I have a table with 10 billion records but there are no key on it. I cannot
build a key on it as it is the data source.
However, the data source exits the duplicated rows.
I have used the DTS to transform the data into a new table and delete the
duplicated rows. As there are 10 billion records, i need to divide it into 3
parts and also the process lasts for 6 hours each part.
I want to ask is there any other good methods to slove my problem??
Hello, I have a question, i loaded 2 files into SQL and the files have some cells that have the same model number. how can I merge the cells together that have the same model number and (if possible take the avarage of their cell called price) (and combine their other cell called stock) and make it into one cell. Any help would be very very apriciated. Thank you. i tryed this but it does not work SELECT Model_number FROM Products Join Where Model_number='3CM3C1670800B' I have also Tryed this, IT SHOULD work but I have an error someWhere: delete from Productsfrom part_number a join (select part_number, max(part_number) from part_number group by part_number having count(*) > 1) b on a.part_number = b.part_number and part_number < b.part_number
We have the below query which is pulling in Sales and Revenue information. Since the sale is recorded in just one month and the revenue is recorded each month, we need to have the results of this query to only list the Sales amount once, but still have all the other revenue amounts listed for each month. In this example, the sale is record in year 2014 and month 10, but there are revenues in every month as well for the rest of 2014 and the start of 2015 but we only want to the sales amount to appear once on this results set.
Dear Gurus,I have table with following entriesTable name = CustomerName Weight------------ -----------Sanjeev 85Sanjeev 75Rajeev 80Rajeev 45Sandy 35Sandy 30Harry 15Harry 45I need a output as followName Weight------------ -----------Sanjeev 85Rajeev 80Sandy 30Harry 45ORName Weight------------ -----------Sanjeev 75Rajeev 45Sandy 35Harry 15i.e. only distinct Name should display with only one value of Weight.I tried with 'group by' on Name column but it shows me all rows.Could anyone help me for above.Thanking in Advance.RegardsSanjeevJoin Bytes!
I used the following select statement to get duplicate records on Case_number column
select cases.distinct case_link, cases.case_number from cases group by case_link having case_number > 1
I got the error message that
"'cases.warrant_number' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. AND cases.case_number' is invalid in the HAVING clause because it is not contained in either an aggregate function or the GROUP BY clause.
Any idea on a better statement to use. THANKS FOR YOUR HELP!
Hi, I have a table and this is what i did to get the desired result
Select A.col1,count(A.col1) from Tab1 group by col1 having count(A.Col1) > 1
i tried this - but it didnot worked - it returned col1 as blanks - Select A.col1,B.Col2,count(A.col1) from Tab1 A, Tab2 B where A.col1 = B.col1 group by A.col1 , b.col2 having count(A.Col1) > 1
As I was looking for all the rows that are apperaing more than once.
Now - The problem -
I have to join this table to another table Tab2 to get the other details. My Tab2 is a table from where I have to pull the Customer DEtails like name,address etc. How should I write this query? Any thinuhts? TIA
Hi. I'm a SQL Server newbie, very experienced with Access, developing an ASP.NET database editor web app. I query the database with a statement more or less in the following form:
SELECT organisation.OrgID, organisation.Name, organisation.whatever FROM services INNER JOIN servicegrouping ON services.serviceID=servicegrouping.serviceID INNER JOIN organisations ON servicegrouping.OrgID = organisations.OrgID WHERE services.service=x OR services.service=y
In other words, I have a database of organisations. The services offered by the organisations are in a separate table, and I only want to return organisations that offer services X or Y.
Okay, now if I did this in Access, this query would return just one record for each organisation that meets the condition, unless I was to include a field from the services table in the SELECT clause, in which case of course I would get one record for each organisation and unique service offered.
But in MS SQL, the query returns duplicate rows if there is more than service offered by the organisation that meets the WHERE condition (=x or =y). Why is this and what do I need to do to my SQL statement to ensure I only get unique rows?
I've a query which gets a set of data from multiple tables -
select * FROM A inner JOIN q ON (RIGHT(q.name,CHARINDEX('-',REVERSE(q.name))-1)= a.id) inner JOIN t ON (t.id = q.id) inner JOIN s ON (q.name = s.name ) inner join l on (s.name = l.name and t.name = l.name)
WHERE A.id = 764 and s.name = '764'
I get repeated # of rows for each id. I've some 136 rows for each q.id ( there are 6 q.ids and hence I get 816 rows instead of 136) These 136 rows are actually divided among thse q.ids as
Hello, I have a question, what does a statement look like that finds the duplicate rows and combines them, I have a table named PRODUCTS in it 3 columbs Cost, Stock, Part_number. I need to find all Part_numbers that dublicate, Combine the rows into 1 & combine (sum, add) their stock together is the new row & take an avarerage of their cost and use it as cost in the new row where they combine. Please help me, I am stalled. Looked all over the internet & could not find anything, I really need this for a project I can not finish. I have the following SQL statement: SELECT part_number, COUNT(part_number) AS NumOccurrences FROM Products GROUP BY Part_number HAVING COUNT(part_number) > 1
I have a csv file that I need to import daily into a SQL Server 2005 table. Much of the table contents could just be overwritten with the new csv file, however there are a set of Rows within the table that need to be appended to , rather than overwritten. There is no Primary Key in the csv file that can be used. I'm not sure this is the best approach, but what I have been trying to do, is append the entire csv file to the existing table, and then go back and delete the duplicates. When I run the Delete, it does delete the majority of the records, but leaves a couple hundred behind. The number left behind varies with each run, can't seem to identify a pattern here. Running the Delete a second time does clean up the rows left behind in the first execution of the Delete, and gives the result I want. Any thoughts as to why this needs to be run twice? Or is a better approach available? Here is my code - SELECT [Pkg ID], [Elm (s)], [Type Name (s)], [End Exec Date], [End Exec Time], dupcount=count(*) INTO temppkgactions FROM pkgactions GROUP BY [Pkg ID], [Elm (s)], [Type Name (s)], [End Exec Date], [End Exec Time]HAVING count(*) > 1
DELETE TOP (SELECT COUNT(*) -1 FROM dbo.temppkgactions WHERE dupcount > 1 ) FROM dbo.pkgactions DROP TABLE temppkgactions
I have a large table that consists of the columns zip, state, city, county. The primary key "zip" has duplicates but the rows are unique. How do I filter out only the duplicate zips. Randy Garland
Hi, I am encountering a problem. There are lots of duplicate rows in the cobol flat files (due to improper data entry and missing columns values )from where I am transforming data to sql 7. 0 tables using DTS. After transformation , can I some how mark the duplicate rows ? it is not for the purpose of eliminating them, but to enter the missing values and make all the rows complete and unique. I have the transformed table as a temporary table. Can I add a column like 'status' etc.. and have the column values marked '1' for the repeating rows etc.... Can anyone suggest 'any' possible way of implementing it ? Thanx Nisha
I have problem in deleting duplicate rows. I have a identity column in my table, if I try to use correlatted sub query with Delete command it gives error.
The other problem I have is I have a date column in my table and update that column with current date and time. If use a query to fetch a records on a particular day , it does not return any rows
select * from rates where ch_date >='02/11/99' and ch_date<='02/11/99'
If I use convert also there is some other problems. Is there any way to force date checkings to be done excluding time.
CAN ANYBODY REPLY FOLLOWING QUESTIONS. I WANT TO DELETE DUPLICATE ROWS IN MY TABLE WITHOUT USING TRANSACTION TABLE. AND ONE MORE QUESTION HOW TO GET YESTERDAY DATE BY USING ISQL WINDOW.
This is an imaginary problem while discussing ROWID in ORACLE.
Consider a table without primary key, unique key, uniuqe index. A row has inserted into the table many times. I want to delete all but one dulicated rows. With any 'where' clause all rows(duplicated) will be deleted. In ORACLE i can achieve this using ROWID as follows:
Delete from Table_name where < all column values > and ROWID <> ( Select max(rowid) from Table_name where < all column values > )
How can this be achieved in MS SQL Server 6.5 ?
According to Dr. Codd's Golden rules for RDBMS one is that One should be able to reach each data value in the database by using table name, row idenfication value and column name.
Does MS SQL Server 6.5 satisfy this requirement ?
Also How many of Dr. Codd's 13 Golden Rules for RDBMS does MS SQL Server 6.5 Satisfy? Which doesn't ?
Hi everyone, I'm migrating some information for a client at the moment. They had everything in Excel files and I'm getting them into SQL Server. There are some differences in the way I am storing data and the way they were storing data.
For each client they stored, they had something like Rel1 Rel2 Rel3 100 101 102
Now, what I have is a seperate row for each of Rel1, Rel2 and Rel3 so I would have 3 seperate rows with identical information except for Rel1. So I would have: Rel1 100 101 102
So one way I thought of doing it was inserting a new row specifying that the value for Rel2 should be stored in Rel1 and for the next row that the value for Rel3 should be stored in Rel1.
Now, I am able to do this but SQL Server inserts an extra row will the NULL value in Rel1. Does anyone know why this would be happening? I think what it is doing is finding a NULL value in Rel3 after creating the two extra rows and is inserting that NULL. So I think I need to check for NULLs and not allow it to create a new row if, say, Rel3 is NULL.
Any pointers are gladly welcome. (I know it's complicated )
Hello I am fairly new to SQL and having spent much time over the manual I decided to ask for help. So here's my deal.
I've got a query with 5 tables that I join together
Code:
SELECT * FROM Map INNER JOIN ThreatCategory INNER JOIN Threat ON ThreatCategory.threatCategoryID = Threat.threatCategoryID INNER JOIN Threat_Map ON Threat.threatID = Threat_Map.threatID ON Map.mapID = Threat_Map.mapID LEFT JOIN person on map.contentPersonID = person.personID WHERE (((DATEDIFF(dd, Map.dataAcquisitionDate, GETDATE()) > map.goodForDays) and (map.expired = '1')) or (map.expired = '3'))
The problem is the table Threat_Map is a many to many mapping between the Map table and the Threat table. Eg) A map can have more than one threat and a threat can have more than one map. I know this is not the best way to have a database set up but its out of my hands as to changing the database. What I need help with is this.
My application checks as to whether a certain field in the Map table is expired or out of date (as in the query). If so it gets some required information from the other tables using those joins. However, I don't want to get information for the same Map.mapID that's expired twice. I don't really care which ThreatID I get from the Threat_Map table I just need to get one of them to meet the objects standards. However, so far this seemingly simple task has eluded me. I'd like to do this in SQL. Is there perhaps a way to do this. If not I guess I'll just take care of it in the application.
Please give the DML to SELECT the rows avoiding the duplicate rows. Since there is a text column in the table, I couldn't use aggregate function, group by (OR) DISTINCT for processing.
Table :
create table test(col1 int, col2 text) go insert into test values(1, 'abc') go insert into test values(2, 'abc') go insert into test values(2, 'abc') go insert into test values(4, 'dbc') go
In my database, I have a table "tbl_c_extract" that consists of 4 columns that look the following. I'm looking at a daily batch of around 4000 records, of which 150 are likely to be duplicates.
In the example above, I need to remove 2 of the entries, leaving only the one that with the maximum leave date. In this case, those without a leave date have the 2099 entry.
Using CTE works exactly as I want it to, however SQL Server Agent doesn't seem to like the use of CTE..
Code: WITH CTE (Proprietary_ID, LeaveDate, RN) AS ( SELECT Proprietary_ID, LeaveDate, ROW_NUMBER() OVER(PARTITION BY Proprietary_ID ORDER BY Proprietary_ID, LeaveDate) AS RN FROM tbl_c_extract ) DELETE FROM CTE WHERE RN > 1
Hi, I need to insert rows into table1 from table2 and table3 but I don't want to insert repeated combinations of col2, col3. So, table1 has the primary key col2, col3.
This the table1:
create table table1( col1 int not null, col2 int not null, col3 int not null, constraint PK_table1 primary key (col2, col3) )
This is my "insert" code:
INSERT INTO table1 SELECT table2.col1,table2.col2, table3.col3 FROM table2, table3 WHERE table2.col1 = table3.col1
this may sound like a weird one, but i need to create duplicates of all rows that satisfy a condition.
using asp, i am able to select rows from a databate using a recordset, only to insert it straight back into the database, thus assigning it a new unique id.
but is there any one to perform this action just using sql?
Hi there, I'm trying to concat the results of a select statement into a single row separated by commas without duplicates. The query and output is something like this:
SELECT Specification FROM Cars WHERE Model = 112
Which would return something like:
Specification ------------------
Model 1 Model 2 Model 2 Model 4
What I'd like is these to be combined into a single row such as: