How To Delete Duplicate Rows Retaining Only One Of Them From Every Set Of Duplicates.
Apr 10, 2008
I have a table employee_test having the sample data. The rows with EmployeeID=6 are duplicate rows. I want to delete the duplicates retaining one row for the employeeid=6.
Note :- I don't want to use a temporary table. I want to do this using a single query or at the most in a SP query batch. Please advise.
CAN ANYBODY REPLY FOLLOWING QUESTIONS. I WANT TO DELETE DUPLICATE ROWS IN MY TABLE WITHOUT USING TRANSACTION TABLE. AND ONE MORE QUESTION HOW TO GET YESTERDAY DATE BY USING ISQL WINDOW.
declare @a1 table (id int not null identity(1,1), phone decimal(18,0), adress nvarchar(100)) insert @a1 (phone,adress)
[Code] ....
id phone address ----------- --------------------------------------- ---------------------------------------------------------------------------------------------------- 1 111 new york 2 111 new york 3 111 new york 4 222 maxico 5 222 mexico
select phone,count(phone) as say from @a1 group by phone having count(phone)>1
phone say --------------------------------------- ----------- 111 3 222 2
How can I remove duplicate phone. For example
after delete select*from @a1 select*from @a1
id phone address ----------- --------------------------------------- ---------------------------------------------------------------------------------------------------- 1 111 new york 4 222 maxico
I wrote this to show for example in my real table have 50000 rows and 1751 duplicate rows.
I have a master table and i need to import the rows into the parent and child table.
Master table name is Flatfile_Inventory Parent Table name is INVENTORY Child Tables name are INVENTORY_AMOUNT,INVENTORY_DETAILS,INVENTORY_VEHICLE, Error details will be goes to LOG_INVENTORY_ERROR
I have 4 duplicate rows in the Flatfile_Inventory which i have already inserted in the Parent and child table.
Again when i run the query using stored procedure,its tells that all the 4 rows are duplicate and will move to the Log_Inventory_Error.
I need is if i have the duplicate rows in the flatfile_Inventory when i start inserting into the parent and child table the already inserted row have the unique ID i must identify it and delete that row in the both parent and chlid table.And latest row must get inserted into the Parent and child table from Flatfile_Inventory.
Hi All, I am writing a SP where I need to pass an value to maintain records of last n days. In this SP I am deleting a couple of tables based on the value passed to this SP. For e.g. If the SP is passed the value 10, then only TOP 10 records is maintained, the rest are deleted. I have formed the following logic, which I feel can be improved vastly. I create a temp table and
CREATE TABLE #TempAuditTbl (Rownum int PRIMARY KEY, Orderid uniqueidentifier)
INSERT INTO #TempAuditTbl
SELECT ROW_NUMBER() OVER (ORDER BY orderdate desc) AS rownum, Orderid FROM Orders
DELETE Orders FROM Orders INNER JOIN #TempAuditTbl adt ON adt.Orderid = Orders.Orderid AND rownum > @TopnRows
DROP TABLE #TempAuditTbl
OR
DELETE FROM Orders WHERE orderid NOT IN ( SELECT TOP @TopnRows OrderID FROM Orders ORDER BY OrderDate desc)
This way I am able to keep the top n records. Which of these two solutions is more efficient? Is there a more efficient way to achieve the same. Please help.
I have a table with 22 million Business records. I can see that there are duplicates when I group by BusinessName and Address and Phone. I'd like to place only the duplicates into a table, with a ranking, oldest business key gets a ranking of 1.
As a bonus I'd like each group to have a distinct group name (although not necessary, just want to know how to do this)
Later after I run more verifications to make sure these are not referenced elsewhere I'll delete everything with a matchRank > 1 out of the main Business table.
DROP TABLE [dbo].[TestBusiness]; GO CREATE TABLE [dbo].[TestBusiness]( [Business_pk] INT IDENTITY(1,1) NOT NULL, [BusinessName] VARCHAR (200) NOT NULL, [Address] VARCHAR(MAX) NOT NULL,
This probably has been addressed before but I was unable to get the search to work properly on this site. I am needing a script/way of deleting all rows from a DB with the exception of one record left for each row that has duplicate column data. Example : Row 1 Field1 = 12345 Field2 =xxxxx Field 3=yyyyy Field4=zzzzz etc. Row 2 Field1 = 12345 Field2 =zzzzzz Field 3=xxxxxx Field4=yyyyyy etc. Row3 Field1 = 12345 Field2 =20202 Field 3=11111 Field4=zzzzz etc. Row 4 Field1 = 54321 Field2 =xxxxx Field 3=yyyyy Field4=zzzzz etc. Etc. Etc.
I want to be able to find the duplicates for Field1 and then delete all but 1 of those rows.( I don't care which one I keep just so only one is left.) The data in the other fields may or may not be unique.
I know how to find the duplicates it's just the deleting part I am having problems with. Any help would be much appreciated. Thanks,
;WITH ctePreAgg AS ( select top 500 act_reference "ActivityRef", row_number() over (partition by act_reference order by act_reference) as rowno, t3.s_initials "Initials" from mytablestuff order by act_reference
[code]...
But what I would love to do next is take each of the above rows - and return the initials either in one column with all the nulls and duplicate values removed, separated by a comma ..
OR the above but using variable number of columns based on the maximum number of different initials for each row.this is not strictly required, but maybe neater for further work on the view
I am having problems with the "Duplicate key was ignored" warning message. The problem is that the message seems to happen randomly and cannot be reproduced. If i take the same set of data and run the stored procedure that causes the problem i don't get the warning message a second or subsequent time. Also all the SELECT statements have criteria set to remove duplicates before they are inserted into the tables.
I have a data feed that pulls data from a DB2 database to a SQL Server 2008 staging table as a flattened set of records. A stored procedure in SQL Server is run to load the data into the destination tables. The data feed is run hourly for new and updated records in DB2 Monday-Friday 09:00-17:00 and then there is a midnight run of all the records going back for the last 12 months.
The data feed was originally sent from DB2 as a CSV file and pulled into SQL Server using SSIS but is now an Informatica workflow that pulls the data directly from DB2.
It is the Informatica workflow that is returning the "duplicate key was ignored" warning message and this stops the workflow. The workflow is restarted and the data is always loaded the second time without the warning message. The warning does not happen every time the workflow is run - it can run for a number of days with no warnings and then one will come through I can see in Profiler that it is SQL Server that returns the Duplicate key was ignored warning message so it is not an issue with Informatica.
I cannot reproduce the problem to get to the root cause of the issue. I would expect that if i run the same set of data through the stored procedure i would get the warning message every time, but this is not the case. Even when i step through the stored procedure i do not get the message. As the midnight data feed returns the records from the last 12 months, so by definition would include duplicates, the warning message only appears randomly and is not consistent.
Some guy posted that the syntax: delete top 1 from some_table works for deleting duplicates. I am pretty sure that doesnt work but I wanted to check just in case it did because it would be a really easy to delete duplicates.
Hi i need a query to check a table and if any duplicates of the column called "MessageID" and if there are any duplicates then delete them leaving just the one unique MessageID
so i have
MessageID, Number, Text.
12,33333333333,hello 12,33333333333,hello - Delete this one 12,33333333333,hello - Delete this one 14,55555555555,new
I'm having trouble figuring out how to delete some _almost_ duplicaterecords in a look-up table. Here's the table:CREATE TABLE [user_fields] ([fKEY] [char] (16) NOT NULL ,[SEQUENCE] [char] (2) NOT NULL ,[FIELD_LABEL] [varchar] (20) NULL ,[FIELD_VALUE] [varchar] (50) NULL ,[EXPORT_DATE] [datetime] NULL ,CONSTRAINT [PK_user_fields] PRIMARY KEY CLUSTERED([fKEY],[SEQUENCE])CONSTRAINT [FK_USRFLD_INV_DOCID] FOREIGN KEY([fKEY]) REFERENCES [OTHER_TABLE] ([PKEY]))Some values:fKEY SEQUENCE FIELD_LABEL FIELD_VALUE----------------------------------------------------------8525645200692B8919Co. ID #8525645200692B8920Co. ID #8525645200692B8921Co. ID #8525645200692B8913Co/Div/Dept8525645200692B8914Co/Div/Dept8525645200692B8915Co/Div/Dept8525645200692B8916Division8525645200692B8917Division8525645200692B8918Division8525645200692B8910Group8525645200692B8911Group8525645200692B8912Group8525645200692B891 HR ContactJOHN NOVAK8525645200692B892 HR ContactJOHN NOVAK8525645200692B893 HR ContactJOHN NOVAK8525645200692B8924Job Location8525645200692B8922Job Location8525645200692B8923Job Location8525645200692B894 Manager8525645200692B895 Manager8525645200692B896 Manager8525645200692B897 Recruiter8525645200692B898 Recruiter8525645200692B899 Recruiter85256D740081C3A413Co. ID #85256D740081C3A414Co. ID #85256D740081C3A410Co/Div/Dept85256D740081C3A49 Co/Div/Dept85256D740081C3A411Division85256D740081C3A412Division85256D740081C3A48 Group85256D740081C3A47 Group85256D740081C3A42 HR ContactDiana Tarry85256D740081C3A41 HR ContactDiana Tarry85256D740081C3A415Job Location85256D740081C3A416Job Location85256D740081C3A43 Manager85256D740081C3A44 Manager85256D740081C3A45 Recruiter85256D740081C3A46 RecruiterNote that fKEY 8525645200692B89 has three of every FIELD_LABEL, andfKEY 85256D740081C3A4 has two. Both, however, should have only one.Unfortunately, when I do a slect ... having count(*) > 1, I have nearly900 different fKEYs with some variation of this problem.It's just not coming to me how to delete the duplicates (except forsequence). I don't care which of the sequence values I keep but as amatter of preference I tried to do something using max(sequence) but,so far, everything I've tried deletes all records for any given fKEY.Help?Thanks.Randy
There are many duplicate records on my data table because users constantly register under two accounts. I have a query that identify the records that have a duplicate, but it only shows one of the two records, and I need to show the two records so that I can reconcile the differences.The query is taken from a post on stack overflow. It gives me 196, but I need to see the 392 records.
How to identify the duplicates and show the tow records without having to hard code any values, so I can use the query in a report, and anytime there are new duplicates, the report shows them.
Hi everybody I need help on finding duplicates and deleting the duplicate record depending on name and fname , deleting the duplicates and leaving only the first one.
my PERSON table is this below:
ID name fname ownerid id2
1 a b 2 c c 3 e f 4 a b 1 10 5 c c 2 11
I have this query below that returns records 1 and 4 and 2 and 5 since they have the same name and fname
select * from ( Select name ,fname, count(1) as cnt from PERSON group by name,Fname ) where cnt > 1
ID name fname ownerid id2
1 a b 4 a b 1 10
2 c c 5 c c 2 11
With this result I need to delete the second record of each group but update the first records with the ownerid and id2 of the second record that would be deleted... I don't know how to proceed with this..
Hi.I have a "union" table which results of a union of two tables.Occasionally I could have duplicates, when the same PIN has been addedto both tables, albeit at different Datees/Times, such as:PINNameAdded Date100411A7/11/2007 10:12:58 AM100411A7/17/2007 10:54:23 AM100413B7/11/2007 10:13:28 AM100413B7/17/2007 10:54:39 AM104229C7/6/2007 2:34:13 PM104231D7/6/2007 2:34:25 PM104869E6/10/2007 11:59:12 AM104869E6/22/2007 2:40:18 PMThe question is - how can I delete by queries the first occurence(time-wise) of these duplicates - i.e. I would want to delete thefirst occurence of 100411 (A), the first occurence of 100413 (B), andthe first occurence of 104869 (E) in the example above - records C andD show only once, so they are fine.Is there a MsAccess solution ? Is there a SQL-server solution ?Thank you very much !Alex
I have a DELETE statement that deletes duplicate data from a table. Ittakes a long time to execute, so I thought I'd seek advice here. Thestructure of the table is little funny. The following is NOT the table,but the representation of the data in the table:+-----------+| a | b |+-----+-----+| 123 | 234 || 345 | 456 || 123 | 123 |+-----+-----+As you can see, the data is tabular. This is how it is stored in the table:+-----+-----------+------------+| Row | FieldName | FieldValue |+-----+-----------+------------+| 1 | a | 123 || 1 | b | 234 || 2 | a | 345 || 2 | b | 456 || 3 | a | 123 || 3 | b | 234 |+-----+-----------+------------+What I need is to delete all records having the same "Row" when there existsthe same set of records with a different (smaller, to be precise) "Row".Using the example above, what I need to get is:+-----+-----------+------------+| Row | FieldName | FieldValue |+-----+-----------+------------+| 1 | a | 123 || 1 | b | 234 || 2 | a | 345 || 2 | b | 456 |+-----+-----------+------------+A slow way of doing this seem to be:DELETE FROM XWHERE Row IN(SELECT DISTINCT Row FROM X x1WHERE EXISTS(SELECT * FROM X x2WHERE x2.Row < x1.RowAND NOT EXISTS(SELECT * FROM X x3WHERE x3.Row = x2.RowAND x3.FieldName = x2.FieldNameAND x3.FieldValue <> x1.FieldValue)))Can this be done faster, better, and cheaper?
Auto_ID Account_ID Account_Name Account_Contact Priority 1 3453463 Tire Co Doug 1 2 4363763 Computers Inc Sam 1 3 7857433 Safety First Heather 1 4 2326743 Car Dept Clark 1 5 2342567 Sales Force Amy 1 6 4363763 Computers Inc Jamie 2 7 2326743 Car Dept Jenn 2
I'm trying to delete all duplicate Account_IDs, but only for the highest priority (in this case it would be the lowest number).
I know the following would delete duplicate Account_IDs:
DELETE FROM staging_account WHERE auto_id NOT IN (SELECT MAX(auto_id) FROM staging_account GROUP BY account_id)
The problem is this doesn't take into account the priority; in the above example I would want to keep auto_ids 2 and 4 because they have a higher priority (1) than auto_ids 6 and 7 (priority 2).
How can I take priority into account and still remove duplicates in this scenario?
Dear Gurus,I have table with following entriesTable name = CustomerName Weight------------ -----------Sanjeev 85Sanjeev 75Rajeev 80Rajeev 45Sandy 35Sandy 30Harry 15Harry 45I need a output as followName Weight------------ -----------Sanjeev 85Rajeev 80Sandy 30Harry 45ORName Weight------------ -----------Sanjeev 75Rajeev 45Sandy 35Harry 15i.e. only distinct Name should display with only one value of Weight.I tried with 'group by' on Name column but it shows me all rows.Could anyone help me for above.Thanking in Advance.RegardsSanjeevJoin Bytes!
How do I only select rows with duplicate dates for each person (id)? (The actual table has approximately 13000 rows with approximately 3000 unique ids)
I'm using SqlDataSource and an Access database. Let's say I got two tables:user: userID, usernamemessage: userID, messagetextLet's say a user can register on my website, and leave several messages there. I have an admin page where I can select a user and delete all of his messages just by clicking one button.What would be the best (and easiest) way to make this?Here's my suggestion:I have made a "delete query" (with userID as parameter) in MS Access. It deletes all messages of a user when I type in the userID and click ok.Would it be possible to do this on my ASP.net page? If yes, what would the script look like?(yes, it is a newbie question)