Transact SQL :: Remove Duplicate Records But Keep At Least 1 Record
Dec 3, 2010
I have a table that "Geography" that has the following columns: city, state, zip
There are tons of duplicate cities in this table. I ran this query and it shows me the number of occurrences of each city. I want to delete all the duplicates except for 1. I don't want to do this manually as there are a lot of records.
What would the SQL look like to delete the duplicate records but keep at least one?
I have a requirement where i want to delete the records based on the Date column. I have table which contain the columns like machinename ,lasthardwarescandate
I want to delete the records based on the max(Lasthardwarescandate) i.e. latest one, column where the machine name is duplicate menace it repeats. So how would i remove the duplicate machine names based on the Lasthardwarescandate column(There are multiple entries for the Lasthardwarescandate so i want to fetch the latest date column).
Note: Duplication should be removed based on “Last Hardware Scan” date.
Only latest date should be considered from multiple records for the same system. "
EDITION PRODUCT INSERTDATE ---------- ------------ ---------------------- CNE TN-Town News 12/19/2007 12:00:00 AM TN TN-Town News 12/19/2007 12:00:00 AM
What i have to do is if there are multiple records for one product in any day, then i need to remove all those records. In this case i am getting two records for the PRODUCT 'TN-Town News' and for INSERTDATE = 12/19/2007 . So i need to remove these two records from the table.
I have been given a task to locate duplicate and report duplicate records and am trying to determine the best way to do this with databases that have 1 million records plus.
Say I have a table with 20 columns, I need to check to see if 3 of 10 specific columns match.
So if 2 columns are the same its no problem however if 3 or more match, they are considered duplicate.
How do I update a record that has duplicates. For example, I have 3612 orders some of these orders have multiple orderid's I want to update the record for each of these orders that was added most recently.
I have student table where duplicate student exist by name with there fathers name and mothers name. I need to search those duplicate records. I do not need ti count them but If there is 5 same student with name then the query will show 5 name then I will delete individually. Below I am trying to show the scenario.
Student_name _____________ Rocky Albert Rocky Williams Albert Robert
The query will show
Student_name ______________ Rocky Rocky Albert Albert
My query wants to insert new supplier if there is any. And it should ignore, if the supplier is already present in the table. But it is trying to insert the supplier which is already available. For example, I have PART A with 2 suppliers ABC and DEF. I am getting data from third party for PART A with supplier DEF. As per the condition, it should ignore the record because DEF is already available . But my query is trying to insert supplier DEF and following that, I am getting primary constraint error.
-- Inserting new preferred supplier into R5CATALOGUE
DECLARE @DATEPROCESS DATETIME; SET @DATEPROCESS = CAST(DATEADD(D, -((DATEPART(WEEKDAY, GETDATE()) + 1 + @@DATEFIRST) % 7), GETDATE()) AS DATE) INSERT INTO R5CATALOGUE(CAT_PART, CAT_SUPPLIER,CAT_GROSS,CAT_LEADTIME,CAT_PURUOM,CAT_REF,CAT_MULTIPLY,CAT_CURR,CAT_SUPPLIER_ORG, CAT_PART_ORG,CAT_DESC,CAT_MINORDQTY)
I'm trying to pull all records from one table and just a single record from another. I have this join, (see below). It works ok, but the problem is if a blog record doesn't have a corresponding image record it doesn't return. The end result should be the blog record and a single corresponding image record. But always a blog record.
The query repeats the Header row value for all children associated with the header.I need the output of the query in XML format such that..For every Header element in the XML, all its children should come under that header element//I am using -
SELECT Cols FROM Table Names FOR XML PATH ('Header'), root('root') , ELEMENTS XSINIL
This still repeats the header for each detail (in the XML) , but I need all children for a header under it.I basically want my output in this format -
I have a client who needs to copy an existing sale. The problem isthe Sale is made up of three tables: Sale, SaleEquipment, SaleParts.Each sale can have multiple pieces of equipment with correspondingparts, or parts without equipment. My problem in copying is when I goto copy the parts, how do I get the NEW sale equipment ids updatedcorrectly on their corresponding parts?I can provide more information if necessary.Thank you!!Maria
I want to remove duplicate records from my table based on nic number. I try to put primray key constraint. But there are many many duplicates so cannot do it can I have a query to remove duplicates..
I am a newb at ms sql and was hoping someone could help me eliminate duplicate PRODUCT.PRODUCT from this statement. I have tried using DISTINCT with the same results.The ProductImage table is causing this because the duplicates are from the PRODUCT.PRODUCT that have more than 1 image.
If anyone could rewrite this statement so I can learn from this, it would be most appreciated!
I have a query that for one reason or another produces duplicate information in the result set. I have tried using DISTINCT and GROUP BY to remove the duplicates but because of the nature of the data I cannot get this to work, here is an example fo the data I am working with
ID Name Add1 Add2 1 Matt 16 Nowhere St Glasgow 1 Matt 16 Nowhere St Glasgow, Scotland 2 Jim 23 Blue St G65 TX 3 Bill 45 Red St 3 Bill 45 red St London
The problem is that a user can have one or more addresses!! I would like to be able to remove the duplicates by keeping the first duplicate ID that appears and getting rid of the second one. Any ideas?
Below is a raw data with SQL script to create the temp table. This table is a scale down version of what I am working with. I am taking the addresses from several different sources and putting them in a temp table. Now, I need to remove the duplicate addresses for each ID. Duplicate addresses are okay for different IDs.
Raw Data IDAddr1 CityStateZip 26205 N Main STChicagoIL52147 26205 N Main STChicagoIL52147 2685 Park AveChicagoIL52147 3535 Main StAustinTX78715 35976 Ponco StDallasTX79757 359587 MopacAustinTX78715 4558741 Len LnDaytonFL74717 455518 Spring DrDaytonFL74717 45585 Park AveChicagoIL52147
Desired Result IDAddr1 CityStateZip 26205 N Main STChicagoIL52147 2685 Park AveChicagoIL52147 3535 Main StAustinTX78715 35976 Ponco StDallasTX79757 359587 MopacAustinTX78715 4558741 Len LnDaytonFL74717 455518 Spring DrDaytonFL74717 45585 Park AveChicagoIL52147
Hi everyone.How can I get the unique row from a table which contains multiple rowsthat have exactly the same values.example:create table test (c1 as smallint,c2 as smallint,c3 as smallint )insert into test values (1,2,3)insert into test values (1,2,3)i want to remove whichever of the rows but I want to retain a singlerow.TIADiego
I've got the following table data:116525.99116520.14129965.03129960.12129967.00And I need to write a query to return only rows 2 and 4, since theremaining rows have duplicate IDs. I've tried the Group By, but amhaving no luck.Thanks!
I have a table containing over 100,000 email addresses. This email table gets duplicates in it, and our customers don't want a second (or third or fourth) copy of our news letter. To prevent this, we run the following SQL to kill the duplicates:
Code Snippet
DELETE FROM _email WHERE _email.eid IN ( SELECT tbl1.eid FROM _email AS tbl1 WHERE Exists ( SELECT emailaddress, Count(eid) FROM _email WHERE _email.emailaddress = tbl1.emailaddress GROUP BY _email.emailaddress HAVING Count(_email.eid) > 1 ) ) AND _email.eid NOT IN ( SELECT Min(eid) FROM _email AS tbl1 WHERE Exists ( SELECT emailaddress, Count(eid) FROM _email WHERE _email.emailaddress = tbl1.emailaddress GROUP BY _email.emailaddress HAVING Count(_email.eid) > 1 ) GROUP BY emailaddress ); This query takes about 2hrs to run which is really hurting our server preformance. Is there any way to do this faster?
I have a CSV file which contains some duplicate record and i have to load this file in SQL server database using SSIS package .
What i have to do is read the file and if the same record entry is occur more than 10 times for a particular unique combination ( like ID , Date , Time ) then i need to take only one record for that occurance.
I am working SQL Server 2005 and One Table Which contain only one column without primary keyNow I want to remove all duplicate value from that table with only single query
I have a table with one column, and i want to remove those records from the table which are duplicate i meant if i have a records rakesh in table two time then one records should be remove... my tables is like that
I found some duplicate data as I was going thru the logic of a data pump. The entire row is not duplicated however.I would like to delete only the one row.
This is a sample of the data: DECLARE @SomeData TABLE ( FirstName varchar(25) , MiddleName varchar(25) , LastName varchar(25) , StreetAddress varchar(25) , Suite varchar(25) , City varchar(25) , [State] varchar(25) , PostalCode varchar(10)
[code]...
As you can see, Joe Smith has two rows, but only one of the rows is complete. I would like to delete only the row that has a NULL value in the phone and area code for Joe Smith. There are a few thousand rows that are like this. They have duplicates all but the area code and phone number.I am used to using a CTE to remove duplicates, but I am a little lost on this one. The things that I have tried, have not worked exactly as I planned.
I have a database being populated by hits to a program on a server.The problem is each client connection may require a few hits in a 1-2second time frame. This is resulting in multiple database entries -all exactly the same, except the event_id field, which isauto-numbered.I need a way to query the record w/out duplicates. That is, anyrecords exactly the same except event_id should only return one record.Is this possible??Thank you,Barry
There's some SQL below (T-SQL) & I'm wanting to have this result set grouped by Venue_ID in order to remove rows where there are duplicate values contained in just one column.
The columns BCOM_ID contain unique values, but Venue_ID can have duplicate values. I only want to get rows for one instance of the Venue_ID (per BCOM_ID) - doesn't matter which instance but basically, no duplicates.
Oh yes, one of the columns is a Bit column.
Any ideas would be welcome & appreciated!
Many thanks, Darren darren@darrenbrook.fsnet.co.uk
SQL:-
SELECT Booking_Header.BH_ID, Booking_Header.Booking_Header_Description, Booking_Header.BStat_ID, Booking_Header.BT_ID, Booking_Header.Tagged, Booking_Header.Status_Timestamp, Booking_Header.Start_Date, Booking_Header.Days_Qty, Proposal.PPL_ID, Proposal.PPL_Status, Booking_Component.BCOM_ID, Booking_Component.Component_Description, Booking_Component.Venue_ID, Venue.Venue_Code, Venue.Description, Address.Address_ID, Address.Town, Booking_Status.BStat_Description, Booking_Type.Type_Description FROM dbo.Booking_Header INNER JOIN dbo.Proposal ON dbo.Booking_Header.BH_ID = dbo.Proposal.BH_ID INNER JOIN dbo.Booking_Component ON dbo.Proposal.PPL_ID = dbo.Booking_Component.PPL_ID INNER JOIN dbo.Venue ON dbo.Booking_Component.Venue_ID = dbo.Venue.VE_ID INNER JOIN dbo.Address ON dbo.Venue.VE_ID = dbo.Address.VE_ID INNER JOIN dbo.Booking_Status ON dbo.Booking_Header.BStat_ID = dbo.Booking_Status.BStat_ID INNER JOIN dbo.Booking_Type ON dbo.Booking_Header.BT_ID = dbo.Booking_Type.BT_ID WHERE (dbo.Proposal.PPL_Status = 1) AND (dbo.Booking_Header.BH_ID = 10)
I have 4 tables (SqlServer2000/2005). In the select query, I have FULL JOINED all the four tables A,B,C,D as I want all the data. The result is as sorted by DDATE desc:- AID BID BNAME DDATE DAUTHOR 1 1 abcxyz 2008-01-20 23:42:21.610 c@d.com 1 1 abcxyz 2008-01-20 23:41:52.970 a@b.com 1 2 xyzabc 2008-01-21 00:17:14.360 c@d.com 1 2 xyzabc 2008-01-20 23:43:17.110 a@b.com 1 2 xyzabc 2008-01-20 23:42:43.937 a@b.com 1 2 xyzabc NULL NULL 2 3 pqrlmn NULL NULL 2 4 cdefgh NULL NULL Now, I want unique rows from the above result set like :- AID BID BNAME DDATE DAUTHOR 1 1 abcxyz 2008-01-20 23:42:21.610 c@d.com 1 2 xyzabc 2008-01-21 00:17:14.360 c@d.com 2 3 pqrlmn NULL NULL 2 4 cdefgh NULL NULL I want to remove the duplicate rows and show only the unique rows but contains all the data from the first table A. I have to bind this result set to a nested GridView.
I have a dataset in my report that pulls 10 fields. One of the tablixes on report should display only 8 of the 10 fields and this is causing the duplicate records to show up on the tablix.I tried hide duplicates option for the entire details row and have set the scope to "Details". It did not work. I am getting the data from a stored procedure and cannot do much on that.