Need to remove the duplicated rows from a table which has text/ntext/image type columns. The table does not have any PK/Unique column. (I accept its a bad data model). But currently changing the data model is not possible. Hence doing changes in application.
I couldn't do 'SELECT DISTINCT * from table', since the table has text columns. Though there is no PK constraint, If I know that col1 and col2 are join PKs in the table, Is that possible to select the distinct rows from such a table.
I wanted to remove duplicate records from SSRS report. I set the "Hide Duplicates" to True. It is now working, But i am getting the space between the two records, which i want to get rid of. How to get rid of extra spaces between two records ( Please find the details below).
> This is a common problem with some solution[color=blue]>>[/color]/************************************************** *********************************** Problem:* Determine the Duplicated Records in a table using single SELECT.** We shall be using Northwind database, add some duplicate records.** Here we want to know if 2 columns (CompanyName,* PHone) are duplicated in a table.*** ShipperID CompanyName Phone* ----------- ------------------------- ------------------* 1 Speedy Express (503) 555-9831* 2 United Package (503) 555-3199* 3 Federal Shipping (503) 555-9931* 4 Federal Shipping (503) 555-9931* 5 Speedy Express (503) 555-9831* 6 Federal Shipping (503) 555-9931***************************************************** **/================================================== SOLUTION 1: Gives me the IDs that are duplicated.================================================== SELECTShipperID, CompanyName, PhoneFROMSHIPPERSWHEREEXISTS (SELECTNULLFROMSHIPPERS bWHEREb.CompanyName = SHIPPERS.CompanyNameAND b.Phone = SHIPPERS.PhoneGROUP BYb.CompanyName, b.PhoneHAVINGSHIPPERS.ShipperID < MAX( b.ShipperID ))/* ********************* Output results********************/ShipperID CompanyName Phone----------- ----------------------------------------------------------------1 Speedy Express (503) 555-98313 Federal Shipping (503) 555-99314 Federal Shipping (503) 555-9931(3 row(s) affected)================================================== ===========SOLUTION 2: Gives me the data which are duplicate butnot the IDs================================================== ===========SELECTCompanyName, PhoneFROMSHIPPERSGROUP BYCompanyName, PhoneHAVINGCOUNT(*) > 1/* ********************* Output results********************/CompanyName Phone---------------------------------------- ------------------------Speedy Express (503) 555-9831Federal Shipping (503) 555-9931(2 row(s) affected)
I have a query below to show all the records with joining these two tables.
SELECT DISTINCT B.BF_ORGN_CD, B.LEV5, A.BF_ACTY_CD FROM BF_ORGN A INNER JOIN BF_ORGN_CNSL_TBL B ON A.CD=B.BF_ORGN_CD WHERE A.BF_ACTY_CD IS NOT NULL ORDER BY B.BF_ORGN_CD,A.BF_ACTY_CD
My goal is only to show all the duplicate records.
Bf_ORGN_CD LEV5 BF_ACTY_CD AC_21234_2 AC_21200_1 402 AC_21236_2 AC_21200_1 402 AC_21238_2 AC_21200_1 402 AC_29000_1 AC_29000_1 802 ---> NOT SHOW (ONLY 1 RECORD) AC_29988_1 AC_29988_1 801 ---> NOT SHOW (ONLY 1 RECORD)
I want the output like this,it should take only Min Datecreated FormCodeRefCodeSerialnoDateTime R1-196H1-68A12232138/6/2007 19:38:14 R1-205H1-67XS23124148/6/2007 19:36:08 R1-220H1-66F433365348/6/2007 19:30:27 R1-400H1-64ER53436648/6/2007 19:24:23 R1-408H1-65TE4626268/6/2007 19:24:23
1 7530568 87143 OESCHD 1/5/2006 6:31:58 AM 1 7530568 87143 OESCHD 1/5/2006 7:02:36 AM
for each 7530568 ordernumber there should only be one OESCHD status.
This is the query I'm using to insert the data sent to me.
INSERT INTO ORDER_EVENTS SELECT d.division as division, dt.orderNum as orderNum, dt.poNum as poNum, dt.statusCode as statusCode, dt.statusChangeDate as statusChangeDate FROM dt_Order_Events dt INNER JOIN division d ON dt.division = d.divisionShort INNER JOIN status s ON s.division = d.division AND s.statCode = dt.statusCode WHERE directive <> 'C' AND dt.orderNum IN (SELECT orderNum FROM ORDER_HEADER)
This works fine when used with in the hourly transactional update. But When I ran it for the Bulk UpDate (so we'd have historical data) it allowed orders to have statuses to many times.
I am not a SQL guru, I have no idea how to write a sql statement or stored proc that will remove the duplicate records. or how to change what I have to prevent further ones.
Hey guys, I have a table full of data that has duplicate records except for two date columns (date1 and date2). What I would like to do is remove the duplicates while retaining the most recent record, how can I do this?
So record 1 looks like this:
Code:
John | Smith | 08/08/2000 | 10/10/2000
Record 2 looks like this:
Code:
John | Smith | 08/10/2005 | 10/10/2005
I'd like to remove the first instance and keep the second (most recent one).
I have a two tables containing customers invoices, one for the invoices header (ie: customer #, invoice date,... KEY: invoice # + invoice date) and another for the details of the invoices (ie: each invoice line details KEY: invoice # + line #). I need to periodically remove invoices older than a certain timeframe (ex: all invoices older than 48 months).
I have regular work that requires me to extract a bunch of customer records from our database, and then remove duplicate address destinations (so we dont mail the same address more than once).
I can currently achieve this using a combination of my poor SQL skills and Excel, but it's really not working out for me, so looking for SQL wizardry necessary to do it just in SQL.
Relevant fields: Member.AddressBarcode (This is a unique barcode (Text representation of a base-3 number) based on the customer address. So if there's more than one record in the pulled records with the same barcode, we then look at Member.MemberTypeID to determine whether to include this record in the results or discard it as a duplicate. Note that AddressBarcode may be blank if the mailing address couldn't be validated, if it is blank we don't discard it since there is no easy way to detect duplicate addresses without the barcode)
Member.MemberTypeID (This is the type of member account. We have 3 types - Single, Joint Primary, Joint Secondary, represented in this field by the numbers 1/2/3. This is also the order of preference of who to mail. So if there is a Joint Primary and Joint Secondary with the same mailing barcode, we want to discard the Joint Secondary from the results, so that the Joint Primary is the record we include in the results of who to mail.)
Member.ID (Unique numeric ID for each customer. Kind of irrelevant here, but it's a key)
So some pseudo code for what I'm trying to achieve is:
(Member.MemberTypeID = 1) OR (Member.MemberTypeID = 2 AND Member.AddressBarcode not in results of Member.MemberTypeID = 1) OR (Member.MemberTypeID = 3 AND Member.AddressBarcode not in results of Member.MemberTypeID = 2 AND Member.AddressBarcode not in results of Member.MemberTypeID = 1)
hi All, I have to remove non-useful and duplicate records containing NULL , Blanks and extra spaces (either on left or right side of the column values) etc from all the tables in my server's DB XYZ weekly thru a a scheduled job with the help of a Stored Proc, that s i guess called Purging og DB. Plz help how i can do it with T-SQL.
Also i have to find out and remove all the duplicate DB objects(tables) from the DB .e.g. a table existing with name TABLE_TEST or TABLE_DEBUG etc for an original table TABLE , making sure no any of the base table is dropped.
Hello, I have a table T1 with fields ID, F1, F2, F3, F4, F5, F6….
I need to find if there is duplicated rows based on F1, F2, F3 columns. If there is set F5=’minimum’ where ID is MIN(ID). So the smallest should be set as minimum. How can I do this in a stored procedure?
Hi, All I have a problem with one table. This table is corrupted so I drop the table and recreate the table..(Of course I export data) After that I try to put the primary key to new table but it won't allow me to do it. Error message says " There are duplicated key existed" Therefore I open up the EM and take a look at that table. There is key in that table but not Primary key..(also from the query analyzer using sp_help) My question to U is how can I find this duplicated key and delete that info? I think somewhere in the system table contains this info but I don't know where:-(((
I have a column of primary key, integer and identity. It has been working fine til 2 days ago. Though the column is IDENTITY, it duplicated number by triple! How's that possible?
Has anyone came across this problem? If you have, how did you fix it?
Hello, I have a table T1 with fields ID, F1, F2, F3, F4, F5, F6€¦.
I need to find if there is duplicated rows based on F1, F2, F3 columns. If there is set F5=€™minimum€™ where ID is MIN(ID). So the smallest should be set as minimum. How can I do this in a stored procedure?
I have a table that I truncate with an execute sql task, which uses the following statement: truncate table dbo.tablename.
Afterwards I fill the table with data. Sometimes it gives an error that there are duplicated keys in the table, but it's truncated before the fill... Is there anyone who has an idea on this matter?
I have just started developing in SQL Express in the last 2 months so still learning. The problem I€™m having with my stored procedure is that I get duplicate rows in my results. The row is a duplicate in terms of column 'Job No' as when the query runs in access only one instance of each 'Job No' is returned but when I recreate the query in SQL server I get a number of rows back for the same 'Job No'? How would I go about getting just 1 instance of each 'Job No' back? With column 'Days to Date' showing the total 'Days to Date' for each Job No. Please see Ms Access results if unsure of what I€™m asking.
A copy of the stored procedure is below and a sample of the out-put with Ms Access results at very bottom.
ALTER PROCEDURE [dbo].[sl_DaysDonePerJob] AS
SELECT CASE WHEN [Job No] IS NULL THEN '' ELSE [Job No] END AS [Job No], SUM([Actual Days]) AS [Days to Date], CONVERT(nvarchar(10),MIN(SessionDate),101) AS [Start Date],
CONVERT(nvarchar(10),MAX(SessionDate),101) AS [End Date],
MAX(CASE WHEN DATEPART(MM,SessionDate)=1 THEN 'Jan'
WHEN DATEPART(MM,SessionDate)=2 THEN 'Feb'
WHEN DATEPART(MM,SessionDate)=3 THEN 'Mar'
WHEN DATEPART(MM,SessionDate)=4 THEN 'Apr'
WHEN DATEPART(MM,SessionDate)=5 THEN 'May'
WHEN DATEPART(MM,SessionDate)=6 THEN 'Jun'
WHEN DATEPART(MM,SessionDate)=7 THEN 'Jul'
WHEN DATEPART(MM,SessionDate)=8 THEN 'Aug'
WHEN DATEPART(MM,SessionDate)=9 THEN 'Sep'
WHEN DATEPART(MM,SessionDate)=10 THEN 'Oct'
WHEN DATEPART(MM,SessionDate)=11 THEN 'Nov'
WHEN DATEPART(MM,SessionDate)=12 THEN 'Dec' END) AS 'End Month'
I have many data in a table in which some rows are duplicated. How can I, for all duplicated rows, delete the extra rows and leave only one? You may assume checking one column is enough to tell if a row is duplicated.
I have a unique account number for every record in the table (hence the p-key) however I have route numbers that are duplicated in the table. How do I find all records that have a ROUTE_NUMBER duplicated.
In other words, how do I query records where the route number occurs more than one time in the table.
I would like the following results....
ACCOUNT NUMBER ROUTE NUMBER NAME STATE 12345 1 WMT NY 48734 1 CBS TX 3945857 1 NBC LA
I've got a trouble with my query and left outer joins. I've got 2 tables, table A and B, both have the same record called ID. I used this query: SELECT * FROM A LEFT OUTER JOIN B ON A.ID = B.ID
This is OK and works fine, my trouble comes when i have a duplicated ID on both A and B. Instead of return 2 fields, it returns me 4.
Is there some way to force SQl server to return only the first founded on B, but the 2 duplicated IDs on the A table?
Hi, I have a table named "std_attn", where, by some bad coding, lots of duplicated rows have been created. And the table don't have any PK. So Now tell me the way to remove the duplicaies..................
I am currently working to solve this problem that i am facing. I just cant' get my desired solution to my problem. As per below is the current db view. emp_id skill level years remarks 2541COMPUTERBAIK1<null> 2541WORD PROCESSINGBAIK1<null> 2541EXCEL BAIK1<null> 2541POWERPOINTBAIK1<null>
how do i get this repeated emp_id to have a view of..
emp_id skill level 2541COMPUTER.wordprocessin,excel,powerpointBAIK,baik,baik,baik years remarks 1,1,1,1 <null>,<null>,<null>,<null>
i just can't seem to get this comin up.. please kind advise thanks all!
ColumnA ColumnB ColumnC ------------------------------- Alice Lukas Alice.Lucas James Redford James.Redford James Redford James.Redford Michael Jackson Michael.Jackson John Brown John.Brown John Brown John.Brown John Brown John.Brown George Gotham George.Gotham
I want to update duplicated values at ColumnC like:
Alice Lukas Alice.Lucas James Redford James.Redford James Redford James.Redford1 Michael Jackson Michael.Jackson John Brown John.Brown John Brown John.Brown1 John Brown John.Brown2 George Gotham George.Gotham
How can i do it?
Thanks in advance!
Note: Table is for creating email aliases from names...
Hi all. i have the following function below,which use to retrieve the order detail from 2 table which are order detail and product. i have many duplicated order id in order detail, and each order id has a unique product id which link to product to display the product information. however when i run the following function below . its duplicated each product info and dsplay in the combo box. May i know whats wrong with my code?
Public Sub ProductShow()
Dim myReader As SqlCeDataReader Dim mySqlCommand As SqlCeCommand Dim myCommandBehavior As New CommandBehavior Try connLocal.Open() mySqlCommand = New SqlCeCommand mySqlCommand = connLocal.CreateCommand mySqlCommand.CommandText = "SELECT * FROM Product P,orders O,orderdetail OD WHERE OD.O_Id='" & [Global].O_Id & "' AND P.P_Id=OD.P_Id " myCommandBehavior = CommandBehavior.CloseConnection myReader = mySqlCommand.ExecuteReader(myCommandBehavior) While (myReader.Read()) cboProductPurchased.Items.Add(myReader("P_Name").ToString()) End While myReader.Close() Catch ex As Exception MsgBox(ex.ToString) Finally connLocal.Close() End Try
Hi,I've got a db table containing 5 columns(excluding id) consisting of1.) First Half of a UK postcode2.) Town name to which postcode belongs3.) Latitude of Postcode4.) Longitude of Postcode5.) Second Part of the PostcodeI want to select columns 1,2,3 and 4, but once only. There are oftenseveral entries where 1 and 2 are the same but 3 and 4 are differenti.e.WA1Bewsey and Whitecross53.386492-2.596847WA1Bewsey and Whitecross53.388203-2.590961WA1Bewsey and Whitecross53.388875-2.598504WA1Fairfield and Howley53.388455-2.581701WA1Fairfield and Howley53.396117-2.571789My current query isSELECT DISTINCT Postcode, Town, latitude, longitudeFROM PostcodeWHERE Postcode.Postcode = 'wa1'ORDER BY Postcode, TownHowever as latitude and longitude differ on each line DISTINCT doesnot do what I'm looking for.Can anybody suggest a way changing the query to just give the firstinstance of each Postcode/Town combo?I.E.WA1Bewsey and Whitecross53.386492-2.596847WA1Fairfield and Howley53.388455-2.581701Many thanks!Drew
This one's kind of hard to explain, so I've opted to post a simplifiedversion of our view that prompted me to ask this question: Thequestion is re-asked after the view...create view MainView (PrimaryKeyID,SubTotal1,SubTotal2,GrandTotal)asselect t.PrimaryKeyID,sum(t1.Total),sum(t2.Total),sum(t1.Total) + sum(t2.Total)from SomeTable tjoin CalculationTable t1 on ...join AnotherCalculationTable t2 on ...Notice in the 3rd column called "GrandTotal" how it calls the function"sum" two more times. Common sense tells me that this is notnecessary. in our case it's orders of magnitude worse... Is the queryoptimizer smart enough to only call these sums once per row in"SomeTable"? Common sense tells me that if we were to break the viewsapart into two views it would avoid this ineffeciency:create view InnerView (PrimaryKeyID,SubTotal1,SubTotal2)asselect t.PrimaryKeyID,sum(t1.Total),sum(t2.Total)from SomeTable tjoin CalculationTable t1 on ...join AnotherCalculationTable t2 on ...create view OuterView (PrimaryKeyID,SubTotal1,SutTotal2,GrandTotal)asselect iv.PrimaryKeyID,iv.Total1,iv.Total2,iv.Total1 + iv.Total2from InnerViewNotice how it appears that we've tricked the optimizer into thinkingthere are less operations. So my question is how does views handlethis situation? Does the optimizer treat both version the same? Or isone faster than the other? Or is there another, faster way? Doesadding levels of views slow down things, or are views simply likemacros and get removed when compiled (I think I've read the latter istrue actually)Thanks,Dave