Suppose that I have a table that contains a lot of records that are
identical except for an id field and a date-time-stamp field. For
example
Id Unit Price DTS
1 A 1.00 Date 1
2 A 1.00 Date 2
3 A 1.00 Date 3
4 B 1.25 Date 4
5 B 1.50 Date 5
6 B 1.50 Date 6
7 C 2.75 Date 7
8 C 2.75 Date 8
9 C 2.75 Date 9
10 C 3.00 Date 10
I want to cull out records that are duplicates in the units and price
fields. I want to use the max DTS as the criteria for which record in
a set of "duplicates" will remain. So, If I get the right query, I
should return with
Id Unit Price DTS
1 A 1.00 Date 1
4 B 1.25 Date 4
5 B 1.50 Date 5
7 C 2.75 Date 7
10 C 3.00 Date 10
Is this possible using a single query? If so, how? I am sure that I
can do this using code, but it will involve a bunch of loops and
process time. I would prefer a cleaner, more elegant way. Thanks for
any help.
I have a table with 100,000 plus records in it, and some are duplicates. Is there any way to delete one of them and not the other. For instance, if I duplicate the table I could run this query. <cfquery name="query1" datasource="datasource"> DELETE DISTINCT FROM tablename WHERE FirstName in ( SELECT FirstName from tablename1 where tablename1.FirstName = tablename.FIRST_NAME AND tablename1.LastName = tablename.LAST_NAME AND tablename1.State = tablename.STATE) </cfquery>
However, it doesn't work. I know the distinct is not correct. But does anyone know how to achieve this, I have looked all over, and everything I try deletes both records. I was thinking of using some kindof count statement, but it still deletes both of them. Please help. Thanks
Hello,I have a stored procedure that deletes duplicatesin one table:.....ASBEGINDELETE FROM mytableWHERE Id IN(SELECT Max(id)from mytablegroup by date, idsenshaving count(*) >1)ENDsometimes it happens that I have >2 rows with duplicated values.How can I write a new stored procedure that delete all rows withduplicated infomrations (leaving only one row with those values)?ThanksM.A.
I need to remove duplicate data from around 25 tables. I want to use a while loop to go through all tables. If I list out all of the column names the query runs fine, but since there are 25 tables some with 50 plus columns I was hoping to use something like the following, which errors out because my sub queries return more than one result.
SELECT q.* from (Select ROW_NUMBER() OVER ( Partition BY (SELECT [name] AS [Name] FROM syscolumns WHERE id = (SELECT id FROM sysobjects WHERE type = 'U' AND [Name] = 'Orders') ) Order by (select top 1 [name] AS [Name] FROM syscolumns
Hello all. I have a table with two coulmns CODE and DESCRITPION. Can anyone suggest how i can go about deleteing the entire record where two or more codes are the same?
I know how to detect & delete dups/or >dups in test with a select clause, this works fine in a small table, but if the table has a million rows say, it sounds like a proc would be faster: my question is: How do I display those rows in a proc for detecting what the problem is. The print stmt. doesn't seem to work and I wondered if I had to go through the process of building an output stream. The proc creates okay but I'm stuck after that part.
thx
Kat -- very rough code below
SET QUOTED_IDENTIFIER ON GO SET ANSI_NULLS ON GO create proc dupcount @count int as set nocount on select categoryID, CategoryName, Count(*) As Dups from Categories group by Categoryid, CategoryName having count(*) >1
I have a table with 22 million Business records. I can see that there are duplicates when I group by BusinessName and Address and Phone. I'd like to place only the duplicates into a table, with a ranking, oldest business key gets a ranking of 1.
As a bonus I'd like each group to have a distinct group name (although not necessary, just want to know how to do this)
Later after I run more verifications to make sure these are not referenced elsewhere I'll delete everything with a matchRank > 1 out of the main Business table.
DROP TABLE [dbo].[TestBusiness]; GO CREATE TABLE [dbo].[TestBusiness]( [Business_pk] INT IDENTITY(1,1) NOT NULL, [BusinessName] VARCHAR (200) NOT NULL, [Address] VARCHAR(MAX) NOT NULL,
Hi. I need to insert a semicolon into one of my fields in a sql server 2000 database. I have a height column and i was trying to insert height. i know i can't insert 4'5" b/c of the apostrophe and double quote, so i was trying to insert it like: 4'7"now i'm running into a problem with the semicolons. how can i insert the semicolons? thanks!edit:aparently this is removing the same things: here's what i'm talking about:http://www.thoughtreactor.com/img-0009.png
Hi. I'm trying to figure out this query and I'm just having the hardest time with it. I've gotten my entire query figured out except for this last part. Basically, I have an events table (ClientEvents) which has events listed for clients. I'm trying to find out which clients have not completed their service. The begin service event is EventID=27 and the end service event is EventID=28. I'm trying to find the clients who are "missing" (well, not really missing b/c they are technically in the process of their service) EventID=28. This events table keeps a record of service so I can't really use the NOT EXISTS b/c a lot of clients have been serviced before. Does anybody have any ideas? Thanks!
I have a semi-additive measure in the report. This measure is additive across product groups (row) but not across the period (column). The period (column) can be drilled up to 1st Half (from Jan to June) and 2nd Half (from July to Dec).
How do I make the subtotal for 1st Half to retrieve from mth June and 2nd Half to retrieve from mth Dec?
Hi, I rewrote my working Sql statement to prevent Sql Injections. I copied some code I used in another project but this time I can't get it to work, possibly because it's an update statement and not an Insert one, which I used before. Sorry for the boring question, but does anyone have a clue what's wrong with the syntax? Here's the original code (I changed the parameter names for clarity and security): Dim conn As SqlConnection = New SqlConnection(System.Configuration.ConfigurationManager.ConnectionStrings("MyConnectionString").ConnectionString) Dim strSQL As String = "Update MyTable Set " & typ & num & " = '" & pname & "' WHERE personID = " & fid Dim cmd As SqlCommand = New SqlCommand(strSQL, conn) Here's the code from codebehind: Dim conn As SqlConnection = New SqlConnection(System.Configuration.ConfigurationManager.ConnectionStrings("MyConnectionString").ConnectionString) Dim strSQL As String = "Update MyTable Set (" & typ & num & ") Values (@" & typ & num & ") WHERE personID = " & fid Dim cmd As New SqlCommand(strSQL, conn) With cmd.Parameters .Add(New SqlParameter("@" & typ & num, pname)) End With TestLabel.Text = strSQL & " " & pname cmd.Connection.Open() cmd.ExecuteNonQuery() cmd.Connection.Close() Here's my test message; first the sql, then the new string to be inserted: Update MyTable Set (picturename) Values (@pname) WHERE personID = 2 2_adin.jpg Here's my error code: Line 1: Incorrect syntax near '('. Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code. Exception Details: System.Data.SqlClient.SqlException: Line 1: Incorrect syntax near '('. Source Error: Line 489: TestLabel.Text = strSQL & " " & pnameLine 490: cmd.Connection.Open()Line 491: cmd.ExecuteNonQuery()Line 492: cmd.Connection.Close()Line 493: End If I could understand it if I'd get an error in VWD Express for using dynamic variables, but they work correctly in the text message so I'm clueless. Any help is deeply appreciated! Pettrer (VB, Sql Server 2000)
Hi i have to delete the master table data without deleting the child table records,is there any solution for this, parent table has relation with the child table. regards vinod.t.v
The Semi-Additive Measures feature of MS SQL Server 2005 is not supported in its Standard Edition SKU. How do solve this issue? What editions support the feature?
/****** Object: StoredProcedure [dbo].[dbo.ServiceLog] Script Date: 07/18/2014 14:30:59 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO ALTER proc [dbo].[ServiceLogPurge]
-- Purge records dbo.ServiceLog older than 3 months: -- Purge records in small portions to avoid locking production tables -- for a long time. The process takes longer, but can co-exist with -- normal usage of the tables.
[Code] ...
*** Getting this error below when executing the code ***
Msg 102, Level 15, State 1, Procedure ServiceLogPurge, Line 45 Incorrect syntax near 'Failed:'.
I wrote a stored procedure that takes a string of values, seperated by semicolon as parameter. The procedure is below;
Code: ALTER PROCEDURE [dbo].[selectUserTabsByRoles] @var varchar(max) AS BEGIN SELECT distinct * from tbl_tabs where ( PATINDEX('%'+left(@var,1)+'%', roles) > 0 or PATINDEX('%'+right(@var,1)+'%', roles) > 0 ) AND parent is null and tabstatus =1 ORDER BY tabposition END
My problem is, when I pass a parameter like 1; it fetches all rows with roles having 1. But I realised that the last row in the sample data does not have 1 as roles, but rather 11.
I am using Excel 2013 64bit and use an english Excel version, but with a comma as a decimal seperator and semi-colon as a list seperator.
In Excel everything works fine, but PowerPivot does not properly recognize that I use a semi-colon for formulas.
PowerPivots lets me write formulas with the semi-colon and not the comma, so that is fine.
However, two issues appear:
the yellow smart formula help box that appears when you start typing a formula, thinks I have to use commas, so when I use semi-colons instead, it does not jump to the next parameter.This problem also causes parameters where I have to enter a table or field to not suggest me table and fields when I start typing.Sometimes the formula validation even throws me an exception, that my formula syntax is incorret, ebcause Id id not use a comma. However, commas also do not work. I have to do some weird playing around until it finally accepts my formula.
I hope this buggy behaviour gets fixed, but is there a way I can work around this without changing my formula/list seperator? I also do not want to use a German Excel version, because I am used to the english formulanames.
I have a BOM table with all finished item receipes and semi items recipes. create a query where semi item materials are also listed in finished item recipe.
I need some help. I have created a database that looks like the following: FirstName Table link to Main Table. I have created a Stored procedure that looks like this: Create procedure dbo.StoredProcedure ( @FirstName varchar(20) ) Declare FirstNameID int Insert Into Main Table ( FirstName ) Values ( @FirstName ) Select @FirstNameID = Scope_Identity() How could I redesign this to check if a value exists and if it exists then simply use that value instead of creating a new duplicate value?
I have a dilema..... I have a databas eof about 60,000 users and i need to get rid of those users where there is a duplicate email address. I have written an asp utilty that works but is far too taxing on our little server and i thinkk itwill kill it. what it does is for each email address it compares it against all the others.... so for each address it checks against 60,000 other records 60,000 times.... you know what i mean. its pretty phucked.... i tested it on just one record and took about 5mins.
anyway ive been trying to do it in SQL with no luck
i'm trying to get duplicates out of the my database
SELECT COUNT(*) AS Amount, Firstname, surname, Internalextension FROM iac.dbo.sf_profil GROUP BY FirstName, surname, internalextension HAVING COUNT(*) > 1 order by firstname, surname
How do i alter the query just retrieve records which have firstname and lastname which are similar but different extension numbers ?
Hi, This is the query which shows me the duplicates Some of the records have more than one records I would like to know how to delete the extra records so that I will end up with one record per row.
select Pricing_Source, VaR_Identifier, Price_Date, PX_Last, Count(*) as 'count' from tblPricesClean group by Pricing_Source, VaR_Identifier, Price_Date, PX_Last having count(*) > 1 order by count desc
Is there a way to find duplicates in one field? For example my query has person_nbr and for each person_nbr on one day they could have used multiple payer_names. I want to be able to count each person_nbr one time but also I want to group by description(which is the name of the provider) and by payer name to see how many person's that the provider seen with each payer. My problem is that if the person had more than one payer they are counted twice. Is there some type of aggregate function to use the first payer in the list??
With PersonMIA (person_id,person_nbr,first_name,last_name,date_of_birth) as ( select distinct person_id,person_nbr,first_name,last_name,date_of_birth from (select count(*) as countenc,a.person_id,a.person_nbr, a.first_name,a.last_name, a.date_of_birth from person a join patient_encounter b on a.person_id = b.person_id group by a.person_id,a.person_nbr,a.first_name,a.last_name,a.date_of_birth )tmp where tmp.countenc <=1 ) select person_nbr,payer_name,first_name,last_name,description,year(create_timestamp),create_timestamp from ( select distinct c.description,tmp.person_id,tmp.person_nbr,tmp.first_name, tmp.last_name,tmp.date_of_birth,d.payer_name,b.create_timestamp from PersonMIA tmp join person a on a.person_id = tmp.person_id join patient_encounter b on a.person_id = b.person_id join provider_mstr c on b.rendering_provider_id = c.provider_id join person_payer d on tmp.person_id = d.person_id where c.description = 'Leon MD, Enrique' group by c.description,tmp.person_id,tmp.person_nbr,tmp.first_name,tmp.last_name, tmp.date_of_birth,d.payer_name,b.create_timestamp )tmp2 where year(create_timestamp) IN (2005,2006) group by person_nbr,payer_name,first_name,last_name,description,create_timestamp
Hi, I'll see if I can explain this clearly. The query below selects rows from the "hdr_ctl_nbr_status" table if the value in the field "tcn" from that table is found in the table "temp_tcn". I want all fields from the "hdr_ctl_nbr_status" table to be selected BUT only one row. In other words for a tcn with a value "12345678" there are 10 rows returned from the hdr_ctl_nbr_status table, I want only 1. Is there a way I can use SELECT DISTINCT to do this ? I know this usually functions on one or more fields but I want the DISTINCT to be on tcn only BUT return all fields in the query.
Select h.*,'' from hdr_ctl_nbr_status as h WITH (NOLOCK) where h.tcn in (select tcn from temp_tcn)
I have two columns of int data in the a table, as my example data shows below.
I want my data returned to be something like those in #test3, but my question is this, how can I do it without using #test2 and #test3?
By the way, the business requirement doesn't care it's min/max or any ID when one side has duplicated values.
Thanks!
Use tempdb Go
if object_ID ('#test') is not null drop table #test
create table #test (col1 int, col2 int) insert into #test Select 123, 222 union Select 124, 222 union Select 125, 222 union Select 111, 223 union Select 111, 224
if object_ID ('#test2') is not null drop table #test2 create table #test2 (col1 int, col2 int) Insert into #test2 Select distinct col1, min(col2) from #test group by col1
if object_ID ('#test3') is not null drop table #test3 create table #test3 (col1 int, col2 int) Insert into #test3 Select min(col1), col2 from #test2 group by col2
I am attempting to execute the Stored Procedure at the foot of thismessage. The Stored Procedure runs correctly about 1550 times, butreceive the following error three times:Server: Msg 512, Level 16, State 1, Procedure BackFillNetworkHours,Line 68Subquery returned more than 1 value. This is not permitted when thesubquery follows =, !=, <, <= , >, >= or when the subquery is used asan expression.I've done some digging, and the error message is moderatelyself-explanatory.The problem is that there is no Line 68 in the Stored Procedure. It'sthe comment line:-- Need to find out how many hours the employee is scheduled etc.Also, there are no duplicate records in the Employee table nor theWeeklyProfile table. At least I assume so - if the following SQL todetect duplicates is correct!SELECT E.*FROMEmployee Ejoin(select EmployeeIDfromEmployeeGroup by EmployeeIDhaving count(*) > 1) as E2On(E.EmployeeID = E2.EmployeeID)SELECTW.*FROMWeekProfile Wjoin(SelectWeekProfileIDFROMWeekProfileGROUP BYEmployeeID, MondayHours, WeekProfileIDHAVING COUNT(*) > 1) AS W2ONW.WeekProfileID = W2.WeekProfileIDNOTE: In the second statement, I have tried for MondayHours thruFridayHours.Anyone got any ideas? The TableDefs are set up in this thread:<http://groups-beta.google.com/group/comp.databases.ms-sqlserver/browse_frm/thread/fff4ef21e9964ab8/f5ce136923ebffc3?q=teddysnips&rnum=1&hl=en#f5ce136923ebffc3>The Stored Procedure that causes the error is here:--************************************************** ***********CREATE PROCEDURE BackFillNetworkHoursASDECLARE @EmployeeID intDECLARE @TimesheetDate DateTimeDECLARE @NumMinutes intDECLARE @NetworkCode int-- Get the WorkID corresponding to Project Code 2002SELECT@NetworkCode = WorkIDFROM[Work]WHERE(WorkCode = '2002')-- Open a cursor on a SELECT for all Network Support Employees whereany single workday comprises fewer than 7.5 hoursDECLARE TooFewHours CURSOR FORSELECTEmployeeID,CONVERT(CHAR(8), Start, 112) AS TimesheetDate,SUM(NumMins) AS TotalMinsFROM(SELECTTI.EmployeeID,W.WorkCode,TI.Start AS Start,SUM(TI.DurationMins) AS NumMinsFROMTimesheetItem TI LEFT JOIN[Work] W ON TI.WorkID = W.WorkIDWHERE EXISTS(SELECT*FROMEmployee EWHERE((TI.EmployeeID = E.EmployeeID) AND(E.DepartmentID = 2)))GROUP BY TI.EmployeeID, TI.Start, W.WorkCode) AS xGROUP BYEmployeeID,CONVERT(char(8), Start, 112)HAVINGSUM(NumMins) < 450ORDER BYEmployeeID,CONVERT(CHAR(8), Start, 112)-- Get the EmployeeID, Date and Number of Minutes from the cursorOPEN TooFewHoursFETCH NEXT FROM TooFewHours INTO @EmployeeID, @TimesheetDate,@NumMinutesWHILE (@@FETCH_STATUS=0)BEGINDECLARE @NewWorkTime datetimeDECLARE @TimesheetString varchar(50)DECLARE @Duration intDECLARE @RequiredDuration int-- Set the correct date to 08:30 - by default the cast from thecursor's select statement is middaySET @TimesheetString = @TimesheetDate + ' 08:30'SET @NewWorkTime = CAST(@TimesheetString AS Datetime)-- Need to find out how many hours the employee is scheduled to workthat day.SET @RequiredDuration = CASE (DATEPART(dw, @NewWorkTime))WHEN 1 THEN(SELECT CAST((60 * SundayHours) AS int) FROM WeekProfile WHERE(EmployeeID = @EmployeeID))WHEN 2 THEN(SELECT CAST((60 * MondayHours) AS int) FROM WeekProfile WHERE(EmployeeID = @EmployeeID))WHEN 3 THEN(SELECT CAST((60 * TuesdayHours) AS int) FROM WeekProfile WHERE(EmployeeID = @EmployeeID))WHEN 4 THEN(SELECT CAST((60 * WednesdayHours) AS int) FROM WeekProfile WHERE(EmployeeID = @EmployeeID))WHEN 5 THEN(SELECT CAST((60 * ThursdayHours) AS int) FROM WeekProfile WHERE(EmployeeID = @EmployeeID))WHEN 6 THEN(SELECT CAST((60 * FridayHours) AS int) FROM WeekProfile WHERE(EmployeeID = @EmployeeID))WHEN 7 THEN(SELECT CAST((60 * SaturdayHours) AS int) FROM WeekProfile WHERE(EmployeeID = @EmployeeID))ENDIF @NumMinutes < @RequiredDurationBEGIN-- Set the Start for the dummy work block to 08:30 + the number ofminutes the employee has already worked that daySET @NewWorkTime = DateAdd(minute, @NumMinutes, @NewWorkTime)-- Set the duration for the dummy work block to be required durationless the amount they've already workedSET @Duration = @RequiredDuration - @NumMinutes-- Now we have the correct data - insert into table.INSERT INTO TimesheetItem(EmployeeID,Start,DurationMins,WorkID)VALUES(@EmployeeID,@NewWorkTime,@Duration,@NetworkCode)ENDFETCH NEXT FROM TooFewHours INTO @EmployeeID, @TimesheetDate,@NumMinutesENDCLOSE TooFewHoursDEALLOCATE TooFewHoursGO--************************************************** ***********ThanksEdward
I have a table, TEST_TABLE, with 6 columns (COL1, COL2, COL3, COL4,COL5, COL6).... I need to be able to select all columns/rows whereCOL3, COL4, and COL5 are unique....I have tried using DISTINCT and GROUP BY, but both will only allow meto access columns COL3, COL4, and COL5..... i need access to allcolumns...I just want to get rid of duplicate rows (duplicates ofCOL3, COL4, and COL5)...Thanks in advance.Joe