I have a members table and have added an extra few thoushand members to it. Now I need to remove the duplicates.
It doesnt matter which duplicate i remove as long as there are unique email addresses.
so here is the format of the table:
id email firstname lastname datebirth
if i do a:
SELECT COUNT(DISTINCT Email) AS Expr1 FROM Customer
it returns 21345
and
SELECT Count(Email) FROM Customer
returns 28987
I can get the unique email addresses into another table by going:
SELECT DISTINCT emailaddress INTO DistinctCustomer FROM Customer
but this will only return unique email addresses. How do i select distinct email address and all other fields into a new table? or just remove duplicates where email address appears more then once?
HiI have inherited a web app with the following table structure, and need toproduce a table without any duplicates. Email seems like the best uniqueidentifier - so only one of each e-mail address should be in the table.Following http://www.sqlteam.com/item.asp?ItemID=3331 I have been able toget a duplicate count working:select Email, count(*) as UserCountfrom dbo.Membersgroup by Emailhaving count(*) > 1order by UserCount descBut the methods for create a new table without duplicates fail. My code forthe 2nd method is:sp_rename 'Members', 'temp_Members'select distinct *into Membersfrom temp_MembersTable....CREATE TABLE [dbo].[Members] ([MemberID] [int] IDENTITY (1, 1) NOT NULL ,[Username] [varchar] (10) COLLATE Latin1_General_CI_AS NOT NULL ,[Password] [varchar] (10) COLLATE Latin1_General_CI_AS NOT NULL ,[Email] [varchar] (50) COLLATE Latin1_General_CI_AS NOT NULL ,[Title] [varchar] (10) COLLATE Latin1_General_CI_AS NOT NULL ,[FirstName] [varchar] (50) COLLATE Latin1_General_CI_AS NOT NULL ,[Surname] [varchar] (50) COLLATE Latin1_General_CI_AS NOT NULL ,[Address1] [varchar] (35) COLLATE Latin1_General_CI_AS NOT NULL ,[Address2] [varchar] (35) COLLATE Latin1_General_CI_AS NOT NULL ,[City] [varchar] (25) COLLATE Latin1_General_CI_AS NOT NULL ,[Country] [varchar] (25) COLLATE Latin1_General_CI_AS NOT NULL ,[Profession] [varchar] (50) COLLATE Latin1_General_CI_AS NOT NULL ,[Publication] [varchar] (40) COLLATE Latin1_General_CI_AS NOT NULL ,[DateAdded] [smalldatetime] NOT NULL ,[SendMail] [smallint] NOT NULL) ON [PRIMARY]GOThanks B.
I was wondering if someone could help me with the results on this query, at the moment I am getting values repeated and I was wondering if it was possible to have some of the columns grouped, I have tried to have grouping at the end of the query but this still did not group the rows.
Thanks in advance for your answer - Sean
The structure that i'm trying to acheive is like the following: with each colour having multiple quantitys for each size:
colourdesc| sizedesc | xs | s | m | l ----------- black |qoh| | 0 | 2 | 0 | 7 ----------- white |qoh| | 0 | 0 | 0 | 0 -----------
The database has Name,Email, and skill. Though the name is distinct it is repeated as it has different skills. I would like to remove duplicate names and add the corresponding skill to the only one row.
From the stored procedure, combining 3 tables I got the output as:
NameemaildepartmentSkill ArunemailidTech teamTechnical ArunemailidTech teamLeadership ArunemailidTech teamDecision Making BinayemailidMarketingTechnical BinayemailidMarketingDecision Making
I would like to remove the duplicate Name fields and combine the Skill in a single row as other fields are same.
So the output should be
NameemaildepartmentSkill ArunemailidTech teamTechnical, Leadership, Decision Making BinayemailidMarketingTechnical,Decision Making
Hi, I'm in the midst of an Access 2003 to SQL server 2000 upsizing project and have come across a table on Sql Server that has a field that looks like it's supposed to be the PK but it contains duplicates. What I'd like to do is to have a cursor start at the first value and increment the next value by 1. Could someone explain how I'd go about this?
I have a very large table that can contain up 3 to 5 duplicate records. Every month around 100,000 new records come in. Sometimes it's an ammended record, other times is just duplicated by error.
Is it possible to keep the latest record dumped into the table and delete the others? Does SQL track the order of the data being dropped into the table?
The layout would look like this. There are 10-15 other columns in the table where adjustments can also be made.
Lease# Year Month Production 12345 2008 10 1,231 12345 2008 10 1,250 12345 2008 10 1,250
I'm trying to pull records from a source/staging table where there is a duplicate row in it.I don't need that as the requirement is to garbage in /garbage out.when I do that from mart and use joins btw fact and dimensions, Im not getting this duplicate record as Im using distinct/group by. If I removed it, then it returns more than 3000 rows which is not correct. Is there a way I can keep these duplicates without removing group by...Im using correct joins and filters.
-- declared variables declare @database_name varchar(100), @table_name varchar(100), @primary_key_field varchar(100) declare @list varchar(8000) -- set values to variables set @list = '' set @database_name = 'data200802_dan' set @table_name = 'other02' set @primary_key_field = 'callid'
use database
select @list = @list + column_name + ', ' from information_schema.columns where table_name = @table_name --table name and column_name != @primary_key_field --unique identifier select @list = substring(@list, 1, len(rtrim(@list)) - 1)
--above 5 lines btw came from a helper in the msdn forum. thanks
SELECT DISTINCT @list INTO '#' + @table_name FROM @table_name @table_name + ':' IF (SELECT COUNT(*) FROM @database_name + '.dbo.' + @table_name) = 0 BEGIN INSERT INTO @database_name + '.dbo.' + @table_name + '(' + @list + ')' SELECT @list FROM '#' + @table_name END ELSE BEGIN DELETE @database_name + '.dbo.' + @table_name +' ( ' + @list + ')' GOTO @table_name END DROP TABLE '#' + @table_name
the query above is basically.. selecting all the fields from a table in database W/OUT their primary key. then putting them in a temp table.. delete all the records in the original table. then paste the records from the temp table into the original table.
is there a way for this to work? i don't know how to use the variables w/ this script. please help me correcting this query..
I have a query which finds duplicate spec_items linked to a work order. What I want to do it remove the duplicates (and in some cases there will be more than one) leaving only the record with the highest [sr.id]
select sr.id, sr.linked_to_worknumber, sr.spec_checklist_id from spec_checklist_remind sr inner join spec_checklist_remind sc on sc.linked_to_worknumber = sr.linked_to_worknumber group by sr.id,sr.linked_to_worknumber, sr.spec_checklist_id Having sr.spec_checklist_id = 30 and count(*)>1 order by sr.linked_to_worknumber
I have an existing stored table with duplicate rows that I want to delete.Using a cte gives me
WITH CTE AS ( SELECT rn = ROW_NUMBER() OVER( PARTITION BY employeeid, dateofincident, typeid, description ORDER BY Id ASC), * FROM dbo.TableName ) DELETE FROM cte WHERE rn > 1
This is what I want to do basically. But this is only deleting in my CTE, is there anyway I can update my existing table "TableName" with this, without using temp tables?
I currently have two tables called Book and JournalPaper, both of which have a column called Publisher. Currently the data in the Publisher column is the Publisher name that is entered straight into either table and has been duplicated in many cases. To tidy this up I have created a new table called Publisher where each entry will have a unique ID.
I now want to remove the Publisher columns from Book and JournalPaper, replace it with an ID foreign key column and move the Publisher name data into the Publisher table. Is there a way I can do this without duplicating the data as some publishers appear several times on both tables?
Any help with this will be greatly appreciated as my limited SQL is not up to this particular challenge!!! Thanks!
I have a table here.  I want  find a way of getting the latest date, when the code is the same.  If the Declined date is null.  Then I still want the latest date.  E.g. ID 3. Â
If the declined date is filled in. Â Then I want to get the row, when the Datein column value is greater then the declined date only.
I tried grouping it by max date, but  i got an error message when trying this out.  Against the code Â
WHERE MAX(Datein) > Declined
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference. Â What do I need to do to get both my outputs working?Â
I have soma ado.net code that inserts 7 parameters in a database ( a date, 6 integers). I also use a self incrementing ID but the date is set as primary key because for each series of 6 numbers of a certain date there may only be 1 entry. Moreover only 1 entry of 6 integers is possible for 2 days of the week, (tue and fr). I manage to insert a row of data in the database, where the date is set as smalldatetime and displays as follows: 1/05/2007 0:00:00 in the table. I want to retrieve the series of numbers for a certain date that has been entered (without taking in account the hours and seconds). A where clause seems to be needed but I don’t know the syntax or don’t find the right function I use the following code to insert the row :
and the following code to get the row back (to put in arraylist):
“SELECT C1, C2, C3, C4, C5, C6 FROM Series WHERE (LDate = Today())� WHERE LDate = '" + DateTime.Today.ToString() + "'"
Which is the correct syntax? Is there a better way to insert and select based on the date?
I don’t get any error messages and the code executes fine but I only get an empty datatable in my dataset (the table isn’t looped for rows I noticed while debugging). Today’s date is in the database but isn’t found by my tsql code I think.
rowID PersonID Start Date End Date ===== ======== ========== ========== 001 6575556 19/06/2013 09/07/2013 001 6575556 20/06/2013 12/07/2013 001 6575556 21/06/2013 12/07/2013 002 9478522 15/05/2013 18/05/2013 003 7753423 22/08/2013 01/09/2013
Person can have more than one start/end date therefore I get multiple of the same row ID and Person ID when looking at their dates.
I want to display the most recent end date and associated data if there is more than one start/end date for the same person. I decided to do a self join with max Date aggregate using this against a main select from the Table1:
SELECT PersonID, MAX([End Date]) AS MaxEndDate FROM Table1 GROUP BY PersonID
And join it this way:
select RowID, PersonID, [End Date] FROM Table1 INNER JOIN ( SELECT PersonID, MAX([End Date]) AS MaxEndDate
[Code] ....
When I run the sub-query on its own it gives me the single PersonID and Max Date but on self-joining with Table1 I still get the duplicates values.
If I run the following select statment against the appropriate table it returns the duplilcate records in the result set. However, from this list I want to add an additional select statement embedded into the query that will actually return only those records with the most current syscreated date.
Example of script I'm using---
select cmp_fadd1, syscreated, cmp_name, cmp_code from cicmpy where cmp_fadd1 in (select cmp_fadd1 from cicmpy group by cmp_fadd1 having count(1) = 2) order by cmp_fadd1,syscreated desc
The results is:
Address Syscreated date Customer 1622 ONTARIO AVENUE 2005-06-15 22:19:45.000 RELIABLE PARTSLTD 1622 ONTARIO AVENUE 2004-01-22 18:10:05.000 RELIABLE PARTS LTD PEI CENTER 2006-01-05 22:03:50.000 P.G. ENERGY PEI CENTER 2004-01-22 17:57:56.000 P.G. ENERGY
From this I want to be able to select ONLY those records with the most current syscreated date or those records with 2005-06-15 and 2006-01-05
I have a table containing typed log entries. One log entry is supposedto be created every twelve hours, but sometimes there are gaps. I needto create a report showing the time of entry, and the actual log entry.I can't just list the contents of the log table, because if I do thatthere will be dates missing. Instead, when there isn't a log entry fora date, I need to print the date, and then just leave the log entryblank.The SQL bellows shows what the output should look like. HOWEVER, thecode below makes use of a temp table containing all possible dates. Myquestion is, is there a better way to do this - one that doesn'tinvolve the temp table? Thanks in advance.create table StationLog (LogDate datetime, LogText char(11))insert StationLog values ('1/1/2005 00:00:00','entry one')insert StationLog values ('1/1/2005 12:00:00','entry two')insert StationLog values ('1/2/2005 00:00:00','entry three')insert StationLog values ('1/3/2005 00:00:00','entry four')create table Date_List (TempDate datetime)insert Date_List values ('1/1/2005 00:00:00')insert Date_List values ('1/1/2005 12:00:00')insert Date_List values ('1/2/2005 00:00:00')insert Date_List values ('1/2/2005 12:00:00')insert Date_List values ('1/3/2005 00:00:00')insert Date_List values ('1/3/2005 12:00:00')select TempDate, LogTextfrom Date_Listleft outer join StationLog on Date_List.TempDate = StationLog.LogDatedrop table StationLogdrop table Date_List
Hi -- What is the proper way to select results with a date in the where statement when the datatype of the column is datetime? select * from table where date_field = '3-23-2006' does not get any results.
Hello,I have a couple of tables. The client tables and the contactedtables.I am not sure how to start on this, what I need is a way to query allmy clientsthen show any client that the last visit and or called day is greaterthan 30 days.Now it gets confusing, Suppose the client was visited more than 30 daysagobut was called only 10 days ago, I really would like to have thisappear on the samequery.So the report would look similar to this below.Visit Date Called DateClientA 2006-11-02 2006-12-16ClientB 2006-12-17 2006-10-30ClientC 2006-10-15 2006-10-16ClientDFields (Simplified)Clients: Name, Address, Phone.Contacted: Name, Date, Visit, Call.I need to query all l names, but I only need the last visit and lastphone call. Then determine if either date is greater than 30 days ifso, display the last date of each type of contact. And if there isnothing for the client in the contacted table this needs to show also,ClientD.Any tips, ideas would be greatly appreciated....ThanksIce
I'm having a mental block on a select statement.I have a table with the following columns:KitId int LotNo varchar(10) DateReceived smalldatetimeIt is possible to have many rows with the same LotNo but a differing DateReceived I need to write a select statement that returns the KitId for a given LotNo with the earliest DateReceivedso if I had rows:KitId =1, LotNo = 123, DateReceived = 11th May 2008KitId =2, LotNo = 123, DateReceived = 28th May 2008KitId =3, LotNo = 125, DateReceived = 28th May 2008KitId =4, LotNo = 127, DateReceived = 28th May 2008KitId =5, LotNo = 123, DateReceived = 12th June 2008I would want to retrieve KitId=1 if I provided LotNo 123 as a parameter Whilst it should always be the case that the LotNo with the earliest date will have the lowest KitId I cannot guarantee that will be the case so going for the lowest KitId isn't an optionCan one of you SQL gurus provide me with the statement I need?ThanksNeil
I have table with column Date Of Birth its datatype is smalldatetime. Now I was looking for SQL Statement like I will give from date and to date as parameter it should select date of birth occurring between that date and month.
So I have this query where i need to get the average date of about five different dates... Is there any way to do this or am I screwed. I looked at using the avg function but SQL server 2005 did not like that.
Hello all. I ma using the following query to pull back data. The MergeHistory table has a column named DateMerged. I am looking to pull back the one record with the most recent DateMerged. I have managed to get the query as far as below but not sure how to select the most recent one. Can anyone help with this? I was told it may be along the line of SELECT TOP 1 or something?
INSERT INTO @List (IndexID, IndexName, MergeSystem, Status, DateCreated, CreatedBy, DataTag, MergedDate) SELECT DISTINCT RT.IndexId, isnull(dbo.ufn_GetBestIdentifier(RT.IndexId), dbo.ufn_GetBestVirtualIdentifier(RT.IndexId)), dbo.ufn_GetEntitySystemName(RT.IndexId), RT.Status, CONVERT(varchar, RT.DateCreated, 106) as DateCreated, RT.CreatedBy, RT.DataTag, MH.MergedDate FROM @resulttable AS RT, MergeHistory AS MH WHERE RT.IndexId = MH.EntityID
I've been wrestling with this problem for a while, but my newbie SQL skills are no match for it, so I'm hoping somebody here can point me in the right direction.
I have the following table, called AccountPayments:
I would like to select all the entries where the payment date is, at the latest, the 7th day of the month following the one in which the invoice was issued.
In other words: If the invoice date is in January, I would like to select all the entries where the payment date is February 7th at the latest. If the invoice date is in February, I would like to select all the entries where the payment date is March 7th at the latest.
So, for the above table, I would like to get the following result:
Does anybody know if it's possible to do this? I'm working with SQL Server 2000 and have been playing around with dateadd, but I can't seem to figure it out.
Good Morning,I have a view that contains rate information, contractIDs, and effectivedates.I need to select the rate info based on contractID and date.I can provide a date and contractID, and I need to select the rate info forthat contract where the effective date is <= dateprovided.I need the 1 record that is closest to that date.I am thinking something with max() perhaps. Any ideas? The <= effectivedate will return several rows, I just need the one closest to the date Iprovide.Thanks for any advice,CK