T-SQL (SS2K8) :: Delete Duplicates From Table Based On Two Columns?
May 20, 2015
Assuming I have a table similar to the following:
Auto_ID Account_ID Account_Name Account_Contact Priority
1 3453463 Tire Co Doug 1
2 4363763 Computers Inc Sam 1
3 7857433 Safety First Heather 1
4 2326743 Car Dept Clark 1
5 2342567 Sales Force Amy 1
6 4363763 Computers Inc Jamie 2
7 2326743 Car Dept Jenn 2
I'm trying to delete all duplicate Account_IDs, but only for the highest priority (in this case it would be the lowest number).
I know the following would delete duplicate Account_IDs:
DELETE FROM staging_account
WHERE auto_id NOT IN
(SELECT MAX(auto_id)
FROM staging_account
GROUP BY account_id)
The problem is this doesn't take into account the priority; in the above example I would want to keep auto_ids 2 and 4 because they have a higher priority (1) than auto_ids 6 and 7 (priority 2).
How can I take priority into account and still remove duplicates in this scenario?
I am using CROSS APPLY instead of UNPIVOT to unpivot > one column. I am wondering if I can dynamically replace column names based on different tables? The example code that I have working is based on the "Allergy" table. I have thirty more specialty tables to go. I'll show the working code first, then an example of another table's columns to show differences:
select [uplift specialty], [member po],[practice unit name], [final nomination status] ,[final uplift status], [final rank], [final uplift percentage] ,practiceID=row_number() over (partition by [practice unit name] order by Metricname) ,metricname,Metricvalue, metricpercentilerank
[code]....
Rheumatology Table:The columns that vary start with "GDR" and [GDR Percentile Rank] so I'm just showing those:
I have a patient record and emergency contact information. I need to find duplicate phone numbers in emergency contact table based on relationship type (RelationType0 between emergency contact and patient. For example, if patient was a child and has mother listed twice with same number, I need to filter these records. The case would be true if there was a father listed, in any cases there should be one father or one mother listed for patient regardless. The link between patient and emergency contact is person_gu. If two siblings linked to same person_gu, there should be still one emergency contact listed.
Below is the schema structure:
Person_Info: PersonID, Person Info contains everyone (patient, vistor, Emergecy contact) First and last names Patient_Info: PatientID, table contains patient ID and other information Patient_PersonRelation: Person_ID, patientID, RelationType Address: Contains address of all person and patient (key PersonID) Phone: Contains phone # of everyone (key is personID)
The goal to find matching phone for same person based on relationship type (If siblings, then only list one record for parent because the matching phones are not duplicates).
I have a table with 22 million Business records. I can see that there are duplicates when I group by BusinessName and Address and Phone. I'd like to place only the duplicates into a table, with a ranking, oldest business key gets a ranking of 1.
As a bonus I'd like each group to have a distinct group name (although not necessary, just want to know how to do this)
Later after I run more verifications to make sure these are not referenced elsewhere I'll delete everything with a matchRank > 1 out of the main Business table.
DROP TABLE [dbo].[TestBusiness]; GO CREATE TABLE [dbo].[TestBusiness]( [Business_pk] INT IDENTITY(1,1) NOT NULL, [BusinessName] VARCHAR (200) NOT NULL, [Address] VARCHAR(MAX) NOT NULL,
;WITH ctePreAgg AS ( select top 500 act_reference "ActivityRef", row_number() over (partition by act_reference order by act_reference) as rowno, t3.s_initials "Initials" from mytablestuff order by act_reference
[code]...
But what I would love to do next is take each of the above rows - and return the initials either in one column with all the nulls and duplicate values removed, separated by a comma ..
OR the above but using variable number of columns based on the maximum number of different initials for each row.this is not strictly required, but maybe neater for further work on the view
create table #myfirst (id int, city varchar(20)) insert into #myfirst values (500,'Newyork') insert into #myfirst values (100,'Ediosn') insert into #myfirst values (200,'Atlanta') insert into #myfirst values (300,'Greenwoods') insert into #myfirst values (400,'Hitchcok') insert into #myfirst values (700,'Walmart') insert into #myfirst values (800,'Madida')
-- My Second Data
create table #mySecond (id int, city varchar(20),Sector varchar(2)) insert into #mySecond values (1500,'Newyork','MK') insert into #mySecond values (5500,'Ediosn','HH') insert into #mySecond values (5060,'The Atlanta','JK') insert into #mySecond values (7500,'The Greenwoods','DF') insert into #mySecond values (9500,'Metro','KK') insert into #mySecond values (3300,'Kilapr','MK') insert into #mySecond values (9500,'Metro','NH')
--Third Second Data
create table #myThird (id int, city varchar(20),Sector varchar(2)) insert into #myThird values (33,'Walmart','PP') insert into #myThird values (20,'Ediosn','DD') select f.*,s.Sector from #myfirst f join #mySecond s on f.city = s.city /* idcitySector 500NewyorkMK 100EdiosnHH */
i have doubt on two things
1) How Can i compare the City names, by eliminating 'The ' at the beginning (if there is any in second tale city) between first and second
2) after comparing first and second if there is no match found in second them want to compare with third table values for those not found
--i tried below to solve first doubt, it is working but want to know any other wasys to do it
select f.*,s.Sector from #myfirst f join #mySecond s on replace (f.city, 'THE ','')= replace (s.city, 'THE ','')
--Expected results wull be
create table #ExpectResults (id int, city varchar(20),Sector varchar(2)) insert into #ExpectResults values (200,'Atlanta','JK') insert into #ExpectResults values (100,'Ediosn','HH') insert into #ExpectResults values (300,'Greenwoods','DF') insert into #ExpectResults values (500,'Newyork','MK') insert into #ExpectResults values (700, 'Walmart','PP') insert into #ExpectResults values (800, 'Madidar','')
I am trying to pivot data based on columns value in year column... but results are not showing up correctly. I want to see all columns after pivot.I want to Pivot based on year shown in the data but it can be dynamic as year can go for last 3 years
I am also using an inner join as i have two amount columns in my code and i want to show both amount columns for all displayed year.I am able to pivot but I need in output all the columns like this Id,MainDate, Year1,Year2,Year3(if any), AMT1 for YR1, AMT2 for Yr1, , AMT1 for YR2, AMT2 for Yr2, AMT1 for YR3, AMT2 for Yr3,
Here is some data:
-- CREATE TABLE [dbo].[TEMP]( --[FileType] [varchar](19) NOT NULL, --[dType] [char](2) NOT NULL, --[dVersion] [char](2) NOT NULL, --[Id] [char](25) NOT NULL, --[MainDate] [char](40) NULL,
I have two tables that are related by keys. For instance,Table employee {last_name char(40) not null,first_name char(40) not null,department_name char(40) not null,age int not null,...}Employee table has a primary key (combination of last_name and first_name).Table address {last_name char(40) not null,first_name char(40) not null,street char(200) not null,city char(100) not null,...}Address table has a primary key (combination of last_name, first_name andstreet in which (last_name, first_name) reference (last_name, first_name) inemployee table.Now I want to delete some rows in Address table based on department_name inEmployee table. What is sql for this delete?I appreciate your help. Please ignore table design and I just use it for myproblem illustration.Jim
with actividades_secundarias as ( select a.*, r.Antigo, r.Novo, rn = row_number() over (PARTITION BY a.nif_antigo, r.novo ORDER BY a.nif_antigo) from ACT_SECUNDARIAS a inner join ((select
[Code] ....
I want to make a delete statement like this:
select * into #table1 from actvidades_secundarias where rn>1 Delete from act_secundarias where act_secundarias.nif_antigo = #table1.nif_antigo and act_secundarias.cod_cae = #table1.cod_cae
Table Master --------------------------- EmpID EffectiveDateFr Group 00001 1/1/2014 A 00001 1/5/2014 B 00001 1/9/2014 C 00001 2/1/2014 B 00001 2/20/2014 A ....
I want to create query the output should be:
EmpID TransDate Group 00001 1/1/2014 A 00001 1/2/2014 A 00001 1/3/2104 A 00001 1/4/2014 A 00001 1/5/2014 B 00001 1/6/2014 B 00001 1/15/2014 C 00001 2/1/2014 B 00001 2/2/2014 B 00001 2/20/2004 A
I am getting runtime error for the below simple execution of sp even I tried casting the @dt to varchar still getting the same error. I want to delete the table based on the 2 parameter ...
1. Table Name first parameter 2. InCondition is the column name which of type datetime
CREATE PROCEDURE dbo.[DeleteTable](@InTblName NVARCHAR(250),@InCondition NVARCHAR(250)) AS BEGIN DECLARE @DeleteSQL NVARCHAR(250) DECLARE @Dt DATETIME SET @Dt = GETDATE() SELECT @DeleteSQL = N'DELETE FROM ' + @InTblName +' WHERE '+@InCondition+ '=''' + @Dt+ '''' SELECT(@DeleteSQL) EXECUTE sp_executesql @DeleteSQL END
I have corrected the code now, I am not getting the output it is throwing error at run time.
EXEC [DeleteTable] 'TABLE_NAME','COLUMN_NAME'
Where column_name is of datatype datetime
Msg 241, Level 16, State 1, Procedure spDeleteTable, Line 8 Conversion failed when converting date and/or time from character string.
Delete records from a table; keying off of a colum (expire date) thats data-type is datetime.
I would like to use command getdate less 2 years. So - I want to delete all records from a table where the expire date is 2 years older than the current date.
I need to retrieve the pts from table #test1 based on the fields when mapped..
Name & Type columns should match ..Based on the data below the output should be
id NameTypeCode CityIType BPS 1EGFN432 HY F2 10 2 EG FN 432 ON F1 20 3 EG FN 433 On F1 30
It has order of priority to get points
1.When all of the field matches then get the respective pts . 2.When name and Type matched and rest fields doesnt match but is null in #test2 table then retrieve those pts
--drop table #test --drop table #test1 Create table #test ( id int identity(1,1) not null, Name varchar(100) NULL,
For example we've got a row from [Formula_Calc] table 'F1+F3' as a string that needs to be transformed as 240+160=400
The below code works for the above example but if I pick 'F11+F3' instead , returns 2561 which comes from 2401+16. Probably replaces F1 value instead of F11 and adds 1st digit (1) if I got it right ...
DECLARE @formula NVARCHAR(100); DECLARE @Total NVARCHAR(100); SET @formula = 'F11+F3';
SELECT @formula = REPLACE(@formula,RowNo,Total) FROM [Totals]
I have around 3 tables having around 20 to 30gb of data. My table A related to table B by a FK and same way table B related to table C by FK. I would like to delete all rows satisfying certain condition from table A and all corresponding related records from table B and C. I have created a query to delete the grandchild first, followed by child table and finally parent. I have used inner join in my delete query. As you all know, inner join delete operations, are going to be extremely resource Intensive especially on bigger tables.
What is the best approach to delete all these rows? There are many constraints, triggers on these tables. Also, there might be some FK relations to other tables as well.
I have one table say A and in which 4 columns are there. Out of 4 , one columns stores the queries like 'select * from table xyz' etc(Only select queries). I am writing a procedure in which I have to fetch this column and execute the query and wants to check whether query i.e. "select * from table xyz" contains any record or not. If yes , I am updating the table B with value as Pass , else Fail.
I used execute @queryfromvariable but it does not gives me count..
I have 4 rows below in file tblTEST, and I want to be able to transfer the CODE from the MAIN location to the INT location (replacing all existing "A" codes), preceeded by an "I".
ID LOC CODE -- ----- ------ 11 MAIN B 11 INT A 22 MAIN C 22 INT A
I want the result to be:
ID LOC CODE -- ----- ------ 11 MAIN B 11 INT IB 22 MAIN C 22 INT IC
I am stumped as to how to do this - any help or advice would be appreciated.
The only thing I've come up with is:
UPDATE S SET s.code = B.code FROM tbltest B LEFT OUTER JOIN tbltest S ON B.id = S.id WHERE (S.loc = 'INT')
My requirement is below.enhancing the T- sql query as I was told not to use SSIS.
Insert data from Table A to Table B with conditions:
1. Truncate gender to one character if necessary.
2. Convert character fields to uppercase as necessary.
3. For systems that supply both residential and mailing addresses, use the residential address if available (both street_address and zip fields have value), mailing address otherwise.
In SSIS I took conditional split with 'ISNULL(res_street_address) == TRUE || ISNULL(res_zip) == TRUE '
default outputname :Consider Res Address; Outputname:Consider mail address.
and mapped as:
(Table A) mail_street_address---street address(Table B)
(Table A) mail_city----------------City(Table B)
(Table A) mail_Zip----------------Zip(Table B)
(Table A) mail_state-------------state(Table B)
(Table A) res_street_address--street address(Table B)
(Table A) res_city---------------City(Table B)
(Table A) res_Zip----------------Zip(Table B)
(Table A) res_state--------------state(Table B)
I want to do the same with T-sql code too:
I came up with below T-SQl but unable to pick(street,city,state,zip columns as I have take combination of street and zip from Table A not individual columns as I wrote in below query) based on above condition(3):
Insert into TABLE B SELECT Stats_ID ,UPPER(first_name) first_name ,UPPER(middle_name )middle_name ,UPPER(last_name) last_name ,UPPER(name_suffix) name_suffix
lets say i have 100 employees on Table A and I have 10 employees on Table B. I want to delete all the employees in Table A that are in Table B. Delete From TableA Where EmpNum = (Select EmpNum From TableB)The above SQL wont work but i am looking for something similar. any ideas?