Need To Find Instances Of Duplicates Within A Column; Joining 2 Tables.
Aug 22, 2007
My basic situation is this - I ONLY want duplicates, so the opposite
of DISTINCT:
I have two tables. Ordinarily, Table1ColumnA corresponds in a one to
one ratio with Table2ColumnB through a shared variable. So if I query
TableB using the shared variable, there really should only be on
record returned. In essence, if I run this and return TWO rows, it is
very bad:
select * from TableB where SharedVariable = 1234
I know how to join the tables on a single record to see if this is the
case with one record, but I need to find out how many, among possibly
millions of records this affects.
Every record in Table1ColumnA (and also the shared variable) will be
unique. There is another column in Table1 (I'll call it
Table1ColumnC) that will be duplicated if the record in Table2 is a
duplicate, so I am trying to use that to filter my results in Table1.
I am looking to see how many from Table1 map to DUPLICATE instances in
Table2.
I need to be able to say, in effect, "how many unique records in
Table1ColumnA that have a duplicate in Table1ColumnC also have a
duplicate in Table2ColumnB?"
Every sunday, new data will be loaded from temptable to main table. I have to make sure that, duplicates does not get loaded from temptable to maintable.
For example, if last sunday a record gets loaded from temp to main. If this sunday also the same record is present then it means that is a duplicate.
The duplicate is decided on below scenario
select 'CodeChanges: ', count(*) from CodeChanges a, CodeChanges_Temp b where a.AccountNumber = b.AccountNumber and a.HexaNumber = b.HexaNumber and a.HexaEffDate = b.HexaEffDate and a.HexaId = b.HexaId and
[Code] ...
Yesterday (Sunday) , data from temp got loaded onto maintable but with duplicates.
There is a log which just displays number of duplicates.
Yesterday the log displayed 8 duplicates found. I need to find out the 8 duplicates which got loaded yesterday and delete it off from main table.
There is a column in both tables which is 'creation date and time'. Every Sunday when the load happens this column will have that day's date .
Now i need to find out what are all the duplicates which got loaded on this sunday.
The total rows in temp table is : 363 No of duplicates present is : 8
I used below query to find out the duplicates but it is returning all the 363 rows from the maintable instead of the 8 duplicates.
Select 'CodeChanges: ', * from CodeChanges a where exists ( Select 1 from CodeChanges_Temp b where a.HexaNumber = b.HexaNumber and a.HexaEffDate = b.HexaEffDate and
[Code] ...
Need finding the duplicate records which has creation date time as '2015-11-01 00:00:00.000' and all the above columns mentioned in the query matches.
If you need to inner join 2 tables that have some columns names that are the same, how can you have those columns be named differently in the query result without aliasing them individually?
Tried select a.*,b.* from tbldm a,tblap b where a.id=b.id hoping the col names in the result would have the a.s and b.s in front of them but they didn't.
I am attempting to run the following select statement joining multiple tables but in the end result I would like only Distinct/Unique values to be returned in the invlod.lodnum column.
[select pw.schbat, adrmst.adrnam, adrmst.adrln1, adrmst.adrcty, adrmst.adrstc, adrmst.adrpsz, invlod.lodnum, shipment.host_ext_id, shipment_line.ordnum, car_move.car_move_id from aremst join locmst on (aremst.arecod = locmst.arecod) and (aremst.wh_id = locmst.wh_id)
I am trying to join two tables and looks like the data is messed up. I want to split the rows into columns as there is more than one value in the row. But somehow I don't see a pattern in here to split the rows.
This how the data is
Create Table #Sample (Numbers Varchar(MAX)) Insert INTO #Sample Values('1000') Insert INTO #Sample Values ('1024 AND 1025') Insert INTO #Sample Values ('109 ,110,111') Insert INTO #Sample Values ('Old # 1033 replaced with new Invoice # 1544') Insert INTO #Sample Values ('1355 Cancelled and Invoice 1922 added') Select * from #Sample
This is what is expected...
Create Table #Result (Numbers Varchar(MAX)) Insert INTO #Result Values('1000') Insert INTO #Result Values ('1024') Insert INTO #Result Values ('1025') Insert INTO #Result Values ('109') Insert INTO #Result Values ('110')
[Code] ....
How I can implement this ? I believe if there are any numbers I need to split into two columns .
I am trying to query for a common value in a column called "file_auth_nbr" in 30 different tables. I was going to try something like this (see below) but wasn't sure if this was the most efficient, fastest, or correct, way to get what I'm looking for:
Select distinct a.file_auth_nbr from table1 as a join table2 as b on a.file_auth_nbr = b.file_auth_nbr join table3 as c on a.file_auth_nbr = c.file_auth_nbr join table4 as d on a.file_auth_nbr = d.file_auth_nbr join table5 as e on a.file_auth_nbr = e.file_auth_nbr ......etc., etc.
I want to write SQL that will search the tables in a database for a specific column, like this. For instance, I have a column "Unique_ID" that is in many of our tables (hundreds) but not in others and want to find out the tables it is in. It is always the first column.
I tried to find a system stored procdure to do this but couldn't and tried to create a script using the sysobjects and syscolumns tables in the Master db, but came to a roadblock because they don't seem to be related at all.
I would surely appreciate if someone else has already done this!
Hi , I have two tables within a SQL database. The 1st table has an identified column and column which lists one of more email identifers for a second table, e.g. ID Email -- ---------- 1 AS1 AS11 2 AS2 AS3 AS4 AS5 3 AS6 AS7
The second table has a column which has an email identifier and another column which lists one email address for that particular identifier, e.g. ID EmailAddress --- ------------------ AS1 abcstu@emc.com AS2 abcstu2@emc.com AS3 abcstu3@emc.com AS4 abcstu4@em.com AS5 abcstu5@emc.com AS6 abcstu6@emc.com AS7 abcstu7@emc.com AS11 abcstu8@emc.com I need to create a stored procedure or function that: 1. Selects an Email from the first table, based on a valid ID, 2. Splits the Email field of the first table (using the space separator) so that there is an array of Emails and then, 3. Selects the relevant EmailAddress value from the second table, based on a valid Email stored in the array Is there any way that this can be done directly within SQL Server using a stored procedure/function without having to use cursors?
I have Two Database that exist on Two seperate servers. The two database contain same schema and contains tables and columns of same name. Some tables have slight differences in terms of data types or Data type lenght.
For example if a Table on ServerA has a column named - CustomerSale with Varchar (100, Null) and a table on ServerB has a column named CustomerSale with Varchar (60, Null), how can i find if other columns have similar differences in all tables with the same name and columns in the two servers.
I am using SQL Server 2005. And the Two Servers are Linked Servers
What Script can i use to accomplish this task. Thanks
Hi, I have table which stores the fund name and its data. We get quarterly information from the fund co. Suppose if the user wants to add a fund thats not in our database we let then add a ClientFundId and a FundName. But may be after sometime the fund company may add that fund in the next quarter.. So how do i get rid of Duplicated Data.. In the ClientFundId column we can a 9 letter Aplhanumeric or a 5 letter character but if the fund co.. provides those values the 5 letter characters are stored in Ticker column and the 9 letter words are stored in Cusip column.. So i just wrote this query hoping i could retrieve the duplicate values but it didnt list any..but i found one this is my query.. Select FundId, Cusip, Ticker, ClientFundId, FundName, ShortName From Fund Where
ClientFundId = Ticker or ClientFundId = Cusip Any help will appreciated Thanks Karen
I am trying to find when a name has been entered more than once into 1 database table.
I'm currently doing something like this (can't remember exactly, not at work)
SELECT COUNT(*) AS Cnt, Name FROM tblTable GROUP BY Name ORDER BY Cnt Desc
This brings back all the Names in the database and tells me which are duplicates but I want to just have the results of the duplicate values and not the single values.
does someone have a querry to display the duplicate records in a table.
Table: zipcode dma
My data upload is failing because there is a primary key on zipcode and the source data (42k records) has about 50 duplicate zipcode records in it. It is possible that there is a unique combo of zipcode / dma but I need to identify the duplicate records to determine that.
I have this script bellow which does what it is supposed to. However it only outputs the cust_id. I want it to show all the columns in the table. How would I do this?
SELECT cust_id FROM cust_table WHERE cust_name in ('Billy','John') and rownum < 100 GROUP BY cust_id HAVING COUNT(*) > 1;
For our ETL process, we maintain a TransformationList table that has the source view and the destination table. Data is copied from the view into the table (INSERT INTO). I am trying to find column names in the Views that are not column names in the associated Table.
In the below example, want to end up with three records:
I have it almost working, except that there is a table, ChangeColPrefix table, that is used by the ETL process to change some of the view's column name prefixes. Some of the source views have column names with prefixes that do not match the destination table column names. Say view SouthBase has all the column names prefixed with SB - like SBAcct, SBName. And the Destination table of Area District has ADAcct, ADName. There would be a row in the ChangeColPrefix for SouthBase, SB, AD, 1, 2 that would be used by the ETL process to create the INSERT INTO Area District From SouthBase.
I need to use this ChangeColPreifx to find my unmatching columns between my source views and destination tables. With out that table SBAcct and SBName from SouthBase will not appear to match the columns of ADAcct and ADName, but they do match.
I want to end up with these three records as non-matching:
View1, Column4 View2, Column4 View2, Column5
View1 has Salumn2 and View2 has Salumn5, and they must be changed to Column2 and Column5 as per the ChangeColPrefix table before running the Select from INFORMATION_SCHEMA.COLUMNS EXCEPT Select from INFORMATION_SCHEMA.COLUMNS looking for unmatched columns.
/***** Set Up Test Data *****/ -- Create 2 test views IF EXISTS(SELECT * FROM sys.views WHERE object_id = OBJECT_ID(N'[dbo].[View1]')) DROP VIEW dbo.[View1] GO CREATE VIEW View1 AS SELECT '1' AS Column1 , '2' AS Salumn2 , '4' AS Column4;
Hi everybody I need help on finding duplicates and deleting the duplicate record depending on name and fname , deleting the duplicates and leaving only the first one.
my PERSON table is this below:
ID name fname ownerid id2
1 a b 2 c c 3 e f 4 a b 1 10 5 c c 2 11
I have this query below that returns records 1 and 4 and 2 and 5 since they have the same name and fname
select * from ( Select name ,fname, count(1) as cnt from PERSON group by name,Fname ) where cnt > 1
ID name fname ownerid id2
1 a b 4 a b 1 10
2 c c 5 c c 2 11
With this result I need to delete the second record of each group but update the first records with the ownerid and id2 of the second record that would be deleted... I don't know how to proceed with this..
Serial Count 001 2 the count is 2 because Serial 001 has an MSDSID of 20 and 22 002 1 the count is 1 because Serial 002 only has MSDSID 21 003 2 the count is 2 because Serial 003 has an MSDSID of 21 and 22 004 1 the count is 1 because Serial 002 only has MSDSID 23
It would be even better if the results just showed where the count is greater than 1.
I have this query below that I created to do a count, but I don't think this is what I needed.
I need to find the duplicates. Example, if
CLI_ID1 12345 has 4 CLIP records, each CLIP record should have a different CLIP rank. I need to find scenarios where 2 (or more) of the CLIP records have the same CLIP RANK. If there are duplicate CLIP_RANKs within the same CLI_ID,
Select Distinct cli_id1, count(clip_rank) countrank FROM impact.dbo.CLI LEFT JOIN impact.dbo.CLIO ON CLI.CLI_ID1 = CLIO.clio_id1
left join impact.dbo.clip ON cli_id1 = clip_id1 Where (clio_trm = '' or clio_trm = NULL or clio_trm is null) group by cli_id1 order by cli_id1
I need to make a selection on join datasets with 2 conditions and populate the results in another dataset(Report).It is working with the fist condition "AccountingTypeCharacteristicCodeId = 3"...
I have a table that is made up of the sum of medical, mental health and pharmacy claims. I would like to query that to find instances when the sum of the three claims types are greater than a predetermined threshold.
For example: Patient 1 Medical = 10,000 (could be 10 records at 1,000 each) Patient 1 Mental Health = 5,000 Patient 1 Pharmacy = 15,000 Patient 2 Medical = 1,000 Patient 2 Mental Health = 0 Patient 2 Pharmacy = 500
Threshold is 25,000
If I queried the above sample table I would get one record: Patient 1 30,000 - because 10,000+5,000+15,000 = 30,000 and is greater than the threshold.
I am not sure that a having clause would work though.
I have a patient record and emergency contact information. I need to find duplicate phone numbers in emergency contact table based on relationship type (RelationType0 between emergency contact and patient. For example, if patient was a child and has mother listed twice with same number, I need to filter these records. The case would be true if there was a father listed, in any cases there should be one father or one mother listed for patient regardless. The link between patient and emergency contact is person_gu. If two siblings linked to same person_gu, there should be still one emergency contact listed.
Below is the schema structure:
Person_Info: PersonID, Person Info contains everyone (patient, vistor, Emergecy contact) First and last names Patient_Info: PatientID, table contains patient ID and other information Patient_PersonRelation: Person_ID, patientID, RelationType Address: Contains address of all person and patient (key PersonID) Phone: Contains phone # of everyone (key is personID)
The goal to find matching phone for same person based on relationship type (If siblings, then only list one record for parent because the matching phones are not duplicates).
I want to find a way to get partition info for all the tables in all the databases for a server. Showing database name, table name, schema name, partition by (maybe; year, month, day, number, alpha), column used in partition, current active partition, last partition (for date partitions I want to know if the partition goes untill 2007, so I can add 2008)
all I've come up with so far is:
Code Block
SELECT distinct o.name From sys.partitions p inner join sys.objects o on (o.object_id = p.object_id) where o.type_desc = 'USER_TABLE' and p.partition_number > 1
I am trying to find all instances of a string that contain the letters FSC. While I am able to find most of them, I am unable to find the ones wrapped in double quotes.
Query example:
Select * from myTable Where item like '%FSC%'
This works great with the exception of when FSC is surrounded by double quotes.
I have created 3 views, which I then want to join to produce an overall result. The first view returns customer details, along with payment information. The next two views return values only when the customer has purchased extras outside our standard product i.e. if there is no purchase of an extra, then nothing is written to the extra's table. When I join the views together they only return values where data has been matched in all 3 views i.e. extra's have been purchased. Any data that did not match in all 3 view (i.e. no extra's purchased) is either ignored or dropped from the results. So I need my script to return all values even if no data exists in the two extra views.
My scripts are as follows: Main View SELECT CUSTOMER_POLICY_DETAILS.POLICY_DETAILS_ID, CUSTOMER_POLICY_DETAILS.HISTORY_ID, CUSTOMER_POLICY_DETAILS.AUTHORISATIONUSER, CUSTOMER_POLICY_DETAILS.AUTHORISATIONDATE, ACCOUNTS_TRANSACTION.TRANSACTION_CODE_ID, CUSTOMER_INSURED_PARTY.SURNAME, SYSTEM_INSURER.INSURER_DEBUG, SYSTEM_SCHEME_NAME.SCHEMENAME, CUSTOMER_POLICY_DETAILS.POLICYNUMBER, --TotalPayable IsNull(SUM(CASE LIST_TRAN_BREAKDOWN_TYPE.IncludeInTotal WHEN 1 THEN ACCOUNTS_TRAN_BREAKDOWN.AMOUNT ELSE 0 END), 0) AS TotalPayable, --NetPremium IsNull(SUM(CASE ACCOUNTS_TRAN_BREAKDOWN.Tran_Breakdown_Type_ID WHEN 'NET' THEN ACCOUNTS_TRAN_BREAKDOWN.AMOUNT ELSE 0 END), 0) AS NetPremium, --IPT IsNull(SUM(CASE WHEN SubString(ACCOUNTS_TRAN_BREAKDOWN.Premium_Section_ID, 1, 3) = 'TAX' THEN ACCOUNTS_TRAN_BREAKDOWN.AMOUNT ELSE 0 END), 0) AS IPT, --Fee IsNull(SUM(CASE ACCOUNTS_TRAN_BREAKDOWN.Tran_Breakdown_Type_ID WHEN 'FEE' THEN ACCOUNTS_TRAN_BREAKDOWN.AMOUNT ELSE 0 END), 0) AS Fee, --TotalCommission IsNull(SUM(CASE WHEN SubString(ACCOUNTS_TRAN_BREAKDOWN.Tran_Breakdown_Type_ID, 4, 4) = 'COMM' THEN ACCOUNTS_TRAN_BREAKDOWN.AMOUNT ELSE 0 END), 0) AS TotalCommission
FROM ACCOUNTS_CLIENT_TRAN_LINK INNER JOIN ACCOUNTS_TRANSACTION ON ACCOUNTS_CLIENT_TRAN_LINK.TRANSACTION_ID = ACCOUNTS_TRANSACTION.TRANSACTION_ID INNER JOIN ACCOUNTS_TRAN_BREAKDOWN ON ACCOUNTS_TRANSACTION.TRANSACTION_ID = ACCOUNTS_TRAN_BREAKDOWN.TRANSACTION_ID INNER JOIN LIST_TRAN_BREAKDOWN_TYPE ON ACCOUNTS_TRAN_BREAKDOWN.TRAN_BREAKDOWN_TYPE_ID = LIST_TRAN_BREAKDOWN_TYPE.TRAN_BREAKDOWN_TYPE_ID INNER JOIN CUSTOMER_POLICY_DETAILS ON CUSTOMER_POLICY_DETAILS.POLICY_DETAILS_ID = ACCOUNTS_CLIENT_TRAN_LINK.POLICY_DETAILS_ID AND CUSTOMER_POLICY_DETAILS.HISTORY_ID = ACCOUNTS_CLIENT_TRAN_LINK.POLICY_DETAILS_HISTORY_ID INNER JOIN SYSTEM_INSURER ON CUSTOMER_POLICY_DETAILS.INSURER_ID = SYSTEM_INSURER.INSURER_ID INNER JOIN SYSTEM_SCHEME_NAME ON CUSTOMER_POLICY_DETAILS.SCHEMETABLE_ID = SYSTEM_SCHEME_NAME.SCHEMETABLE_ID INNER JOIN CUSTOMER_INSURED_PARTY ON ACCOUNTS_CLIENT_TRAN_LINK.INSURED_PARTY_HISTORY_ID = CUSTOMER_INSURED_PARTY.HISTORY_ID AND ACCOUNTS_CLIENT_TRAN_LINK.INSURED_PARTY_ID = CUSTOMER_INSURED_PARTY.INSURED_PARTY_ID WHERE CUSTOMER_POLICY_DETAILS.AUTHORISATIONDATE = '2007-08-17' AND ACCOUNTS_TRANSACTION.TRANSACTION_CODE_ID <> 'PAY'
GROUP BY CUSTOMER_POLICY_DETAILS.POLICY_DETAILS_ID, CUSTOMER_POLICY_DETAILS.HISTORY_ID, CUSTOMER_POLICY_DETAILS.AUTHORISATIONUSER, CUSTOMER_POLICY_DETAILS.AUTHORISATIONDATE, ACCOUNTS_TRANSACTION.TRANSACTION_CODE_ID, CUSTOMER_INSURED_PARTY.SURNAME, SYSTEM_INSURER.INSURER_DEBUG, SYSTEM_SCHEME_NAME.SCHEMENAME, ACCOUNTS_TRANSACTION.Transaction_ID, CUSTOMER_POLICY_DETAILS.POLICYNUMBER
Add on View 1 CREATE VIEW TOPCARDPA AS select policy_details_id, History_id, Selected from customer_addon where product_addon_id = 'TRPCAE01'
Add on View 2 CREATE VIEW TOPCARDRESC AS select policy_details_id, History_id, Selected from customer_addon where product_addon_id = 'HICRESC01'
Join Result Script SELECT TOPCARD.AUTHORISATIONUSER, TOPCARD.AUTHORISATIONDATE, TOPCARD.TRANSACTION_CODE_ID, TOPCARD.SURNAME, TOPCARD.INSURER_DEBUG, TOPCARD.SCHEMENAME, TOPCARD.POLICYNUMBER, TOPCARD.TotalPayable, TOPCARD.NetPremium, TOPCARD.IPT, TOPCARD.Fee, TOPCARD.TotalCommission, TOPCARDPA.SELECTED, TOPCARDRESC.SELECTED FROM dbo.TOPCARD TOPCARD INNER JOIN dbo.TOPCARDPA TOPCARDPA ON TOPCARD.POLICY_DETAILS_ID = TOPCARDPA.POLICY_DETAILS_ID AND TOPCARD.HISTORY_ID = TOPCARDPA.HISTORY_ID INNER JOIN dbo.TOPCARDRESC TOPCARDRESC ON TOPCARD.POLICY_DETAILS_ID = TOPCARDRESC.POLICY_DETAILS_ID AND TOPCARD.HISTORY_ID = TOPCARDRESC.HISTORY_ID
I have included all the scripts I have used, as others may find them useful, in addition to anyone that is able to provide me with some assistance. Thanks in advance for for the help.
Hi. I'm new to SQL, and need to join 2 tables... any hints??? table1:id (int)title(varchar(50))body(text) table2:id (int)title(varchar(50))body(text) somehow need to get the id, which table the record is from, and the title and body... so if the tables had the information: table1:id title body1 "first title" "first body"2 "second title" "second body"3 "third title" "third body" table2:id title body1 "first title" "first body"2 "second title" "second body"3 "third title" "third body" I would like to get... id table title body3 1 "third title" "third body"3 2 "third title" "third body"2 1 "second title" "second body"2 2 "second title" "second body"1 1 "first title" "first body"1 2 "first title" "first body" Does anyone know how to get this? I am fairly flexible if i need to change things... cheers, eh!
Hello everyone,I'm starting a new project right now and am trying to cut down on the number of stored procedures and tables I'm gonna have to use and I have run into a dead end.Up till now I have been doing the following: Say I had a PRODUCTS table with a DesignId column and ColorId column. I would then create a DESIGN table (Name, Id) and a COLOR table (Name, Id) to INNER JOIN with the two columns in my PRODUCTS table. And the same goes for all my other tables: ORDERS, CUSTOMERS, LINKS etc...... And in the end I would have a lot of tables and stored procedures for these category columns. So I thought, it would be nice to just have a Categories and Subcategories table for all my category columns for the whole website. That way every time I need to define a category column for any table I can simply just add the values to my Categories and Subcategories table instead of having to create a new table for every category column. Everything is fine and dandy except for trying to INNER JOIN these two tables with more than one column. To get values for one column is no problem:<code> SELECT *, _SubCategories.SubCategoryNameFROM _ProductsINNER JOIN _SubCategoriesON _Products.DesignId = _SubCategories.SubCategoryIdWHERE DesignId = COALESCE (@DesignId, DesignId)</code> But how do you INNER JOIN the ColorId column as well. Both DesignId and ColorId values are in my _SubCategories table. In a stored procedure: Is there any way to create a table and columns. Run a loop statement, with one INNER JOIN . Rerun another loop statement with a new INNER JOIN statement? Would that work or does any one else have an idea what would?Thank you guys for the help. It is much appreciated. Alec
Hello all, I have two datatables "customersReached " and "customersGuessed " and I want to combine them into one table only, the problem is that one table exeeded to the other by two fields, so what can I do??????? Mahmoudona
I've been trying to think about how I can do this. I have forums that I have written built around SQL Server. Basicly you have:
-A users Table -A Posts Table -A Replies Table.
Posts and replies have very similar structures. I'd like to be able to merge them and pick out the earliest post for said forum.
1 - is there a way to merge them so that the post date for both the replies and posts tables is contained in 1 column. If not is there a better alternative.
I'd also like to add indexing to the posts so I can do paging. Is there a way for me to add an index number to them while I can sort them anyway i want.