DeDuplication Issue
Dec 28, 2007
Here is my senario
1. I have millions of trasactional rows in target (no business key)
2. source data (100 columns) need to be verified with target data column by column to eradicate inserting of duplicate data into target.
3. A T-SQL Script with a equi join comparing all the 100 columns takes time to process.
I have read articles about using rank, SCD, tabel diff but did not really help me
we are not using checksum becasue of the 1% data collision
Is there is efficient way to handle column by column comparision
View 1 Replies
Apr 13, 2006
Hi, everyone,
l've a fact table DEVICE with following structure,
DEVICE_NAME VARCHAR(50)
DEVICE_DATE DATETIME
DEVICE_NUMBER INT
Where DEVICE_NAME and DEVICE_DATE form a PRIMARY KEY
So l would like to import a text file with same information into this table.
My problem is, text file contains records which will violate my primary key constraint. In that case, l would only insert the record with DEVICE_NUMER not equal to ZERO and discard and log the others.
In case of the records violtae primary key constraints have DEVICE_NUMBER not equal to ZERO, discard both and log it.
So anyone has good suggestion on this?
View 1 Replies
View Related