DeDuplication Issue

Dec 28, 2007

Here is my senario

1. I have millions of trasactional rows in target (no business key)

2. source data (100 columns) need to be verified with target data column by column to eradicate inserting of duplicate data into target.

3. A T-SQL Script with a equi join comparing all the 100 columns takes time to process.

I have read articles about using rank, SCD, tabel diff but did not really help me
we are not using checksum becasue of the 1% data collision

Is there is efficient way to handle column by column comparision

View 1 Replies


ADVERTISEMENT

How Do Deduplication On Fact Table

Apr 13, 2006

Hi, everyone,

l've a fact table DEVICE with following structure,

DEVICE_NAME VARCHAR(50)
DEVICE_DATE DATETIME
DEVICE_NUMBER INT
Where DEVICE_NAME and DEVICE_DATE form a PRIMARY KEY

So l would like to import a text file with same information into this table.

My problem is, text file contains records which will violate my primary key constraint. In that case, l would only insert the record with DEVICE_NUMER not equal to ZERO and discard and log the others.

In case of the records violtae primary key constraints have DEVICE_NUMBER not equal to ZERO, discard both and log it.


So anyone has good suggestion on this?

View 1 Replies View Related







Copyrights 2005-15 www.BigResource.com, All rights reserved