VERY Chalanging Question
May 30, 2006input: 1.5 million records table consisting users with 4 nvchar
fields:A,B,C,D
the problem: there are many records with dublicates A's or duplicates
B's or duplicates A+B's or duplicates B+C+D's & so on. Mathematicly
there are 16-1 posibilities for each duplication.
aim: find the duplicates & filter them, leave only the unique users
which don't have ANY duplication.
We can do it by a simple select query that logicly checks the
duplication in a OR operator.
But it takes about 16 days in a very fast PC.
The DB is in sql-server, converting it to Oracle might acomplish it to
8 days.
How can i do it in a few hours?
Remeber that filtering first the users with parameter A & than by
parameter B & so on will result an error in the final result because it
will loose the information regarding the filtered users - maybe in
parameter C they are equal to other users in the table...
THANK YOU