Dupes In My Db

Feb 5, 2000

Greetings!

I have a database with several million records, I have found dupes and I need to get rid of them while keeping the original data in the db, kind of like delete all but 1. Any ideas of an easy way to do this?


Thanks,
Jimmy Ipock, MCSE, MCP+I

View 1 Replies


ADVERTISEMENT

Dupes From A Join :(

Dec 9, 2001

I have 3 tables, and im doing a join like:
select top 10 thits.fhits as hits, tmain.fheadline as rubrik, tmain.fpubfile as pub
from thits
join tmain on tmain.postid=tHits.postid
join tkeyscat on tkeyscat.postid=tmain.postid
where tkeyscat.fkeycat=60 order by hits desc

Which works great (almost).

The problem is when an article in tmain is cathegorized in more then
one cathegory, so the join tkeyscat on tkeyscat.postid=tmain.postid
might join in more then one result.

Im trying to select the 10 most read articles from tmain/thits where the article is in keycat 60.

How can I solve this?

tia
/frax

View 3 Replies View Related

Urgent - Dupes..

Apr 6, 2001

Hi ,

i need to delete duplicate rows in a table, i want a good logic and example to solve this issue. Please help me in this..

urs
vj

View 2 Replies View Related

Help Eliminate Dupes

Nov 30, 2006

I am VERY new to SQL and I am having a heck of a time biulding a script to find and remove duplicate entries.

Here is the table structure.


CREATE TABLE [dbo].[SecurityEvents](
[EventLog] [varchar](255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[RecordNumber] [int] NULL,
[TimeGenerated] [datetime] NULL,
[TimeWritten] [datetime] NULL,
[EventID] [int] NULL,
[EventType] [int] NULL,
[EventTypeName] [varchar](255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[EventCategory] [int] NULL,
[EventCategoryName] [varchar](255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[SourceName] [varchar](255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[Strings] [varchar](255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[ComputerName] [varchar](255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[SID] [varchar](255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[Message] [varchar](255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[Data] [varchar](255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON [PRIMARY]

GO

This small script seems to eliminate the dupes, but I can't seem to figure out to properly replce the table the with output of the script with all the dupes gone.


select distinct * from dbo.SecurityEventsTest where recordnumber IN
(select recordnumber from dbo.SecurityEvents)
order by recordnumber

Could someone help??

Thank You,

John Fuhrman
http://www.titangs.com

View 9 Replies View Related

Deleting Dupes In Special Cases

Feb 7, 2005

I need to delete all rows that match at least one of the account_id values of another row *and* that has the same email address. However, if they have the same email address and none of the account_id values then I need to keep it. I've attached a sample dataset along with the expected results.

I have this:
DELETE [acctID_emailAddress_tmp] FROM [acctID_emailAddress_tmp]
JOIN
(select emailaddress, account_id, max(contact_id_tmp) max_cid
from [acctID_emailAddress_tmp]
group by emailaddress, account_id) AS tempImportTable
ON tempImportTable.[emailaddress] = [acctID_emailAddress_tmp].[emailaddress]
WHERE [acctID_emailAddress_tmp].[contact_id_tmp] < tempImportTable.[max_cid]
AND tempImportTable.[account_id] = [acctID_emailAddress_tmp].[account_id];

but it doesn't work since it's keeping the subset of the dupe row(s).

Can someone shed some light?

TIA

View 14 Replies View Related

Help Finding And Updating Dupes In 2 Tables

Oct 13, 2005

Being fairly new to SQL and SQL scripting, I am at a loss on how to proceed on my issue.

I have a MSDE database with 2 tables that need to modified. I am changing to a standard 12 digit code in my PATIENTS table for the field sChartCode nvarchar). That code will be in the form of 110012345678. 1100 will preceed the actual 8 digit chartcode

In the PATIENTS table, the same person may be duplicated many times using vaiations such as 123456, 12345678, 012345678, 12345678 SMITH, 012345678 SMITH. For each of these records, they are linked to the RECORDS db using the field lPatientId (int).

I have already manually updated about 20K records in the RECORDS db which
takes way to many hours of time. New records will be imported at about 10K a week or so and will be over 100K soon. By the way, the SQL server is on the way.

What I am looking for is an easier way to find the records that have not been
converted in the PATIENTS db and see if they match one that has already been converted. If it has, it would need to update all records in the RECORDS db with the correct updated lPatientId and then delete the duplicate record(s) from the PATENTS db. If not, it would only need to add '1100...' to the lPatientId field.

Any help or guidance that anybody can give will be most appreciated.

Dale

View 2 Replies View Related







Copyrights 2005-15 www.BigResource.com, All rights reserved