Queries :: Remove Duplicates - If 2 Fields Are Equal
Nov 19, 2013
I have a large table with many fields and many rows. There is no primary key. I'll call one field ParentPN, and another field ChildPN. There are many other fields as well. I want to identify all rows where BOTH the ParentPN and ChildPN occur more than once. I know how to create a query to identify duplicates of ONE field in the table, but not two. I can solve this with VBA: I will read the two fields of interest in the first row, then compare both values with every other row. If it finds another row with BOTH ParentPN and ChildPN identical with the first, that's a "hit". Then, repeat with all the other rows. I could find ways to make this run faster, but I was wondering if there are any build in functions to accomplish this. I looked at the Find Duplicates query builder, and all I see is I can select ONE field to search for dupes, not two.
I am working on a database to manage newsletter subscriptions. Each subscriber record has the option of having up to four email addresses registered to his/her name.
Is there a way to check for duplicate email addresses in the entire database? It would have to compare all values in all four fields of all records.
Any ideas on how to implement such a thing? I'm clueless...
I have 4 or 5 tables. Most of the fields are exactly the name but they all have at least 1 to possibly 5 or six fields that are not in the other table. Additionally there are some duplicates within the individual tables as well as across tables.
i.e.
I have a
Student Table - with all the info on the student as well as a column called student that identifies them as such however it does not have the columns parent, donor, appeal, designation..... Parent Table - with all the info on the student as well as a column called parent that identifies them as such however it does not have the columns student, donor, appeal, designation..... Donor Table - with all the info on the student as well as a column called donor that identifies them as such however it does not have the columns student, parent, appeal, designation..... Appeal Table - with all the info on the student as well as a column called appeal that identifies them as such however it does not have the columns student, parent, donor, designation.....
-A person can be within one of these tables more than once but with all the same information. -A person can also fall into all of these parameters so they could be on every table with the same information in addition to the missing columns,=.
Question 1 : what is the best way to dedupe and delete the individual tables (they all have account numbers) Question 2: I was thinking create a new table with all the columns available, however how do i dedupe across tables while populating the additional columns from each?
I have a table called Stock Levels which contains 3 fields. (ID, ProductID, StockLevel) ID is the Pkey, ProductID contains duplicates and StockLevel which contains different stock levels
and I am trying to remove the duplicates and retain the the data so I am left with the correct stock number
what I have done is the following, but I am still getting duplicate values in productid and stocklevels
SELECT DISTINCTROW id, productid, stocklevel into mynewtable from stocklevels
I have a report with 2 access tables (1 Master table and another a daily feed table)
The Master table keeps a log of all incoming records. (once append it to this table, should not show in future reporting)
The Daily feed information within the last 48 hours. (uploaded from an excel report into access temporary table)
When the daily feed table gets completed, I append the records and updated them into the Master to avoid duplication.
When I upload the daily feed table and I match it against the Master table to find duplicates, how can I delete the duplicates from the Daily Feed table?
This is my code to find duplicates:
SELECT CMPreport.ID, CMPreport.MbrName, tblMaster.ID FROM CMPreport LEFT JOIN tblMaster ON CMPreport.ID = tblMaster.ID WHERE (((tblMaster.ID) Is Not Null));
I have a query which gets information from 2x tables where the I'd on one table is the reference number on the second table.I would like to know how I can remove the duplicates on my reference number field?
I have a query linked to a main frame database. One of the fields is [significance] and gets a number 1-20. Usually when this data is entered, it gets multiple significance numbers. This causes my query to return separate records for each significance number. For example if case number 123 is given significance codes 1, 5 and 12, then my query returns 3 records.
I need a query that will show all records one time that have a significance code other than 12. This would be easy if there were not duplicate entries for the same case number because I could simply say "Not 12". So in the example above, my query returns 2 records showing significance codes 1 and 5. But I don't want to see the record for case number 123 because it also has a 12 significance code.
If have a table with billing occurances and one of the fields is "business name" how would I need to setup a make table query which lists unique occurances within that data?
Essentially, I want to remove duplicates. If ABC Pet Store has 5 billing occurances and XYZ Pet Store has 1... I want both to only be listed once.
I hope that makes sense. Thanks for any help as to how to set this up!
I would like to remove duplicates from the following query. I would like it to display only one record depending on the InvoiceID. So only show one unique record based on Invoices.InvoiceID. Thanks!
SELECT DISTINCT Invoices.InvoiceID, Invoices.CustomerID, InvoiceDetails.InvoiceDetailID, Invoices.InvoiceDate FROM InvoiceDetails INNER JOIN Invoices ON InvoiceDetails.InvoiceID = Invoices.InvoiceID WHERE (((Invoices.InvoiceDate) Between #8/1/2004# And #8/31/2004#) AND ((InvoiceDetails.DeliverBy)=0)) ORDER BY Invoices.InvoiceID;
Hi all, total newb here with a question i can't find answered anywhere. (you are my only hope) This database contains over 70k records and is a collection of user registrations over the years. Heres the issue.
After running the find duplicates query and get my list (over 8000 dupes) i get a sample like below. I cannot now run the append query as every other site or article says to. I can't set the primary key to surname because there are so many records that have the same last name but diff. first name. I need it to remove the dupes based on the EditDate, keeping the newest record.
I have the below code but I want to add a grouping to it so that if there is a duplicate building number it will not list it. Is there a way to add a grouping into the code to do this?
I have a client database that has recently had multiple duplicate entries. I need to reduce or negate this erroneous activity. I have a client table where I record amongst others, the following;
key [christian_name] [family_name] [dob] ......
I believe that to prevent duplicate entrie via form I have created an additional field called "unique" given it as a unique index which I want to have populated with the joined fields first_name & last_name & dob (IE johndoe01/01/90), and then as user enters a new client it wont allow a duplicate.
However I need to fill all the existing customers (3600+) with the relevant joined existing data. If I create an expression I can cajoin the fields in a select query but when I try to make an update query the same syntax comes up with empty fields.
select query sql that worked to show field ...
SELECT divers.christian_name, divers.family_name, divers.dob, [christian_name] & [family_name] & [dob] AS Expr1 FROM divers;
update query that was empty ..
UPDATE divers SET divers.[unique] = [christian_name] & [family_name] & [dob];
I have a table of 50,000 reccords. The table has 8 fields of theses eight fields I want to remove duplicates based on 4 of the fields in a query. Is there a way I can do this and keep the uniqe identifyer so that after the duplicates are removed from those 4 fields I can match them back up with the other four fields????
I have the following set up on a form to pull together a query (by form) and resulting report:
Publication Sector Product Region
Each publication can have multiple sectors/products/regions. The resulting query therefore duplicates the publication, for example:
Publication 1 Sector 1 Product 1 Region 1 Publication 1 sector 2 product 1 region 1 publication 1 sector 2 product 2 region 1
etc etc...
The report ONLY has publication on it, but as it is based upon the query utilising the 4 factors above, the resulting report is: Pub 1 Pub 1 Pub 1 Pub 2 etc
I want the report to only list the publications once - i have tried "hide duplicates" from the form field properties, however this hides the text but leaves a big gap on the resulting report when generated where the duplicate would be if it were not hidden.
There must be a simpler way to acheive this than getting another query to create a table based on the first query which (the table) only includes the publication name, and is filtered to remove duplicates through a primary key...
I am trying to design a delete query that has an additional criteria needed.
I want to delete identical IDs in one column only if there are identical values in an adjacent column.
So for example, in the table below I want to delete the last row where the duplicate ID is "2700023" because the Code field has identical values, "LRAC". I do not want to delete the second row because the Code field is different for the row.
I have two tables, one is of departments, and one is of people (with a FK denoting what department this person is in). Now consider the fact that there are duplicates in the departments table, and I would like to remove these duplicates. However, the duplicates have related records (in the people table). So, before removing the duplicates, I must update the FKs in the table of people (this is the step I'm having trouble with).
Here's an example:
As you can see, the "Sales" department is there twice. And both have a related record. What I want to do is: Update all DepartmentIDs (in tblPeople) to not point to duplicate records. In this example, that would be PersonID 2; Joe. His DepartmentID should update to "1" (as both "1" and "2" are "Sales").Delete the duplicates in tblDepartments (in this case, DepartmentID 2, "Sales").
The second step is no problem, it is only the first I am struggling with.
Also, the example posted here is just an example, the data I actually need to do this for is significantly more complex and there are many more records! In the attached database:
qry1: Simple query to find all duplicates (just used the query wizard) qry2: Just the first row of each duplicated departments (duplicates that shouldn't be deleted). In the example above, this would be the "2", "Sales" row in the tblDepartments table. qry3: Basically all qry1 rows that don't appear in qry2 qry4: All qry3 values, and their respective qry2 value.
This is what each of the (soon to be deleted) duplicate values' related records' DepartmentID should be updated to... There's no simpler way to phrase that, so using the example above, qry4 would return "2","1". This indicates that all people with a DepartmentID of "2" should be changed to "1" (so we can subsequently erase the department with the ID of 2.
This is as far as I have gotten. My next step is: Update all FKs in tblPeople based on qry4 (You can't set an update query's criteria to pull from another query, nor can you use the second query for the update value... or maybe you can, but I don't know about it).
Ref# Rev 97 b 98 c 99 c 99 e 100 c 100 b 101 a 102 b
I need to create a simple report but remove the duplicates (ex. Ref# 99,100). I need to delete the older Rev's (Ex Ref# 99 Rev C, Ref# 100 Rev B).Is this done throughRecordsets? will an SQL query do the trick?
I've been working on a procedure to step through the recordset and add the data one record at a time so I can get rid of the duplicates.
I've tried a few approaches, but this is where I'm at now.
Code:
Dim rs As DAO.Recordset Dim rsHH As DAO.Recordset Dim rsPhone As DAO.Recordset Dim rsEmail As DAO.Recordset Dim rsAddress As DAO.Recordset Dim rsPerson As DAO.Recordset Dim db As DAO.Database Set db = CurrentDb
The first is ClientList, which contains typical contact and biographical information (name, address, citizenship, etc), and unique ClientID# for each client. The primary key for this table is the default Autonumber ID that comes with each new table.
The second table is WillInfo, which contains information specific to drafting the client's Will (e.g., spouse name, spouse address, spouse citizenship, similar data on beneficiaries, similar data on executors, etc). The primary key for this table is ClientID#.
I then created a One-to-One relationship between ClientList and WillInfo, binding by Client ID. All this appears to work.
My question arises because I have two clients who are married to each other, which means much of the spouse info I require for the WillInfo table in respect of these particular clients is already accurately recorded as client info in the Clientlist table. So for these specific clients (but not generally!), I want the spouse information in the WillInfo table (e.g., SpouseAddress, SpouseCitizenship for ClientID# 12.001) to EQUAL specific values provided in the ClientInfo table (i.e., ClientAddress, ClientCitizenship for ClientID# 12.002).
I read and understand this is the best approach, following the principle that data should not be entered twice, so as to increase efficiency and avoid mistakes and future problems.
My question is: How do I do this? In Excel, if the client info I wanted to replicate was in cells B4-B9, I would enter =B4, or =B5, or =B6 and so on in the cells for spouse info. What is the equivalent expression for replicating specific client info from a different table.
I have a few fields that are the same across a couple of forms and sub-forms (each form/sub-form being represented by a different table). I would like for data entry into one field to ensure that the data is autofilled into the other. ie if I type 'ENG' into field 1 on form 1, it will autofill the equivalent field in sub-form 2 as 'ENG' so that I do not have to type the same thing twice. These entries are not unique or in any order as it is variable depending on the entry and so they can't be linked as primary keys and foreign keys. So how would I do this? I would like to avoid VBA is possible.
Is there an easy way to auto-populate a Junction table [in access 2010] given the following two tables with a many-to-many relationship for Tasks? The two tables are
I have a database with with 100s of values for a field. What I would like to be do is specify a value via a form and the query will return all options that equal the specified value.
I have two date columns in my table called "End date" and "Closing date".
An example could be 14-06-2015 and 13-04-2017.
I need to make a query which is checking if the two dates are equal to the last day of their respective month. I don't have two columns in the table with the last day of month, so I first need to find out what the last day in the month is.
I am trying to delete a record in tblinclude where record from tblexclude are equal to clientid and codeid
Here is the sql DELETE tblinclude.ClientID FROM tblexclude INNER JOIN tblinclude ON (tblexclude.ClientID = tblinclude.ClientID) AND (tblexclude.CodeID = tblinclude.CodeID) WHERE (((tblinclude.ClientID)=1));
I get the error Specify the table containing the records you want to delete. I've searched for this but I am just not getting it today.