Queries :: Deleting Duplicates In Non-unique Records
Feb 27, 2014
I have imported a large number of emails into a table tbl_requests.
I had intended to have unique file tbl_requests.date_opened unique, but have ended up with a lot of duplicate records (i.e. tbl_requests.date_opened is not unique !). How to delete any duplicates? I have 15,000 records...
I have a table that has multiple records (count >1). I used the find duplicate records and then made it a delete query, however, this resulted in deleting all the records that had count >1. I need to retain distinct record, and delete the extra records. Using select distinct.. I do not want to use VBA to achieve this, but at the same time be able to incorporate the steps in a module that would execute the queries in a sequential order and achieve the same results.
I bounced into a big problem with Access 2010 where I cannot seem to be able to remove duplicates from a table containing millions of entries midway through a series of queries.
As you can see, there are some rows which contain exact duplicates and some are a bit different. I wish to remove those fields with exact duplicates and only the duplicated fields. Running "Find duplicates" in Access gives me about 250,000 rows with such data.
I've tried a few options already: - I cannot use the date as primary key as there are several serial numbers. I cannot use serial numbers as primary key, because there are several dates. Using reading value as primary isn't an option either. - Microsoft says I should mark all duplicate values with an x and then make a delete query to get rid of all the x-marked rows. For 250,000 duplicates, that's a bit too much manual 'x-ing'. - If I do a delete query using the Find Duplicates query as a base, it removes all 250,000 entries from that table, instead of just the 125,000 which had a duplicate elsewhere in the table. - If I make a query which identifies duplicate data and gives me just one row for each duplicate, the delete query still deletes both entries from the original table. - I could make a new query which would have only unique values using Totals as criteria (for instance, using First for the Date-column). However, this still leaves the duplicate values in the original table. Note that this database is already 800 MB large and new data is imported once a week, for the next decade or so. I cannot have a table get duplicates every week and leave it there. - If I make a macro which would create the unique values table first and then deletes the old table, what happens next week when I try to import new readings? I would need to make a new macro each time I try to import new data as the table names change. Or is there a way to first run the unique numbers out, then replace the original table with the new one? With a 800MB database, this would put me dangerously close to the 2GB size limit. I wouldn't be able to use this as a part of a macro, as the database would have to do Compact & Repair each time it deletes the original table, midway through a longer series of queries. - Having duplicates removed from the original import isn't an option either, as it comes in as a overriding excel sheet for the past 3 months, once a week.
As you can see, it's quite a pickle getting the duplicates out of the original table. This is just a small part of a very long macro, which takes about 15 minutes to complete, but due to duplicates the database is getting way too large.
I have a unique query which lists all the films that we are screening over the next 3 months. I have added a COUNT field so that I can see how many of each films we are screening.
The problem is that i get duplicates of some films - and this may be because we may hold several copies of some films. I have attached two images which might explain this better!
What I could do with is knowing how to make it so that i get a list of films booked and how many of each, regardless of which copy of the film is used.
The SQL is:
Code: SELECT DISTINCTROW dbo_Films.[film name], Count(dbo_Films.[film name]) AS [CountOffilm name] FROM ((dbo_Films INNER JOIN dbo_filmCopies ON dbo_Films.ID = dbo_filmCopies.tblFilms_ID) INNER JOIN dbo_EventsFlicks ON dbo_filmCopies.ID = dbo_EventsFlicks.filmCopyID) INNER JOIN dbo_Venues ON dbo_EventsFlicks.venueID = dbo_Venues.ID WHERE (((dbo_EventsFlicks.datefield)>=#8/1/2015# And (dbo_EventsFlicks.datefield)<#1/1/2016#)) GROUP BY dbo_Films.[film name], dbo_Venues.southhub, dbo_Venues.northhub, dbo_Films.Specilaised ORDER BY dbo_Films.[film name];
I'm attempting to create an append query that will add new records only if there isn't an equivalent record already existing. Typically I would add the existing table to the query, and only add new records if the same do not exist. In this case, the table is maintaining records over time (start date and end date).
I'm checking if [t_temp_employees].[effective_date] <> [t_city_assignment].[start_date]. However, if the employee has historical entries it will still add a record (in fact, it'll add multiple records).
How can I append a new record only if one for the same time period does not exist?
I have duplicate data in a cell, I want to hide duplicate data and display only non-duplicate data.. I changed the property sheet to only show unique values, but it keeps showing data I don't want to see...
I have a client database that has recently had multiple duplicate entries. I need to reduce or negate this erroneous activity. I have a client table where I record amongst others, the following;
key [christian_name] [family_name] [dob] ......
I believe that to prevent duplicate entrie via form I have created an additional field called "unique" given it as a unique index which I want to have populated with the joined fields first_name & last_name & dob (IE johndoe01/01/90), and then as user enters a new client it wont allow a duplicate.
However I need to fill all the existing customers (3600+) with the relevant joined existing data. If I create an expression I can cajoin the fields in a select query but when I try to make an update query the same syntax comes up with empty fields.
select query sql that worked to show field ...
SELECT divers.christian_name, divers.family_name, divers.dob, [christian_name] & [family_name] & [dob] AS Expr1 FROM divers;
update query that was empty ..
UPDATE divers SET divers.[unique] = [christian_name] & [family_name] & [dob];
I have created a table for product codes and prices. After importing data from Excel I have a LOT of duplicated records. So I set up a query to search for 'distinct' records and then set up a delete query to delete anything that is not distinct. The query seems to be setup correctly (I have followed the instructions from Access Help) - but then when I try and run the query I get the message 'Could not delete from specified tables' - Is this a permissions issue? And if it is, where do I set up the permissions for the database? Thank you any ideas would be greatly appreciated!
my problem is this: i have a table that has three data feilds id, amount1, and, ammount2 i want to allow the user to imput duplicates, but i want my database to delete all but one of the duplicate feilds, and add the ammounts of the deleted records to the ammounts of the non deleted duplicate...
I 've created a query using the access wizard that finds duplicate records. It works fine. But then I would like to delete the duplicate records and keep just one(the first) . The problem is that the duplicate query returns the amount of duplicates found, not WHICH are they so I can delete them. Is there any way to alter the duplicate query , or to create a delete query using the duplicate query to achieve this? Or any other idea? Thank you in advance
My Membership Database includes 3 fields in the main Table which I need to clear prior to the new subscription year, but on a selective basis.The fields are R/N for (Renewal or New), a subscription date and an amount field (Currency).
At the due date I need to clear all 3 fields except where N/R=N AND date >01/01/2014, when data in all 3 fields is to be retained.I have tried several SQL codings to achieve this but have been unsuccessful so far.
I am trying to delete a specific part from multiple BOMs in my database.
I have a table of the BOMs that I want to look in. I called this table PartTable. I also linked my database table SYSADM.REQUIREMENT which contains all the requirement parts for all of our BOMs.
So I am wanting to delete only part number 123XX from each of the BOMs in my PartTable.
I am able to select the records with:
Code: SELECT SYSADM_REQUIREMENT.* FROM SYSADM_REQUIREMENT INNER JOIN PartTable ON SYSADM_REQUIREMENT.WORKORDER_BASE_ID = PartTable.PART_ID WHERE (((SYSADM_REQUIREMENT.WORKORDER_TYPE)="M") AND ((SYSADM_REQUIREMENT.WORKORDER_BASE_ID)=[PartTable].[PART_ID]) AND ((SYSADM_REQUIREMENT.WORKORDER_LOT_ID)="0") AND ((SYSADM_REQUIREMENT.PART_ID)="123XX"));
Now how do I delete these same records.
I am getting error saying I have to select a table to delete from....
I have a problem with getting the right result with a query.
To this topic I attached a part of my database including 2 queries. The queries are almost the same, except the first field. In query 1 is the total 'Group By' and in query 2 'Count'.The other fields are parameters, which are the same in both queries.
My problem: If I run query 1 then the result is 31 rows with unique ID's. When I run query 2 the result is 35, because the query counts 4 duplicates.
The correct result I want is 31. That means count the unique ID's. I am able to show the ID's with query 1, but I am not able to count them correctly. I tried to add 'DISTINCT' in SQL view but that didn't help. Also other solutions written in this forum didn't work.
Thank you if someone (with more query/SQL experience) is able to help me... :o
The main table for the database I am working on contains the following fields:
ID Mfr Control Number Initial or Follow-Up Follow-up Number Suspect Date of Initial Email Date Received Date Submitted Date of Report Serious Brief Description Causality Notes
With some additional qualifications I wanted to find records that had an intial report but no follow-up. Which translates too I want records that are unique in the Mfr Control Number field (no duplicates).
I am trying to build a query and keep getting hung up on the unique aspect of fields. I started by trying to query only "Mfr Control Number" fields that are unique (no duplicates). As best I can figure for some reason I can not add any additional fields to that query. My current query is set up in the query build table such that I have added "Mfr Control Number" in two columns. The first column in Total I have "Group By". In the next column I have set Total to "count" with a criteria of 1. If I try to add any other fields from my table than I seem to lose those unique results. But I need to further filter to get the exact information I need.
I want criteria on the "Initial or Follow-up" field to only bring "initial" I want criteria on the "Serious" field to only bring "serious" I want criteria on the "Date of This Report" field of "<Date()-"15""
Is there some way to take the results of that initial query to then build a another query based just on those records? I could then apply the further criteria and run my report. Or is there a way to do this in one step?
I've tried to make clear my intention but know it can be difficult to get this kind of stuff down in writing in a clear fashion. I have to be careful to keep information confidential also so some of the details are vague.
I have a form which allows the user to add new records to a table. After the user had entered all the information into the form, they click a command button to add the record. In addition to adding the new record, my command button runs an query which is supposed to generate a random number between 1 & 1,000,000,000 and update the record ID field with that number.
Here is the formula I have been using in the "update To" now of my query: Int((1000000000-1+1)*Rnd()+1)
My problem is that I keep getting duplicates. You would think that the chances of getting a duplicate number would be pretty small with this large of a range, but I get a duplicate almost every time.
I have tried indexing (No duplicates) the field in the table, but that did not work. When my query generated a duplicate number, the record was just not added to the table.
I also tried a two step approach: 1-Make a table of all in use record ID numbers from my table (tblIdNo) 2-Update new record with a random number that is not in tblIdNo
This was a no-go too
How to build an update query that will update each new record added to the table with a random number between 1 & 1,000,000,000 without any duplicates? This seems like it should be so simple, and I am starting to get really frustrated.
I would prefer to accomplish this through a query/queries (if possible) rather than with 100 lines of code. This database is not for me, it's for another group, and the individuals in this group are totally freaked out by code.
I have created a linked Excel table in Access 2010 called 'tblExcelLinked' and I have a form called 'ASB Log Form' for the purposes of presenting the data in a more readable manner that is easier to view, plus link other fields of data that are not directly related to the 'tblExcelLinked'.
Because there is no unique ID in the 'tblExcelLinked' to create a relationship, I have created a table called 'tblASB', which allows me to add other table data linked from same d/b.
I now want to update the 'tblASB' with data from the 'tblExcelLinked', but only append new records from 'tblExcelLinked', but my inadequate append query is duplicating the records each time I run it, rather than just adding the new ones.
Once sorted my next challenge is a macro so that this runs automatically rather than being manually triggered.
1. Persons (list of persons) 2. Job history (list of jobs)
each person have their own job history. all these jobs are stored in the job history table. when i delete a person i would like the job history for this person deleted as well. each job stored in the job table have a field with person name, so that it is linked to this person.
how can i do this? vba or simple properties options?
I have been working on a simple data base for some time now (beginner level) and am still trying to improve it. I would like to do something but before that I would like to have your opinion to know if it is even possible?I have a query QryMainReport:
Start Date/Time End Date/Time Employee
At the moment this is what the format of my report looks like (I removed other unnecessary fields):
StartTime----------EndTime---------------Employee 12/06/2014 01:00--12/06/2014 03:00------John Smith 12/06/2014 04:00--12/06/2014 06:00------Jane Doe 13/06/2014 02:00--13/06/2014 05:00------John Smith 13/06/2014 08:00--13/06/2014 08:00------Jane Doe
I would like to do as a report. (Dates would always be from Sunday to Saturday). I am not sure it is possible to do that. I suppose first it would mean:I would have to do a query to separate the times from the dates?I would have to find a way for Access to find the unique dates and unique names?Does it mean I have to use cross tab queries?
I have two tables (same data but slightly different attribute structure) with a one-to-one relationship (the join field is "ID"). There are 69 matching records in these tables. How can I delete these matching records from table A, while leaving them alone in table B?
I'm confused because I brought both tables into a select query, created a join from ID to ID (within in the query's design view), and then added all fields from table A to the query. I then ran the select query and saw the 69 records from table A in the query's data view that I wanted to delete, highlighted all records and clicked delete. However, this action deleted the 69 matching records from TABLE B, not Table A!!! How is this possible? What should I do instead? Thanks.
A strange request but I hope someone can help with this one
I have a table (tbl_Econ) where I have to delete a specified number of records from a table. It does not matter which records as long as I delete the exact number
e.g On a form text box I enter the number or records to be deleted (e.g.6000).
The table (tbl_Econ) has 8000 records, so I have to delete 6000 records. I need to be able to do this automatically :eek:
how would i go about deleting a set of records? i can get a list of records together in a query taken from 4 tables and would need to, if necessary, delete a single line. not all information needs to be deleted from all 4 tables though? the info to be deleted would only be deleted from 1 or 2 tables being the last 2 in the relationship.
i guessed it might be an append query but im not too familiar with them.
Hi there, i have a master reset to delete all the data in the database. Although, as there is a username and password entry to get into the admin module, i wish for one entry in the table participantTable to not get deleted (to save one password/username so its possible to log into the admin module after the reset). The code below will delete everything, how can i change it so it keeps the first record in the table, and then deletes all the other records after.