Is This Possible? - "Fuzzy Grouping"
Feb 2, 2005
Hello,
My problem is rather complicated and I am not sure if Access is even capable of addressing it. As I said, this is a bit tricky and I understand if no one is willing to tackle it, however, I would really appreciate it if someone could tell me it is impossible if that is the case. Thanks in advance.
I have attached a table to better explain my dilemma.
I would like to use the information in the “Category”, “Range Start” and “Range Stop” fields to generate new identifiers for each record in the table. The simplest criteria would be to assign the same novel identifier to two records if they have the same values in all of “Category”, “R. Start”, “R. Stop” (This is the case for the first two records.).
I am able to use this approach but would much rather use a more sophisticated set of criteria. Specifically, to be assigned the same ID two(or more) records must:
A) Belong to the same category.
B) Their ranges must overlap by more than (x)(Where x is some amount of overlap). E.g. Records 3 and 4 should be assigned the same ID because their categories are the same and their ranges overlap by 333 (13222-12889).
C) Finally, if A is satisfied and B is not then the then records could be assigned the same identifier if the difference between their ranges is less than some value (y). E.g. Records 5 and 6 should be grouped because A) is satisfied and their ranges are only 5 apart -> (119-300)…(305-700)
There is no limit to the number of records that may be assigned the same identifier, provided they satisfy criteria A+B or A+C.
Many thanks,
Matt
View Replies
ADVERTISEMENT
Mar 4, 2006
Hello All,
For those who are interested in Approximate String Matching or those who could use these algorithms; I have a complete suite of Approximate String Matching algorithms written in Visual Basic in an Access database.
In 2004 I decided to jump into the world of Fuzzy Matching with both feet.
As it is, I am working for a company that deals with names, addresses, etc. very intensely. It is a fair sized company that
uses Access on a grand scale. Since I am an Access programmer, I work in an Access gold mine!
I knew that if I could get a good handle on Fuzzy Matching, that when I hit the right person at the right time, the company could greatly benefit from my research on Fuzzy Matching. The right time and the right person are not here yet.
Nevertheless, since I have reaped much free source code and information from the Web, it is now time to return the favor.
I developed a package that is sort of a demo/tutorial on Approximate String Matching algorithms in Access that is very
robust in Fuzzy Matching. It would overtax the post in this forum for me to include it in a post.
To summarize, it works with the basic name - Last, First, and Middle. It has a user interface that allows a user to type in
what would be a good name and what would be a questionable name to resemble the good name. The weighted results of all the various algorithms can be chosen, or an individual algorithm can be chosen to display how closely the names match.
In addition, it has a table of 17,295 known good names with unique ID numbers as a reference table, and table of 1200
morphed names that are typical of names entered in a database with no input conventions. These morphed names have typos, transpositions, variations on maiden names, etc. 1200 good names were selected for alteration and the unique ID of each original good name was stored in the table with the altered names to determine the accuracy of the matching process.
The morphed names were compared to the known good names in a query with an approximate join using the suite of algorithms to determine match percentage. The altered names, the ID number of the original good name, the ID number of the name it matched to, and the match percentage were stored in a results table to determine the results of the matching run.
These tables were used to test and tweak the algorithms by comparing the morphed names with the known good names. The results of 1322 names were saved to a results table with match scores.
The matching process was executed in a query with an approximate join using the suite of algorithms.
The match results:
Total Approximate Matches: 1188
(Recall) Precision Pct: 99.00%
Total Unmatched Names: 12
Unmatched Pct: 1.00%
Total Other Matches: 134
Other Matches Pct: .77%
The tables are accessible in the database, so anyone can run their own tests. The interface is set up to accommodate this
as well.
The algorithms used: Dice coefficient as a threshold algorithm, Levenshtein Distance algorithm, Longest Common Subsequence, and the DoubleMetaphone. The names were passed to the algorithms by way of the bigram model.
I will email it to anyone who requests it.
It is in two platforms, Office 97 and Office 2000 as FuzzyMatching97.zip (692 KB) and FuzzyMatching2k.zip (721 KB).
The zip files include ApprxStrMatchingEngine97.pps or ApprxStrMatchingEngine2k.pps respectively, StrMatching97.mde or StrMatching2k.mde respectively, IEEESoundexV5.pdf, and VBAlgorithms.txt.
IEEESoundexV5.pdf is an abstract about Approximate Sting Matching that fired my curiosity about the subject, and pertains to the package.
VBAlgorithms.txt contains the entire suite of algorithms in Visual Basic extracted from the MDB modules.
The PowerPoint presentations describe the workings of the MDE and give a good overview of Fuzzy Matching.
To match is divine....
View 10 Replies
View Related
Aug 22, 2006
Hi all,
I've got two vast tables of data which I need to link, however the field unique to each was, at it's source, a typed field, and as such both have errors, typos, formatting problems, known deviations etc.
An example would be something like this:
Table1: SFOC0912JB3
Table2: F0CO9I2JB3
(These are harware serial numbers for what it's worth). I could do with creating a link between the two tables which would return a true based on a number of possibilities, such as:
Match if:
- String matches with prepended 'S', 'C' and/or
- String matches with substituted 'I' and '1' in any or all positions and/or
- String matches with substituted 'O' and '0' and/or
etc.
I think Levenshtein had the right answer from what I've been reading, but I haven't yet found an implementation for access (freely) available.
Any ideas?
Thanks,
Alex
View 1 Replies
View Related
Jun 1, 2005
I have a form that has code tied to the 'on open' event that is going to be accessed by users where we want them to only have access to certain fields which we want them to fill out. The fileds that will be locked will change based on the field called 'Item Number'. The code will be long because there are 30 different Item Numbers and about 10 to 20 fields that we will disable based on the Item Number. The code is like:
Dim Item_Number As String
If Me.Item_Number = "32000" Then
Me.Batch_Lot_Number.Enabled = False
End If
This is all great except that the disable makes the field kind of obscure by the color it gives it. I don't want to use the lock property because that doesn't give you a visual clue that its locked.
Is there a way to change the color of the field background using VBA?
View 5 Replies
View Related
Jan 7, 2007
Many have had questions on how to use this for their own purposes.
Link to the original post:
http://www.access-programmers.co.uk/forums/showthread.php?t=103279
Download at: http://www.kdkeys.net/forums/thread/6450.aspx
Here is how you can use it - I provide this example:
Tables and queries can be created in the MDE database.
Create a table with known good reference strings. I created this one - REF_LIST.
It has one field, REF_STRING (Text) with a length of 50, and indexed (No Duplicates). The field length can be set to a length that suits your requirements.
This is the content:
REF_STRING
Claw Hammer
Cold Chisel
Monkey Wrench
Nail Gun
Create another table with strings to match. I created this one - TEST_LIST.
It has one field, TEST_STRING (Text) with a length of 50, and indexed (Either No Duplicates or Duplicates Ok depending on the data). The field length can be set to a length that suits your requirements.
This is the content:
TEST_STRING
Claw Hamer
Claw Hammr
Clew Hammer
Clw Hammer
Cold Chisil
Cold Chisle
Cold Chissel
Cole Chisel
Monkey Wrnech
Monkie Wrench
Monky Rench
Nail Gn
Nail Gunn
Naill Gun
Nial Gun
Then create another table for the results. I created this one - RESULTS.
It has four fields, REF_STRING (same properties as in table REF_LIST), TEST_STRING (same properties as in table TEST_LIST), MATCH_VALU (Single, Fixed, 2 decimal places), and GOOD_MATCH (True/False).
This is the content from the results of the Match_Lists query.
REF_STRINGTEST_STRINGMATCH_VALUGOOD_MATCH
Nail GunNial Gun0.92No
Monkey WrenchMonky Rench0.94No
Monkey WrenchMonkie Wrench0.94No
Claw HammerClew Hammer0.94No
Cold ChiselCold Chisle0.94No
Cold ChiselCold Chisil0.94No
Cold ChiselCole Chisel0.94No
Nail GunNail Gn0.95No
Monkey WrenchMonkey Wrnech0.95No
Nail GunNail Gunn0.96No
Nail GunNaill Gun0.96No
Claw HammerClaw Hamer0.96No
Claw HammerClaw Hammr0.96No
Claw HammerClw Hammer0.96No
Cold ChiselCold Chissel0.97No
SQL from the Match_Lists query:
INSERT INTO RESULTS ( REF_STRING, TEST_STRING, MATCH_VALU )
SELECT REF_LIST.REF_STRING, TEST_LIST.TEST_STRING, IsSimilar([REF_STRING],[TEST_STRING]) AS Expr1
FROM REF_LIST, TEST_LIST
WHERE (((IsSimilar([REF_STRING],[TEST_STRING]))>0.79));
Using this example you can populate the two tables, REF_LIST and TEST_LIST with strings that you need to compare and run the Match_Lists query.
The GOOD_MATCH field in the RESULTS table is for you or another human to determine if anything questionable is a good match for your purposes.
If it is found that any match with a value of at least .95 is a good match then an update query could be created to update the GOOD_MATCH field with true for all those with a value of >= .95.
Then a select query could be created to look at those matches that do not have a GOOD_MATCH to determine if they may be good matches.
Naturally the two tables may need a unique ID for the strings for better tracking and comparing.
If so, create them and have them appended to the RESULTS table as well in the Match_Lists query.
OpnSeason
View 4 Replies
View Related
Apr 14, 2005
I need to link two tables on the Name Field. The trouble is that the names are not enterred the same in each table, so I can't do a direct = comparison.
For instance, one table might have "The Heart Center of Indiana", while the other has "Heart Center of Indiana" Or one might have "St. John's Medical Center" and the other has "St Johns Medical Center" (or, god help me, "St John's Hospital")
My only thoughts are somehow building a matching rank by saying that 85% of the characters in "The Heart Center of Indiana" match "Heart Center of Indiana". There are thousands of names in each list, and I would very much not like to have to manually try to spot them.
I doubt there is a direct solution to my problem, so any tips on how I can make a translation table is aoppreciated.
Thanks,
David
View 4 Replies
View Related
Mar 5, 2006
Hello All,
For those who are interested in Approximate String Matching or those who could use these algorithms; I have a complete
suite of Approximate String Matching algorithms written in Visual Basic in an Access database.
In 2004 I decided to jump into the world of Fuzzy Matching with both feet.
As it is, I am working for a company that deals with names, addresses, etc. very intensely. It is a fair sized company
that uses Access on a grand scale. Since I am an Access programmer, I work in an Access gold mine!
I knew that if I could get a good handle on Fuzzy Matching, that when I hit the right person at the right time, the
company could greatly benefit from my research on Fuzzy Matching. The right time and the right person are not here yet.
Nevertheless, since I have reaped much free source code and information from the Web, it is now time to return the
favor.
I developed a package that is sort of a demo/tutorial on Approximate String Matching algorithms in Access that is very
robust in Fuzzy Matching. It would overtax the post in this forum for me to include it in a post.
To summarize, it works with the basic name - Last, First, and Middle. It has a user interface that allows a user to
type in what would be a good name and what would be a questionable name to resemble the good name. The weighted results of all the various algorithms can be chosen, or an individual algorithm can be chosen to display how closely the names match.
In addition, it has a table of 17,295 known good names with unique ID numbers as a reference table, and table of 1200
morphed names that are typical of names entered in a database with no input conventions. These morphed names have typos, transpositions, variations on maiden names, etc. 1200 good names were selected for alteration and the unique ID of each original good name was stored in the table with the altered names to determine the accuracy of the matching
process.
The morphed names were compared to the known good names in a query with an approximate join using the suite of
algorithms to determine match percentage. The altered names, the ID number of the original good name, the ID number of the name it matched to, and the match percentage were stored in a results table to determine the results of the matching run.
These tables were used to test and tweak the algorithms by comparing the morphed names with the known good names. The results of 1322 names were saved to a results table with match scores.
The matching process was executed in a query with an approximate join using the suite of algorithms.
The match results:
Total Approximate Matches: 1188
(Recall) Precision Pct: 99.00%
Total Unmatched Names: 12
Unmatched Pct: 1.00%
Total Other Matches: 134
Other Matches Pct: .77%
The tables are accessible in the database, so anyone can run their own tests. The interface is set up to accommodate this as well.
The algorithms used: Dice coefficient as a threshold algorithm, Levenshtein Distance algorithm, Longest Common
Subsequence, and the DoubleMetaphone. The names were passed to the algorithms by way of the bigram model.
I will email it to anyone who requests it.
It is in two platforms, Office 97 and Office 2000 as FuzzyMatching97.zip (692 KB) and FuzzyMatching2k.zip (721 KB).
The zip files include ApprxStrMatchingEngine97.pps or ApprxStrMatchingEngine2k.pps respectively, StrMatching97.mde or StrMatching2k.mde respectively, IEEESoundexV5.pdf, and VBAlgorithms.txt.
IEEESoundexV5.pdf is an abstract about Approximate Sting Matching that fired my curiosity about the subject, and
pertains to the package.
VBAlgorithms.txt contains the entire suite of algorithms in Visual Basic extracted from the MDB modules.
The PowerPoint presentations describe the workings of the MDE and give a good overview of Fuzzy Matching.
View 14 Replies
View Related
May 2, 2006
I have created an invoicing system for a CD shop
There is a transaction table which has each individual cd sold as a seperate transaction. Each transaction has an order number, so there can be more then one cd sold per order, but they all still have their own record in the table.
im now invoicing each order by mail merge based on a query that finds all the details on every transaction but wht ive found is that the mail merge puts each transaction onto a different page even if its from the same order number as another.
does anyone know how to group each order in the query so that all the items in one order come in a single invoice?
View 3 Replies
View Related
Jun 21, 2005
Hi, I have a query which numerous fields, and I need to make a report based on the query. However I need to group 3 fields in the query and press the sum button on 2 of them, Qty and Total Price (which is qty*price)
I need to do this so in the report when a particular wine is purchased more than once, instead of repeating the peoples name who bought the wine, it will only show 1 and automatically add the rest to the total price.
I dont know how to group within a query, can someome please tell me how? Thanks.
View 1 Replies
View Related
Jul 6, 2005
Hi to all;
I have one code of 6 digits; each digit refer to a group; first 2= product group; beverage; live animals,….itc (24 product groups), the 3ed digits= food product, the 4th-6th digit= detail product group; vegetables, fruits,….,the rest of digits refers to product name, carrot, apple,…itc , example 070511
How can I use the query to sum the product value at different group level; example the first 2= product group, ..?
Do I have to split my code to 6 cods to do my calculation?
Thank in advance for help
majed
View 1 Replies
View Related
May 2, 2006
I have created an invoicing system for a CD shop
There is a transaction table which has each individual cd sold as a seperate transaction. Each transaction has an order number, so there can be more then one cd sold per order, but they all still have their own record in the table.
im now invoicing each order by mail merge based on a query that finds all the details on every transaction but wht ive found is that the mail merge puts each transaction onto a different page even if its from the same order number as another.
does anyone know how to group each order in the query so that all the items in one order come in a single invoice?
View 7 Replies
View Related
Sep 21, 2004
Hi all
I am trying to find a way of finding the number of a group of sessions as a percentage figure. e.g. total number of clients attending 1-3 sessions = 20%, 4-6 =15%, 7-12 =21% 1-2years = 8%, etc. and entering this calculation in a report.hope you can help.
Dave
View 9 Replies
View Related
Jul 18, 2005
Question on grouping within Access Reports:
A simplistic view of the report I'm trying to generate is as such:
Company Name - Company Description - Employee
I am grouping by Company name, and I am hiding duplicates of Company Description because they can be long. I also have the Company Description as Allowing to Grow.
The problem is that the first record gives the company, the description, and the employee name on the first line just fine, but the second employee name won't be listed until the Company description ends. When the description is about a paragraph long, the other employees look very seperated from the initial record.
Any way to fix this?
View 1 Replies
View Related
Mar 9, 2008
Hi,
can form objects be grouped? i currently have numerous buttons on a form that are shown according to a button selection. my current code makes all buttons visible / invisible singularly but i wiondered if they could be grouped/ named and the get the code to make the group visible / invisible?
many thanks,
NS
View 4 Replies
View Related
Sep 19, 2005
Can you help me.
I have about 2000 records with Part Number XXXCOGxxxxx. This Should be C0G. Ex:
SPN
RPE122COG101K
RPE122COG681K
RPE122COG101K
RPE122COG681K
GRM40COG100J
GRM40COG100K
GRM40COG101J
GRM40COG150J
GRM40COG220J
GRM40COG221J
GRM40COG271J
GRM40COG470J
GRM40COG471J
GRM42-6 COG100J
GRM42-6COG220J
GRM42-6COG472J
What to write a query to change just the COG portion. Can you tell me the formula?
Thanks
Chris
View 1 Replies
View Related
Oct 23, 2005
I am trying to group time entries so that I can say, between 8AM and 10AM there were this number of calls. I have a field called Time_Assigned with numberous time entries (such as 08:15:33 AM) corresponding to another field called Incident_Type (such as medical). The data spans a whole year so I have several thousand time entries and I would like to show how many incident types occur between such and such hours. Thanks again for everyone's generous help.
View 2 Replies
View Related
Dec 6, 2005
Hi,
I would like to know how i can group numbers into bands.
e.g
Col 1
1
2
3
4
5
6
How do I group the above into groups of i.e. 1 - 3 and 4 - 6, or group in regular intervals like 1-2, 3-4, 5-6?
Regards
View 3 Replies
View Related
Feb 24, 2006
Hi all,
I have a db for logging meeting action points. Each action point has a description and owner. I'd like a query/report which will produce a list of action points grouped by owner (easy), but with a particular owner's action points at the top.
Essentially, rather than do a GroupBy and ascending sort, i need to specify the first group that is displayed. I don't care in what order the other groups appear.
Eg.
Action Point Owner
1. Task 1 DC
6. Task 6 DC
8. Task 8 DC
3. Task 3 AG
4. Task 4 AG
So, above if i just did an ascending sort, the DC records wouldn't be at the top.
any ideas?
El.
View 3 Replies
View Related
May 16, 2006
How do i group the following records
ID Surname Pack
29679Minican 1
29679Minican 2
27818Oliver 1
27818Oliver 2
27818Oliver 3
so its will show ...
27818Oliver
29679Minican
... on a spreadsheet
View 1 Replies
View Related
Jul 6, 2006
I have a table which stores meeting room booking information in half hour slots.
If someone books a 2 hour meeting then 4 records would be produced – one for each half hour. I want to produce a query which will group the data by room (ScheduleID) showing the max and min times (ie the initial start time and final end time) for each event and the event details. The table structure is as follows
ScheduleDetailsID, ScheduleID, CustomerID, ScheduleStartTime, ScheduleEndTime, meeting purpose
And the query I have tried is as follows:
SELECT [Schedule Details].ScheduleDetailsID, [Schedule Details].ScheduleID, [Schedule Details].CustomerID, Min([Schedule Details].ScheduleStartTime) AS MinOfScheduleStartTime, Max([Schedule Details].ScheduleEndTime) AS MaxOfScheduleEndTime, [Schedule Details].[meeting purpose]
FROM [Schedule Details]
GROUP BY [Schedule Details].ScheduleDetailsID, [Schedule Details].ScheduleID, [Schedule Details].CustomerID, [Schedule Details].[meeting purpose];
Can anyone tell me where I have gone wrong. It does not group the data as I want it to ie by room, then time, with only the initial start and final end times. Thanks in advance. Peter
View 3 Replies
View Related
Aug 1, 2006
I have a query which has many sums and counts on things like "Company name", "region name" and "Development Name".
I'm using this query for a report to do lots of percentages with, but now i need to filter this also by a date period.
So the user choses "Alex Homes" as the company name and then "July 2006" as the reporting month, and i need all the sums and counts to stay the same and only count/sum the records in the chosen month.
I can't seem to think of a way to do this.
If you need more info then ask.
View 4 Replies
View Related
Oct 10, 2006
I have a set of evaluation response data. There are a number of question (like 20), and each respondent answered the question on a scale of 1 to 5. Thus I have a table Responses where each column is a question and each row is a Respondent's answers. Each cell has a value between 1 and 5 (inclusive), or is possibly NULL.
What I want is to be able to list out the number of people who responded with a certain evaluation for each question. So it would list the number of people who gave a 1, a 2, a 3, and so on for each question. So I'd get something like:
---------Q1 Q2 Q3 Q4 Q5 ...
-1- x x x x x ...
-2- x x x x x ...
-3- x x x x x ...
-4- x x x x x ...
-5- x x x x x ...
Ideally I'd like the questions to be on the rows and the numbers as columns, but I can just do the transposition when I make this as a report.
Any ideas of a SQL Query that will get me such a table? I'd really like to avoid VBA if possible (I'm writing this database to be used/maintained by a non-programmer), and I'd like to not have to develop like 20 different subreports in order to print out this information.
Thanks.
View 6 Replies
View Related
Jan 8, 2007
I am having a problem that I hope has a very simple solution that I am somehow overlooking.
I have simplified my query for the purposes of this question. I have a query that only get the date and amount of a transaction. I want to group the information by date and have the transaction amount summed. For some reason it will not group by date.
Here is my query displayed in SQL:
SELECT Transactions.Date, Sum(Transactions.TransTotal) AS SumOfTransTotal
FROM Transactions
GROUP BY Transactions.Date
ORDER BY Transactions.Date;
And here is a sample of the data returned:
DateSumOfTransTotal
12/13/20055.12
12/13/20055.12
12/13/20055.12
12/13/20055.12
12/15/20051.15
12/15/20051
12/15/20050.12
12/15/20056.56
12/16/200519.14
12/16/200512
12/16/20058.16
12/17/200511.11
Why will it not group all the 12/13 or 12/15's together? I have done other queries like this and not had this problem. What am I overlooking?
Any help is very much appreciated. Thanks,
View 13 Replies
View Related
Jul 31, 2007
Hi
I have this table
date, error code, user
i need the output to be :
User, Error Code, Month/Week1 Count of error code, Month/Week2 Count of error code .....
basically how do i make error code field as count with each week 1-4 for the month?
Format([AuditDate],"mm-ww") create week but in 1-52 weeks not 1-4 for the month. Also need something like ' count if week = 1'
Hope this makes sense.
Thanks
View 1 Replies
View Related
Aug 10, 2007
I need to create a report that groups duplicates and totals the quantity.
Here's an example:
Current Report:
Product# Type Qty Cost Total
i123 pen 2 1.00 2.00
a987 paper 3 1.00 3.00
i123 pen 1 1.00 1.00
a987 paper 4 1.00 4.00
What I need:
i123 pen 3 1.00 3.00
a987 paper 7 1.00 7.00
Please help......
View 2 Replies
View Related
Feb 22, 2008
I have a table GIS_Subs with following fields:
Force_No ( Foreign Key)
Subs_Dt
Block Yr
Subs_Amt (number)
Running_Total
I wish to update Running Total for each record based on previous pay sorted in Ascending.
I am able to make a running sum but then it clubs all PersonNo of same date like:
[COLOR="Blue"]SELECT GIS_Subs.Force_No AS FN, DatePart("yyyy",GIS_Subs!Subs_Dt) AS AYear, DatePart("m",GIS_Subs!Subs_Dt) AS AMonth, DSum("Subs_Amt","GIS_Subs","DatePart('m', [Subs_Dt])<=" & [AMonth] & " And DatePart('yyyy', [Subs_Dt])<=" & [AYear] & "") AS RunTot
FROM GIS_Subs
GROUP BY GIS_Subs.Force_No, DatePart("yyyy",GIS_Subs!Subs_Dt), DatePart("m",GIS_Subs!Subs_Dt)
ORDER BY DatePart("yyyy",GIS_Subs!Subs_Dt), DatePart("m",GIS_Subs!Subs_Dt);COLOR]
How can I calculate it for each Person seperately?:rolleyes:
View 3 Replies
View Related