Fuzzy Grouping Transform Corrupts Pass-through Data
Aug 2, 2005We are working with a client and are using Fuzzy Group transform for de-duping, and hierarchy creation for a national account list.
View 4 RepliesWe are working with a client and are using Fuzzy Group transform for de-duping, and hierarchy creation for a national account list.
View 4 RepliesDear Friends,
i think fuzzy lookup
COMPARES WHAT WE ARE MAPING THE COLUMNS WITH SPELLING (IT WILL REJECT ATLEAST 1 LETTER IS DIFFRENT IN ANY RECORD MAPPED COLUMN) EX: RAVI != REVI
what is fuzzy grouping ???? please explain
regards
koti
I've seen one other post on this topic from October 2005 and I thought I'd bring it up again. I've a Fuzzy Grouping component in my data flow. The output data from it appears to be the result of records spliced into other records. This includes pass-through columns, not merely "clean" or similarity columns. For example (I've added the suffixes for illustrative purposes):
AddressLine1_in: 162 OAKMONT
AddressLine1_out: 162 OAKMONTLAMINATION INC
CityStateZip_in: Alexander, AR 72002-8539
CityStateZip_out: Alexander, AR 72002-8539116-7066
These are just pass-through columns, although "used" columns are seeing something similar (below.) Any others with this experience?
City_in: Alexander
City_out: Alexandertle Rock
I am using the Sept CTP, I am doing a fuzzy grouping on 1.5Mil records.
View 7 Replies View RelatedI managed to get fuzzy grouping working. The relevant output (_key_in and _key_out) are stored in a new table that is a copy of the old table + fuzzy grouping columns.
How do i get SSIS to store the _key_in and _key_out in the original table?
The new matching column _key_out refers to the new key: _key_in. How could i get SSIS translate that to a matching column that refers to my original key?
hi focks,
WHAT IS THE USE OF Fuzzy Grouping IN SSIS
and please give me the example
regards
koti
Hi - we have been evaluating using Fuzzy Grouping and Lookup for maintaining our large list of customer records. Initial testing with Grouping on about 300K records went great but now with a larger sample of 7.3 million records we are running into problems. It doesn't appear to be system limitation - the index is built reasonably quickly and without errors but when it starts the matching we get these errors:
[Fuzzy Grouping Inner Data Flow : DTS.Pipeline] Error: The ProcessInput method on component "Fuzzy Lookup" (86) failed with error code 0x8000FFFF. The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running.
[Fuzzy Grouping Inner Data Flow : DTS.Pipeline] Error: Thread "WorkThread0" has exited with error code 0x8000FFFF.
[Fuzzy Grouping Inner Data Flow : DTS.Pipeline] Error: Thread "WorkThread1" received a shutdown signal and is terminating. The user requested a shutdown, or an error in another thread is causing the pipeline to shutdown.
[Fuzzy Grouping Inner Data Flow : OLE DB Source [1]] Error: The attempt to add a row to the Data Flow task buffer failed with error code 0xC0047020.
[Fuzzy Grouping Inner Data Flow : DTS.Pipeline] Error: Thread "WorkThread1" has exited with error code 0xC0047039.
[Fuzzy Grouping Inner Data Flow : DTS.Pipeline] Error: The PrimeOutput method on component "OLE DB Source" (1) returned error code 0xC02020C4. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing.
[Fuzzy Grouping Inner Data Flow : DTS.Pipeline] Error: Thread "SourceThread0" has exited with error code 0xC0047038.
One thing we did find is that our test server didn't have SP1 installed and that seemed to help a lot (we were getting buffer errors prior to SP1). One other note - the desination table is populated with all the data but no scoring has been applied to it.
Does anyone have any ideas what could be causing this?
Thanks!
Keith Doyle
Hello,
I have created a project to do de-dupification of addresses.
I understand that Fuzzy Grouping will take less time if it has lesser data volume to process.
My source feed file is sometimes huge. So I am splitting the input into multiple branches based on
the first letter of the city. There are 7 branches in the process.
Source File Feed
|
Split data into 7 groups
|
------------------------------------------------------------------------------------------------------------------------------------------
| | | | | | |
FzGrpg FzGrpg FzGrpg FzGrpg FzGrpg FzGrpg FzGrpg
| | | | | | |
Split Split Split Split Split Split Split
| | | | | | |
------------- -------------- -------------- -------------- -------------- -------------- --------------
| | | | | | | | | | | | | |
<- - - - - - - Write the Canonicals and Dupes from each of these splits into database - - - - - - - - ->
When I designed this I was hoping that each of the Fuzzy Grouping tasks will execute in parallel.
But in reality they are processing one after the other.
Is there anyway to make them execute in parallel?
Appreciate your help.
Thanks
KM
Hi,
I have an Oracle table called "Party" which contains Party_Id as primary key and have Party_Name, Party_Addr etc., as fields. We have lot more duplicate party details such as (party_name and party_addr) in this table. We are trying to aviod duplicates using FUZZY logic of SSIS.
1. Is any body suggest me how to create package to avoid duplicates using Fuzzy logic for this scenario(Step by step instructions are good for me to understand SSIS).
2. Could you please provide me some samples for FUZZY(Please send me a sample to my email)
I was running a Fuzzy Grouping task on SQL Server Enterprise Edition SP1 without any issues. I then applied SP2 and now that same Fuzzy Grouping is causing a minidump and terminating the process.
First, does anybody know anything about this kind of issue?
Second, I tried to run the minidump file in Visual Studio but I cannot actually run the dump file in Visual Studio as I keep getting the following exception:
Debugging information for 'DtsDebugHost.exe' cannot be found or does not match. No symbols loaded.
Finally, I did obtain a random error on the server itself that displayed the GUID: 58FC39EB-9DBD-4EA7-B7B4-9404CC6ACFAB.
This GUID appears to be tied to a Dr. Watson error but, again, I cannot figure out what process is breaking.
Can somebody please help?
Hi,
We do not have any Address Cleansing tools and the requirement is we have to cleanse the data, finding the best possible record which has all info and update other records accordingly.
I am Not sure we can do this Fuzzy Grouping Transformation.
Example:
I have Source table with following info.
Customer_id
Location_Address
Location_City
Location_State
Location_Zip
Location_County
TT101
252 HARVARD RD
ATLANTA
GA
30340
FULTON
TT101
30340
TT101
252 HARVARD RD
ATLANTA
TT102
125TEST
CUMMING
TT102
125 TEST DR
CUMMING
GA
30040
FORSYTH
TT102
GA
30040
Please let me know the solution
Thanks in advance.
Hello,
I have been struggling with this for quite awhile so any help would be appreciated.
I need to know if there is away to populate the fuzzy grouping control dynamically. I know you programmatically design a package and customize it in C# but for our purposes we would like to control the SSIS package via database settings. When the settings change the package would then act different. Its a simple a package consisting of an Input - fuzzy grouping - conditional split - output. The connections are setup dynamically using parameters, expressions and a script task. Is there anyway I could do a similar thing for Fuzzy Grouping?
Hello All,
We have a SSIS package which includes Fuzzy Grouping in Data Flow. It takes two columns from source table and saves outputs in different table with match score etc. Following is the way we are doing it:
1. Load required data from table using OLEDB connection (source)
2. Sort the data
3. Apply Fuzzy grouping (using dedicated database instead tempdb and MinSimilarity = 0.6)
4. Send to destination table using OLEDB connection (destination)
In input table we have millions of records. It takes too long to execute and even sometime it fails after running 12 hours. Any suggestions for performance improvement are welcomed.
Appreciate your help.
Thanks and regards,
Ashish Basran
I have a few questions about the amounts of resources used by the fuzzy grouping transformation. I am running a little less than 5mil records through a fuzzy grouping that exact matches one column and fuzzy matches one. The server executing the package is a dual-core xeon with 2gb ram, running a default instance of sql 2005 enterprise.
I have been attempting to execute this package for a while now but it keeps erroring out for various reasons. At first, it was from a lack of available memory. I limited the memory usage of sql server to 256mb and set the buffer temp storage path, which alleviated those errors. However, now, my tempdb transaction log is growing significantly. It failed once for not being able to grow and reallocate quickly enough, but enlarging the auto-growth factor fixed that. Then, it filled up the volume the tempdb log was on, so now I have moved it to the san and am about to try again.
I was wondering, does anyone have a general idea on approximate resource usage by fuzzy grouping? Specifically, is there an approximate relation between the number of records grouped and the amount of ram/pagefile required? Also, on the database backend, how big can I expect the tempdb data/log files to get?
Hi,
I need some advice on fuzzy lookup / grouping design.
I have a requirement that, I think, is between lookup and grouping transformations.
In one of our applications, users can enter manually a label for some information in the database.
Every month, I will store all the new data in our OLAP DB, and I want to group these labels with a fuzzy logic.
Historical data (already loaded) have to be grouped, as well as new data coming every month.
I have no predefined canonical data, so Fuzzy Lookup seems not adapted to my pb.
Fuzzy Grouping seems ok, but it would require to put historical data as well as new data as an input of the Fuzzy Grouping Transfo to constitute groups. This seems not efficient to me.
Any clue ?
M.D
Hi all,
My question is how to calculate the similarity by using SQL query, example LIKE % , order by.....? Now i'm doing a function same like fuzzy grouping but i do not know how to get the answer, mean how they get match with those selected row of data.
Hope my question is clear. How to write the correct query? What should i do? I 'm newbie in Integration Services, so i need ur explaination in step by step if there hv correction.
I am looking forward to hearing from you shortly and thanks a lot in advance.
Thanks!
rgds,
xuenly
I have recently decided to dedupe my data but i am having a problem after running fuzzy grouping with the query on updating which duplicate to keep
_key_in is unique, _key_out is the duplicates so for example:
_key_in , _key_out , name , score , dedupe
1 , 1 , ron , 10 , purge
2 , 1 , ronn , 15 , keep
3 , 3 , john , 5 , keep
4 , 4 , matt , 15 , keep
5 , 4 , mat , 10 , purge
6 , 4 , matt , 15 , purge
I want to keep the _key_out with the higher score by setting the field de_dupe to 'keep' and the remainder to 'purge'. The score can also be the same within a duplicate so in the case it is the same i just need to keep one it doesnt matter which one. The query i have below nearly works but it marks duplicates with the same score as keep.
Code:
UPDATE b
SET b.dedupe_result = 'keep'
FROM
[BusinessListings].[dbo].[MongoOrganisationACTM1Destination] b
INNER JOIN
[Code] ....
Hello,
I was wondering how Fuzzy Grouping deals with and handles first name similarities. Is there a way to configure it so that Anthony = Tony, Bill = William, etc€¦? I created a simple package with several rows containing similar first names and ran the fuzzy grouping on the first name column. I received only one possible duplicate of Will = William which was at 56%. I lowered the threshold down to 1% and still only one match.
Now I understand and appreciate the reasons for this but was wondering if this type of situation was considered and a way of dealing with it is available.
Thanks,
Beac
Hi All,
Is there a way the fuzzy lookup or grouping can be trained so that similarities and confidence values rely on previously matched strong links?
For example: I can link 80% of my two datasets using one strong identifier (say phone #) which I trust. My goal then, is to use the probability of matching of the rest of my linking fields (say Name,Address,Gender,DOB) in a "matched by phone number" pair to train a fuzzy lookup task to be done on the unlinked 20% of the datasets.
This "training set" would in theory influence the similarity and confidence values of the fuzzy output since each linking column would carry a different weight or contribution towards a confident match.
Does anyone out there knows how to do this in practice in SSIS?
I have tried to process > 3 million Fuzzy grouping records on two different servers with no success. 3 mill works but anything above 4 mill doesn't. Some background:
We are trying to de-dup our customer table on: name (.5 min), address1 (.5 min), city (.5 min), state (exact). .8 overall record min score.
Output includes additional fields: customerid, sourceid, address2, country, phonenumber
Without SP1 installed I couldn't even get a few hundred thousand records to process
Two different servers - same problems. Note that SSIS and SQL Server are running locally on both
The higher end server has 4GB RAM, the other 2.5 GB RAM. Plenty of free disk space on both
SQL Server is configured to use 2 GB of RAM max
The page file is currently at 15GB
After running a number of test on both servers trying different batch sizes etc. the one thing I noticed is that it seems to always error out when SSIS takes over and starts chewing up all the available RAM. This happens after the index is created and SSIS starts "warming caches". On both servers SQL Server uses up about 1.6GB of RAM at this point while SSIS keeps taking over RAM until all physical RAM is used up.
Some questions:
Has anyone been able to process more then 3 million records and if so what is your hardware configuration?
Should we try running SSIS from a different server so it has access to the full amount of physical RAM? (so it doesn't have to fight for RAM with SQL Server)
Should we install Win 2003 Enterprise Server so we can add more RAM?
Any ideas why switching to the page file might be causing errors?
Thanks!!
Keith Doyle
I am having trouble programmatically creating a fuzzy lookup package. I have successfully build 90% of it, along with a different Fuzzy Grouping package, but have hit a wall with regards to the pass through columns of the fuzzy lookup component.
The last line of code below always fails. Prior to the below code I've setup my fuzzy lookup component, instantiated it (instance variable), and attached it's input to the output of an ole db source. At this point, the only part that I haven't been able to figure out is the code below -- this is where I'm trying to add pass through columns to the output of my fuzzy lookup component. ImportId and ImportRowId are columns that are in my OLE DB source and thus, in the input of my fuzzy component. Below I try to get them to pass through so that they're in the output, and the last line fails. When I step through code, I see that the outputColumn.LineageID is in fact the correct value (I compared it with a package i created manually and the value when debuggins is exactly the same value as the xml from the manually built version).
Code Block
IDTSOutput90 fuzzyLookupOutput = this.FuzzyLookup.OutputCollection["Fuzzy Lookup Output"];
IDTSOutput90 sourceOutputCollection = this.OleDbSource.OutputCollection["OLE DB Source Output"];
IDTSOutputColumnCollection90 sourceOutputCols = sourceOutputCollection.OutputColumnCollection;
foreach (IDTSOutputColumn90 outputColumn in sourceOutputCols)
{
// pass through columns
IDTSOutputColumn90 col = null;
if (outputColumn.Name == "ImportId" || outputColumn.Name == "ImportRowId")
{
col = instance.InsertOutputColumnAt(
fuzzyLookupOutput.ID, fuzzyLookupOutput.OutputColumnCollection.Count, outputColumn.Name, "");
col.SetDataTypeProperties(
outputColumn.DataType, outputColumn.Length, outputColumn.Precision, outputColumn.Scale, outputColumn.CodePage);
instance.SetOutputColumnProperty(
fuzzyLookupOutput.ID, col.ID, "SourceInputColumnLineageId", outputColumn.LineageID);
}
}
Any thoughts???
Will the fuzzy grouping task match a null value to an empty string (or spaces)? I've got 5 columns I'm matching on, and one of them may be null for certain rows but an empty string for others. Given the 4 other columns may match, will this difference stop similar columns being grouped together?
(Someone's modified my grouped data since it was deduped, which takes a while, and I'm hoping for a quick answer on this).
Thanks in advance.
Ben
I have a table that I need to identify similarities so I'm running a Fuzzy Grouping Process. I'm getting the follow errors and I can't identify the problema since all the fields are varchar, except for the first that is int but not use in the fuzzy.
select
MSSEndCustomerTPID
, orgname
, address1
, cityname
, statename
, countryname
from [sales].[vw_Fact_VolumeSales] a
inner join [GMOFBI].[dbo].[vw_Dim_MSS_Organization] b
on a.EndCustomerOrganizationKey=b.MSSOrganizationKey
[code]...
Hello Experts,
In my Data Flow Task I have a Fuzzy Lookup transformation. In the Columns tab of the Fuzzy Lookup Transformation Editor, if I attempt to select a field for pass through that is a DT_TEXT data type, I get the error:
Validation error. Data Flow Task: Fuzzy Lookup [3532]: The data type of column 'event_list' is not supported.Package.dtsx
BOL says, "Only input columns with the DT_WSTR and DT_STR data types can be used in fuzzy matching...." But I'm not doing fuzzy matching on the DT_TEXT column, I'm just trying to pass it through to the transformation's output. BOL doesn't say anything about this data type being incompatible with passing through to the output.
Any thoughts on how I may workaround this issue? I was thinking I would need to perform the lookup on a subset of the columns without the DT_TEXT field and then merge the data back together at the end. But, if there's a setting or some other way, please let me know.
The documentation on the fuzzy lookup transform mentions that only columns of type DT_WSTR and DT_STR can be used in fuzzy matching. I interpreted this as meaning that you could not create a mapping between an input column of type DT_NTEXT and a column from the reference table. I assumed that you could still have a DT_NTEXT column as part of the input and mark this as a pass through column so that it's value could be inserted in the destination, together with the result of the lookup operation. Apparently this is not the case. Validation fails with the following message: 'The data type of column 'fieldname' is not supported.' First, I'd like to confirm that this is really the case and that I have not misinterpreted this limitation.
Finally, given the following situation
- A data source with input columns
Field_A DT_STR
Field_B DT_NTEXT
- A fuzzy lookup is used to match Field_A to a row in the reference table and obtain Field_C.
- Finally, Field_B and Field_C must be inserted into the destination.
Can anyone suggest how this could be achieved?
Fernando Tubio
I am trying to interpret some of the results I observe when trying to match similar records using a fuzzy lookup transform, but it's not entirely clear how the overall row similarity score is calculated. In particular, sometimes rows with lower individual column similarity scores will achieve a higher similarity and confidence score than a matching row with higher individual column scores.
The transform is configured with 6 text fields set to fuzzy mapping and a minimum similarity of 0, and 3 additional numeric fields with an exact mapping. It is set to return a maximum of 2 matches per lookup and to do an exhaustive search of the reference table.
For example, from the following matching pair of records Match 1 is picked over Match 2 even though it's individual scores are lower.
Match 1 Match 2
----------------- -----------------
_similarity_author 1.0 1.0
_similarity_title 0.85344648 1.0
_similarity_headline 0.0125 0.0125
_similarity_summary 0.0125 0.0125
_similarity_picture 1.0 1.0
_similarity_caption 1.0 1.0
_similarity 7.8429267E-2 7.3196657E-2
_confidence 0.55728668 0.44271332
In another case both matching records have *identical* scores for every mapped column and yet their similarity and confidence scores are different.
Clearly there are other factors involved in calculating the overall row score. Anybody know what these are?
Fernando Tubio
Hi all,
I'm new to the forum, and fairly new to SQL, so please forgive me if I say something that sounds stupid at any time. I have been looking around this forum (and others) while trying to diagnose my current issues, and there seems to be a really good amount of knowledge about here so I thought I'd try my luck with a post.
We've been running SQL 2000 for around 2 years now, and had very few problems until we suffered a torn page error around 6 months ago (just before the time I started looking after the server).
Since then we've been getting intermittent database corruption (I implemented a nightly DBCC job as part of DB maintenance) and it seems to occur after our weekly index maintenance has run (also a scheduled DBMaint job). I have just confirmed this by restoring nightly backups to a test database - the current round of corruption has occured sometime between the Saturday and Sunday night backups (indexing runs at 1 am Sunday morning, and is the only activity between these two backups)
We have 9 small (all less than 200MB) databases which are used faily lightly by our salesforce. The corruption is occurring on different databases each time.
We are running SQL 2000 (SP4) on a Windows 2003 clustered server (2 x DL380s and an MSA500 G1) running in failover mode (active-passive). The raid 5 array does utilise disk caching, and I have wondered if this is involved somehow.
Our hardware support comany have checked the server thoroughly for HW issues, but have come up blank.
Here is the DBCC output from one of our corrupt databases from this morning:
dbcc checkdb (casestatus) WITH ALL_ERRORMSGS, NO_INFOMSGS
-----------------------
Server: Msg 2533, Level 16, State 1, Line 1
Table error: Page (1:20040) allocated to object ID 117575457, index ID 0 was not seen. Page may be invalid or have incorrect object ID information in its header.
Server: Msg 8976, Level 16, State 1, Line 1
Table error: Object ID 117575457, index ID 1. Page (1:20040) was not seen in the scan although its parent (1:19808) and previous (1:20039) refer to it. Check any previous errors.
Server: Msg 8978, Level 16, State 1, Line 1
Table error: Object ID 117575457, index ID 1. Page (1:20041) is missing a reference from previous page (1:20040). Possible chain linkage problem.
Server: Msg 8964, Level 16, State 1, Line 1
Table error: Object ID 117575457. The text, ntext, or image node at page (1:12449), slot 1, text ID 1190656606208 is not referenced.
Server: Msg 8964, Level 16, State 1, Line 1
Table error: Object ID 117575457. The text, ntext, or image node at page (1:12449), slot 3, text ID 1190656671744 is not referenced.
Server: Msg 8964, Level 16, State 1, Line 1
Table error: Object ID 117575457. The text, ntext, or image node at page (1:12449), slot 5, text ID 1190656737280 is not referenced.
Server: Msg 8964, Level 16, State 1, Line 1
Table error: Object ID 117575457. The text, ntext, or image node at page (1:12449), slot 7, text ID 1190656802816 is not referenced.
Server: Msg 8964, Level 16, State 1, Line 1
Table error: Object ID 117575457. The text, ntext, or image node at page (1:12449), slot 9, text ID 1190656868352 is not referenced.
Server: Msg 8964, Level 16, State 1, Line 1
Table error: Object ID 117575457. The text, ntext, or image node at page (1:12449), slot 11, text ID 1190656933888 is not referenced.
Server: Msg 8964, Level 16, State 1, Line 1
Table error: Object ID 117575457. The text, ntext, or image node at page (1:12449), slot 13, text ID 1190656999424 is not referenced.
Server: Msg 8964, Level 16, State 1, Line 1
Table error: Object ID 117575457. The text, ntext, or image node at page (1:12449), slot 15, text ID 1190657064960 is not referenced.
CHECKDB found 0 allocation errors and 11 consistency errors in table 'Raw:LifeStatusFP' (object ID 117575457).
CHECKDB found 0 allocation errors and 11 consistency errors in database 'CaseStatus'.
repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (CaseStatus ).
-----------------------
The funny thing is that this particular database isn't even in use any more - I've had it in single user mode for about 2 months (and the users don't have any access to the back end stuff).
If anyone can shed any light on what's happening, I'd really appreciate this.
Thanks
Dave
What if the master db gets corrupted can we still re-cover our user databases
Thanks
This one is a needle in the haystack. I have a fairly robust Dell server running WIn 2003, 16 gig of memory, SQL SP3a using log shipping failing over to a local server and then back to my location across a WAN, using a vb6 app with about 60 users, a 24X7 shop Mon-Sat. 2 times in the past month the production server has created a corrupt log ship file which gets applied to my failover server and then corrupts the failover DB to both failover servers. Of course it is my 20 gig db and the only time I can recreate the failover is Saturday night. The production DB is not corrupt and is fine. Has anyone ever run into this before. Currently working with MS and Dell, but hoping someone had experience with this !
View 2 Replies View RelatedI have an ftp task that grabs files from a remote dir ( *.csv ). However, it seams that the ftp task is corrupting some of my files. Has anyone else seen this? Is there something I can do? (These are grabbed as binary)
For example, here is the original on the remote server:
25316,<ACTUAL>,296,917,48,10,1,2006-03-29,UPLOADER
25319,<ACTUAL>,63060,106,64,10,1,2006-03-29,UPLOADER
25300,<ACTUAL>,63060,206,64,10,1,2006-03-29,UPLOADER
29743,<ACTUAL>,56060,106,96,10,1,2006-03-29,UPLOADER
29744,<ACTUAL>,56060,206,96,10,1,2006-03-29,UPLOADER
25315,<ACTUAL>,261,607,48,10,1,2006-03-29,UPLOADER
29749,<ACTUAL>,45030,103,96,10,1,2006-03-29,UPLOADER
29750,<ACTUAL>,45030,203,96,10,1,2006-03-29,UPLOADER
29751,<ACTUAL>,63030,303,64,10,1,2006-03-29,UPLOADER
But here is the result on the local machine. You can see that the last 4 lines are duplicated (plus the last character of the preceding line):
25316,<ACTUAL>,296,917,48,10,1,2006-03-29,UPLOADER
25319,<ACTUAL>,63060,106,64,10,1,2006-03-29,UPLOADER
25300,<ACTUAL>,63060,206,64,10,1,2006-03-29,UPLOADER
29743,<ACTUAL>,56060,106,96,10,1,2006-03-29,UPLOADER
29744,<ACTUAL>,56060,206,96,10,1,2006-03-29,UPLOADER
25315,<ACTUAL>,261,607,48,10,1,2006-03-29,UPLOADER
29749,<ACTUAL>,45030,103,96,10,1,2006-03-29,UPLOADER
29750,<ACTUAL>,45030,203,96,10,1,2006-03-29,UPLOADER
29751,<ACTUAL>,63030,303,64,10,1,2006-03-29,UPLOADER
R
25315,<ACTUAL>,261,607,48,10,1,2006-03-29,UPLOADER
29749,<ACTUAL>,45030,103,96,10,1,2006-03-29,UPLOADER
29750,<ACTUAL>,45030,203,96,10,1,2006-03-29,UPLOADER
29751,<ACTUAL>,63030,303,64,10,1,2006-03-29,UPLOADER
Hi friends,
Can somebody tell me how to do this-
How can we Analyze existing code used to transform data into the Operations Data Warehouse, and make changes to correspond to upcoming changes in the SAP data sources.
Thanks
sk
I have installed an application that uses MSDE on a server running Windows 2000 with SQL 7 already installed; my application, MSDE and SQL7 all continue to function properly after install.
The problems start when I uninstall MSDE ; the uninstall of MSDE seems to corrupt the SQL 7 installation. After using Control Panel/Add-Remove Programs to uninstall MSDE I get an error message box with heading SQL DMO and error message Error 340 - General Error. I can no longer see the SQL Server registration and cannot reregister the server either!
My quesiton is this? Is there a process of uninstalling MSDE that will leave the SQL 7 installation in the same state as it was prior to installing MSDE? Obviously, the uninstall of MSDE seems to corrupt the SQL 7 installation - how can I ensure that this does not hapen? Does the uninstall of MSDE remove or reregister COM objects that are required by SQL 7 and if so how can we get around this?
Hi,
On my MS SQL Server 2000, I am trying to create a generic way to load tables into my datawarehouse.
I have as input to the process a large number of table definition(s) stored individually as files on my server. And, ascii delimited data files in various locations but mostly accessible via NFS mounts.
I created two DTS package in MSSQL2K that in theory represents what I want to do:
package1
... invoke package2 with global variables to load a system of related tables
package2
... check for a trigger file
... set the "Execute SQL Task" statement to my first file
... run the "Execute SQL Task" which drop/add's a table
... set a "Connection" to a data source file that I want to use
... run the transformation
and, with that my package starts to fall apart
... set the "Execute SQL Task" statement to the next file, and
...... goback and execute it
I can't figure out how to set the table in the transformation section to the table I want to use. And, I assume next to have the transformations links between the source and new table relinked.
The source files contain in the first row the column names as found in the tables I just created.
thanks,
Dave Rowsome