Single Source, Multiple Lookups Against Same Table
Nov 16, 2006
Let's say I have 4 columns coming from my OLE DB source.
Column1
Column2
Column3
Column4
I also have a table that I'll be using in a lookup, LUPTable. In LUPTable, I have two fields, LUPField, ReplaceField.
In my data flow, I need to take columns, 2-4, and look them up against LUPField in LUPTable. I then need to add the value of ReplaceField (when a match is found) into the data flow.
The problem that I'm running into is that I don't want to sequentially do the lookups in the dataflow, because that's just a waste of time/memory. I only need to build the in-memory lookup table once, because that exact same data (it is static, for the most part) will be used for the remaining lookups.
What is the best way to achieve this?
The goal is to have the following columns remaining in the dataflow:
Column1
NewColumn2 (containing value from ReplaceField)
NewColumn3 (containing value from ReplaceField)
NewColumn4 (containing value from ReplaceField)
Column2-4 can be dropped from the dataflow after the lookups.
I need to do a 4 column lookup against a large table (1 Million rows) that contains 4 different record types. The first lookup will match on colums A, B, C, and D. If no match is found, I try again with colums A, B, C, and '99' in column D. If no match, try again with column A, B, D, and '99' in Column C. Finally, if no match in any of the above, use column A, '99' in B, '99' in C, '99' in D. I will retreive 2 columns from the lookup table.
My thought is that breaking this sequence out into 4 different tables/ lookups would be most efficient. The other option would be to write a script that handled this logic in a single transform with an in-memory table. My concern is that the size of the table would be too large to load into memory.
I am using DTS and VBScript in DataPump tasks in order to transfer large amounts of data from text files to an SQL database.
As the database uses a normalized schema, there is often the case of inserting multiple records in a destination table from various fields of the same record of the source text file.
For example, if the source record contains information about goods sold like date, customer, item code, item name and total amount, and does so for a maximum of 3 goods per sale (row), therefore has the structure:
I have tried using a datapump task and VBScript, and I guess it has to do with the DTSTransformStat_**** constants, but none of those I used seems to work
In many of my packages I have to translate an organizational code into a surrogate key. The method for translating is rather convoluted and involves a few lookup tables. (For example, lookup in the OrgKey table. If there is a match, use that key; if not, do a lookup of the first 5 characters in the BUKey table. If there is a match, use that key; if not, do a lookup of the first 2 characters... You get the idea.)
Since many of my packages use this same logic, I would like to consolidate it all into one custom transformation. I assume I can do this with a script transform, but then I'd lose all the caching built into the lookup transforms.
Should I just bite the bullet, and copy and paste the whole Rube Goldberg contraption of cascading lookup transforms into each package? Or is there a better solution I'm overlooking?
I am pretty new to SSIS. I am trying to create a package which can accept data in any of several formats. i.e. CSV, Excel, a SQL Server database/table and import the data into my destination database.
So far i've managed to get this working OK. However I am now TOTALLY stuck. I'm currently trying to just concentrate on the data sources being a CSV (using a Flat File Data Source) and/or an Excel Spreadsheet.
I can get the data in and to my destination using a UNION ALL component and mapping the data sources to it so long as both the CSV file and the Excel spreadsheet exist.
My problem is that I need my package to handle the possibility that only the CSV file might exist and there is no Excel spreadsheet. In which case i'd like the package to ignore the Excel datasource completely. Currently either of my data sources do not exist I get errors and the package terminates.
Is there any way in SSIS that I can check all my data sources to see which ones exist (i.e. are valid). If they exist I want to use them. If it doesn't exist i'd like to disgard it (without error - as long as there is a single datasource the package should run)
I've tried using the AcquireConnection method in a script task on each of my connections, hoping that it would error if the file/datasource did not exist. It doesn't though (in the case of an Excel datasource it just creates a empty excel file for me).
The only other option I can come up with are to have seperate packages depending on the type of data we want to import and then run a particular package depending on the format of the source data. This seems a bit long winded. I am pretty sure I must be able to do what I want to achieve but I can't work out how.
I'll be grateful to anyone who can send me any tips/hints/links on how I can achieve this.
There are 3 tables Property , PropertyExternalReference , PropertyAssesmentValuation which are common for 60 business rule
SELECT   PE.PropertyExternalReferenceValue  [BAReferenceNumber] , PA.DescriptionCode   [PSDCode] , PV.ValuationEffectiveDate   [EffectiveDate] , PV.PropertyListAlterationDate   [ListAlterationDate]
[code]....
Can we push the data for the above query in a physical table and create index to make the query fast rather than using the same set  tables multiple timesÂ
I have implemented a single lookup and would like to know the optimal approach to implement multiple lookups within a single €œdata flow task€? i.e. my question was if I had to look up multiple reference tables to obtain surrogate keys. I am oversimplifying for illustration purposes€¦
Hi, We are building an application for online system for people to place ADs for selling various used items like Car, Electronics, Houses, Books etc. If someone selling a car then he can fill out headline, year, make, model, mileage, transmission, condition, color, price, description, contact etc. Similarly if someone selling a digital camera he will fillout headline, memory, zoom, megapixel, maker, model, color, batter, description etc. Option 1: I can have a main table to hold the common attributes of all different types of ADs (headline, images, contact, price, color, condition, description) + 1 table to store string values of all ADs (car: maker, model, square feet (if house), memory, megapixel (camera) etc) + 1 table to store the droplist select values(car: transmission, door, seat etc; house: year_built) pros: single table for all ADs. unique IDs for all ADs, easy to extend as new attributes can be dropped easily. cons: lot of physical reads of 2nd and 3rd table from join. 10 times physical reads compared to option 2 when reading 5000 records. Option 2: have different set of table for each AD type. Car will have its own main table + 1 table to store multiselect list box values. Similarly housing will have its own set of tables pros: 10% less physical read than option 1. cons: hard to add new attributes. We have to modify the main table by adding one column. Query will go to different table based on the category. Do you have any suggestions on which way to go?Thanks
I'm trying to update a checkbox from "False" to "True" within a single table for multiple records. I can update a single record using the script below. However, I'm having trouble applying additional Id's to the string.
(Works) - Update Name_Demo set KEY_CONTACT = 'true' where ID = 225249
(doesn't work) - Update Name_Demo set KEY_CONTACT = 'true' where ID = '225249, 210014, 216543'
It says query executes successfully but returned no rows.
Folks,While I still have some hair left, can someone help me with thisquery?I have a table "TestRunInfo". Amongst other fields there are"TestRunIndex" (Pri Key), "TesterID", "Duration", and "Status".The Status field links to a Status table, which links the index valueto a more meaningful label "Pass", "Fail" etc...As you may have guessed, there is a record for each test that anindividual tester runs, and with that record is a duration, and status(1,2,3 etc).What Im trying to do, is create a datasheet view, with a single rowfor each testerID, summarising that Testers work as follows:TesterID, Total Duration, Count of passed tests, Count of failed testsSo far I have:Select TesterID, sum(Duration), count(Status) FROM TestRunInfo GROUPBY TesterIDBut this of course purely gives the total number of tests run by thatengineer as the count. I need to break it down. Help? Someone?Please?!?!?TIASteve
Hello Everybody, If any one have solution then please help me. Thanks in Advance. I do the following steps:
I have created two publications on my SQL SERVER for merge replication Publication A €“ which returns all rows from the Table1 Publication B €“ which returns all rows from the Table1 where the field MANAGER =€™ABC€™ I have two clients who have MSDE Client 1 is subscribed to Publication A and Client 2 is subscribed to Publication B All works fine till now and I am able to make transfers from the two clients and get all the changed data However, now If I change the filter rules for Publication B and set that it should return all rows from the Table1 where the field MANAGER = €˜DEF€™ , SQL Server tells me that I have to reinitialize all snapshots for all subscriptions. If I don€™t do this it doesn€™t allow me to make a transfer from my Client 1. As you can see in my example I have not made any changes to Publication A to which Client 1 is subscribed. So this seems to be illogical. In our case it would not be possible for us to reinitialize the subscriptions for all Publications when the rules for only one Publication are changed because we may have a lot of clients connected to our Server and if we reinitialize the subscription then all the data is sent again. If anybody know that how to restrict the other publications re-initialization then please tell me how we can do.
To anyone that is able to help....What I am trying to do is this. I have two tables (Orders, andOrderDetails), and my question is on the order details. I would liketo set up a stored procedure that essentially inserts in the orderstable the mail order, and then insert multiple orderdetails within thesame transaction. I also need to do this via SQL 2000. Right now ihave "x" amount of variables for all columns in my orders tables, andall Columns in my Order Details table. I.e. @OColumn1, @OColumn2,@OColumn3, @ODColumn1, @ODColumn2, etc... I would like to create astored procedure to insert into Orders, and have that call anotherstored procedure to insert all the Order details associated with thatorder. The only way I can think of doing it is for the program to passme a string of data per column for order details, and parse the stringvia T-SQL. I would like to get away from the String format, and gowith something else. If possible I would like the application tosubmit a single value per variable multiple times. If I do it this waythough it will be running the entire SP again, and again. Anysuggestions on the best way to solve this would be greatlyappreciated. If anyone can come up with a better way feel free. Myonly requirement is that it be done in SQL.Thank you
I have 3 tables with the follwing schema Table <Category> {
UniqueID, LastDate DateTime }
Assume the follwing tables with data following the above schema
Table Cat1 {
1, D1 2, D2 3, D3 } Table Cat2 {
2, D4 3,D5 4, D6 } Table Cat3 {
1, D7 3,D8 5,D9 }
I have a Master and the schema is as follows Table master {
UniqueId, Cat1 DateTime, -- This is same as the Table name Cat2 DateTime, -- This is same as the Table name Cat3 DateTime -- This is same as the Table name }
After inserting the data from all these 3 tables, I want the my master table to look like this Table Master {
How to import multiple text files (residing in single folder) into SQL Server table? I know how to import single file but not sure how multiple files could be loaded? Pls. guide.
I have resulting rows from a query similar to the following:
The data is coming from a single table that contains only one coverage code column and one coverage code date, but the end user wants the two coverage code types and dates combined into a single row. So the SELECT looks something like this:
SELECT [Employee ID] = emp.employee_id, [Coverage Code 1] = enr.coverage_code, [Coverage Date 1] = enr.coverage_date, [Coverage Code 2] = case when enr.product_type = 'Accident.Accident' then enr.coverage_code else NULL end,
[Code] ....
I basically want to merge the like Employee ID's together into a single row like the following:
I know I have done this before and it is probably pretty simple.
I have multiple databases in the server and all my databases have tables: stdVersions, stdChangeLog. The stdVersions table have field called DatabaseVersion which stored the version of the database. The stdChangeLog table have a field called ChangedOn which stored the date of any change made in the database.
I need to write a query/stored procedure/function that will return all the database names, version and the date changed on. The results should look something like this:
We have control table which will be useful whether we need to start the job or not. If we are starting the Job we will make it to 1.
Below is the Table Structure.
Table Name    IN_USE_FG CUST_D           0 PROD_D           0 GEO_D            0 DATE_D           0
Now we have different packages for 4 tables data loading. These 4 packages will start at a time. Before going to load the data we have to make the Flag to 1 and after that we have to load it. Because of this we have written Update statement to update the Value to 1 in respective Package.Â
Now we are getting dead lock because we are using same table at a same time. Because we are updating different records.Â
i just can't find a way to perform this Select Query in my ASP.Net page. I just want to find out the sales for a certain period[startDate - endDate] for each Region that will be selected in the checkbox. Table Sales Fields: SalesID | RegionID | Date | Amount This is how the interface looks like.Thank You.
I have a data flow task in which there is a OLEDB source, derived column item, and a oledb destination. My source is a SQL command, that returns some values. I have some values, that I define in the derived columns, and set default values under the expression column. My question is, I also have some destination columns which in my OLEDB destination need another SQL command. How would I do that? Can I attach two or more OLEDB sources to one destination? How would I accomplish that? Thanks
Iam new to SQLl2005. Iam using DTS to transfer data from my source to the warehouse. I have a couple of tables in my source whein I have to join these to tables fields and insert the same in teh warehouse fact table. I have used a Join query in my Oledb source component, What other component needs to be used to insert the data into the fact table. I also need to extract same data with aggregation and insert the same into an another Fact table.
Im having a little diff speeding up the below process
I have two tables table a contains 300,000 rows which are unique however the identifier can appear more than once. The fields im interested in are the identifier and a date/time field.
Table two also contains an identifier and a date field and again can contain multiple instances of the identifier with a variyity of dates.
For each row in table a i want to review the date and look up table b for the first date greater then or equal to the date linking by the identifier.
i have managed to do this via the code below however it takes 45 mins and i want to speed this up.
Select a.*, (select Min(DateB) as DateB From #tableB b where a.identifier = b.identifier and b.DateB >= a.DateA) asDATE From #TableA a
I have created a trigger that is set off every time a new item has been added to TableA.The trigger then inserts 4 rows into TableB that contains two columns (item, task type).
Each row will have the same item, but with a different task type.ie.
I have some SQL experience, but nothing past basic commands. I'm trying to take some data held by an application to use as CSV import into another application.I have two tables from an application, one holds references made in another.The first tables holds details about a person:
field1=name field2=age field3=country
Joe,50,1
Country is held as a number, then there is another table that holds all the countries: