I am building a health care application that marries transaction-level data (health care services provided) with person-level characteristics that have a time-dimension. The person-level characteristics are diseases that the person has (these disease all have a start and some have an end date). The diseases are stored in a table in which the foreign keys are a person-identifier, a time identifier (month/year) and a surrogate for the disease. Persons can have more than one disease at a time (the diseases are NOT mutually-exclusive). There are no measures in this table. The transaction table has a foreign key for person and time (month/day/year), a procedure code (the type of service rendered) and money (the cost of the services).
How do I answer the following questions:
What is the total cost of care (the sum of all service costs) last year for persons with "disease A"?
What is the total cost of care last year for persons with "disease A" AND "disease B"?
What is the total cost of care last year for persons with "disease A" OR "disease B"?
I've tried a factless fact table but can't get it to work. If anyone has the right solution and can communicate to me before I slit my wrists, I would be greatly appreciative!!!
I built my first tabular model and see that my fact tables are also appearing as dimensions. In Multi dimensional mode i could choose which are the dimensions. How do i do that in tabular model.
Hi, I use lookups to map surrogate of level 1 dimensions to my fact tables in SSIS. But how to handle a level 2 dimension with a ValidFrom and a ValidUntil date field? I do not use an IsCurrent column, because this could problem with late arriving facts.
- In dts I used an SQL statement like this:
update SA SET SA.DimProdRef = Dim.RecordID FROM SAWarenEingang SA, DimProd Dim where SA.ProduktNumber = Dim.ProduktNumber and SA.ArtikelkontoBewegungsdatum between Dim.ValidFrom and Dim.ValidUntil
Now in SSIS I want to handle the whole thing in the data flow without using a staging table: - Using Lookups: I would have to pass the date column for each inside the fact table into the lookup. That does not work. - Using Execute SQL in the data flow: would be very slow, because the statement will be executed for any line in the dataflow
Hi, all experts here, Do we always have to use SCD component for the loading of data into data warehouse to handle changes of rows? I am looking forward to hearing from you and thank you very much in advance for your help. With best regards,
I am relatively new to SSIS/SSAS. I have searched the forums but cannot find an answer to my question.
I created a cube in SSAS and have deployed it. Now I am trying to use SSIS to populate the cube. I have setup a DS that points to the SSAS instance - it uses OLEDB Provider for Analysis services 9.0.
When I try to use a data flow task OLE DB source to truncate the dimension/cubes I do not see the DS in the list to select?
I am finding it hard to get into the SSIS way of organizing the processing.
Is there a way to define measure group on fact1-details-table using TimeDim. Date info is only in fact1 table and not in details table.
Is namedquery joining the 2 fact tables the best solution in this case? There is so much redundancy using one fact table, so underlying sqldb uses 2 tables.
What are the general guildlines for choosing these settings, like paralle vs. sequential, different error configurations? The default selection for processing multiple dimensions is paralle and use default error configuration. But the default for processing multiple partitions is sequential. I cannot find anything helpful other than the definitions from the online help. TIA.
I am modelling two fact tables of Actuals and Budget which are at different granularity, Actuals are at day, customer and product sub category level. Budgets are at month, Region and Product category level.
Month, Region and Product Category is present in Date, Region and Product Category dimension respectively. I have only three dimensions as Customer, Product and Date. Linking those dimensions to Actual Fact table is not an issue, what is the best way and options are there to link budget fact table to those three dimensions.
1. 2 Dimensional tables(Parent), DIM1 with 50000 rows, and DIM2 with 1000 rows
2. Fact 1 with 50 columns, 25 Million rows and with FK to DIM1 and DIM2
3. Fact 2 with 40 columns, and 25 Million rows and with FK to DIM1 & DIM2 tables.
Actually the fact 1 and fact 2 have same related data but since our Analysis cube person wanted the fact table not to have more than 50 columns we divided the tables into 2, but they have the same compound key.
Above said, I have a situation where I have to select all the columns, in both fact tables, and do a group by. I wrote the query and ran "Analyze Query in the Database Engine Tuning Advisor" for it. It gave bunch of recomendations about the statistics and indexes which I created. When I executed the query the result came up in matter of seconds, which was good.
In the query I had a condition having MarketName='Bridgeview' and DateID = 344 (FK of today-1).
When I wanted the data for last 30 days I changed to DateID in ( > FK of today -32 and < FK of today), the query responded and worked fine.
But when I changed the query to get MarketName='Aurora' (other than I used when I ran Tuning Advisor), the result returned is empty set. When I removed the MarketName condition, it is supposed to return all markets' data, but it returns only Bridgeview data.
I know the data is in the table for all markets, since reports are rendered from these fact tables for all of these markets(also ran queries to check the fact table data).
I am unable to point out the reason why the query behaves like this. It responds to the date change, but not to the MarketName change.
I really appreciate if anyone can help me point out the problem.
I have delta loaded all the dimension tables now and each dimension table is related to fact table through a surrogate key, How do i further load a fact table. Please tell me I am stuck up here.. :( .
If any one has an example to refer please do tell me
I have to load my fact table using data from my stage table and joining it against dimesnsions, most of my dimesions are straight joins that I can implement using Lookups , howevere in one join I am using something like this
SELECT T.Dim_Time_ID ,
SUM (Measure)
FROM [dbo].[Stage_Table] S
INNER JOIN Dim_Time T
ON SUBSTRING(T.MonthYear,1,3) = SUBSTRING(S.Time,1,3)
AND SUBSTRING(T.MonthYear,6,2) = SUBSTRING(S.Time,6,2)
Currently I am using a copy column transformation to make a copy of this column, and then using substring function in my SQL code and the Lookup Transformation joining the data to get the desired output, was wondering if there is any other (better) way of accomplishing this inside the data flow.
Hi I am strugglinh since last 2 days .SSIS is giving me torrid time
I am getting error while loadding the fact table
[Destination Fact Table [1099]] Error: An OLE DB error has occurred. Error code: 0x80040E21. An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80040E21 Description: "Multiple-step OLE DB operation generated errors. Check each OLE DB status value, if available. No work was done.". [Destination Fact Table [1099]] Error: The "input "OLE DB Destination Input" (1112)" failed because error code 0xC020907B occurred, and the error row disposition on "input "OLE DB Destination Input" (1112)" specifies failure on error. An error occurred on the specified object of the specified component. [DTS.Pipeline] Error: The ProcessInput method on component "Destination Fact Table" (1099) failed with error code 0xC0209029. The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running.
Please help
I have already googled for this error and appied whatever tips were given.
I'm currently setting up a Tabular Model to do some research between several fact tables. In this example i have two fact tables (table 1 and table 2) which I've created a 1 to 1 relationship on phone number. Typically I create a relationship between these tables to find common data between the two. However, in this case I am trying to figure out the best way to model the data so that I can easily surface data from one table that does not exist in the other. I would liken this to a LEFT JOIN or a WHERE NOT EXISTS in SQL.
Table 1 has all of the data and Table 2 Only has a subset of the data from Table 1. What I'm trying to do here is display what attributes in Table 1 may play a part in records not existing in Table 2. What is the best way to model this?
I'm loading a fact table that has several geographic attributes - some are at the state level, some are at the county level, and then some are drilled farther in that that. I understand the basic concept of the dimension with the ragged hierarchy, but unsure of how to load to the fact table using lookups based on these geographic units. For example, if my geographic dimension contains 200 records for the state of Wyoming, basically a record for each fine-grain place (i.e. city/town), then how do I go about doing a county lookup. Wyoming only has 23 counties, but because of the repetitive nature of the dimension attributes that are not at the finest grain, I'll get more records in the lookup than I need. This activity repeats of course while I move up the geographic scale to state, then country. How do I configure/fill my dimension to handle these differing scales of data?
I have got an xml file with size more than 2 GB. I have to load this file into tables. With 32 bit platform, I am unable to load this file using SSIS. Ram is 8 GB, but it is still bombing out. As I know it uses XML DOM Parser and tries to shred the file in memory and because of memory limition, it fails. Although I have already written code in C# using XmlTextReading object(implemetation of SAX Parser) to load data in tables, but I want to keep this loading process within the limits of DBAs.
I am stuck. Can someone guide me through the situation?
I'm having a very irritating time trying to migrate data from a COBOL system to SQL Server.
One of the A/R Master files has approx. 200 columns.
I can export this file any number of ways that will (sometimes) load partially into my database, but always when the load succeeds, columns 75 through N simply contain NULL, even though there is data in the file. When the load fails in DTS, the error is always missing column delimiter. Using BULK INSERT the error is always something like data too long at column 75.
Putting all this together, I have deduced that something isn't working if I try to load a staging table with more than 74 columns of data. This seems like a way-too-low threshold for a robust database, especially since SQL Server is supposed to be able to handle up to 1,024 columns per table.
Hi I have a question how to load data to tables linked by Foreign Keys in MSDE/SQL server. Example: If I have 2 tables linked (by Foreign Key): One table:
ITEM idITEM NAMEITEM CATEGORY (FK) 1cheese 2
And another:
Category IDCATEGORY NAME 1 household 2 food 3 general
How do I enter the load of data Do I have to enter it as 1cheese2 or is there some way of entering it as 1cheesefood
TDS wizard does not allow me to transfer to views/querries what I thought would be a normal way as I would enter data to view(relevant to Access's form) and it would update related tables . When I wrote sql to do it it said I can not update my view table as too many tables would be affected(I had lookup tables empty then though) I am doing it by number using TDS wizard to transfer it directly to the main table but there must be a better way
I am trying load data from multiple Foxpro tables which are under a folder. I can have multiple folders with 17 foxpro tables. I was able to do it in DTS using ActiveX script. Here is the ACtiveX script.
'********************************************************************** ' Visual Basic ActiveX Script '************************************************************************ Option Explicit Dim conObj,DSNGosfbill,comObj,objRs,HostServer Dim sFolder,sFileFolder, Details,subFolderoccur,sFileFolderDBF,sFileFolderFPT,CheckFile,dFiles,Fil Dim fso, folderObj,subFolderList,dFolderObj Dim objPackage,oStep,objPackage_1,oStep_1,ConnObj_001,ConnObj_004,ConnObj_031,ConnObj_032,ConnObj_033 Dim ConnObj_Hclaimb, ConnObj_HProv, ConnObj_Hids, ConnObj_HCodes, ConnObj_HSpan, ConnObj_002, ConnObj_HCHGB Set conObj = CreateObject("ADODB.Connection") HostServer =DTSGlobalVariables("gvServer").Value
set comObj=CreateObject ("adodb.command") set comObj.ActiveConnection =conObj Function Main() Dim Dir_Name,DirFlag Dir_Name = "" DirFlag = "N" Set fso = CreateObject("Scripting.FileSystemObject")
If fso.FileExists(checkFile) Then Else Details = "***** Success.Lst file is missing in Batch folder. BATCH job may not be successfull or there are no folders in UNZIP directory to process. Check the batch run.*****" Call Write_Log Main = DTSTaskExecResult_Failure Exit Function End If Set folderObj = fso.GetFolder(sFolder) Set subFolderList = folderObj.SubFolders For Each subFolderOccur in subFolderList DirFlag = "Y" Dir_Name = subFolderOccur.Name Call Process_Dir(1,subFolderOccur.Name) Next If DirFlag = "N" Then Details = "***** No directories to process in SSI UNZIP folder*****" Call Write_Log End If If DirFlag = "Y" Then Call Process_Dir(2,Dir_Name) If objRs.Eof Then Details = "***** No directories to process in SSI UNZIP folder*****" Call Write_Log End If While not objRs.EOF set sFileFolder = fso.GetFolder(sFolder & objRs("zip_file_name")) Details = "***** Start-Time " & sFileFolder & " " & Date & " " & Time & "*****" Call Write_Log Call Update_Process_Flag("L",objRs("zip_file_name")) '*******Execute the package for each directory****************' '********* Call the Package**************' Set objPackage = CreateObject("DTS.Package") Set objPackage_1 = CreateObject("DTS.Package")
Set ConnObj_001 = objPackage.Connections("SSIPATH001") ConnObj_001.DataSource = sFileFolder
Set ConnObj_002 = objPackage.Connections("SSIPATH002") ConnObj_002.DataSource = sFileFolder
Set ConnObj_004 = objPackage.Connections("SSIPATH004") ConnObj_004.DataSource = sFileFolder Set ConnObj_031 = objPackage.Connections("SSIPATH031") ConnObj_031.DataSource = sFileFolder Set ConnObj_032 = objPackage.Connections("SSIPATH032") ConnObj_032.DataSource = sFileFolder Set ConnObj_033 = objPackage.Connections("SSIPATH033") ConnObj_033.DataSource = sFileFolder
Set ConnObj_Hclaimb = objPackage.Connections("SSIPATHCLAIMB") ConnObj_Hclaimb.DataSource = sFileFolder
Set ConnObj_HProv = objPackage.Connections("SSIPATHPROV") ConnObj_HProv.DataSource = sFileFolder Set ConnObj_Hids = objPackage.Connections("SSIPATHHIDS") ConnObj_Hids.DataSource = sFileFolder Set ConnObj_HCodes = objPackage.Connections("SSIPATHCODES") ConnObj_HCodes.DataSource = sFileFolder Set ConnObj_HSpan = objPackage.Connections("SSIPATHSPAN") ConnObj_HSpan.DataSource = sFileFolder
Set ConnObj_HCHGB = objPackage.Connections("SSIPATHCHGB") ConnObj_HCHGB.DataSource = sFileFolder
objPackage.Execute For Each oStep In objPackage.Steps If oStep.ExecutionResult = DTSStepExecResult_Failure Then Details = "***** GOSFBILL_SSI_Staging_Load failed. " & Date & " " & Time & "*****" Call Write_Log Main = DTSTaskExecResult_Failure Exit Function End If Next
For Each oStep_1 In objPackage_1.Steps If oStep_1.ExecutionResult = DTSStepExecResult_Failure Then
Details = "***** GOSFBILL_SSI_Update_FileSource failed. " & Date & " " & Time & "*****" Call Write_Log Main = DTSTaskExecResult_Failure Exit Function End If Next
'********************************************' Details = "***** End-Time " & sFileFolder & " " & Date & " " & Time & "*****" Call Write_Log objPackage.Uninitialize objPackage_1.Uninitialize Set objPackage = Nothing Set objPackage_1 = Nothing sFileFolder = "" sFileFolderDBF = "" sFileFolderFPT = "" objRs.MoveNext Wend objRs.Close End If Call Close_Conn Main = DTSTaskExecResult_Success End Function Sub Process_Dir (Para_cntl,Dir_Name) comObj.CommandText ="dbo.Usp_Process_Dir" comObj.commandtype = 4 comobj.parameters.Refresh comobj.parameters("@Para_Cntl")= para_cntl comobj.parameters("@Dir_Nm")= Dir_Name comobj.parameters("@File_Type")= "SSI" If (Para_Cntl = 1)Then comObj.Execute() Else If Para_Cntl = 2 Then Set objRs = comObj.Execute() End If End If
End Sub Sub Update_Process_Flag(P_Flag,Dir_Name) comObj.CommandText ="dbo.Usp_Process_Flag" comObj.commandtype = 4 comObj.parameters.Refresh comObj.parameters("@Process_Flag")= P_Flag comobj.parameters("@Dir_Nm")= Dir_Name comObj.Execute() End Sub Sub Write_Log comObj.CommandText ="dbo.usp_etl_write_log" comObj.commandtype = 4 comobj.parameters.Refresh comobj.parameters("@Text")= Details Comobj.parameters("@NDC_SSI_IND")= "SSI" Comobj.parameters("@Process_Stage")= "Staging" comObj.Execute() End Sub
Sub Close_Conn Set comObj = Nothing Set objRs = Nothing conObj.Close Set conObj = Nothing Set fso = Nothing Set folderObj = Nothing Set subFolderList = Nothing End Sub
When I migrated this code to SSIS, its not working. How can I achive this functionality in SSIS. Any one pls help me.
I have Table A . we already have 80 columns . we have to add 65 more columns.
we are populating this table from oracle .and we need to populate those 65 columns again from the same table.
Is it a better idea to add those new 65 columns to the same table or new table.
If we go for the same table then loading time will be double, If I go for new table and If i am able to run both the packages which loads table data from same oracle server to difffrent sql tables then we should be good. But if we run in to temp space issues on oracle server . Then i have to load the two tables separately which consumes the same time as earlier one.
I was thinking if there is a way in SSIS where I can pull data from same oracle table in to two diff sql tables at same time?
Hi I have a fact table in which i have a String column now i want to show it in the mdx queries..when i add it as a measure it shows some numerical values in the cube i cannot even add it as a member propeties due to some datwarehouse design constraints..so can anyone out there help me please....it urgent Thankx in advance regards Hemant
What is the best way to move data from Online system tro data warehouse? I have created 3 dimension tables(product,date and customer tables) and I wanna create fact table and get foreign keys from dimension tables. What is the best method to do that in SSIS?
I am new at SSIS and I am trying to create a Datawarehouse using SSIS. I have the data files as flat files I have the Dimensional Model ready on Paper and Now I need to use the SSIS for the ETL process.
I am trying to figure out how to make dimension tables in SSIS? I mean I want to create the 5 Dimension tables and then create a Fact table out of it but I cant understand where to start? Can any one tell me how we create Dimesion tables in SSIS. Like one of the dimesion tables I need to create uses 2 flat files and is like a flattened dimension, How would I create this in SSIS?
Even if there is any tutorial which shows this step by step do let me know. I would really appreciate any guidance on this.
Hi All, I want you an urgent respones, I want to add a column to a Dim_table and Fact_table, the Dim_table is sourcing from different database table and now the new column is going to source from another DB, thus without changing the structure of the table relationship can i add a column? To make it clear Dim_table has got Clomun1 from table1, Clomun2 from table2, Clomun3 From table3, Clomun4 From table4. Now I want Clomun5 From table5. How do i add and let it source from table5. Thank you for your qiuck response.
i have a fact table which loads through package,when i m trying to load this table by running the package,i m truncating the fact table and loading the fresh data,instaed of this without truncating the fact table i have to implement the incremental logic in this.
For this i can use SCD or Conditional split,but problem here is i have many source tables to load this fact table,so its very difficult to trace the changes in different source tables.
Hi there, my question is really simple. I want to setup an automatic task in SSIS that drops the tables in the target database and substitutes them with tables from the source database. We are talking about two or three dimension tables and one fact table. The dimension tables are pretty small. The fact table will contain, at maximum, 300,000 rows and 12 columns. I do not use delta or flag historisation btw. What tasks in SSIS would you suggest to use?
I have situation where I get data from SRC Flat file and have to load Dimensional table and also fact table, using same data flow(have no other choice since I have to unpivot some src data). Since I have to load both tables in same data flow, I have to have a way to put load ordering constraint (I know informatica allows that). Does any one have any idea on how this can be done in SSIS?
Iam new to SQLl2005. Iam using DTS to transfer data from my source to the warehouse. I have a couple of tables in my source whein I have to join these to tables fields and insert the same in teh warehouse fact table. I have used a Join query in my Oledb source component, What other component needs to be used to insert the data into the fact table. I also need to extract same data with aggregation and insert the same into an another Fact table.
I need to create a package that updates the dimensions and cube data from a data warehouse on daily basis. I was going to create a Data Flow that takes the data from the DW source then put it as input to a Process Dimension destination to update the dimensions and use a Process Partition destination in the same manner to update the cube, but then I came across the Analysis Services Processing Task which seems to do the job as well. I am kinda confused which way to go. Any recommendations?
I am modelling cube in SSAS. Cube has around 20 dimensions and 6 fact tables. Some of the dimensions are common among the fact tables. e.g. Time dimension. Fact_PNL has 3 date columns for those we have 3 role playing dimensions in the dimension usages.
Another fact table has 5 date columns for them as well we have separate role playing dimensions in dimension usage tab. We have a common dimension Company which is foreign key in all fact tables. We might need to combine the data from multiple facts to get final output.
Should i create 6 role playing dimension for each of the fact table or use the same dimension for all fact tables?
Role playing dimensions should be created when we have multiple columns pointing to the same dimension ?