SQL Server 2008 :: How To Load Dimension And Facts
Apr 15, 2015
I have some source tables like Customer, Order, ship, item, invoice. Among these source tables, I have to create 5 dimension tables and 1 Fact called orderFact using sql server queries just to test data. So i have created 5 dimensions and pulled dimension keys from each dimension and loaded into fact using join. For measures I have joined those 5 sources created a Rawfact table which have all measures.
Now loading into fact I have joined Rawfact with all dimensions and get keys and for measures i directly pulled from rawfact. Is this process right or we can do it by some other method?
And I want to avoid any Cartesian product for below queries. What I can do to avoid this?
Dimension:
DimCustomer, DimOrder, DimShip,DimItem, DimInvoice and Fact is FactOrder:
Loading Rawfact:
select o.ord_id, o.full_order_value,o.open_order_value,o.div_code, o.order_type_code,o.order_status,o.order_date,
s.num_of_pallets,s.num_of_cartons,s.shipment_value,s.ppd_coll,s.ship_status,
i.invoice_amt,
it.net_weight, it.gross_weight,it.warranty_days,it.item_type,it.item_num,
c.terr_code, c.largest_bal,c.last_amt_pay,c.last_inv_amt,c.num_invoice_paid,c.cust_num,
from order o
I need to load a customer database onto our SQL 2008 server. I always use restore database option in the management studio and create a new database from device (customer database backup file), it used to work just fine. But when I do the same now, I get this below error:
CREATE DATABASE permission denied in database 'master'. RESTORE HEADERONLY is terminating abnormally.(Microsoft SQL Server,Error: 262)
I also tried to create new database option in the management studio, but get the same error. I did run management studio as 'Run as admin'.
I have a table in Server A and it has 5 columns. One is Address & ID, CreateDatetime,..
I need to transfer data from this table from Server A to server B for a report pupose. The Address column in this table has some places two address in the table. I am giving ex below
Address Houston, Dallas Redmond Sacramento New Jersey, New York
I want to avoid the rows where address are two Houston, Dallas & New Jersey, NewYork to the destination table in server B and need to do incremental loads.
how to proceed with this?
The version we are using is Sql Server standard edition 2008r2
I am running SQL 2008 Enterprize Edition with SP1 on Windows 2008. I am trying to set up replication. I have completed the following:
1. Created distribution Database 2. Created publisher 3. Granted SQL Agent access to the ...MSSQL100Com folder to execute the agent_exe files 4. Granted SQL Agent access to ...MSSQLinn where the subsystem_dll files are located 5. Granted SQL Agent write permissions to ...MSSQL epldata in order the write the bcp files
Each time I try to initialize the snapshot, I get the following errors in the SQL Agent Log
1. Log Step.......cannot be run because the LogReader subsystem failed to load. The job has been suspended. 2. Log Step.......cannot be run because the Snapshot subsystem failed to load. The job has been suspended.
I found posts where the records in the msdb.dbo.syssubsystems pointed to different folders than where the dll and exe files are located. So, I checked that, but they are correct.
The SQL Agent has sysadmin on the SQL Server and is using a windows service account.
I believe it is a security issue because I can run the executables from the command prompt to generate the snapshot for the publication. Have I missed the forest for the trees?
I'm putting together a Kimball method SSIS package. My factSales table has an OrderRep key. If a match isn't found in the dimRep table I am inserting a dummy dimRep row and going on. That seems to be working.
My question is what do I do when the OLTP sales row has NULL for the OrderRep. This is possible; every sale does not have to have an order rep. My package is seeing that as a non match and trying to create a dummy row in the dimRep table for every NULL. I really don't want to do this. I can trap for the NULL rep and convert it to "unknown" or something but then the program would still create a single row in the dimRep table for unknown. Is that the best way to handle this? Or is there a way to trap for NULL and ignore the entire lookup process? A conditional split before every key lookup?
I have about 5 or 6 other dimension tables that will have the same NULL possibility.
I have situation where I get data from SRC Flat file and have to load Dimensional table and also fact table, using same data flow(have no other choice since I have to unpivot some src data). Since I have to load both tables in same data flow, I have to have a way to put load ordering constraint (I know informatica allows that). Does any one have any idea on how this can be done in SSIS?
I have a ssis package which runs daily. This ssis package has couple of execute sql tasks which load data for yesterday's transaction. Ex.
INSERT INTO Shipped (Div_Code, shipment_value, ship_l_id, shipped_qty, shipped_date, whse_code,
ord_id, ship_id, ship_l_ord_l_id, Created_date) select ord.DIV_CODE as div_code, ship.SHIPMENT_VALUE as shipment_value, ship_l.SHIP_L_ID as ship_l_id, ship_l.SHIPPED_QTY as shipped_qty, ship.SHIPPED_DATE as shipped_date, ship.WHSE_CODE as whse_code, ord.ORD_ID as ord_id, ship.SHIP_ID as ship_id, ship_l.ord_l_id as ship_l_ord_l_id, Getdate() as Created_date from SHIP ship, ORD ord, SHIP_L ship_l where ship.SHIPPED_DATE=(dateadd(day, -1, CONVERT(VARCHAR(10),GETDATE(),120))) and ship.WHSE_CODE='WPP' and ord.ORD_ID=ship.ORD_ID and ship.SHIP_ID=ship_l.SHIP_ID
All execute sql task has query like above query. and in some query we have date filter which loads data for yesterday. Ex. one query has ship.SHIPPED_DATE=(dateadd(day, -1, CONVERT(VARCHAR(10),GETDATE(),120))). some other query has ord.trans_date=(dateadd(day, -1, CONVERT(VARCHAR(10),GETDATE(),120))). this package runs daily through sql server job, so It loads data for yesterday. Now If i want to run for any particular date, How could we achieve from ssis?
I have a ssis package, which runs on date parameter. If we dont specify the date it always load data for yesterday's date. And if we give any specific date like '2015-05-10', It should load for that date. How can we achieve this dynamically (using package configuration)? Once we load for any specific date, package should be set for yesterday's date dynamically. How to achieve this as I am new to SSIS.
When i add a dimension to the cube dimension without any relation in my dimension usage to any measure group my units are going down.However when i remove the dimension from the cube am getting the correct values.
I made an illustration, that describes the szenario. There is one Master, one dimension and a second facts table. The Master Table holds up all objects. Sometimes I have additional Information for some objects in Master Table. They are stored in extra tables like the table Optional_facts_for_villas.
My question: How can I combine these two fact tables in Cube Designer without doing a Full outer Join in my Database?
The reason why I want it this way, not all optional_fact_tables are necessary now for OLAP, but maybe later. So, I thought, easiest way would be, adding them later to the cube, without changing the database. Sorry for my english, its not the best :)
This is my first task in attempting to populate a fact and dimension table from SSIS. I have a Fact table Sales and dimension tables Customer and Location. The data I am getting to fill this structure is in one file. where each record contains the sales information as well the customer information and location details on the same row.
I am using the SSIS to fill this structure by using the slowly changing dimension for the Customer dimension.
I am filling the customer dimension by using the slowly changing dimension. If I have 2 records having same BusinessKey but each with a different first name, where first name is set as a changing attribute, it is creating the customer twice in the table. But shouldn't it create one record with the most recent first name? or am I miss using the SCD.
I have another conceptual question, I am not sure which is the best way to fill my fact and dimension. Should I fill the customer and location dimensions first through 2 different loops on the data and then fill the fact table and map to corresponding dimensions? Or should I do like a for each loop on each record and for each record fill the dimensions and fact simultanously?
I'm defining a mining structure against an OLAP dimension. The continuous value that I'm using both as input and for forecasting represents the time to complete a certain process.
There's something that strikes me as if it could be a problem, but I'm not sure. Our fact table has multiple columns (with multiple correponding measures in the cube). The "time-to-complete" measure is only populated on some of the fact rows - the rows that represent completion information. Other rows represent other information, and the "time-to-complete" value is set to 0. This works fine for cumulative time-to-complete and average time-to-complete, but it seems like it could mess up data mining. Will those 0-value facts skew the mining results? I'm not seeing a way to filter out those entries and only include the non-zero facts in the mining processing.
Or perhaps I'm totally misunderstanding something, which is quite possible. :)
We have a few customers dropping files in Amazon S3. how to load this data into SQL Server 2008 R2 database using SSIS? We are 2008 R2 BIDS environment.
I have a simple SSIS package that reads a flat file and copies it into a SQL Server table.
When the flat fiel is on the C drive I have no problem runnign this package from SQL Server Agent, but as soon as I update the path to a network location the package only works when I run it manually, but fails when is executed via the SQL Server agent job.
The error says "cannot open the datafile", while the datafile location is valid.
Is this a kind of limitation of a SQL Server Agent that only local files are allowed to be processed?
There are a few features in the new SQL Server - Reporting Services that I really need in production. I have tested everything and it works great. I am running the CTP version since Microsoft is saying they aren't releasing the release version until 3rd quarter 2008.
Since Microsoft won't sell SQL 2008 until 3rd quarter, can I run the CTP in production until the release and then purchase SQL 2008?
Hello - does anyone have experience w/SQL Server 2005 in a virtual environment? I'm considering this for a production environment but not sure if performance will suffer. Our databases will have a lot of writing but not too much reading. A SSRS solution is currently the only app. connecting to the SQL db. Max users to server at any given time will be very low (~10 users max). But the databases are pulling in data from other, outside multiple data sources on a daily basis.
Hello! Recently, I set up server with Windows Web Server 2008 RC1, SQL 2008 Express beta, .NET 3.5, IIS 7. I'm running ASP.NET web application with SQL database. Everything works fine until the first application state on the server expires. After that, any postback that starts a new application state on the server and connects to the database, results in the following error: Failed to generate a user instance of SQL Server due to a failure in starting the process for the user instance. The connection will be closed. Is this a bug that will be fixed in release of Windows / SQL or am I doing something wrong? Many thanks for help, Jan
Here I will describe my problem. 1. We are loading large amount of data from database on background thread which is starting on Application_start event in global.aspx.cs file.The data is later cached for subsquent request to improve the performance. 2. Now when we put the application on web farm garden, it is not able to load the application. 3. We are sending the request the servers through Router kind of application. 4 This application is working fine on single server enviornment.
Need some help building a query that does the following :
I have 2 Time Dimensions ; Time (Transdate) and ClosedDate (ClosedDate)
In my report/query, if [Time].CurrentMember = [Time].[YMD].[YMD].[2006].[200610].[20061031] I want to FILTER out all ClosedDate < [ClosedDate].[YMD].[YMD].[2006].[200610].[20061031]
Both Time Dimensions are Year -> Month -> Day and have the same Members.
I have every option available, using calculated Members and/or Measures to do this.
The report I'm creating is Aging of Receivables : Balance / 30 days / 60 days / etc.. But for the Aging, I need to filter like explained above.
I downloaded the €œMicrosoft SQL Server 2008 Express CTP, February 2008€? from http://www.microsoft.com/downloads/details.aspx?FamilyId=749BD760-F404-4D45-9AC0-D7F1B3ED1053&displaylang=en
I simply replaced the 2005 file €œSQLEXPR.EXE€? with the 2008 file €œ€?, recompiled the installation and tested only for it to fail. I than read the 2008 books online and noted the change in command line options.
I then changed the command line to suit the Microsoft 2008 books online, recompiled the installation and tested only for it to fail once more.
Interestingly I tested the install from the default GUI and at the point of adding the €œsa€? login credentials it fails to allow the installation to proceed. Strangely by selecting the windows authentication credentials, €œnext€? than €œback€? it now allows me to add the €œsa€? login credentials and continues to install correctly as required.
I hope I have explained this clearly enough.
1. Is this a bug in the €œMicrosoft SQL Server 2008 CTP, February 2008€? installation? 2. If so is this causing the command line install options to fail? 3. How do I obtain a version of €œMicrosoft SQL Server 2008 Express€? that will work installing from the command line?
I hope to pass a dimension such as int A[5] to a stored procedure A[0]=10 A[1]=20 ... Can I do that in sql server 2005? if so , please give me a sample, thanks!
I just have done the SSIS example in the tutorial document included when install SQL 2005 ENT. I have a problem that whenever I test to run, the service load all data from source with out noticing about the data (I mean it load all the data to the destination), I do it several time and it continue to load all without checking. That mean the data is dublicated when the schedule run???
I think there should be a paramete or something like that to help the engine just load the new data to the destination. Could you help please?
In our application we have created a SSIS package which extracts data from staging table and places the same in destination table. We have created a slowly changing dimension for the same. Slowly changing dimension uses a composite business key of two columns to decide whether it is a old record or a new record.
Problem : On execution of the package it copies duplicate records with same business keys instead of updating the same. Also the same does not happen for all records. For few records update works fine but for others it inserts a new duplicate record.
I will appreciate if anybody can guide me where I am doing something wrong.
I have an existing SSIS that uses SCD on a rather large table. I am also migrating to SQL2014 Enterprise. My question is, is it possible to use SCD and Columnstore(CS) together? I know that I can drop the CS and add a business key then insert data. Then drop the business key and add the CS back.
Hi, How can i install Report server on a load balanced server.. My sql server is on a different machine and we have 2 webservers where we need to install Reporting services. Any help will be appreciated.
Please help. We have no idea what is wrong. Error message occurs at the end of the load. "Error at Destination for Row number 6218607. Errors encountered so far in this task: 1. SqlDumpExecptionHandler: Preocess 11 generated fatal exception c0000005 EXCEPTION_ACCESS_VIOLATION. SQL Server is terminating this process."
Is it possible to take a backup of a database and load it to another server? I know the users id will be messed up. But, I don't want to do a detach or a DTS task. I just want to verify that the backup is good.