I have tried the following out and would appreciate feedback from experienced users regarding if the following is a good/bad approach:
After bring all the data in my Data Mart, I have created a view which has all the data in a big flat table (totally unnormalized). Then based on this BIG FLAT UNNORMALIZED VIEW :) I have created my various dimensions using the 1st option i.e. Star Schema.
Based on the little testing that I have done, I seem to be getting the correct results across various dimensions... However, can someone kindly comment on this approach and the pros/cons.
We have an OLTP database and operational reporting is carried out on a replica server / database. We have plans to build a new data warehouse and an analysis services cube.
Question 1:Should a cube be designed to extract data from a physical star schema rather than a logical one (3NF relational (ODS?) using a data source view to derive the star)? I'm guessing for performance it's better to pull data from similar structures (physical facts and dimensions as required by analysis services) but is the difference significant?
Question 2:Depending on the answer to q1, is it bad practice to ETL data from a staging database (replica > staging) directly to a star schema (multiple data sources and cleansing / business rules required)? Or should it be processed from staging to an ods and only then to a star schema (physical or logical). I still don't know if an ODS is required but I guess the consideration for this decision is whether the business would require daily operational (or ad hoc) reporting on the consolidated data sources (without needing historical DW functionality).
I am cutting my teeth on star schema design. I have a simple star schema I am building for Headounct analysis at work. I have a factless fact table where a row represents a head in the company. Each head is toed to a particulat week in a Date dimension tabel. There are additional dimensions for things like gender, ethnicity, marital status, age, etc. Now in my department dimension - it's hierarchical. In the DimDepartmnet there is a department which belongs to a company. Comapnies belong to divisions. Now the fun part. Each division has a headcount target for each year. Up to this point I am in a perfect star schema (no snow flaking). How would I integrate in this concept of a headcount target for each division for a given year?
We are using cognos on top of this star schema to provide reporting and analysis services if that is relevant. From the Star Schema design stand point... any thoughts?
My clients are not intersted in using Auto generated keys. They are also get data from many sources they would like to use something like customerID. Dose anybody know of any reason why we should not do that? Also they are concern about sql server not being able to handle the datawarehouse in future because they are expecting it to grow in terabites. Dose anyone have advice on that? I was thinking of putting the fact table on a different file group don't if it will help.
I am re-engineering the data warehouse and my client is currently using autogenerate keys, their concern is that after a certain amount of keys (can't remember the figure) sql server starts having problems, dose anyone know how i should handle it when i am doing the designing? thanks any input will be appreciated
I'm designing a DW, and i have some doubts relative to the Distributed Transaction when modeling a star schemma.
My problem is: I have a main dtsx package in wich i call all the child packages in order to create the (Fact and Dimension Tables).
(1) First i have several child packages that create and populate all the Dimension Tables (with the latest values from the relational DB).
(2)Then i have several child packages that create all the fact tables, in this process i use the surrogate keys from the dimension tables (obtained in step 1).
The problem here is , " How do i use the multiple transaction ?" , if i put a "required" Transaction Option on the parent package, then after calling the child packages that creates the dimension tables. The values are not commited, so they are not available when i later execute the childs packages related with the fact tables.
How can i use transaction when modelling a star schemma, in order to have a full roll back or a full commit in all tables (Dimensions and Fact Tables).
we are just starting to do some testing on sql server EE with dimensional models.....we have had one or two problems we have been able to solve using the new peformance dashboards etc.
However, as is inevitable, we are seeing strange behaviour of a query....in a star join it seems to be doing an eager spool and trying to spool the entire fact table to tempdb....hhmmm....
Rather than ask one question at a time.....we have DBAs who went to classes etc at MSFT and the client is some level of MSFT partner.
Could anyone point me to the best documentation for understanding the optimiser and how to influence it to get it to do the right thing in optimising plans for star joins?
So this has got to be considered a major, major flaw in how SSRS interacts with Oracle. I'm using the "Oracle" data provider, but I've also tried using Microsoft's OLE DB data source, and some others, and in no case does SSRS hand off to Oracle a query that does NOT have bind variables. In other words, typically query parameters get passed off to Oracle as bind variables.
The incredibly major problem that this causes is that it disallows Oracle's use of star transformation queries which is the primary method by which to get fast responses to a data warehouse/star schema, in fact a prime authority on this subject (Bert Scalzo, Oracle DBA Guide to Data Warehouse and Star Schemas, p.86 -- obvioulsy not using Oracle 7x was the first) lists it as in effect the #1 consideration.
So what gives? In effect SSRS cannot be used against large scale Oracle data warehouses? I've had success with Business Objects being able to access Oracle star transformations.
So a guess my question is how the heck can use SSRS in a big, Oracle-based data warehouse?
For star_transformation join plans, the following parameters must also be considered: ... No BIND VARIABLE in SELECT statement
http://www.orafaq.com/usenet/comp.databases.oracle.server/2003/09/28/2305.htm Star transformation is not supported for tables with any of the following characteristics: * Queries that contain bind variables
Locally I develop in SQL server 2005 enterprise. Recently I recreated my db on the server of my hosting company (in sql server 2005 express).I basically recreated the tables and copied the data in it.I now receive the following error when I hit the DB:The 'System.Web.Security.SqlMembershipProvider' requires a database schema compatible with schema version '1'. However, the current database schema is not compatible with this version. You may need to either install a compatible schema with aspnet_regsql.exe (available in the framework installation directory), or upgrade the provider to a newer version.I heard something about running aspnet_regsql.exe, but I dont have that access to the DB. Also I dont know if this command does anything more than creating the membership tables and filling it with some default data...Any other solutions/thought on what this can be?Thanks!
Hello everybody!I'm using ASP.NET 3.5, MSSQL 2005I bought virtual web hosting .On new user registrations i have an error =(The 'System.Web.Security.SqlMembershipProvider' requires a database schema compatible with schema version '1'. However, the current database schema is not compatible with this version. You may need to either install a compatible schema with aspnet_regsql.exe (available in the framework installation directory), or upgrade the provider to a newer version. On my virtual machine it work fine but on web hosting i have an error =(What can you propose to me?
I would like to use SSIS tool to move the data from one database schema to another database schema.
For example:
Source table has
1. UserName (varchar 20) (no null)
2. Email (varchar 50) (can be null)
Destination table has
1. UserID (uniqueidentifier - GUID)
2. UserName (varchar 50) (no null)
3. EmailAddress (nvarchar 50) (can be null)
4. DateTime
Questions:
1. What controls do I use in my Data Flow to make data move between databases with different data types and include new value in UserID as a new GUID and DateTime as a date (GETDATE)?
OLE DB Source, OLE DB Destination, Data Converson and .....
How do I insert Guid and Date at the same time?
2. I have many tables to do data moving. Any sugestions? How do I architect my project? If I create many data flows for each table - it will look complicated.
I used SSEUtil to add a schema to my database but I am having problems. Used these steps:SSEUtil -c> USE "c:Rich.mdf"> GO>!RUN Resume.SQL//indicates success>SELECT * FROM SYS.XML_SCHEMA_COLLECTIONS>GO//schema not shown in list> USE master>GO>SELECT * FROM SYS.XML_SCHEMA_COLLECTIONS>GO//schema is shown in the queryIt appears that the schema is not added to the desired database, so when I try to use the schema in Visual Studio, the schema does not appear when I connect to the Rich.mdf database. Any ideas on what I am doing wrong or why this might be happening?ThanksKevin
I want to set up a db for an e-commerce site. I need to know how to set up the db correctly with out getting anything mixed up ... 'cos I never done an e-commerce db b4:
I need to know where and how to store the passwords, the product pictures and the customers delivery address which is different from the billing address.
The tables are as follows...
--Customer-- CustID Email (email & pwd is for login) password (will this be secure here) Name Address (should it all be in one column or firstline,secondline,zipcode columns) DeliveryAddress
--Product-- ProductID ItemName Catogary Price SellingPrice Quantity ItemsPicture...(not sure where to link the pics to) DistributorsID (the warehouse who dispatches the item)
--Distributors-- DistributorsID Name Company Address
It would be much appreciated if you could share some info and tips for how to set it up all correctly.
Tools I'll be using are, ASP.NET(C#), ADO.NET, MSSQL (Stored procedures)
I know in SQL Server the terms Database and Catalog are usedinterchangably. But a table is also assigned a schema. As seen in theINFORMATION_SCHEMA.Tables View. I don't get what this schema qualifieris all about. Like if a table has a schema of dbo.Can someone explain the relationship the schema has and what it is?Thanks.
I have setup merge replication between two machines and the data seems to be updating when changed on either side ok. However I am now looking at how I can handle schema changes. I have found on msdn (although not tried) stored procedures that will add and remove columns, but what happens if I want to delete a table for example.
At the moment I can deselect the table being deleted from being published and then delete it or modify it in Management Studio, which then means I need to create a new snapshot which can take sometime before the change is updated on the subscriber.
Is there a better way to tackle schema updates that I have not found yet? Also, how will the subscriber cope if someone starts modifying data while synchronisation to the new snapshot is going on.
I wrote a website that hits an SQL Server DB using queries that directly acces the table name with no Schema references. (Ex: SELECT * FROM tblHelpNotes;) I have also backed the file and and somehow (I have forgotten how I managed to import it.) imported it into SQL Server Express to run at home. The problem I am having is that in order to query at home I HAVE to use the Schema or it won't work. (Ex: SELECT * FROM dbo123456.tblHelpNotes;) How can I change this so I can just hit the table directly? I have tried miserable to set up a user that uses the default schema of db123456 but I can't seem to get it to work. Any idea's to point me in the right direction?
here is my problem. I have to rebuild database after crash. there is no backup. So I did a bcp to get data from. But I do not have the original database so my question is how to get the full schema off the data base, tables,colomns,stored procedures etc...
I come from a MySQL background and have been having some trouble finding the MSSQL command that corresponds to MySQL's "describe." I Googled for around 1/2 hour but can't seem to find it.
I am trying to modify a survey DB schema created by OOP(Object Oriented Programming) snobs who had their heads so far up their butts(in my opinion). This is current schema created by them.
Actually there aren't a several foreign keys as they are shown in the picture, but that's what they(OOP snobs) are doing to the table "Response". They are inserting question_collection_id and its name as well as question_id and its name to Response table. However, it was not ill-intended, "We just wanted to load the report quickly"(OOP snobs). But I really recommend this schema, Because of its simplicity.
Yet I haven't proven that my schema will provide report just as quickly as OOP snobs' version. As mentioned earlier, since they don't have foreign key constraints accordingly, I do find quite number of discrepancy on values of "question" columns between "Question" and "Response" tables and "question_collection" columns between "question_collection" and "Response" table. I can provide schedule task to eliminate those problems. But I really prefer my schema than theirs. So far, I can explain mine is better just because of its simplicity and DBA's gut instinct. I'd like to hear Superior Expert's opinion which schema is better and why one is better than the other.
Attached is also zip files of schema images if you can't see them by those URL.
I have created a very simple DTS. The DTS is used to trasfer data from Oracle to MSSQL
Eg: Select empno, empname from schemaname.employee where isactive ='Y'
I want to transfer the above data from oracle to MSSQL. I am able to create the DTS and run the DTS successfully.
Right now the schema name is harcoded. But in real scenario, the schema name is known during run time. I want to pass the schema name during run time. How to do this?
I am using Transform Data task to transfer data from oracle to mssql.
I have a 3Gig SQL database that currently has all Indexes, Logs, and Data on one drive. We have a new server that will be put in place on Wednesday and thankfully we will be putting the Indexes and TransLogs on different drives from the actual Data. Does anyone have a recommendation on his/her preferred way of doing this and what are some of the advantages/disadvantages some of you may have encountered? This isn't homework. I am finally getting the hang of manipulating and working with our database and will need to accomplish the server switch in about two weeks. Or if you have a recommended reading on this - please point me to it and I'll get to practicing!