I wonder what would be the best (at to be honest - how to do it at all) to perform data normalization with SSIS. The scenario is as follows:
I got plain table with several columns in it.Some of columns can be copied straight into destination tableSome columns (String) should be lookup in another table to get IDOn success just replace string with IDOn fail - create new record in lookup table and return newly created ID
Thanks for any ideas and maybe short samples
Hi All Please guide me in the following situation. I am new in programming I have a master table tblCompany with fields: Company Name, Address, Phone number Second table is tblUsers with Company Name, User Name , Password Third table is tblDealing with field Company Name , Dealer Name, Dealer Address According to the normalization rules I shoud put a column named Company_Id in tblCompany(master table) and use it in other two tables instead of CompanyName colum to reduce the data retundancy. But my question is accessing data from master detail tables with join quries will take more processing time(taking the company name against the company ID). On the other hand memory wise its same to store the company ID(like 0012786) and company name (like somecompany Ltd). So should I go for normalization or simply store the Company name in each table. Thanks
How to normalization would be use 3 tables ,one for tags_id and tags, and another relating the tags_id to ID from the table above...but how does the code already know there is a tag called repeating?
I want to encrypt certain data like password, ssn, credit card info etc before saving in database. Also, this encrypted data can be queried using standard SQL statements like:
select * from users where userid=454 and password = 'encrypted data'
The mechanism to encrypt data could be in a .net application. The code that does encryption/decryption should also be protected so that it doesnt work if it falls in wrong hands.
Can anyone suggest what would be the best way to accomplish above?
I need to record in a table: Who, When, What Field and New Value of Fields When changes occur to an existing record.
The purpose is for users to occassionally view the changes. They'll want to be able to see the history of the record - who changed what and when.
I figured I'd add the needed code to the stored procedure that's doing the update for the record.
When the stored procedure is called to do the update, the PK and parameters are sent.
The SP could first retain the current state of the record from the disk, then do the update, then "spin" thru the fields comparing the record state prior to the update and after. Differences could be parsed to a "Changes string" and in the end, this string is saved in a history record along with a few other fields:
Name, DateTime, Changes
FK to Changed Record: some int value Name: Joe Blow Date: 1/1/05 12:02pm Changes: Severity: 23 Project: Everest Assigned Lab: 204
How does the above approach sound?
Is there a better way you'd suggest?
Any sample code for a system that spins thru the fields comparing 1 temporary record with another looking for changes?
Hi , I am loading the Data into the Tables with the constraints on and redirecting the error rows into a seperate table is there a way to capture the error rows from a execute sql task by directly loading data without constraints and later adding them with the execute sql task and redirecting them to error table as this approach would make the loads quicker. the approach now that i am using is on a row by row basis ..... and if i drop constraints and load data and then add constraints will this deposit the same error rows as in case of the current approach please send me ur suggestions
In an approach of building an ETL tool, we are into a situation wherein, a table has to be loaded on an incremental basis. The first run all the records apporx 100 lacs has to be loaded. From the next run, only the records that got updated since the last run of the package or newly added are to be pulled from the source Database. One idea we had was to have two OLE DB Source components, in one get those records that got updated or was added newly, since we have upddate cols in the DB getting them is fairly simple, in the next OLEDB source load all the records form the Destination, pass it onto a Merge Join then have a Conditional Split down the piple line, and handle the updates cum insert.
Now the question is, how slow the show is gonna be ? Will there be a case that the Source DB returns records pretty fast and Merge Join fails in anticipation of all the records from the destination ?
What might be the ideal way to go about my scenario.. Please advice...
My Input is a flat file source and it has spaces in few columns in the data . These columns are linked to another table as a foreign key and when i try loading them in a relational structure Foreigh key violation is occuring , is there a standard method to replace these spaces .
what approach should i take so that data gets loaded in a relational structure.
for example
Name Age Salary Address dsds 23 fghghgh
Salary description level 2345 nnncncn 4
here salary is used in this example , the datatype is char in real scenario
what approach should i take to load the data in with cleansing the spaces in ssis
It's well known issue, that one can't use any dataset fields in a report header/footer directly. One of the approach is to create query-based parameter that basically equals =First(Fields!@FieldName@.Value, "@DataSetName@") and use that parameter value instead. But it doesn't work in my case!
My report displays some entity description and is parametrized with EntityID param. Its header contains entity name that, according to the approach, is queried from the data source through the EntityName report parameter. There's important issue: the report is displayed in ReportViewer control, that is embedded into my application and entity ID parameter isn't ser by user in ReportViewer parameters area. Its default value is changed by the application with SetReportParameters() web method every time a user wants to view the report according to the entity the user is exploring in the application. But after the report has been rendered, its header always contains not actual (outdated) entity name. Nevertheless, the report body contains actual data (including entity name). If I alter entity ID parameter in ReportViewer or in web-based Report Manager and refresh report, header displays correct entity name.
Is it possible to normalize a database using SQL statement? I have a huge duplicated records on a certain fields and need to do some normalization on it. For example, the raw data below
I'm working on a normalization for one of my classes and I've been sick and I feel lost now, could one of you please look at my database statements and tell me if/what is wrong with it?
Hi everyone.. Well i have my tables ready to build the database on to the sql server... My probs is normalization of the tables being used in the database.... Is there any best possible way / [short-cut.. very weird to ask this :) ] using the sql server...?
I have a web app which is used to do normal insert/update of employee info. Connected to each employee that is entered is some data that is imported from an outside source for each employee. The question I have is currently my database is very normalized and importing data from this outside source will be quite a pain because of this. Is it bad practice to denormalize a specific table if no user will every insert/update it beside DTS?
What's my question is ? Whether there are any tools available for normalization produced by Database vendors or any third party. If yes, Can u kindly give me clear documentation.
Does this show "poor" design? It has been suggested to me to do a "Logical Model" of my data base and that will make it easier to "normalize" the tables. I tried this and come up with the following but I don't know if I am stretching it too thin. One rule of the 2NF is to ensure all tables have a primary key, and as you can see, my tbProjectTeam has a primary key, but that is made up of the entire row. Same goes for the tbDepartmentActivities.
tbEstimatedProjects Reference (PK) | Name | City | Postal |... ----------------------------------------------------------- 1 | Some Project | Niagra Falls | N8E7J5 | ....
Hi,I am using MS-SQL server to store my database.My problem is that I have around 150+ database files in DBF format.Each database file consists of fields ranging from 2 to 33 in number.Also, there are some fields which have just one entry and rest areNULL.This database will be accessed by a printing software.Please advice as to how I should proceed to normalize this database.Regards,Shwetabh
I've come up with this issue in several apps now. There are things that, fromone perspective, are all handled the same, so it would be desirable that theyall be handled in the same table with some field as a type specification.From other perspective of foreign key relationships, however, they aredifferent things and can't be stored in the same table.For example, I have a scheme for indicating mappings between dimension recordsat one time period to new dimension records at another time period. I coulduse one set of tables for all mappings since they all work exactly the sameway, but then I can't set up DRI between the mapping tables and the dimensiontables. If I just make separate mapping tables for each dimension table, thenI'm creating 4 new tables per dimension table, all identical with respect towhat fields they contain, what kinds of unique constraints they have, and whatrelationships they have to each other with the sole distinction that they eachmap to the integer-type key of a different dimension table. I would not lookforward to doing maintenance on this schema!Is there any strategy for having the cake and eating it, too?
create table cd_fiq_a ( fiq_id int not null primary key, fiq_name varchar(50) not null)
create table cd_sal_b ( sal_id int not null primary key, sal_name varchar(50) not null)
create table cd_rak_c ( rak_id int not null primary key, rak_name varchar(50) not null)
insert into cd_fiq_a values (1, 'Fiq1')
insert into cd_fiq_a values (2, 'Fiq2')
insert into cd_sal_b values (1, 'Sal1')
insert into cd_sal_b values (2, 'Sal2')
insert into cd_sal_b values (3, 'Sal3')
insert into cd_sal_b values (4, 'Sal4')
insert into cd_sal_b values (5, 'Sal5')
insert into cd_rak_c values (1, 'Rak1')
insert into cd_rak_c values (2, 'Rak2')
insert into cd_rak_c values (3, 'Rak3')
insert into cd_rak_c values (4, 'Rak4')
Now there is a relationship between cd_fiq_a, cd_sal_b and cd_rak_c. For a given Faq there can be one or more records of Sal. For a given Fiq and a given Sal there can be one or more records of Rak. I am thinking that i can do it one table or two tables:
One Table Solution ----------------------------
create table relation_d ( relation_id int not null primary key, fiq_id int not null foreign key REFERENCES cd_fiq_a (fiq_id), sal_id int not null foreign key REFERENCES cd_sal_b (sal_id), rak_id int not null foreign key REFERENCES cd_rak_c (rak_id), sort_order int not null )
Two Table Solution ---------------------------
create table relation_header_d ( relation_header_id int not null primary key, fiq_id int not null foreign key REFERENCES cd_fiq_a (fiq_id), sal_id int not null foreign key REFERENCES cd_sal_b (sal_id) )
create table relation_detail_e ( relation_detail_id int not null primary key, relation_header_id int not null foreign key REFERENCES relation_header_d (relation_header_id), rak_id int not null foreign key REFERENCES cd_rak_c (rak_id), sort_order int not null )
Which solution is more normalized and will result in better execution of Sql? Or is there any other solution which is more better?
My question concerns the amount of normalization i require for my specific needs. I realize this is a difficult question without knowing alot more about my database. I am hoping with some information I could get advice from those more experienced than I.
- My database consists of 9 tables with the maximim number of columns being 11 which is the contacts table.
- the largest data type is nvarchar 125.
- number of rows in the largest table will eventually grow to hundreds of thousands.
- users access the database online
This is large to me but I expect that to some of you this not.
My application would be easier to setup if the contacts table were to include address info.
So my question is, for a database of this size could I create a contacts table similar to the customer table in the Microsoft Northwind sample data base with the address included, or should I model something more like the contact table in the Microsoft Adventureworks db with the address and State/province split to separate tables.
Any help you could provide with this scetchy info would be greatly appreciated.
Hai everybody recently i came across this article and i have tried to answer all the follwoing questions. But i am not sure its correct or not..so you peoples can comment on the follwoing questions.
Hi all, This is actually a pretty stupid question, but somehow I need an answer from you experts. We are currently building a web application using ASP.NET, and it simply manages contact information, like outlook. Contact information include first name, last name, birthday, etc. It also tracks address, phone number, and email. Here come the problem. We allow only one address, 4 phone numbers and an email for each contact. When we building the database table, should we create 6 fields to contain all the information or should we create address, phone, and email table and then create the relationship between them. Will the first method speed up the performance? Or the second method is the proper way? I need some pros and cons on each Thanks so much for your opinion Sam
helloI've a denormalized table PRODUCTS with following fields:ProductNo ,OrderNo ,SerialNo ,OrderDate ,PromiseDate ,ManufacturerID ,......DistributorID ,DealerID ,......ReceiptDate ,......I have to denormalize this table, so I created 3 tables:Table Name : ProductOrdersFields:+ProducrOrderID,ProductNo ,OrderNo ,SerialNo ,OrderDate ,PromiseDate ,ManufacturerID ,-------------------------------Table Name: ProductsOrdersDetailsFields:+ProductsOrdersDetailsID,ProductOrdersID,DistributorID ,DealerID......-----------------------Table Name: ProductsOrdersReceiptsFields:+ProductsOrdersReceiptsID,ProductsOrdersDetailsID,ReceiptDate ,......----------------------------DistributorID and DealerID appear in Details table because aparticular order number for a specific product number can come fromdifferent distributor and dealer.What I need is a query to populate these normalized tables from theoriginal denormalized Products table.Can any one please help?Thanks
I have decided to use Wufoo online forms so collect evaluation data from my clients. Each form will have different type of data.
For example
Form 1 might have: Client id, name, user id, comments
Form 2 might have: Client id, name, user id, address, comments, telephone number
Form 3 might have: Client id, name, user id, comments, date of birth, email address
Each form can have over 20+ data types which can be all different to other forms.My question is: What is the best way to store all the data into sql server without creating new columns of new data types everytime a new form is created in wufoo.
Normalization Question - Please bear with me, hopefully things will beclear at the end of the question.Given a treaty table containg treaties; Treaty1, Treaty2 etc and abenefit table; Benefit1, Benefit2 etc. A treaty can only have certainbenefits:-For example Treaty 1 can process Benefit1 and Benefit2.To maintain this relationship a new table TreatyBenefit has beencreated: -Treaty1Benefit1Treaty1Benefit2A further table called policy has been added. A treaty can have manypoliciesTreaty1Policy1Treaty1 Policy2A Policy can contain 1 or more benefitsPolicy1 Benefit1Policy1 Benefit2 etc.Giving structure as follows:-T - TB - B|PGiven the above, should there be a constraint between policy andTreatyBenefit or Policy and Benefit to enforce referential integrity.If so which constraint, if not what form of constraint / checkingshould I be using, to ensure that a Policy will contain the correctbenefits.ThanksAdrian
We need to store land title information about properties in variousAustralian states, but each state maintains it's own land titleregistry and use different columns (well actually differentcombinations of the same columns). For example:Victoria store:TorrensUnitTorrensVolumeTorrensFolioQueensland store:TorrensCountyTorrensLotTorrensPlanTorrensParishTorrensUnitTorrensVolumeTorrensTitleRefThere are 11 different columns and they are used in 8 differentcombinations depending on the state.Since we need to store information about land in different states I seetwo possible solutions:1. A sparse table containing the 11 columns with a CHECK constraint toenforce the valid combinations.2. A table for each state containing only the columns relevant to thestate with a foreign key relationship to the table containing thecommon columns.I'm not sure if the data type and length is consistent between statesyet (waiting to find this out) but assuming that it is which of theseapproaches is going to be the most rigorous? I'm leaning towards (2)but I don't like the feel of a table per state.
I have a form that I am building and I've run into this problem. There is a section on the form that checks if a person wants to sign up for a AM shift or PM shift for a Show. What I need to know is how do I normalize the form to reflect that in the database? Here is what the user would see on the web form.
Show AM Shift PM Shift -------------------------------------------------- Show #1 checkbox checkbox Show #2 checkbox checkbox Show #3 checkbox checkbox
Here is what I have so far in the table design:
Table Shows -------------------- ShowID int PK Name nvarchar(50) Not Null
Table Person -------------------- PersonID int PK FirstName nvarchar(50) Not Null LastName nvarchar(50) Not Null
Table Shift ---------------- ShiftID int PK Name bit Not Null