DB Design :: How To Create A Surrogate Key (workID) In New Table
Jun 3, 2015
I want to change the work table name to work_version2 and later drop the work table. First, I created the table (work_version2) along with the data structure seen below and later inserted data from the work table. As I tried to make workID a surrogate key in work_version2 using SSMS, I got the below error message when I try to save the changes. Is there a way to do this?
Saving changes is not permitted. The changes you have make requires the following tables to be dropped and re-created. You have either make changes to a table that cant't be recreated or enabled the option that prevent saving changes that requires the table to be recreated. Work_version2.
CREATE
TABLE WORK(
WorkID Int NOT
NULL IDENTITY (500,1),
Title Char(35) NOT
NULL,
Copy Char(12) NOT
I want to create an import table for daily rows with an integer column like 20150430 for the date, called DayKey. This table would do one date per day. It would then be imported into a STAGE table which would have the same columns and would have all of the import rows for every day.My question would be this: I want to be able to have an integer Primary Key unless there is a better idea. I could make the STAGE table use an auto-incremented value for the key. Then, when I load the import table which is truncated every day, I could take the NEXT value of the key from the STAGE table and increment by 1.
Let's say the last value in STAGE is 1000, then the next value that would be in IMPORT would be 1001 and incrementing up. Then these would be added to the STAGE table with the associated keys. There is no chance of anyone or anything else adding to the STAGE table any other way.
I am trying a create views that would join 2 tables:
Table 1: Has all the columns need by a view ( Name: Product Structure: ID, Attribute 1, Attribute 2, Attribute 3, Attribute 4, Attribute 5 etc Table 2: Is a lookup table that provides the names of columns Name: lookupTable Structure: tableName, ColumnName, columnValue Values: Product, Attribute1, Color Product, Attribute2, Size Product, Attribute3, Flavor Product, Attribute4, Shape
I desire to have a clustered index on a column other than the Primary Key. I have a few junction tables that I may want to alter, create table, or ...
I have practiced with an example table that is not really a junction table. It is just a table I decided to use for practice. When I execute the script, it seems to do everything I expect. For instance, there are not any constraints but there are indexes. The PK is the correct column.
CREATE TABLE [dbo].[tblNotificationMgr]( [NotificationMgrKey] [int] IDENTITY(1,1) NOT NULL, [ContactKey] [int] NOT NULL, [EventTypeEnum] [tinyint] NOT NULL,
Hi, I use lookups to map surrogate of level 1 dimensions to my fact tables in SSIS. But how to handle a level 2 dimension with a ValidFrom and a ValidUntil date field? I do not use an IsCurrent column, because this could problem with late arriving facts.
- In dts I used an SQL statement like this:
update SA SET SA.DimProdRef = Dim.RecordID FROM SAWarenEingang SA, DimProd Dim where SA.ProduktNumber = Dim.ProduktNumber and SA.ArtikelkontoBewegungsdatum between Dim.ValidFrom and Dim.ValidUntil
Now in SSIS I want to handle the whole thing in the data flow without using a staging table: - Using Lookups: I would have to pass the date column for each inside the fact table into the lookup. That does not work. - Using Execute SQL in the data flow: would be very slow, because the statement will be executed for any line in the dataflow
My question here is as the dimension has SCD type 2 on it and every time when there is a change the persn_key gets a new key value but the fact table still points to oldest key.how to update the surrogate key on fact table to the current key value? As per the requirement fact surrogate key must be pointing to current active record on the dimension.
I would like to create a table called product. My objective is to get list of packages available for each product in data grid view column while selecting each product. Each product may have different packages type (eg:- Nos, CTN, OTR etc). Some product may have two packages and some for 3 packages etc. Quantity in each packages also may be differ ( for eg:- for some CTN may contain 12 nos or in other case 8 nos etc). Prices for each packages also will be different that also need to show. How to design the table..
Product name : Nestle milk | Rainbow milk packages : CTN,OTR, NOs |
CTN, NOs Price: 50,20,5 | 40,6
(Remarks for your reference):CTN=10nos, OTR=4 nos | CTN=8 Nos
I have table which having clustered index based on column (A,B,C,D,E,F).Now my query based on B,D,F. e.g: where b='Test1' and D='test2' and F='test3' Now Execution plan ask to create non clustered index with (B,D,F) column.is it make any sense to create non clustered index where clustered already available.
Is it possible to create diagram from existing database? Need to understand the schema of database which has no documentation. So would like to create diagram for that database.
Is it possible to create a database which design or data is not needed to update and i want to set to it as a READ ONLY. what are the steps to create such type of database?
I want to create a stored procedure for displaying working Saturdays. For example... last working Saturday 08/08/2015 (dd/mm/yyyy) is working day, I want to get next working Saturday as 22/08/2015 (dd/mm/yyyy). Likewise, I want to show for particular year
We have a vendor created database with 9000+ tables, one of which has about 6 billion rows. The vendor redesigned the database recently and ever since we've had terrible performance.
What the vendor did was increase any and all varchar columns (tens of thousands of columns) to 256.
Before the upgrade we had no problems creating an index on the 6billion row table, it would take 2 hours.
Now after the upgrade we've let the index creation command run for 5 days and killed it because it was consuming terabytes of logspace.
The previous design had combined column width of 1049 to what is now over 4000. The primary key itself is 1283 characters (SQL limit is 900).
There is no additional data, just wider columns. Why we are unable to create the index?
What is happening inside SQL Server? Does SQL make "room" in memory for the index for the entire width of the potential max row length?
hi! when i read some reference books about the SQL7.0, i often met 'surrogate key'. what's the surrogate key? what's its funtion? could you give me a good example? thanks very much!
I have a text file which needs to be created into a table (let's call it DataFile table). For now I'm just doing the manual DTS to import the txt into SQL server to create the table, which works. But here's my problem....
I need to extract data from DataFile table, here's my query:
select * from dbo.DataFile where DF_SC_Case_Nbr not like '0000%';
Then I need to create a new table for the extracted data, let's call it ExtractedDataFile. But I don't know how to create a new table and insert the data I selected above into the new one.
Also, can the extraction and the creation of new table be done in just one stored procedure? or is there any other way of doing all this (including the importation of the text file)?
The orininal design of my db (part of it...) is the following
A JOB has a Number and a Description. Each JOB can have one or two TASKS (min one, max two). Each TASK is identified by the JOB it belongs to and an Index (unique only for the same JOB). Each TASK has one an only one set of INFO1, one and only one set of INFO2, one and only one set of INFO3 etc.
(There is a reason to keep INFO1, 2 and 3 separate, because eachof them will be linked to different table. This might influence the answer to my real question.)
First of all, I wouldn't add any surrogate key for TASK, not to loose the logic behind; plus I'd put an ined on JonMum only, being Index equal to 1 or 2 only, so not selective.
The real question is about INFO1 (and 2, 3 etc.) table: should I leave JobNum and Index as PK (consider that the PK of INFo1 will be used as FK for another table), or should I use a surrogate key, like for eaxmple
C: INFO1 (Info1ID [PK], JobNum [FKb], Index [FKb], ...)
I don't really like this solution. Actually I'd prefer the following
C: INFO1 (Info1ID [PK], ...)
where Info1ID = JobNum + Index (+ = string concatenation).
We need to Insert/Update a Fact Table from staging Table. currently we are using a SP which update Fact Table for Each region. this process is schedule, every 5 min job is run and Update fact table.but time of Insert and Update too long from staging to Fact, currently we are using merge statement for Insert and update.in my sp we are looping number how many region we need to update and at a time single Region we are updating using while loop in current SP.
I have a requirement of table partitioning. we have 10 years of data on a table which is 30 billion up rows on 2005 server we are upgrading it to 2014. we have to keep 7 years of data. there is no keys on table or date column. since its a huge amount of data and many users its slow down the process speed. we are thinking to do partition on 7 years for Quarterly based. but as i said there is no date column on table we have to use reference table to get date. is there a way i can do the partitioning with out adding date column on table? also does partition will make query faster?
I have think three ways to do it. 1. leave as it is. 2. 7 years partition on one server 3. 3 years partition on server1 and 4 years partition on server2 (for 4 years is snapshot better?)
Hey All, I'm trying to decide what's the 'best' to use. I've been designing and creating database for a while and have pretty much always used a surrogate key and not a normal one. I've finally had some free time to start studying more so in my spare time and read up and come accross a lot of guides, articles and stories that tout that normal keys should be used whenever possible as they're a better identifier and that surrogate keys should only be used when there is not a readily available normal key. Now perhaps I'd be open to accepting that but absolutely every database I come across tends to only use surrogate keys. For example I'm doing an authentication system from scratch and am looking at the User table. Now of course the user name has to be unique, should that be the primary key or should I have a seperate column with a guid or an incrementing int or the like as the primary key? I can certainly see that username could be used. I can also see how it may be easier when looking through the data tables to identify who/what a table is refering to with a surrogate key. However it still seems sort of sloppy, for lack of a better word, to me. Where now I could have somebody's username (or any other piece of data used for this purpose) spread accross a lot of other tables. And while writting this I just thought of the scenario that perhaps somebody needs their username changed, with this method now the ids need to be changed on all the related rows of all the other tables whereas with a surrogate key it wouldn't matter. Anyways I'm mostly looking for opinions on which way to go (not just with the user sample, but more in general).Thanks.
Hello I'm looking for a way of generating the next key value that works in MS and Sybase SQL Servers. Sybase identity columns are a bit dodgy, so...
If I have a separate table NextKey (NextKey int) with one row that I update as follows...
declare @NextKey int update NextKey set NextKey = NextKey + 1, @NextKey = NextKey + 1 insert into myTable (PrimaryKeyCol, ....) values (@NextKey, ....)
are there any problems with concurrency ? As I see it the update will lock the row so different connections will always come up with a different @NextKey value....
My previous post was not really clear, so I'll try again with a (hopefully) better (even if longer) example...
Consider the following...
A JOB describes the processment of a document. Each document can exist in two versions: English and French. A JOB can have 1 or 2 TASK, each describing the processement of either the English or French version. So we have the following:
that is there is an identifying 1:M (where maxium allowed for M is 2) relationship between JOB and TASK; TASK being identified by JobNum and Version (where the domain for Version is {E, F}).
Each TASK may require a TRANSLATION sub_task. Each TASK may require a TYPING sub_task. Each TASK may require a DISTRIBUTION sub_task.
For example, for a given doc, the English TASK requires TRANSLATION and DISTRIBUTION, while the French only DISTRIBUTION.
That is, there is a 1:1 not-required relationship between TASK and TRANSLATION, TYPING and DISTRIBUTION. So we have the following:
C: TRANSLATION (JobNum [PK] [FKb], Version [PK] [FKb], DueDate, ...) D: TYPING (JobNum [PK] [FKb], Version [PK] [FKb], DueDate, ...) E: DISTRIBUTION (JobNum [PK] [FKb], Version [PK] [FKb], Copies, ...)
As you can see I am using the PK of TASK as FK and PK for each of the three SUB_TASKs.
To complicate things, each SUB_TASK has one or more assignments. The assignments for each SUB_TASK records different information from the others. So we have...
C: TRANSLATION (JobNum [PK] [FKb], Version [PK] [FKb], DueDate, ...) D: TYPING (JobNum [PK] [FKb], Version [PK] [FKb], DueDate, ...) E: DISTRIBUTION (JobNum [PK] [FKb], Version [PK] [FKb], Copies, ...)
F: TRA_ASSIGN (JobNum [PK] [FKc], Version [PK] [FKc], Index [PK], Translator, ...) G: TYP_ASSIGN (JobNum [PK] [FKd], Version [PK] [FKd], Index [PK], Typyst, ...) H: REP_ASSIGN (JobNum [PK] [FKe], Version [PK] [FKe], Index [PK], Pages, ...)
that is there is an identifying 1:M relationship between each SUB_TASK and its ASSIGNMENTs, each ASSIGNMENT being identified by the SUB_TASK it belongs to and an Index.
I wish I could send a pic of the ER diagram...
Maybe there is another and better way to model this: if so, any suggestion?
Given this model, should I use for TRANSLATION, TYPING and DISTRIBUTION a surrogate key, instead of using the composite key, like for example:
C: TRANSLATION (TranslationID [PK], JobNum [FKb], Version [FKb], DueDate, ...) D: TYPING (TypingID [PK], JobNum [FKb], Version [FKb], DueDate, ...) E: DISTRIBUTION (DistributionID [PK], JobNum [FKb], Version [FKb], Copies, ...)
this will "improve" the ASSIGNMENTs tables:
F: TRA_ASSIGN (TranslationID [PK] [FKc], Index [PK], Translator, ...) G: TYP_ASSIGN (TypingID [PK] [FKd], Index [PK], Typyst, ...) H: REP_ASSIGN (DistributionID [PK] [FKe], Index [PK], Pages, ...)
I could even go further using a surrogate key even for TASK, which leads me to the following:
F: TRA_ASSIGN (TaskID [PK] [FKc], Index [PK], Translator, ...) G: TYP_ASSIGN (TaskID [PK] [FKd], Index [PK], Typyst, ...) H: REP_ASSIGN (TaskID [PK] [FKe], Index [PK], Pages, ...)
I don't really like this second solution, but I'm still not sure about the first solution, the one with the surrogate key only in the SUB_TASks tables.
I am performing a Select Into from a #table into a real table that has a surrogate key. If this is in a transaction (or not in one) am I guaranteed that the records inserted will be sequential surrogate key ids?
Select * into REALTABLE from MYPOUNDTABLE --40 rows
Can I assume that if the first one inserted is id 32 that the last one is 72?
{CREATE TABLEs and INSERTs follow...}Gents,I have a main table that is in ONE-MANY with many other tables. For example, ifthe main table is named A, there are these realtionships:A-->BA-->CA-->DA-->EWith one field in Common (Person). The tables B, C, D and E are History tables,with Start and End dates. Each person has a Program history (table B, ie), anExperience history (table C, ie), and so on...many differernt types ofhistories, and it may grow from here....table F, G, etc.The included CREATE TABLEs and INSERTs contain tables A, B and C.The problem: Each tblCase (table A) record has a date. When joining all of thehistory tables to tblCase on Person, obviously you get a cross-product of eachhistory unless you specify a WHERE clause that extracts one single record fromeach of the histories (duh...that's the point...to extract a single record fromeach history, because there can only be one value in effect at the time of theCase.)QUESTION: From a performance standpoint, would it behoove me to maintain thesurrogate ***HistoryID from each history table in tblCase, or, assuming theindexes are set up properly, would a WHERE condition for each history besufficient? For example, the following select works as expected:SELECT CasePerson, CaseDate, ProCode, ExpYearFROM tblExperienceHistory INNER JOIN (tblCase INNER JOIN tblProgramHistory ONtblCase.CasePerson = tblProgramHistory.ProPerson) ON tblCase.CasePerson =tblExperienceHistory.ExpPersonWHERE CaseDate BETWEEN ProStartDate and ProEndDateAND CaseDate BETWEEN ExpStartDate and ExpEndDateIt extracts the single record from each history for each person for each case.But I'm afraid of performace with such a scenario.Instead, I could store each ***HistoryID in the table tblCase, and then justjoin on that...no WHERE needed. But the trade-off is that I'd have to buildprocesses to maintain that. ("Hey, when you insert a record into tblCase, makesure to go get each HistoryID from the History tables!" or "If the user changesthe date ranges in one of histories, make sure to update tblCase to match thenew historyID!")Maybe a clustered index on each ***History table on Person/StartDate combinedwith the WHERE clause should perform as well as a real JOIN on surrogateintegers.It seems cheesey to have to resort to surrogate IDs...but the performanceincrease might be worth it. Also, if I go that route, whenever I add a newhistory table, I'd have to change the design of tblCase AND any SPs thatreference it. With the WHERE solution, I'd only have to change the SPs.Comments are welcome! (tblCase grows at 250,000 records per year; the historytables will increase about 1000 records per year)DCMFANCREATE TABLE [dbo].[tblCase] ([CaseID] [char] (5) CONSTRAINT [PK_tblCase] PRIMARY KEY CLUSTERED NOT NULL ,[CaseDate] [smalldatetime] NOT NULL ,[CasePerson] [char] (5) NOT NULL) ON [PRIMARY]GOCREATE TABLE [dbo].[tblExperienceHistory] ([ExperienceHistID] [int] IDENTITY (1, 1) NOT NULL ,[ExpPerson] [char] (5) NOT NULL ,[ExpStartDate] [smalldatetime] NOT NULL ,[ExpEndDate] [smalldatetime] NOT NULL ,[ExpYear] [int] NOT NULL) ON [PRIMARY]GOCREATE TABLE [dbo].[tblProgramHistory] ([ProgramHistID] [int] IDENTITY (1, 1) NOT NULL ,[ProPerson] [char] (5) NOT NULL ,[ProStartDate] [smalldatetime] NOT NULL ,[ProEndDate] [smalldatetime] NOT NULL ,[ProCode] [int] NOT NULL) ON [PRIMARY]GOINSERT INTO [tblCase]([CaseID], [CaseDate], [CasePerson])VALUES('12345', '3/1/03', '00000')INSERT INTO [tblCase]([CaseID], [CaseDate], [CasePerson])VALUES('A1G34', '4/23/03', '00001')INSERT INTO [tblExperienceHistory]([ExpPerson], [ExpStartDate], [ExpEndDate],[ExpYear])VALUES('00000', '1/1/03', '5/19/03', 1)INSERT INTO [tblExperienceHistory]([ExpPerson], [ExpStartDate], [ExpEndDate],[ExpYear])VALUES('00000', '5/20/03', '12/31/03', 2)INSERT INTO [tblExperienceHistory]([ExpPerson], [ExpStartDate], [ExpEndDate],[ExpYear])VALUES('00001', '4/20/03', '11/1/03', 0)INSERT INTO [tblProgramHistory]([ProPerson], [ProStartDate], [ProEndDate],[ProCode])VALUES( '00000', '2/1/03', '9/30/03', '55555')INSERT INTO [tblProgramHistory]([ProPerson], [ProStartDate], [ProEndDate],[ProCode])VALUES( '00000', '10/1/03', '5/1/04', '55555')INSERT INTO [tblProgramHistory]([ProPerson], [ProStartDate], [ProEndDate],[ProCode])VALUES( '00001', '1/1/03', '12/31/03', '55555')
I am performing a Select Into from a #table into a real table that has a surrogate key. If this is in a transaction (or not in one) am I guaranteed that the records inserted will be sequential surrogate key ids?
Select * into REALTABLE from MYPOUNDTABLE --40 rows
Can I assume that if the first one inserted is id 32 that the last one is 72?
I'm trying to create a proc for granting permission for developer, but I tried many times, still couldn't get successful, someone can help me? The original statement is:
Can I dynamically (from a stored procedure) generatea create table script of all tables in a given database (with defaults etc)a create view script of all viewsa create function script of all functionsa create index script of all indexes.(The result will be 4 scripts)Arno de Jong,The Netherlands.
Now, I like #1 because the implementation is simple -- the calling code simply passes an author name, and a country id and an INSERT INTO statement is called with those parameters
INSERT INTO authors( @authorName, @countryId )
I like #1, because it hides the surrogate "id" key from the application calling code. But on the downside, it has more overhead work, because you have to first a) verify a country with that name exists, and b) select that id into a variable.
DECLARE id INT; IF EXISTS (select * from countries where country_id = @countryId ) THEN SELECT country_id INTO id FROM countries WHERE country_name = @countryName; END IF;
(Sorry I may have the SQL syntax wrong up there, but I was just trying to demonstrate the extra overhead involved).
I am in the process of building a fact table in a staging area. The data in the host system has numerous composite keys, so I have replaced all the composite keys in the dimensions with surrogate keys (integer) which are generated using an identity at load time. When I load the staging (fact) table, I have set the default value of all the foreign keys to 0. What I must do now is update all the foreign key values with the surrogate key values from the dimensions. I'm using an update command and the original gid values from the source system in the where clause...i.e. UPDATE X SET x.key_1 = y.key_1 FROM TableA X WITH (NOLOCK) INNER JOIN TableB Y WITH (NOLOCK) ON x.org_id = y.org_id AND x.bus_id = y.bus_id AND x.prov_gid = y.prov_gid AND x.log_gid = y.loc_gid;
This seems to work fine for most tables. However, I am now trying to update a table that has over 10 million rows and approximately 30 foreign keys. The script runs for hours. I ususally stop it after about 8 hours when it still hasn't completed. Since the keys are dynamic and they could possibly change during each load process, I can't add them during the load process.
Is there a better way to update these keys. I need to regenerate the fact tables every night and taking this much time to reload a fact table is just not practicle. I've indexed the alternate keys on all the dimensions and have also indexed the gids on the target fact table. Am I doing something wrong? Have I over indexed the target table? Please help! Thanks Jerry