INNER JOIN Using Surrogate ID, Or [Date] BETWEEN?

Jul 20, 2005

{CREATE TABLEs and INSERTs follow...}

Gents,

I have a main table that is in ONE-MANY with many other tables. For example, if
the main table is named A, there are these realtionships:

A-->B
A-->C
A-->D
A-->E

With one field in Common (Person). The tables B, C, D and E are History tables,
with Start and End dates. Each person has a Program history (table B, ie), an
Experience history (table C, ie), and so on...many differernt types of
histories, and it may grow from here....table F, G, etc.

The included CREATE TABLEs and INSERTs contain tables A, B and C.

The problem: Each tblCase (table A) record has a date. When joining all of the
history tables to tblCase on Person, obviously you get a cross-product of each
history unless you specify a WHERE clause that extracts one single record from
each of the histories (duh...that's the point...to extract a single record from
each history, because there can only be one value in effect at the time of the
Case.)

QUESTION: From a performance standpoint, would it behoove me to maintain the
surrogate ***HistoryID from each history table in tblCase, or, assuming the
indexes are set up properly, would a WHERE condition for each history be
sufficient? For example, the following select works as expected:

SELECT CasePerson, CaseDate, ProCode, ExpYear
FROM tblExperienceHistory INNER JOIN (tblCase INNER JOIN tblProgramHistory ON
tblCase.CasePerson = tblProgramHistory.ProPerson) ON tblCase.CasePerson =
tblExperienceHistory.ExpPerson
WHERE CaseDate BETWEEN ProStartDate and ProEndDate
AND CaseDate BETWEEN ExpStartDate and ExpEndDate

It extracts the single record from each history for each person for each case.
But I'm afraid of performace with such a scenario.

Instead, I could store each ***HistoryID in the table tblCase, and then just
join on that...no WHERE needed. But the trade-off is that I'd have to build
processes to maintain that. ("Hey, when you insert a record into tblCase, make
sure to go get each HistoryID from the History tables!" or "If the user changes
the date ranges in one of histories, make sure to update tblCase to match the
new historyID!")

Maybe a clustered index on each ***History table on Person/StartDate combined
with the WHERE clause should perform as well as a real JOIN on surrogate
integers.

It seems cheesey to have to resort to surrogate IDs...but the performance
increase might be worth it. Also, if I go that route, whenever I add a new
history table, I'd have to change the design of tblCase AND any SPs that
reference it. With the WHERE solution, I'd only have to change the SPs.

Comments are welcome! (tblCase grows at 250,000 records per year; the history
tables will increase about 1000 records per year)

DCMFAN

CREATE TABLE [dbo].[tblCase] (
[CaseID] [char] (5) CONSTRAINT [PK_tblCase] PRIMARY KEY CLUSTERED NOT NULL ,
[CaseDate] [smalldatetime] NOT NULL ,
[CasePerson] [char] (5) NOT NULL
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[tblExperienceHistory] (
[ExperienceHistID] [int] IDENTITY (1, 1) NOT NULL ,
[ExpPerson] [char] (5) NOT NULL ,
[ExpStartDate] [smalldatetime] NOT NULL ,
[ExpEndDate] [smalldatetime] NOT NULL ,
[ExpYear] [int] NOT NULL
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[tblProgramHistory] (
[ProgramHistID] [int] IDENTITY (1, 1) NOT NULL ,
[ProPerson] [char] (5) NOT NULL ,
[ProStartDate] [smalldatetime] NOT NULL ,
[ProEndDate] [smalldatetime] NOT NULL ,
[ProCode] [int] NOT NULL
) ON [PRIMARY]
GO

INSERT INTO [tblCase]([CaseID], [CaseDate], [CasePerson])
VALUES('12345', '3/1/03', '00000')

INSERT INTO [tblCase]([CaseID], [CaseDate], [CasePerson])
VALUES('A1G34', '4/23/03', '00001')

INSERT INTO [tblExperienceHistory]([ExpPerson], [ExpStartDate], [ExpEndDate],
[ExpYear])
VALUES('00000', '1/1/03', '5/19/03', 1)

INSERT INTO [tblExperienceHistory]([ExpPerson], [ExpStartDate], [ExpEndDate],
[ExpYear])
VALUES('00000', '5/20/03', '12/31/03', 2)

INSERT INTO [tblExperienceHistory]([ExpPerson], [ExpStartDate], [ExpEndDate],
[ExpYear])
VALUES('00001', '4/20/03', '11/1/03', 0)

INSERT INTO [tblProgramHistory]([ProPerson], [ProStartDate], [ProEndDate],
[ProCode])
VALUES( '00000', '2/1/03', '9/30/03', '55555')

INSERT INTO [tblProgramHistory]([ProPerson], [ProStartDate], [ProEndDate],
[ProCode])
VALUES( '00000', '10/1/03', '5/1/04', '55555')

INSERT INTO [tblProgramHistory]([ProPerson], [ProStartDate], [ProEndDate],
[ProCode])
VALUES( '00001', '1/1/03', '12/31/03', '55555')

View 4 Replies

INNER JOIN Using Surrogate ID, Or [Date] BETWEEN?

What's Surrogate Key?

Surrogate Key

Surrogate Or Composite Key?

Surrogate Key Generation

Surrogate Key Generation

Normal Vs Surrogate/artificial Key?

Generating Surrogate Key Without IDENTITY

Surrogate Or Composite Primary Key?

Surrogate Key Population On A Select Into

Surrogate Key Population On A Select Into

Limit Join By Date

JOIN Onto Itself And Contains The Lower Date

How To Get Max Date With Join Tables

Duplicates Using MAX Date Self-Join

Surrogate Key As Parameter In Stored Procedure?

Need Help Updating Foreign Surrogate Keys

Updating Dimension With Foreign Surrogate Key

Scrpting To Genrate Surrogate Keys

How Can I Reset A Database Surrogate PK Using SSIS?

ReUse Common Surrogate Key Pipeline

Transact SQL :: Get Min Date Row From Join Table

Help With Sample Code For Ssis Surrogate Key Transform

DB Design :: How To Create A Surrogate Key (workID) In New Table

TABLE JOIN! DOES DATE FORMAT MATTER?

TABLE JOIN! DOES DATE FORMAT MATTER?

T-SQL (SS2K8) :: Join 2 Table Based On Date

Transact SQL :: Filtering Join Tables By Date

SQL Server 2014 :: Composite PK / Surrogate And Sequence Identity

Populating A Surrogate Key Inside Data Flow Task

DB Design :: Import And Stage Table Surrogate Keys

Select Unique Country / Date Pairs With Self Join

Data Warehousing :: Populating Fact Tables With Surrogate Key From Dimension Table?