Extracting And Joining Header From Denormalized Table
Dec 1, 2005
Hello,
I am currently working on a monthly load process with a datamart. I
originally designed the tables in a normalized fashion with the idea
that I would denormalize as needed once I got an idea of what the
performance was like. There were some performance problems, so the
decision was made to denormalize. Now the users are happy with the
query performance, but loading the tables is much more difficult.
Specifically...
There were two main tables, a header table and a line item table. These
have been combined into one table. For my loads I still receive them as
separate files though. The problem is that I might receive a line item
for a header that began two months ago. When this happens I don't get a
header record in the current month's file - I just get the record in
the line items file. So now I have to join the header and line item
tables in my staging database to get the denormalized rows, but I also
may have to get header information from my main denormalized table
(~150 million rows). For simplicity I will only include the primary
keys and one other column to represent the rest of the row below. The
tables are actually very wide.
CREATE TABLE dbo.Claim_Lines (
CLM_ID BIGINT NOT NULL,
LN_NO SMALLINT NOT NULL,
CLM_LN_DATA VARCHAR(100) NULL )
Target database:
CREATE TABLE dbo.Target (
CLM_ID BIGINT NOT NULL,
LN_NO SMALLINT NOT NULL,
CLM_DATA VARCHAR(100) NULL,
CLM_LN_DATA VARCHAR(100) NULL )
I can either pull back all of the necessary header rows from the target
table to the claims table and then do one insert using a join between
claims and claim lines into the target table OR I can do one insert
with a join between claims and claim lines and then a second insert
with a join between claim lines and target for those lines that weren't
already added.
Some things that I've tried:
INSERT INTO Staging.dbo.Claims (CLM_ID, CLM_DATA)
SELECT DISTINCT T.CLM_ID, T.CLM_DATA
FROM Staging.dbo.Claim_Lines CL
LEFT OUTER JOIN Staging.dbo.Claims C ON C.CLM_ID = CL.CLM_ID
INNER JOIN Target.dbo.Target T ON T.CLM_ID = CL.CLM_ID
WHERE C.CLM_ID IS NULL
INSERT INTO Staging.dbo.Claims (CLM_ID, CLM_DATA)
SELECT T.CLM_ID, T.CLM_DATA
FROM Staging.dbo.Claim_Lines CL
LEFT OUTER JOIN Staging.dbo.Claims C ON C.CLM_ID = CL.CLM_ID
INNER JOIN Target.dbo.Target T ON T.CLM_ID = CL.CLM_ID
WHERE C.CLM_ID IS NULL
GROUP BY T.CLM_ID, T.CLM_DATA
INSERT INTO Staging.dbo.Claims (CLM_ID, CLM_DATA)
SELECT DISTINCT T.CLM_ID, T.CLM_DATA
FROM Target.dbo.Target T
INNER JOIN (SELECT CL.CLM_ID
FROM Staging.dbo.Claim_Lines CL
LEFT OUTER JOIN Staging.dbo.Claims C ON C.CLM_ID =
CL.CLM_ID
WHERE C.CLM_ID IS NULL) SQ ON SQ.CLM_ID = T.CLM_ID
I've also used EXISTS and IN in various queries. No matter which method
I use, the query plans tend to want to do a clustered index scan on the
target table (actually a partitioned view partitioned by year). The
number of headers that were in the target but not the header file this
month was about 42K out of 1M.
So.... any other ideas on how I can set up a query to get the distinct
headers from the denormalized table? Right now I'm considering using
worktables if I can't figure anything else out, but I don't know if
that will really help me much either.
I'm not looking for a definitive answer here, just some ideas that I
can try.
I have a program where I have to do a weekly data upload of approx 1,500 records. I've written the sql, but need to know how I can add static header text to the SQL statement.the text I need in the header is Reportname(Static string) + #rows (Rowcount of the sql, int) + department (String static) I'm planning on writing it and saving it as a package, then schedule it to run every Friday at 5:00. But I have been reading a little on Reporting Services and wonder if people think it' might be the way to go? Is it a whole new area of security issues? Or is it worth installing and learning? I will have more and more reports I need to write that will increase in complexity, and want to know if I want to keep writing querys by hand, or use reporting services. Thanks in advance! Dan
I have a de-normalized table that I need to export to XML using For XML, but put all of the related rows under the same node.The table is alot more complicated than the example below, but for proof of concept purposes, i'll keep it really simple:
Is there an existing option that deals with this automatically, or do I essentially need to do a group by to output the campaign element, and then union an ungrouped select to output the price element?
I'm building a system that imports data from several source, Excel files, text files, Access databases, etc. using DTS. The entire process revolved around MS SQL Server, by the way.
I figured I would create denormalized tables that mirror the Excel and flat files, for example, in structure, import data to those, clean up and remove duplicates there, then break those out into my normalized table structure later.
Now I've finished the importing part (though this is going to happen once a week) and I'm onto breaking up the denormalized tables.
I'm hesitating because I'm not sure I've made the best decisions in terms of process, etc.
I've decided to use cursors to loop over the denormalized tables and use batch insert statements to push data out to the appropriate tables.
Any comments? Suggestions? All is welcome.
I'm specifically interested in hearing back on the way I've set up the intermediate, denormalized tables and how I'm breaking them up using cursors (step 2 of the process below). Still, all comments are welcome. As are suggestions for further reading.
Thanks again...
simplified example (my denormalized tables are 20 - 30 colums wide)
denormalized table: =================== name, address, city, state, cellphone, homephone
I'm breaking up the denormalized tables like this (*UNTESTED*): =================================================
DECLARE @vars.... (one for each column in my normalized table structure, matching size and type)
DECLARE myCursor CURSOR FAST_FORWARD FOR SELECT name, address, city, state, cellphone, homephone FROM _DNT_myWideTable INTO
WHILE @@Fetch_Status = 0 BEGIN -- grab the next row from the wide table FETCH NEXT FROM myCursor INTO @name, @address, @city, @state, @cellphone, @homephone
-- create the person first and get the ID with @@IDENTITY INSERT INTO tblPerson (name) VALUES (@name)
SET @personID = @@IDENTITY
-- use that ID to coordinate inserts across other tables INSERT INTO tblAddress (FK_person, address, city, state, addressType) VALUES(@person, @address, @city, @state, 'HOME')
INSERT INTO tblContact (FK_person, data, contactType) VALUES(@person, @cellphone, 'CELLPHONE')
INSERT INTO tblContact (FK_person, data, contactType) VALUES(@person, @homephone, 'HOMEPHONE')
I am pretty new to SSIS, so please excuse me if this is a trivial question.
I have a denormalized database table in an Access database that I need to import into several different tables in a SQL 2005 database. You can think of the Access table as a CustomerOrders table. For example customer related information (i.e. CustomerName, CustomerID, etc...) is repeated with each record in the Access table. When this data gets moved to the SQL 2005 database, I need to insert one record for each distinct CustomerName/Customer ID record into a Customers table. I then need to insert and link every "Order" record into an "Orders" table.
I am sure that this is probably a pretty common task, but I have not found any examples or articles explaining this particular situation. What ways can this be done?
I was thinking I need to loop through each DISTINCT Customer record in the Access (source) table and insert a Customer record into the destination database's Customer table. I would then need to iterate through each row of the Access (source) table and "Lookup" the appropriate CustomerID/Key Field and insert an "Order" record.
The Access table contains over 75,000 rows of data. I am looking for the most appropriate way of doing this with SSIS (so that I don't have to write a custom application to do this!). Any help, input, links, articles, etc. is appreciated!!
create table temp ( Â Â date1 datetime, Â Â category varchar(3), Â Â amount money
[code]....
In above format:15-10 means Oct 15, 15-09 means Sept 15I am getting issues while formatting this in SSRS.I need SSRS table headers to be dynamic. I need that it should be according to the sql table header...The table header should change according to the sql query
I have a report that I created and the report was working until I added some fields to a group footer row in a table.
My table has 5 group levels. I had information displaying in the 5th level header group and detail. It was working fine. Then I added some fields to the 4th level group footer. Now it displays only the Page header, Table header, and the 4th level group footer data.
What happened to the rest of the data?
All the cells and rows I want to display have the Visibility Hidden set to false. I tried removing the objects I added (to the 4th level group footer) and it still does not work. Is this a bug or did I set something that is hiding the data.
I have two tables - a stock table and a transaction table. The stock table is called stock and the transaction table is called trans. The table make a point of sale database. Both table have a joining field called prod_no.
I want to extract the stock line by querying stock.prod_no that are not present in the trans table for a period of time specified.
Basically, I want to find dead stock lines that have not been sold and, therefore, are not in the transaction table but that have a stock figure. I have to enter in a start and end date paramater using the ddate field in the trans table. I have looked at left, right and outer joins but with no luck and don't even know if this is the correct query type to use.
I want to see all the product line provided they are there in the table or not, i.e.
Product Line Product Line Description Total Usage Quantity
10000 Raw Material 0
20000 Intermediate Product 0
30000 Bearing 2713321
32000 High Strength 4258197
34000 High Temp 0
36000 Corrosion Resistant 639492
50000 High Speed 1452153
52000 Die Steel Hot Work 1614727
54000 Die Steel Cold Work 464943
88000 Misc 0
90000 Conversion 0 This is the result i am looking out for now the problem is that those record are not there in table so naturally they will not be shown in the result. Can any one help me modifying below query? so that i can get the aforesaid result
Code Block
SELECT [PRODL] as 'Product Line' , (case when prodl = 10000 then 'Raw Material' when prodl = 20000 then 'Intermediate Product' when prodl = 30000 then 'Bearing' when prodl = 32000 then 'High Strength' when prodl = 34000 then 'High Temp' when prodl = 36000 then 'Corrosion Resistant' when prodl = 50000 then 'High Speed' when prodl = 52000 then 'Die Steel Hot Work' when prodl = 54000 then 'Die Steel Cold Work' when prodl = 88000 then 'Misc' when prodl = 90000 then 'Conversion' end) as 'Product Line Description'
,sum(case when [PRODL] = 10000 then [Usage Qty] when [PRODL] = 20000 then [Usage Qty] when [PRODL] = 30000 then [Usage Qty] when [PRODL] = 32000 then [Usage Qty] when [PRODL] = 34000 then [Usage Qty] when [PRODL] = 36000 then [Usage Qty] when [PRODL] = 50000 then [Usage Qty] when [PRODL] = 52000 then [Usage Qty] when [PRODL] = 54000 then [Usage Qty] when [PRODL] = 88000 then [Usage Qty] when [PRODL] = 90000 then [Usage Qty] end) as 'Total Usage Quantity'
FROM [LATCUBDAT].[dbo].[RMU] group by prodl order by prodl
Result of The Aforesaid Query
Product Line Product Line Description Total Usage Quantity
My ERP system (SAP Business One) does not have a "payment type" field, instead it has a field for the amount of each type of payment. It makes for annoying reporting.
If I have the following data in the Incoming Payments table (ORCT)
SELECTT0.DocEntry, CASE WHEN T0.CashSum <> 0 THEN 'Cash' WHEN T0.CreditSum <> 0 THEN 'Credit' WHEN T0.CheckSum <> 0 THEN 'Check' WHEN T0.TrsfrSum <> 0 THEN 'Transfer' END AS [Type], CASE WHEN T0.CashSum <> 0 THEN T0.CashSum WHEN T0.CreditSum <> 0 THEN T0.CreditSum WHEN T0.CheckSum <> 0 THEN T0.CheckSum WHEN T0.TrsfrSum <> 0 THEN T0.TrsfrSum END AS [Total] FROMORCT T0
I think I've just been staring at this too long and wanted some fresh eyes to think outside of the box on this one.
I am using this to report on our customer's monthly statements so that I can display "12/13/06 Wire Transfer $100". There are many more complications to this query such as currencies and which document number or reference to display, but I have figured out most of those on my own.
I am using the following to extract the column names of a table. I would like to do this for the whole database. Currently I am cutting the results into an excel spread. Is there a better way of doing this? Here is the query
SELECT name FROM syscolumns WHERE [id] = OBJECT_ID('tablename')
My data source has some columns I have to 'translate' first and then insert into my destination table.
Example Source data: key size height 1 'Small' 'Tall' 2 'Big' 'Short' has to become 1 'Y' 'P' 2 'N' 'D'
I thought of creating a lookup table (I'm talking about the real table, not a lookup transformation table) that would have these columns: column name, value_source, value_dest Example: col_name vl_source vl_dest size 'Small' 'Y' size 'Big' 'N' height 'Tall' 'P' ... and so on, I believe you get the point
How would you extract the needed values? Can I use a select statement in a derived column expression? Any ideas? I'm not really fond of the idea to create n lookups, one for each column I have to translate, is there a slicker solution?
I have just inherited a new project consisting of data imported into sql 2005 from a multi-dimensional database. After finding the correct ODBC and importing the data I believed that I was done, but after reviewing the resulting structure I discovered why this was called a €œmulti-dimensional€? database. The resulting imported data is completely de-normalized and resembles an excel spreadsheet more than a relational database. An example of this structure is the persons table. The table has multiple columns, some of which contain the multi-dimensional fields. These fields contain multiple values which are separated with a tilde, €œ~€?. I need to parse out the data and normalize it. In the specific sample of data below I attempting to take the personid, associates, and assocattrib and insert them into a sql table, associates. This table references the persons table where the personid and the associates references the personid in the persons table.
CREATE TABLE [dbo].[associates]( [associd] [int] NOT NULL REFERENCES persons(personid), [namepkey] [int] NOT NULL REFERENCES persons(personid), [assocattribute] [varchar](20) NULL )
The purpose of normalizing this data will be to show the realationship(s) between people as it has been documented in the previous data structure, i.e. person 1 is an associate of person 336 and the attribute is WIT.
My problem lies in attempting to parse out the associates and assocattrib columns and relate them to the appropriate personid. The personid relates to each associate and assocattrib and the tilde, ~, separates the values ordinal position which, in sql, would be separate rows. For example the single row: personid associates assocattrib 58201 252427~252427~252427 VICT/SUSP~WIT~RP Should be: 58201 252427 VICT/SUSP 58201 252427 WIT 58201 252427 RP
The imported data can have no associates: personid associates assocattrib 152683 NULL NULL
or up to 69 associates, I am not even going to try to paste all of that here.
There are over 400,000 rows that I am looking at parsing, so any way of doing this in t-sql or the clr would be GREAT. This data is stored in SQL 2005 standard SP2. The specific machine is our test/reporting server, so I am not necessarily concerned with the best performing solution, I am leaning more towards providing some free time for me.
Any help or suggestions, including better ideas on the table structure, would be greatly appreciated.
I run into a problem when asking to show a query of employee vacation days.
table 1: column1 Â is dates e.g. 2015-01-01 2015-01-02 2015-01-03Â . . . 2015-12-31
table2: employeeID vacation_date Tom 2015-01-03 Tom 2015-01-04 David 2015-01-04 John 2015-01-08 Mary 2015-01-012
My query output need to be:
2015-01-01 2015-01-02 2015-1-03 Tom 2015-01-04 Tom 2015-01-04 David 2015-01-05 2015-01-06 2015-01-07 2015-01-08 John 2015-01-09 2015-01-10 2015-01-11 2015-11-12 Mary
... etc... all the way to 2015-12-31
when i use left outer join, i only record one employee per date.
Hello. I have a somwhat simple question. I have a table inside my database where i have some columns. Id like to extract distinct values from one column and inser them into another. My table is named article and id like to slect all the destinct values from the column naned type inside my article table. Then i need to insert my values into a new table called type but i would also like the to have 2 columns inside my type table. 1 called counter witch is an auto increment PK, and another one named type where the results from my query would be inserted. Iv tried a veriety of querys but none of them have managed to do this. Could anyone help me construct this query?
I have been placed in the position of administering our SQL server 6.5 (Microsoft). Being new to SQL and having some knowledge of databases (used to use Foxpro 2.6 for...DOS!) I am faced with an ever increasing table of incoming call information from our Ascend MAX RAS equipment. This table increases by 900,000 records a month. The previous administrator (no longer available) was using a Visual Foxpro 5 application to archive and remove the data older than 60 days. Unfortunately he left and took with him Visual Fox and all of his project files.
My question is this: Is there an easy way to archive then remove the data older than 60 days from the table? I would like to archive it to a tape drive. We need to maintain this archive for the purposes of searching back through customer calls for IP addresses on certain dates and times. We are an ISP, and occasionally need to give this information to law enforcement agencies. So we cannot just delete it.
I am trying to extract what I believe is called a Node from a column in a table that contains XML. What's the best way to do this? It is pretty straightforward since I'm not even looking to include a WHERE clause.
What I have so far is:
SELECT CCH.OrderID, CCH.CCXML.query('/cc/auth') FROM tblCCH AS CCH
Hello,I'm creating an audit table and associated triggers to be able to captureany updates and deletes from various tables in the database. I know how tocapture the records that have been updated or deleted, but is there any waythat I can cycle through a changed record, look at the old vs new values andcapture only the values that have changed?To give you a better idea of what I'm trying to do, instead of creating acopy of the original table (some tables have many fields) and creating awhole record if a type or bit field has been changed, I'd like to onlycapture the change in a single audit table that will have the followingfields;AuditID int INDENTITY(1,1)TableName varchar(100)FieldName varchar(100)OldValue varchar(255)NewValue varchar(255)AuditDate datetime DEFAULT(GetDate())Any direction would be greatly appreciated.Thanks!Rick
I want to calculate the target based on Flag value if Flag value is "Y" ....than MAX(Customer Target) else MAX(SLA target).Flag column contains "Y" , "N" and some blank values . Flag, Customer Target and SLA target are the columns in Table1. I have used the below formulas
I am looking for an alternate logic for below-mentioned code where I am inserting into a table and having left join with the same table
insert TABLE1([ID],[Key],[Return]) select distinct a.[ID],cat1,cat2Â from (select ID,[Key] Cat1 ,[Return] cat2 from @temp as temp) a left join TABLE1 oon a.ID= o.ID and a.Cat1 = o.[Key] and a.cat2 = o.[return] where [key] is null order by ID
I wanted to know that when I create a table using SQL Server, does the table metadata like schemaname and catalogname is stored in memory or in hard-disc. I am trying to extract schemaname and catalogname when the database is down or correctly when I have no connection string. I know about sysindexes, systables etc, but i am not able to understand whether it stores the data which is useful to me like catalogname.
sql query to extract the column names from sysindexes?
I have tables Companies, CompaniesDetails (the company branches), Addresses and Companies_Addresses.
The addresses table contain street and city while the Companies_Addresses has the keys for both companies and branches ,i.e., they are linked to Companies and CompaniesDetails via CompanyID and CompanyDetailID and to Addresses via addressID.
Companies_Addresses id (PK) companyID (FK) companyDetailID (FK) addressID (FK)
I am able to get the branch address at the moment with this code but I would like to get the company address as well using a single select statement.
Code: SELECT DISTINCTAddresses.city as branchCity, Addresses.street as branchStreet FROMCompanies LEFT JOINCompaniesDetails AS cd ON companies.companyID = cd.companyID
LEFT JOINCompanies_Addresses AS c ON c.companyDetailID = cd.companyDetailID LEFT JOINAddresses ON c.addressID = Addresses.addressID
WHERE Companies.name LIKE 'abc' ANDCompanies.status_indicator like 'Current'
If a table gets joind with itself and then joined again with itself, is it possible to perform one kind of outer join from the third table to the second tabe and having the range of records in the second table limited to the joins between the first and the second table?
Join on RegionID Join on RegionID and City Show records not qualifying for the joint but limited to the scope of of records established by the first joint _______________ ________________ _______________ | | | | | | | RegionID One | | RegionID One | | RegionID One | |_______________| | City New York | | City New York| |________________| |_______________| _______________ ________________ _______________ | | | | | | | RegionID One | | RegionID One | | RegionID One | |_______________| | City New York | | City New York | |________________| |_______________| _______________ ________________ _______________ | | | | | | | RegionID One | | RegionID One | | RegionID One | |_______________| | City Buffalo | | City New York | |________________| |_______________|
This is regarding the joins...for instance I have three Tables A, B, C. And they are related to each other. If I am joining A->B,B->C then is it necessary to join A->C too? what happens if I omit join A->C? and what happens if I include join A->C too?
Hello allLet's say I have 1 table "contract" containing the following data:id year sales45 2005 10045 2004 9589 2005 25089 2004 27512 2005 42I want to make a table with one unique row for each id and then a column for2004 sales and 2005 sales, like this:select a.id, a.sales, b.salesfrom contract a, contract bwhere a.contract=b.contract(+)and a.year=2005and b.year=2004The rows for id 45 and 89 are shown perfectly. But id 12 is not shown at allbecause it doesn't have a record for 2004!! I don't know why 'cause Iouterjoined the tables.It works perfectly when I have two distinct tables for each year (forinstance contract_2005 and contract_2004). So the problem seems to be in thefact I like to join one table with itself.Someone has a solution for this?thanks!Maarten
I'm trying extract a column from the table based on certain Conditions: This is for PowerPivot.
Here is the scenario:
I have a table "tb1" with (project_id, month_end_date, monthly_proj_cost ) and table "tb2" with (project_id, key_member_type, key_member, start_dt_active, end_dt_active).
I would like to extract Key_member where key_member_type="PM" and active as of tb1(month_end_date).