Pointers To The Best Documentation On Star Joins And The Optimiser
May 16, 2007
Hi All,
we are just starting to do some testing on sql server EE with dimensional models.....we have had one or two problems we have been able to solve using the new peformance dashboards etc.
However, as is inevitable, we are seeing strange behaviour of a query....in a star join it seems to be doing an eager spool and trying to spool the entire fact table to tempdb....hhmmm....
Rather than ask one question at a time.....we have DBAs who went to classes etc at MSFT and the client is some level of MSFT partner.
Could anyone point me to the best documentation for understanding the optimiser and how to influence it to get it to do the right thing in optimising plans for star joins?
Thanks
Peter
View 6 Replies
ADVERTISEMENT
Jul 23, 2005
Hi, i have a table like thisCREATE TABLE dbo.test(num int NOT NULL,ename char(80),eadress char(200),archived char(1)PRIMARY KEY CLUSTERED (num))create index i_archived on dbo.test(archived)the are 500000 rows in this table, and the archived field contain 15000 'Y'and 485000 'N'When i issue a select * from test where archived='Y',the path choosed is the index scan clustered and not the index i_archivedthe stats are updated every day.did i miss something ?thx
View 2 Replies
View Related
Jul 20, 2005
There is something very strange going on here. Tested with ADO 2.7 andMSDE/2000. At first, things look quite sensible.You have a simple SQL query, let's sayselect * from mytab where col1 = 1234Now, let's write a simple VB program to do this query back to anMSDE/2000 database on our local machine. Effectively, we'llrs.open sSQLrs.closeand do that 1,000 times. We wont bother fetching the result set, itisn't important in this example.No problem. On my machine this takes around 1.6 seconds and modifyingthe code so that the column value in the where clause changes eachtime (i.e col1 = nnnn), doesn't make a substantial difference to thistime. Well, that all seems reasonable, so moving right along...Now we do it with a stored procedurecreate procedure proctest(@id int)asselect * from mytab where col1 = @idand we now find that executingproctest nnnn1,000 times takes around 1.6 seconds whether or not the argumentchanges. So far so good. No obvious saving, but then we wouldn'texpect any. The query is very simple, after all.Well, get to the point!Now create a table-returning UDFcreate function functest(@id int) returns table asreturn(select * from mytab where col1 = @id)try calling that 1,000 times asselect * from functest(nnnn)and we get around 5.5 seconds on my machine if the argument changes,otherwise 1.6 seconds if it remains the same for each call.Hmm, looks like the query plan is discarded if the argument changes.Well, that's fair enough I guess. UDFs might well be more expensive...gotta be careful about using them. It's odd that discarding the queryplan seems to be SO expensive, but hey, waddya expect?. (perhaps theUDF is completely rebuilt, who knows)last test, then. Create an SP that calls the UDFcreate procedure proctest1(@id int)asselect * from functest(@id)Ok, here's the $64,000 question. How long will this take if @idchanges each time. The raw UDF took 5.5 seconds, remember, so thisshould be slightly slower.But... IT IS NOT.. It takes 1.6 seconds whether or not @id changes.Somehow, the UDF becomes FOUR TIMES more efficient when wrapped in anSP.My theory, which I stress is not entirely scientific, goes somethinglike this:-I deduce that SQL Server decides to reuse the query plan in thiscircumstance but does NOT when the UDF is called directly. This iscounter-intuitive but it may be because SQL Server's query parser istuned for conventional SQL i.e it can saywell, I've gotselect * from mytab WHERE [something or other]and now I've gotselect * from mytab WHERE [something else]so I can probably re-use the query plan from last time. (I don't knowif it is this clever, but it does seem to know when twotextually-different queries have some degree of commonality)Whereas withselect * from UDF(arg1)andselect * from UDF(arg2)it goes... hmm, mebbe not.... I better not risk it.But withsp_something arg1andsp_something arg2it goes... yup, i'll just go call it... and because the SP was alreadycompiled, the internal call to the UDF already has a query plan.Anyway, that's the theory. For more complex UDFs, by the way, theperformance increase can be a lot more substantial. On a big complexUDF with a bunch of joins, I measured a tenfold increase inperformance just by wrapping it in an SP, as above.Obviously, wrapping a UDF in an SP isn't generally a good thing; theidea of UDFs is to allow the column list and where clause to filterthe rowset of the UDF, but if you are repeatedly calling the UDF withthe same where clause and column list, this will make it a *lot*faster.
View 3 Replies
View Related
Jul 23, 2007
I have discovered what looks like a bug in the optimiser. I've posted it at https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=288243 but I wonder if any of you with SQL 2005 RTM, 2005 SP1 or 2008 CTP could confirm when this was introduced and whether it is still an issue?
Code Snippet
-- Bug report
-- 2007/07/19
-- Alasdair Cunningham-Smith
-- alasdair at acs-solutions dot co dot uk
set nocount on
go
-- example date in in British date format
set dateformat dmy
go
use tempdb
go
create table foo( bar varchar( 30 ) not null )
go
insert into foo( bar ) values ( 'fishy' )
insert into foo( bar ) values ( '19/07/2007' )
go
-- this works fine in all versions - only valid dates are passed to the convert function
select
convert( smalldatetime, bar, 103 ) as bardate
from
foo
where
bar like '__/__/____'
go
-- this works on SQL 2000, but fails on SQL 2005 SP2 (I've not tried other SPs of SQL 2005):
-- Msg 295, Level 16, State 3, Line 2
-- Conversion failed when converting character string to smalldatetime data type.
--
-- I believe the query is rewritten as if the derived table query contained
-- "and convert( smalldatetime, bar, 103 ) < getdate()"
-- which would expose the convert to the invalid data
select
*
from
(
select
convert( smalldatetime, bar, 103 ) as bardate
from
foo
where
bar like '__/__/____'
) as derived
where
bardate < getdate()
go
-- Workaround:
-- Use a case statement to protect the convert operator from the invalid data
select
*
from
(
select
case when bar like '__/__/____' then
convert( smalldatetime, bar, 103 )
else
null
end as bardate
from
foo
where
bar like '__/__/____'
) as derived
where
bardate < getdate()
go
drop table foo
go
The workaround I discovered is simple but ugly. I invite your comments...
alasdair.
View 5 Replies
View Related
Nov 25, 2004
hmm ok here is an outline:
I have a view which is a combination of TblHorses and TblOwners has fields:
Form from Horses
StableHands from owners
linked by Horses.OwnedBy = Owners.OwnerID
ok goes through all the horses and creates a number either 0,1,2
based on the formula
Round((int(RSFormUpd("StableHands"))/22) + (rnd * 2),0)
then if the random number = 0 the Form goes down 1 unless is already 1 then stays the same
if = 1 then stays same
if = 2 then form goes up 1 unless it is already 5 then it stays the same
then this is how i coded in asp as i can do asp/vb programming lol but T-SQL is a mystery:
Randomize()
Set RSFormUpd = Server.CreateObject("ADODB.Recordset")
RSFormUpd.open "Select * From ViewWeeklyFormUpdate", Conn, 3, 3
Do While Not RSFormUpd.EOF
UpdForm = Round((int(RSFormUpd("StableHands"))/22) + (rnd * 2),0)
If UpdForm = 0 Then
If int(RSFormUpd("Form")) - 1 = 0 Then
RSFormUpd("Form") = int(RSFormUpd("Form"))
Else
RSFormUpd("Form") = int(RSFormUpd("Form")) - 1
End If
End If
If UpdForm = 1 Then
RSFormUpd("Form") = int(RSFormUpd("Form"))
End If
If UpdForm = 2 Then
If int(RSFormUpd("Form")) + 1 = 6 Then
RSFormUpd("Form") = int(RSFormUpd("Form"))
Else
RSFormUpd("Form") = int(RSFormUpd("Form")) + 1
End If
End If
Response.write RSFormUpd("Form") & " "
RSFormUpd.update
RSFormUpd.movenext
Loop
RSFormUpd.close
Set RSFormUpd = Nothing
Thanks in advance
Dagaz
View 1 Replies
View Related
Jun 12, 2006
Hello everyone, I'm in need of help.
I'm using microsoft sql server 2005 along with the microsoft visual studio 2005. I have 2 questions:
1) In the database server, there is an "image" datatype. I need to know how to use that because I need to display images on my webform.
2) I read somewhere that pointers can be used to point the file path. So, is it possible for me to store images / audios in a file and use the database to point to the file path? If it is possible, how can it be done?
Thanks.
View 6 Replies
View Related
Jun 11, 2006
hello, i have a few questions here which i hope anyone can help me.
1. how do i go about using the image data type?
2. how to use pointers to point to a specific file? for example, if i want to point to a music/image file, how do i go about doing that?
i'd appreciate if anyone can help me.
thx! :) .
View 2 Replies
View Related
Jul 23, 2005
HiI'm currently having to design a database for a recruitment agencythat's just started up and have one area where I'm a little unsurewhere to go.Basically I've implemented the 'standard' Customer, Contacts tableslinked on CustomerID, and also have CallRecords (for phone calls etcmade to contacts) Linked on ContactID.My difficulty is that they want to be able to store names/details ofpeople looking for work (candidates) BUT these people may also be acontact (i.e. the agency could be dealing with a contact at a companywho is also looking for a new job themselves). They would also like to(naturally) have these candidates details held against 'currentemployer' customer details so there may be situations where a candidateis JUST a candidate (i.e. not currently working and therefore notassociated to a company), OR they may be a candidate AND a contact, andyou may have contacts who are JUST contacts (i.e. not actively lookingfor work at the moment).I'm basically just trying to figure out the options I have for storingthe contact details and candidate details.FYI I need to store the same details for Contacts and Candidates (i.e.name, job title, contact numbers etc) but Candidates require extrainformation to be stored about them (work experience, qualificationsetc).Any help/pointers would REALLY be appreciated!!Thanks in advanceMartin
View 1 Replies
View Related
Jul 12, 2007
We currently have a standard star schema warehouse that contains clickstream data from our web server farm. We use a home grown ETL process that is a combination of java code and shell scripts to process these logs on a daily basis. The clickstream data represents both our dimensional data as well as measurements. We are currently processing 22GB of compressed data daily and are currently on a 50% growth rate year over year.
My question is does anyone have experience/pointers on using SSIS to process a stream of data that contains both the dimensions and fact data? Our current architecture pulls out dimensional attributes, processes them separately, and then substitutes the dimensional keys back into the fact stream. I have to believe there is a more efficient way to do this via SSIS.
Any advice would be appreciated.
Thanks
--sean
View 3 Replies
View Related
Dec 27, 2004
Hi,
I have tried the following out and would appreciate feedback from experienced users regarding if the following is a good/bad approach:
After bring all the data in my Data Mart, I have created a view which has all the data in a big flat table (totally unnormalized). Then based on this BIG FLAT UNNORMALIZED VIEW :) I have created my various dimensions using the 1st option i.e. Star Schema.
Based on the little testing that I have done, I seem to be getting the correct results across various dimensions... However, can someone kindly comment on this approach and the pros/cons.
Thanks
View 4 Replies
View Related
Jul 20, 2005
Is there a way to convert an image pointer to a page ID that could beused in DBCC pagei.e.select TEXTPTR(document)FROM testdocs where id = 1resturns0xFEFF3601000000000800000003000000select convert(int,TEXTPTR(document)) FROM testdocs where id =1returns50331648dbcc page (9,3,8,1)dumps the first page of the imageI am trying to map 0xFEFF3601000000000800000003000000 - > pagenumber 8thanks
View 1 Replies
View Related
Oct 13, 2000
My clients are not intersted in using Auto generated keys. They are
also get data from many sources they would like to use something like
customerID. Dose anybody know of any reason why we should not do that?
Also they are concern about sql server not being able to handle the datawarehouse in future because they are expecting it to grow in terabites.
Dose anyone have advice on that?
I was thinking of putting the fact table on a different file group don't if it will help.
View 1 Replies
View Related
Dec 18, 2002
does any one know how to built a star schema by DTS:confused:
View 4 Replies
View Related
Oct 11, 2000
I am re-engineering the data warehouse and my client is currently using autogenerate keys, their concern is that after a certain amount of keys (can't remember the figure) sql server starts having problems, dose anyone know how i should handle it when i am doing the designing?
thanks any input will be appreciated
View 2 Replies
View Related
Nov 23, 2007
I'm designing a DW, and i have some doubts relative to the Distributed Transaction when modeling a star schemma.
My problem is: I have a main dtsx package in wich i call all the child packages in order to create the (Fact and Dimension Tables).
(1) First i have several child packages that create and populate all the Dimension Tables (with the latest values from the relational DB).
(2)Then i have several child packages that create all the fact tables, in this process i use the surrogate keys from the dimension tables (obtained in step 1).
The problem here is , " How do i use the multiple transaction ?" , if i put a "required" Transaction Option on the parent package, then after calling the child packages that creates the dimension tables. The values are not commited, so they are not available when i later execute the childs packages related with the fact tables.
How can i use transaction when modelling a star schemma, in order to have a full roll back or a full commit in all tables (Dimensions and Fact Tables).
Thanks
View 6 Replies
View Related
Apr 26, 2015
I have an extremely annoying problem when debugging stored procedures in SQL Server 2014 with SSDT or SSMS. When calling a SP thru EXECUTE in Debug mode, 9 out of 10 SPs are traced with a wrong yellow arrow-pointer to the line currently reached.
The offset is between 6 to 15 lines downward. Tracing itself and update of the "Locals"-view works as expected. All SPs contain comments also before the Create Procedure statement. The SP shown when tracing show exactly the same content as the stored SQL in the SSDT project under work incl. Create procedure and all comments.
The picture here show the first line selected after the debugger has traced into the SP. The first line really executed with "Next" will be SET NOCOUNT ON.
If this does not turns out as my fault and some of you would support that, I would like to post this to SQL Connect.
View 4 Replies
View Related
Jul 8, 2014
In SQL Server I can select a specific column in from of * like so:
Code:
select test_column_1,* from testtable1
I've been googling around and cannot seem to be able to find a definitive answer.
View 1 Replies
View Related
Apr 30, 2008
So this has got to be considered a major, major flaw in how SSRS interacts with Oracle. I'm using the "Oracle" data provider, but I've also tried using Microsoft's OLE DB data source, and some others, and in no case does SSRS hand off to Oracle a query that does NOT have bind variables. In other words, typically query parameters get passed off to Oracle as bind variables.
The incredibly major problem that this causes is that it disallows Oracle's use of star transformation queries which is the primary method by which to get fast responses to a data warehouse/star schema, in fact a prime authority on this subject (Bert Scalzo, Oracle DBA Guide to Data Warehouse and Star Schemas, p.86 -- obvioulsy not using Oracle 7x was the first) lists it as in effect the #1 consideration.
So what gives? In effect SSRS cannot be used against large scale Oracle data warehouses? I've had success with Business Objects being able to access Oracle star transformations.
So a guess my question is how the heck can use SSRS in a big, Oracle-based data warehouse?
http://www.dba-oracle.com/oracle10g_tuning/t_star_transformations_sql.htm
For star_transformation join plans, the following parameters must also be considered: ... No BIND VARIABLE in SELECT statement
http://www.orafaq.com/usenet/comp.databases.oracle.server/2003/09/28/2305.htm
Star transformation is not supported for tables with any of the following characteristics:
* Queries that contain bind variables
View 1 Replies
View Related
Jan 19, 2007
Hi all,
Our star schema design has one fact table and 3 dimensions.
The FK's in the fact do not necessarily make up the primary key. So I have an identifier in the fact table as PK. Here is my index assignment:
Fact Table - Clustered Index on PK
Non Clustered Index 1 on FK1
Non Clustered Index 2 on FK2
Non Clustered Index 3 on FK3
Each Dimension Table - Clustered Index on PK
Non Clustered Index on Attribute. This is the attribute that will be used in reports / cubes.
Is the above design good to start with?
Thanks,
V
View 4 Replies
View Related
Oct 22, 2015
We have an OLTP database and operational reporting is carried out on a replica server / database. We have plans to build a new data warehouse and an analysis services cube.
Question 1:Should a cube be designed to extract data from a physical star schema rather than a logical one (3NF relational (ODS?) using a data source view to derive the star)? I'm guessing for performance it's better to pull data from similar structures (physical facts and dimensions as required by analysis services) but is the difference significant?
Question 2:Depending on the answer to q1, is it bad practice to ETL data from a staging database (replica > staging) directly to a star schema (multiple data sources and cleansing / business rules required)? Or should it be processed from staging to an ods and only then to a star schema (physical or logical). I still don't know if an ODS is required but I guess the consideration for this decision is whether the business would require daily operational (or ad hoc) reporting on the consolidated data sources (without needing historical DW functionality).
View 2 Replies
View Related
Nov 3, 2000
We find that a delete command on a table where the rows to be deleted involve an inner join between the table and a view formed with an outer join sometimes works, sometimes gives error 625.
If the delete is recoded to use the join key word instead of the = sign
then it alway gives error 4425.
625 21 0 Could not retrieve row from logical page %S_PGID by RID because the entry in the offset table (%d) for that RID (%d) is less than or equal to 0. 1033
4425 16 0 Cannot specify outer join operators in a query containing joined tables. View '%.*ls' contains outer join operators.
The delete with a correleted sub query instead of a join works.
Error 4425 text would imply that joins with view formed by outer joins should be avoided.
Any ideas on the principles involved here.
View 1 Replies
View Related
Aug 11, 2005
SQL Server 2000Howdy All.Is it going to be faster to join several tables together and thenselect what I need from the set or is it more efficient to select onlythose columns I need in each of the tables and then join them together?The joins are all Integer primary keys and the tables are all about thesame.I need the fastest most efficient method to extract the data as thisquery is one of the most used in the system.Thanks,Craig
View 3 Replies
View Related
Nov 16, 2007
I am cutting my teeth on star schema design. I have a simple star schema I am building for Headounct analysis at work. I have a factless fact table where a row represents a head in the company. Each head is toed to a particulat week in a Date dimension tabel. There are additional dimensions for things like gender, ethnicity, marital status, age, etc. Now in my department dimension - it's hierarchical. In the DimDepartmnet there is a department which belongs to a company. Comapnies belong to divisions. Now the fun part. Each division has a headcount target for each year. Up to this point I am in a perfect star schema (no snow flaking). How would I integrate in this concept of a headcount target for each division for a given year?
We are using cognos on top of this star schema to provide reporting and analysis services if that is relevant. From the Star Schema design stand point... any thoughts?
Christian Loris
View 3 Replies
View Related
Jul 28, 2004
What information should I be documenting about a server (SQL) and about the databases found on that server. I was task with this job. Please help.
Thanks
Lystra
View 2 Replies
View Related
Nov 13, 2004
I'd like to go through and document the databases I'm responsible for...it's probably a good practice and I'm sure there a several approaches to doing so.
Anybody have comments, recommendations, or possibly a nice word template that links everything up?
Thanks for the input.
Alex
View 1 Replies
View Related
Mar 9, 2004
hello everyone,
I have to document about four databases, has anyone got any specific format or tool for documenting a database?
Any inputs would be very helpful
regards,
Harshal.
View 13 Replies
View Related
Jul 26, 2007
Will there ever be xml code documentation support inside of Sql Server projects?
View 1 Replies
View Related
Oct 12, 1999
Hi,
Why is it that SQL joins (*=) run a little faster as opposed to ANSI joins(LEFT JOIN...)? Aren't they supposed to be almost identical?
The issue is this: we are promoting using ANSI syntax for the obvious reason (future versions of SQL Server may not support SQL Server syntax; portability, etc.)
However, the problem is the speed. What have others done about this? Do you use ANSI syntax or SQL syntax? HOw true is it that future SQL Server versions may discontinue support for the '*=" and "=*' join operators.
Angel
View 1 Replies
View Related
Feb 29, 2008
I have four tables which I want to return results for an advanced search function, the tables contain different data, but can be quite easily joined,
Table A Contains a Main Image, this image is displayed in the results
Table B Contains an Icon, this image is displayed in the results
Table C doesn't have an image in it but has a child table with a number of images associated to the table, in the UNION ALL statement I would Like to do a Join to get say the top Image from this child and print it for the row associated with table C.
Select title, description, image from tableA
UNION ALL
Select title, description, icon as image from tableB
UNION ALL
title, description, ( inner Join SELECT top(1)
from imageTableC where imagetableC.FK = tableC.PK)
as image from tableC
Could someone show me the syntax to do this, I have all the information printing to the screen, bar this table C image.
View 14 Replies
View Related
May 5, 2008
Hi all
Could anyone tell me from where i can find complete Documentation for SQLDMO Object and how can i initialize
and use it in my programms?
Kind Regards.
View 1 Replies
View Related
Dec 3, 2005
Does anyone knows a good FREE tool for documenting a DB?
so far i'm with ApexSQL but its not free, wondering if anyone knows a free tool.
Thanks,
View 1 Replies
View Related
Nov 9, 2004
Hi guys,
Can any one tell me about any Database Documentation Tool for SQL server.
Please reply asap.
Junaid
View 5 Replies
View Related
Oct 16, 1998
I am looking for good utilities that document MS SQL table schema, indexes, layouts etc. I am currently looking at SQL auditor, but this product does not give me table schema or any kind of device revisions. I recently was given several SQL servers that have not been documented in any way.
thanks
ps SQL auditor gets a B+ in my book. Let me know what you think
View 2 Replies
View Related