Look,merge And Sort
Apr 11, 2008Can anyone please just give me an example of the above transformations to be understood in a lay mans language. Please.
View 1 RepliesCan anyone please just give me an example of the above transformations to be understood in a lay mans language. Please.
View 1 RepliesDoes this look correct for MERGE SYNTAX?I want to INSERT INTO the target table,new service contracts sold with the start date,from historical table by month, only if they don't exist in the target table, doing nothing if matched. . . but the source table is indexed Fiscal Period descending and I have multiple renewal versions of the same contract number in following years. . .
MERGE dbo.SERVICE_CONTRACTS_NEW AS Target
USING (SELECT r.Customer_Acct, r.[Contract], r.Start_Period FROM dbo.ACTIVE CONTRACTS r ORDER BY r.FiscalPeriod ASC) AS Source
ON Target.Account_Nbr = Source.Customer_Acct AND Target.[Contract] = Source.[Contract]
WHEN NOT MATCHED BY TARGET
THEN INSERT (Account_Nbr, [Contract], FiscalPeriod_Start)
VALUES (
Source.Customer_Acct
,Source.[Contract]
,Source.Start_YYYYMM
);
Just wondering if I am an isolated case or just doing stuff other people don't. Ive created around a dozen packages for importing various types of data into a datawarehouse and some of the packages have random crashes. Over time I have been trying to tie together what the crashing ones have in common and now believe I have completely isolated it: Sort + Merge inside a dataflow in a for each loop. The more sorts and merges in the flow the more likely the problem is to occur.
This week I created 2 identical packages to import some XML... one used some gnarly complicated substrings and such to extract some data and add it as derived columns while the other used the XML correctly that produced 2 streams (one being meta data - other being detail data) and sorted + merged them. The gnarly substring package could run all night (processes about a file every 2 or 3 seconds so you can do the math) while the merged data flows crashes somewhere between 1 and 4 hrs of running - usually right around 3 hrs (also processes about 1 file every 2 seconds). The file it crashes on wil be completly random and not related to file size - I've had crashes on files with as little as 1 row of data and 1 row of metadata and as big as 1 row of metadata and 100,000+ rows of data. Another more complicated package I created has 4 merges (and thus 8 sorts) and it will crash anywhere between 5 minutes and 20 minutes of running.
The crash is usually one where the whole process just shuts down with a memory dump but occasionally I have gotten errors about out of memory or warnings about threads leaking buffers.
I am trying to do some "research" to see if others have this problem. I've spent 1.5 months writing all ETL jobs in SSIS and its gotten to the point management has asked me to explore other platforms. I do have a case open with Microsoft on the package with 4 merges - but we've spent weeks now just trying to get debugging tools to get them some useful info when the process crashes. I am running SP1 and kb91822.
Anyone else?
I have encountered an annoying problem which causes the Merge Joins to lose records in the dataflow. The problem is caused by 2 unusual behavoirs.
1/ Sort of SSIS is not sorting the same as ORDER BY in SQL
example:
Code Snippet
CREATE TABLE [dbo].[table_2](
[test] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON [PRIMARY]
With data as following:
Code Snippet1000
1-1.00.00.00
2000
When select this data with an order by like: select test from table_2 order by test
The result will be:
Code Snippet
test
1-1.00.00.00
1000
2000
If you sort the data by the SORT block of the SSIS the result will be:
Code Snippet
test
1000
1-1.00.00.00
2000This is annoying and dangerous, because it causes the next bug.
2/ Two datasources sorted by ORDER BY clause can give problems in a Merge Join.
If you have 2 data sources both correctly sorted by an order by in the query. When you join these 2 datasources with a Merge Join, you can lose some records in the dataflow.
This happens with larger datasets than examples above.
http://img340.imageshack.us/my.php?image=strangebehavior2tw1.png
When I join the datasources (see image ) inside SQL I will get a correct result of 15271 records.
Is this a bug which I should report? or is there a flaw in my logic?
Setup:
I need to run an Insert query which pulls data from a table located on server A database AA Table AAA conditional on (or JOINED with) Table BBB in database BB sever B. In SQL 2000 it could be done as:
From Server A:
sp_addlinkedserver B
INSERT dbo.ResultsTable
SELECT SourceTable.* FROM B.BB.dbo.BBB SourceTable
INNER JOIN A.AA.dbo.AAA ConditionTable ON SourceTable.RecID = ConditionTable.RecID
sp_dropserver B
In SSIS one of the possible solutions is to use a package which does the following:
OPEN A + OPEN B-> SORT A + SORT B->MERGE JOIN A and B->OUTPUT RESULT
The problem with this approach is that it's extremely slow for large datafiles (50M records each)
Questions:
1) In the procedure above could the SORT step be avoided?
2) Is there another approach to run cross-servers JOIN in SSIS?
Thank you
In the first image as can be seens i have 2 different data sources and then they are being joined using "Merge Inner Join". The "sort" is on BusinessEntityID column of Person table and "Sort1" is on "PersonID" of Customer table. The merge join of these 2 result in 19,119 rows.
On the other hand, if i use single data source and use a query with inner join on tables used in the first image (ie. 2 tables being used in 2 different data sources) as depicted in second image. Also, since merge cannot operate without SortKey i have defined TerritoryID as sort key in the advanced editor. The number of rows i get after this is "10,274". My select query was :
SELECT
P.BusinessEntityID,
P.PersonType,
P.Title,
P.FirstName,
P.MiddleName,
P.LastName,
P.Suffix,
C.TerritoryID
FROM stg.Person AS P
INNER JOIN stg.Customer AS C ON C.CustomerID = P.BusinessEntityID
ORDER BY C.TerritoryID;
According to me, it should have been the same as in first case i am using merge inner join and in second case i am using SELECT query with inner join. Upon drilling down i found that in the first case , my sort keys are BusinessEntityID and PersonID, if i modify this to CustomerID and BusinessEntityID as this is my join condition (in ithe inner join query shown above), i get the desired output. What i was wondering was, how the sort order change the Join Condition?
I am trying to set sorting up on a DataGrid in ASP.NET 2.0. I have it working so that when you click on the column header, it sorts by that column, what I would like to do is set it up so that when you click the column header again it sorts on that field again, but in the opposite direction. I have it working using the following code in the stored procedure: CASE WHEN @SortColumn = 'Field1' AND @SortOrder = 'DESC' THEN Convert(sql_variant, FileName) end DESC,
case when @SortColumn = 'Field1' AND @SortOrder = 'ASC' then Convert(sql_variant, FileName) end ASC,
case WHEN @SortColumn = 'Field2' and @SortOrder = 'DESC' THEN CONVERT(sql_variant, Convert(varchar(8000), FileDesc)) end DESC,
case when @SortColumn = 'Field2' and @SortOrder = 'ASC' then convert(sql_variant, convert(varchar(8000), FileDesc)) end ASC,
case when @SortColumn = 'VersionNotes' and @SortOrder = 'DESC' then convert(sql_variant, convert(varchar(8000), VersionNotes)) end DESC,
case when @SortColumn = 'VersionNotes' and @SortOrder = 'ASC' then convert(sql_variant, convert(varchar(8000), VersionNotes)) end ASC,
case WHEN @SortColumn = 'FileDataID' and @SortOrder = 'DESC' THEN CONVERT(sql_variant, FileDataID) end DESC,
case WHEN @SortColumn = 'FileDataID' and @SortOrder = 'ASC' THEN CONVERT(sql_variant, FileDataID) end ASC And I gotta tell you, that is ugly code, in my opinion. What I am trying to do is something like this: case when @SortColumn = 'Field1' then FileName end,
case when @SortColumn = 'FileDataID' then FileDataID end,
case when @SortColumn = 'Field2' then FileDesc
when @SortColumn = 'VersionNotes' then VersionNotes
end
case when @SortOrder = 'DESC' then DESC
when @SortOrder = 'ASC' then ASC
end and it's not working at all, i get an error saying: Incorrect syntax near the keyword 'case' when i put a comma after the end on line 5 i get: Incorrect syntax near the keyword 'DESC' What am I missing here? Thanks in advance for any help -Madrak
I have database on SQL Server 2000 set up with a merge publication.This publication is configured with a number of dynamic filters toreduce the amount of data sent to each client. Each client has ananonymous pull subscription. The merge process can be triggered by thewindows sync manager and my application.To improve performance I have created some helper tables to hold themapping between user login and primary keys of selected entities.For the replicated data to be correct the contents of the helper tablesneeds to be up to date.I need to fire off a stored procedure on the publisher beforereplication starts to verify that this data is up to date. I can notsee any documented way of doing this however I have been experimentingwith some unorthodox systems.Firstly has anyone any ideas?I have been considering adding a trigger to some of the tables used bythe Microsoft replication code - yes I know this is very nasty.My problems arise because executing this stored procedure will causesome data to be updated. In updating data we could create a newgeneration in the database. I must therefore run my stored procedurebefore any the Microsoft code makes any generation checks / updates.Anyone done anything similar, Anyone have any better ideas?Any comments would be gratefully received.
View 1 Replies View RelatedHi,
I'm using merge replication to maintain a backup copy of my main (publisher)MSDE database. A push subscription periodically (1 per minute) updates the backup DB.
It's intended that if the main db goes down then the backup (subscription) db can be configured as a publisher. This must all be performed via scripting.
The initial configuration of the main publisher and subscription is controlled via scripting, which works fine.
The problems occur when I try to configure the subsciber to become a publisher. A script is executed on the subscriber but fails at the point when it's configuring the publisher detail. The error is something like "unable to configure a publication for a database setup as an anonymous subscription".
I'm guessing that there are subscritpion artifacts added to the database which need to be removed before it can be configured as a new publisher.
Please help,
Jez W
i encounter this error..
Cannot sort a row of size 8107, which is greater than the allowable maximum
of 8094.?
why this error occur? can someone explain? how to avoid this? thanks!!
We are using a modeling technique called Anchor Modeling in our data warehouses. You can read more about the technique itself at our homepage http://www.intellibis.se, where we have published a fact sheet and a recently held presentation (TDWI European conference). One of the features with this technique is its simple way to historize data. This is done by having a fromDate column which together with the surrogate key will yield a unique combination. On the tables that has this kind of historization we add a primary key, which in turn will create a clustered index, with the following specification (surrogateKey asc, fromDate desc). This will physically order data on the storage media according to the specificed columns and ordering. Now I move on to create a "latest view" of this table which does a subselect to find the latest version for every surrogateKey using max(fromDate). Should not the optimizer now figure out that data is ordered so that the latest version always comes first for every surrogateKey, hence any sorting would be unneccessary? If I look at the actual execution plan after running a query that uses the view there is a sort in the plan, but the cost is always 0%. Does this mean that it did not sort the data, or that it did call a sorting routine, but it actually took very little time to do the sorting? If so, is there a reason that is has to do the sorting or could it have been left out by an even smarter optimizer?
I would also like to applaud the people behind the optimizer, since it will figure out which tables are in fact necessary to query and eliminate others, even if I have left joined them into the view I am using. This speeds up performance and makes anchor modeling feasible. Unfortunately optimizers from other vendors seem to have trouble doing this...
Regards,
Lars
I've been racking my brain all day and I finally decided to ask for help. I've got two tables with rows from the first that need to be sorted by the second. The problem is that the rows don't always exist in the second table. I've tried various forms of INNER, LEFT, RIGHT, OUTER, LEFT OUTER, CROSS, etc., etc., etc. and nothing (oh yeah UNION too). Every time I get close, I lose the records that don't have matches.
Something close-
SELECT A.IDDoc, B.First
FROM A
LEFT JOIN B
ON A.IDDoc = B.IDDoc
WHERE B.Dept = 'A'
ORDER BY B.First
Example Data
Table A
IDDoc Document
---------------------------
1 1467.doc
2 8722.doc
3 A47F.doc
4 88DQ.doc
5 ABCD.doc
Table B
IDDoc Dept First
----------------------------
1 A John
2 A Bob
3 A Ralph
4 A Diane
Results I Want
IDDoc First
-------------------
5 NULL
2 Bob
4 Diane
1 John
3 Ralph
Any help is appreciated. If I've posted in the wrong forum, please feel free to direct me to a better one.
Thanks in advance!
Jim
SELECT
LEFT(CONVERT(CHAR(11),convert(datetime,task_date),109),3) + ' ' +
RIGHT(CONVERT(CHAR(11),convert(datetime,task_date),109),4) as Date,SUM(CASE a.status_id WHEN 1000 THEN b.act_point ELSE 0 END) as Programming,SUM(CASE a.status_id WHEN 1016 THEN b.act_point ELSE 0 END) as Design,SUM(CASE a.status_id WHEN 1752 THEN b.act_point ELSE 0 END) as Upload,SUM(CASE a.status_id WHEN 1032 THEN b.act_point ELSE 0 END) as Testing,SUM(CASE a.status_id WHEN 1128 THEN b.act_point ELSE 0 END) as Meeting,SUM(CASE a.status_id WHEN 1172 THEN b.act_point ELSE 0 END) as OthersFrom
task_table a,act_table b where a.status_id=b.act_id and
a.user_id=(select user_id from user_table where user_name='Raghu') and
a.task_date like '%/%/2006' GROUP BYLEFT(CONVERT(CHAR(11),convert(datetime,task_date),109),3) + ' ' + RIGHT(CONVERT(CHAR(11),convert(datetime,task_date),109),4)Output :Aug 2006 294 0 0 80 0 0 Jan 2006 14 0 0 0 0 0 Oct 2006 336 0 0 0 0 0 Sep 2006 3262 20 24 8 16 0 How to sort the date in ascending Order ?Jan 2006Aug 2006Sep 2006Oct 2006
I have: 4 tables and 1 table variable.
CCenters (ID, Name)
Campaigns (ID, Name)
Rel (ID, CCenterID, CampaignID) - [many to many]
and @SCampaigns (ID, CampaignID) - represents the selected campaigns by the user
performing the commands below I would get the centers associated with the campaigns selected.SELECT CCenterID
FROM Rel
INNER JOIN @Campaigns ON @SCampaigns.CampaignID = Rel.CampaignID
But what I really want are the common centers to the selected campaigns.
Thanks
I am trying to select a record from a table where it has the smallest priority
how would you go about doing this
is there a cool sort command or is there a select command syntax that can do this
thanks
I've made this example and it loads a picture into a database. (MsSql )Take a look at the code, it works just fine however it leaves a process in sleeping mode "avaiting command" in Enterprise manager under "Management/current Activity/Process Info"Is it supposed to be like this or is it supposed to be reemoved after .net is finished??Code snip_______________________________________________________
Dim conn As New SqlConnection("Data Source = (local);Initial Catalog = " & "test;User ID = NAME; Password=PASSWORD;")
Dim cmd As New SqlCommand("Select * from tab_bild", cnn)
Try
conn.Open()
Dim myDatareader As SqlDataReader
myDatareader = cmd.ExecuteReader(CommandBehavior.CloseConnection)
Do While (myDatareader.Read())
Response.ContentType = myDatareader.Item("PersonImageType")
Response.BinaryWrite(myDatareader.Item("PersonImage"))
Loop
conn.Close()
Response.Write("Picture info succesfully retrieved")
Catch SQLexc As SqlException
Response.Write("Read failed, Reason: " & SQLexc.ToString())
End Try
End Sub________________________________________________________________Please can someone explain this for me or sort this out for me.All help is welcome even if its only points me too a direction.RegardsTombola
When sorting records in a table how do you sort by the occurence of a field.
So, if the table contained: 1,1,2,2,2,3,3 how would you get it to sort using the desc syntax to give 2,2,2,3,3,1,1
When sorting records in a table how do you sort by the occurence of a field.
So, if the table contained: 1,1,2,2,2,3,3 how would you get it to sort using the desc syntax to give 2,2,2,3,3,1,1
Hi,
I am trying to restore .DAT file from dump. Its giving me error ..saying that the sort order id used for dumping was 42 not the default value 52.
How can i change the sort order to 42.
I am using sql server 6.5
Thanks
Srinivas
We have a vendor who insists that sql server 7 be set to a binary sort order. Is there any real advantage to this as opposed to a dictionary sort?
View 2 Replies View RelatedI have a table that most of the data has the same value, but there are only a few that do not match that value. I want to populate a listbox with all values from the table, but I'd like to have the majority listed first, followed by the others (the few that don't matach). What's the best way to approach this with SQL?
Thanks.
HOW CAN I CHECK THE SORT ORDER OF MY OLD SERVER?
HELP
Is there a query you can run against a 7.0 server to return the chracter set and sort order.
View 1 Replies View RelatedI'm trying to setup a duplicate of an old SQL Server 4.2 server to put in place while we upgrade the server, but I can't get the sort-order right. I know the existing server uses sort order id 40, but I can't find which sort-order that corresponds to during the install process. If anyone can give me a system table that lists all the sort orders names and id's, or can tell me what the text name for sort order 40 is, I would be very grateful.
Thanks,
Rob.
Hi All,
I have a table with 3 columns. Product, Location and Value. The data looks like this:
NULL NULL 100
Atlanta NULL 50
Atlanta Cookie1 30
Atlanta Cookie2 20
Dallas NULL 120
Dallas Cookie1 80
Dallas Cookie2 40
This table gets filled with a Groupby with Rollup option. The NULLS show subtotals/total. Is there a way to build a query that returns the results with NULLs at the bottom of each section like:
Atlanta Cookie1 30
Atlanta Cookie2 20
Atlanta NULL 50
Dallas Cookie1 80
Dallas Cookie2 40
Dallas NULL 120
NULL NULL 100
Thanks,
Shab
I need to copy the structure and data of an existing SQL 6.5 server to one with a different sort order. Normally, I would use the transfer tool to accomplish this, but the servers are on different networks. My question is, is BCP the answer? In other words, will the data copied via BCP from the sending server be able to be copied on the recieiving server. Also, is there a way to automatically generate the BCP statements for all tables? What I would really like is to be able to get at the scripts and data files created by the transfer tool.
View 1 Replies View RelatedHow can I set the sort order to 42, nocase when I install sql server 6.5
does Setup gives you some option to check to set sort order ?
For SQL 2000,
Can I assign different sort order on the database level?
thanks!
Hi,
Can any one pls tell me what this sort order id 42 corresponds to and how its different from 52 ?
What options i need to check during installation for sort oreder id. ?
Thanks
Srinivas
Hi All,
I have a table with 3 columns. Product, Location and Value. The data looks like this:
NULL NULL 100
Atlanta NULL 50
Atlanta Cookie1 30
Atlanta Cookie2 20
Dallas NULL 120
Dallas Cookie1 80
Dallas Cookie2 40
This table gets filled with a Groupby with Rollup option. The NULLS show subtotals/total. Is there a way to build a query that returns the results with NULLs at the bottom of each section like:
Atlanta Cookie1 30
Atlanta Cookie2 20
Atlanta NULL 50
Dallas Cookie1 80
Dallas Cookie2 40
Dallas NULL 120
NULL NULL 100
Thanks,
Shab
Trying to get data from result of MDX query in dec order:
iLoopFrom = cst.Axes(1).Positions.Count - 1
iLoopTo = 0
iLoopStep = -1
im using these lines to read the data from bottom to top but what exactly does Position.Count do? and is the -1 there because the result has headers??
Thanks
Hello,
Is there a way to re sort data in a table? based on ID.
The
SORT TABLE tlb_Name ON ID Asc
is not working?
Thank you very much.
Hi There i have two windows 2000 servers which are both running SQL and i would like to restore a backup from one server to the other. Which in my opinion should be an easy task but when i go into the restore option and point it at the file i would like to restore i get the follwoing error
"The database you are attempting to restore was backed up under a different sort order ID (52) than the one you are currently using on this server (50) and at least one of them is a non binary sort order. Backup or restore operation operation terminating abnormally."
The server that i am trying to restore to already has databases on this so i cannot just reinstall SQL and change the sort order not that id know how to do that but this is what i have read.
Is htere anyway that i can put insome script for the database to fix this ???
Im using Enterprise manager with SQL server 7
Thanks in advance