I have a problem with a Merge Join providing no output (when it should have 1890 rows). My Data Flow Task has 4 OLE Data Sources, 3 Multicasts, and 1 OLE Data Destination. I am experiencing the problem near the end of my data flow where two Multicasts create two parallel flows of data (see Level 1 below). I have two Merge Joins which join one leg from each multicast with a leg from the other multicast (see Level 2 below). Then the two remaining legs use a Merge to get my destination output (see Level 3 below).
I am experiencing my problem with the Merge Join (input A2, B2) --> (output C2) transformation. The Merge Join providing output C1 appropriately outputs 1890 rows, but C2 outputs 0 rows. Both Merge Joins are identical. The data is identically sorted prior to entering the problematic Merge Join and a DataViewer (Grid) verified that the data is appropriately entering in. Merge Join (input A2, B2) --> (output C2) has 667 rows as input A2 and 1890 rows as input B2 (using an inner join, just like the other merge join), but C2 baffles me with 0 rows of output (when it too should have 1890). I receive no Ouput errors and the execution completes showing all green.
I read about mysterious behavior with Merge Joins and have attempted modifying my EngineThreads property to values between 2 and 10, with no luck. Any help/ideas would be appreciated.
I created a package that seems to work fine with a small amount of data. When I run the package however with more data (as in production) the merge join output is limites to 9963 rows, no matter if I change the number of input rows.
Situation as follows.
The package has 2 OLE DB Sources, in which SQL-statements have been defined in order to retrieve the data.
The flow of source 1 is: retrieving source data -> trimming (non-key) columns -> sorting on the key-columns.
The flow of source 2 is: retrieving source data -> deriving 2 new columns -> aggregating the data to the level of source 1 -> sorting on the key columns.
Then both flows are merged and other steps are performed.
If I test with just a couple of rows it works fine. But when I change the where-clause in the data source retrieval, so that the number of rows is for instance 15000 or 150000 the number of rows after the merge join is 9963.
When I run the package in debug-mode the step is colored green, nevertheless an error is displayed:
Error: 0xC0047022 at Data Flow Task, DTS.Pipeline: SSIS Error Code DTS_E_PROCESSINPUTFAILED. The ProcessInput method on component "Merge Join" (4703) failed with error code 0xC0047020. The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running. There may be error messages posted before this with more information about the failure.
To be honest, a few more errormessages appear, but they don't seem related to this issue. The package stops running after some 6000 rows have been written to the destination.
I am using SSIS in SQL Server Enterprise 2005. I have two OLE DB data sources from two disparate databases (IBM DB2 and Microsoft SQL Server), some columns from each of which are to be included in the merged output results. I have noted the various requirements in the forum postings with regard to sorting the OLE DB sources and specifying the output source columns as being sorted, as well as the requirement that the join fields in the two sources be close/exact matches. Yet, when I run this in VS, while the work area reflects the expected number of rows being input into the Merge Join transformation, no count is reflected as output from that transformation into the final destination table.Specifically, my two data sources (IBM DB2 and MS SQL) are configured as follows:
IBM DB2 contains an SQL statement that uses Cast operations to create the result columns.and an ORDER BY clause to ensure that the output is sorted by the desired two columns.. The OLE DB source property setting for IsSorted is set to true; the Output Columns folder column definitions for "key_ source_dtsy" and "key_source_dtrt" have their SortKeyPosition properties set to 1 and 2, respectively. Those field are both defined as data type DT_STR, with lengths of 4 and 2, respectively. Below is the Path metadata from the Data Flow Path editor from the path from this source:
IBM DB2 source"Name"Â "Data Type"Â "Precision"Â "Scale"Â "Length"Â "Code Page"Â "Sort Key Position"Â "Comparison Flags"Â "Source Component""ID_CODE"Â "DT_STR"Â "0"Â "0"Â "10"Â "1252"Â "0"Â ""Â "Source F0005 User Defined Codes""CODE_DESCR_1"Â "DT_STR"Â "0"Â "0"Â "30"Â "1252"Â "0"Â ""Â "Source F0005 User Defined Codes""CODE_DESCR_2"Â "DT_STR"Â "0"Â "0"Â "30"Â "1252"Â "0"Â ""Â "Source F0005 User Defined Codes""key_source_dtsy"Â "DT_STR"Â "0"Â "0"Â "4"Â "1252"Â "1"Â ""Â "Source F0005 User Defined Codes""key_source_dtrt"Â "DT_STR"Â "0"Â "0"Â "2"Â "1252"Â "2"Â ""Â "Source F0005
User Defined Codes:
MS SQL contains an SQL statement that takes the columns as they are in the MS SQL table (no Cast operations needed); it also uses an ORDER BY clause to ensure the output is sorted by the join columns. The OLE DB source property setting for IsSorted is set to true; the Output Columns folder columns for "key_source_dtsy" and "key_source_dtrt" have their SortKeyPosition properties set to 1 and 2, respectively. Those field are both defined as data type DT_STR, with lengths of 4 and 2, respectively. Below is the Path metadata from the Data Flow Path editor from the path from this source:
MS SQL source"Name"Â "Data Type"Â "Precision"Â "Scale"Â "Length"Â "Code Page"Â "Sort Key Position"Â "Comparison Flags"Â "Source Component""id_code_name"Â "DT_I2"Â "0"Â "0"Â "0"Â "0"Â "0"Â ""Â "Source CodeName in db dwVdFY""key_source_dtsy"Â "DT_STR"Â "0"Â "0"Â "4"Â "1252"Â "1"Â ""Â "Source CodeName in db dwVdFY""key_source_dtrt"Â "DT_STR"Â "0"Â "0"Â "2"Â "1252"Â "2"Â ""Â "Source CodeName in db dwVdFY"
The Merge Join transformation specifies an INNER JOIN using the columns named "key_source_dtsy" and "key_source_dtrt" from the respective data sources.I know there are alternative ways of accomplishing my intent (Lookup, port MS SQL table to IBM DB2 so join can occur in SELECT statement, etc.; however, I'd like to use this functionality and assume that it should work.Â
I've run into something that looks like a bug to me but I wanted to run it by the board:
Merge join 2 sorted tables.
Table1: ColumnA : Sort Order 1, ColumnB Sort Order 2
Table2 : ColumnA: Sort Order 1, ColumnB Sort Order 2, ColumnC not sorted
Merge Join the two tables on ColumnA and ColumnB...
Choose the following as output columns
A + B + C = works
C = works
A + C = works
B + C = NOT work.. error message: The column with the SortKeyPosition value of 0 is not valid. It should be 2.
Basically if you choose one or more of the sorted columns in the output at least one of them has to be the column with Sort position 1 or you'll get that error.
Is this a bug or intentional? If you do not have sort column 1 in the output that output could no longer be considered sorted... so perhaps the error is related to that (instead of error I'd expect some warning about the sorting). Interesting that it lets you choose C only becuase that also makes the output unsorted.
I have a sql statement that joins two tables and I get back a few thousand records when I run it in query tool in management studio.
But when I use SSIS merge join to join the two tables my output is 0 records.
I did sort the key column in both tables by setting 'sortkeyposition' property to 1 in advanced editor for output of both tables.
however the merge join returns nothing to my destination tables. Also I am doing a inner join. The task runs without error but returns nothing as well.. any ideas?
I'm doing a data conversion with one of my fields (SUMDWK) from one of the tables that will be used in a merge join. With the new, converted field, I do a look up. From this look up, I want to take a new field FiscalWeekOfYear, and replace the original field, SUMDWK. This is necessary because SUMDWK is one of the sorted fields. In the look up, it is not possible to change the Output Alias. Does anybody know a way around this? Thanks.
In the first image as can be seens i have 2 different data sources and then they are being joined using "Merge Inner Join". The "sort" is on BusinessEntityID column of Person table and "Sort1" is on "PersonID" of Customer table. The merge join of these 2 result in 19,119 rows.
On the other hand, if i use single data source and use a query with inner join on  tables  used in the first image (ie. 2 tables being used in 2 different data sources) as depicted in second image. Also,  since merge cannot operate without SortKey i have defined TerritoryID as sort key in the advanced editor. The number of rows i get after this is "10,274". My select query was :
SELECT P.BusinessEntityID, P.PersonType, P.Title, P.FirstName, P.MiddleName, P.LastName, P.Suffix, C.TerritoryID FROM stg.Person AS P INNER JOIN stg.Customer AS C ON C.CustomerID = P.BusinessEntityID ORDER BY C.TerritoryID;
According to me, it should have been the same as in first case i am using merge inner join and in second case i am using SELECT query with inner join. Upon drilling down i found that in the first case , my sort keys are BusinessEntityID  and PersonID, if i modify this to CustomerID  and BusinessEntityID as this is my join condition (in ithe inner join query shown above), i get the desired output. What i was wondering was, how  the sort order change the Join Condition?
How to summarise the data in this table to a single row output per site (2 records for every SiteID). I am based in the UK so the data copied from SQL is in the US format. I can convert this to UK date without any issues.
CREATE TABLE [dbo].[MRMReadings]( [SiteIncomeID] [int] IDENTITY(1,1) NOT NULL, [SiteID] [nchar](5) NOT NULL,
[Code] ....    Â
Is it possible to return the data in the following format merging 2 lines of data into one output:
 SiteID  ReadStartDate ReadEndDate    ReadStartIncome    ReadEndIncome L0020    19/05/2015 05:00 20/05/2015 05:00   85.98    145.98 L0101    19/05/2015 22:07   20/05/2015 22:14         1,936.08      1,438.89 L0102    20/05/2015 21:16   19/05/2015 21:48  143.65 243.5
I need to transform Foxpro table to SQL Server table with merging all rows into one where all column values are the same except one . For this the only column with the different values , I want them also to be merged as coma or space delimited string. The question whether SSIS is a good candidate for this kind of data munging and also would be interested to know knowing as many as possible ways of doing that. Surely I may produce Foxpro script in 5 minutes which wil do that and be a pre-processor action before SSIS starts.
OLEDB source 1 SELECT ... ,[MANUAL DCD ID] <-- this column set to sort order = 1 ... FROM [dbo].[XLSDCI] ORDER BY [MANUAL DCD ID] ASC
OLEDB source 2 SELECT ... ,[Bo Tkt Num] <-- this column set to sort order = 1 ... FROM ....[dbo].[FFFenics] ORDER BY [Bo Tkt Num] ASC
These two tasks are followed immediately by a MERGE JOIN
All columns in source1 are ticked, all column in source2 are ticked, join key is shown above. join type is left outer join (source 1 -> source 2)
result of source1 (..dcd column) ... 4-400-8000119 4-400-8000120 4-400-8000121 4-400-8000122 <--row not joining 4-400-8000123 4-400-8000124 ...
result of source2 (..tkt num column) ... 4-400-1000118 4-400-1000119 4-400-1000120 4-400-1000121 4-400-1000122 <--row not joining 4-400-1000123 4-400-1000124 4-400-1000125 ...
All other rows are joining as expected. Why is it failing for this one row?
I have a merge join (full outer join) task in a data flow. The left input comes from a flat file source and then a script transformation which does some custom grouping. The right input comes from an oledb source. The script transformation output is asynchronous (SynchronousInputID=0). The left input has many more rows (200,000+) than the right input (2,500). I run it from VS 2005 by right-click/execute on the data flow task. The merge join remains yellow and the task never finishes. I do see a row count above the flat file destination that reaches a certain number and seems to get stuck there. When I test with a smaller file on the left it works OK. Any suggestions?
Is there a way to do a super-table join ie two table join with no matching criteria? I am pulling in a sheet from XL and joining to a table in SQLServer. The join should read something like €œfor every row in the sheet I need that row and a code from a table. 100 rows in the sheet merged with 10 codes from the table = 1000 result rows.
This is the simple sql (no join on the tables):
select 1.code, 2.rowdetail from tblcodes 1, tblelements 2
I read that merge joins work a lot faster than hash joins. How would you convert a hash join into a merge join? (Referring to output on Execution Plan diagrams.) THANKS
We have two queries that run nightly and we'd like to combine them and only have one result set instead of two. What's the best way to combine these? The only difference is the Table the information is being pulled from.
Query 1:
set nocount on select case when datalength(MICRACCTNUMBER) = 4 then convert(char(20),('001 000000000000'+MICRACCTNUMBER)) when datalength(MICRACCTNUMBER) = 5 then convert(char(20),('001 00000000000'+MICRACCTNUMBER)) when datalength(MICRACCTNUMBER) = 6 then convert(char(20),('001 0000000000'+MICRACCTNUMBER))
[code].....
Again, the only difference is the Table the info is coming from...
I used the MERGE function for the first time. Now I have to create a pipe-delimited delta file for a 3rd party client of any deltas that may exist in our database.
What is the best way to do this? I have OUTPUT to a result set of the deltas...but I have to send over the entire table to the 3rd party via a pipe-delimited file.
Does anyone know how i can go about merging preexisting pdf files and SQL server reporting services output. Can this be done in reporting services? For example, I have 5 pages from a pdf files which is created from another 3rd party software provider. I then i have output from sql reporting services. How can i merge these two outputs and deliver it over .Net/ ASP framework?
I have read a few other posts about how to merge multiple report output files into a single document e.g. a single pdf. There are a few approaches: 1) Generate post script files and then merge and oprint via a post script driver. 2) Generate seperate pdf files and then merge them into single document using a custom library.
I have a third idea and would appreciate any input:
Dynamically generate a RDL file that contains sub-reports, one for each report required in the final document, publish and run this as the final report. This could happen way before actually running the report i.e. the user has a tool where they select reports for a pack, tool then generates new RDL file and publishes it to sql reporting services, gets run at some later point in time.
Some challenges: -Generate a table of contents with page numbering? -Layout of sub-reports, not sure how they would be rendered across multiple pages? -Managing parametes across sub-reports at run time.
I have got a query in which a merge join is 99% of the cost .... and I am confused ... is not merge join supposed to be the fastest ??? Anyone seen this before ???
Any ideas why this could be happening ... and sorry ... do not ask me to post the code coz I will not be able to ...
I need to use Merge Join transformation to join two sources. One is from a PIVOT transformation and one of the output columns is ISSORTED, the other is from an OLE DB Source using a query. The Merge Join transformation requires both input source have to be sorted. I cannot find the ISSORTED property on the OLE DB Source!! I tried to use Derived/ copy transformations but cannot find the property also. How can set the OLE query sorted in order to use the MergeJoin?
I have a package where I use merge join for two sorted inputs and the output is stored in a raw file.
In another package, the raw file from above package is again merge joined with another sorted input. Now my question is....do we need to sort again the raw file from first package? or is it OK to set the isSorted property to True and define the sort keys?
I am new to this SSIS. I have a simple join query like this select a.id from tbl_a a, tbl_b b where a.id = b.id and I want insert the result to my temp table. the query results is 1500 rows. but when I use merge join in SSIS, it only inserts to my temp table 4 rows. I use inner join and I already set the IsSorted to true and specify the sort position for the columns in both source tables In tbl_a, there are one million rows, in tbl_b, there are 2000 rows. I don't know why the merge join cannot work out my task.Is there other way that I can just run this simple join query in SSIS to copy the data? Please help, thanks in advance.
Hello, I have a Merge Join transformation and when i sort values in OLEDB source the merge join fails, but if i use a sort transformation it works! Why?? Best regards, Fred
I'm trying to use the Merge component. When i attach a datasource to the the component, the Select Input/Output dialog box should popup.. It does, but VS.NET is hanging and i can only shutdown the procesess...
Any idea how i should solve this? how can i re-register this component?
Hi guys! I'm trying to figure out how to join 3 tables, but I can't seem to find a solution. What I want to do is to put table 1, table 2 and table 3 into table_merged. table_merged = table 1 + table 2 + table 3 Is it possible to merge tables even if they have different fields?
i'm merge joining 2 data sources, one is oracle and the other is excel...the problem is in the oracle source, it's a sql statement like:
select hdr.div_ord_no, hdr.mtr_no, hdr.prod_cd from qctrl_div_ord_header hdr, (select max(sub.eff_dt_from) min_eff_dt_from, div_ord_no from qctrl_div_ord_header sub group by div_ord_no ) tmp where hdr.eff_dt_from = tmp.min_eff_dt_from and hdr.div_ord_no = tmp.div_ord_no
having that sql statement, merging will come out with 0 rows
however, having a simple query like:
select hdr.div_ord_no, hdr.mtr_no, hdr.prod_cd from qctrl_div_ord_header hdr
merging will come out with 2 rows
you may think that the data in the first sql statement is not there for the merge, which causing the 0 rows, however, the data is there, i'm only joining by one column and definitely the data is there, the merge result should be 2 rows for both query statements
i believe this is a problem with SSIS, anyway around this?
I am working on an ssis package and i find an problem while using the merge join for merging 2 OLEDB Data sources .
data source 1 is : - The table formed my an sal server comand , that out put is given to a multicast since i want to sare that output amoung 2o other tables.
So the the left input for the merge join is OLEDB source , which contains direct data from source table
I am usong Inner join on one column
The problem is i am not getting the expected rows as out put of merge .
I tried to join the two tables in sqlserver query window and i am getting expected result
What could be the problem
The first table is
Reservations.ReservationManual
second table is Out put of the following query
Select Distinct B.ReservationID as R
from Property.Main A ,Reservations.Reservations B ,Reservations.ReservationRooms C
Where
A.propertyID = B.PropertyId And
C.ReservationID = B.ReservationID And
getdate() >=C.Until +A.ReservationOffLineDays
i am not getting the expected result here in SSIS package merge join
But if i try to execute the following in query editer in management studio i am getting the expected result !!
declare @temp as table
(ResID Varchar(50)
)
Insert into @temp
(ResID)
Select Distinct B.ReservationID as R
from Property.Main A ,Reservations.Reservations B ,Reservations.ReservationRooms C
Where
A.propertyID = B.PropertyId And
C.ReservationID = B.ReservationID And
getdate() >=C.Until +A.ReservationOffLineDays
select * from Reservations.ReservationManual A , @temp b
I am trying to normalize data using the unpivot transform. I have to unpivot using more than one key so I have a multicast feeding into two unpivot transforms then into a sort transform. This is where my problem starts - I have tried using a Merge Join (inner Join) transform but dont get the expected result.
My original data looks like this:
Pk_ID
Choice1
Choice2
Feedback1
Feedback2
10
a
b
x
y
After the mulitcast - unpivot - Merge Join, the expected result is: (pk_newID is an identity)
Pk_newID
fk_ID
Choice
Feedback
563
10
a
x
564
10
b
y
However with a Merge-Join (inner join on pk_ID) I get