Problem:long Running Validation When Pointing OLEDB Source To A View.
Jan 24, 2007
One of our developers has written a view which will execute completely (returns ~38,000 rows) in approx 1 min out of SQLMS (results start at 20 sec and completes by 1:10 consistently).
However, if he adds a data flow task in SSIS, adds an OLEDB Data Source and selects Data Access Mode to "Table or view" and then selects the same view, it is consistently taking over 30 minutes (at which point we've been killing it). I can see the activity in the Activity Monitor, it is doing a SELECT * from that view and is runnable the whole time.
If we modify the view to SELECT TOP 10, it returns in a short time.
Has anyone run into this problem? Any suggestions? It is very problematic, as if the views change we have to hack around this problem.
Thanks for any responses.
Jeff
View 5 Replies
ADVERTISEMENT
Aug 30, 2006
I have a table that contains approx 200 thousand records that I need to run validations on. Here's my stored proc:
[code]
CREATE PROCEDURE [dbo].[uspValidateLoadLeads]
@sQuotes char(1) = null, @sProjectId varchar(10) = null, @sErrorText varchar(1000) out
AS BEGIN
DECLARE @ProcName sysname, @Error int, @RC int, @lErrorCode bigint, DECLARE @SQL varchar(8000)
IF @sQuotes = '0'
BEGIN
UPDATE dbo.prProjectDiallingList_staging
SET sPhone = RTrim(LTrim(Convert(varchar(30), Convert(numeric(20, 1), phone))))
END
ELSE
BEGIN
UPDATE dbo.prProjectDiallingList_staging
SET sPhone = phone
END
--2. Remove quotes
UPDATE dbo.prProjectDiallingList_staging
SET sphone = REPLACE(sphone,'"' , '')
--3. Remove decimal, comma, dashes, parenthesis
UPDATE dbo.prProjectDiallingList_staging
SET sphone = replace(replace(replace(replace(replace(replace(sphone,'.',''),',','' ),'-',''), ' ',''), '(', ''), ')', '')
--4. Update failed Validation column if not 10 digits
UPDATE dbo.prProjectDiallingList_staging
SET sFailedValidation = 'X'
WHERE(Len(RTrim(LTrim(sPhone))) <> 10)
--5. Dedup
UPDATE a
SET a.sFailedValidation = 'X'
FROM dbo.prProjectDiallingList_staging a (nolock)
INNER JOIN dbo.prProjectDiallingList_staging b
ON a.sPhone= b.sPhone
WHERE(a.iList_StagingID > b.iList_StagingID)
--6. Update failed Validation column if not numeric
UPDATE dbo.prProjectDiallingList_staging
SET sFailedValidation = 'X'
WHERE(IsNumeric(RTrim(LTrim(sPhone))) = 0)
--7. Update time zones
UPDATE s
SET s.sTimeZone =z.sTimeZone
FROM dbo.prProjectDiallingList_staging s (nolock)
LEFT OUTER JOIN dbo.prPhoneTimeZone z
ON left(rtrim(ltrim(s.sphone)),3) = z.sPhoneAreaCode
--8. Insert into dialing table only records that have not failed the validation
INSERT dbo.prProjectDiallingList(iPrProjectId, sPhoneNumber, sTimeZone)
SELECT @sProjectId,sPhone, sTimeZone
FROM dbo.prProjectDiallingList_staging
WHERE ISNULL(sFailedValidation,'1') = '1'
UPDATE d
SET d.bProcessReporting = 1
FROM dbo.prProjectDialling d
WHERE d.iPrProjectId = @sProjectId
END
[/code]
When I execute this stored proc it runs for more than 5 minutes. Is there anything i can do to speed it up? Maybe there is a faster way of writing these queries?
Thanks,
Ninel
View 1 Replies
View Related
Feb 18, 2008
Hi,
I have an Integration Services project which creates a flat file report from Analysis Services, I'm using an OLE DB as data source and running an Openquery in the SQL statement.
the problem is that Integration services runs the query twice before getting the data into the flat file. I know this because the query runs two times in Profiler, and because the same query takes half the time when run in Management Studio.
Integration Services is running the whole query when validating. how can I disable this validation or better make it validate properly.
thanks
View 11 Replies
View Related
Jan 3, 2007
HI,
I'm trying to get data from AS400. using OLEDB source as my connection. i'm using IBM OLEDB provider for iSeries. and working on standard edition of SQL Server 2005.
While using OLEDB source task when i set my access mode to 'table or view' and try to see list of available libraryname.tablenames, i do not get and tables
where as when i use Data access mode as 'SQL Command' i can get data (can only see preview of data) from AS400 but not able to insert that into my destination table. At run time task Fails with the error mentioned below.
I have configured Data links tab inside the OLEDB connection manager also, but when tried to set a default library it gives me error. : "Error code :CWBZZ5042" - ( catalog is invalid ) but it does exist.
Is there some settings that needs to be done from AS400 side or SQL Server side to view the available libray and its tables ?
Can some one help me on the same.
thanks in advance
Shah
Error Message received when executed with SQL command:
Error: 0xC0202009 at Data Flow Task, OLE DB Source [1]: An OLE DB error has occurred. Error code: 0x80040E00.
Error: 0xC0047038 at Data Flow Task, DTS.Pipeline: The PrimeOutput method on component "OLE DB Source" (1) returned error code 0xC0202009. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing.
Error: 0xC0047021 at Data Flow Task, DTS.Pipeline: Thread "SourceThread0" has exited with error code 0xC0047038.
Error: 0xC0047039 at Data Flow Task, DTS.Pipeline: Thread "WorkThread0" received a shutdown signal and is terminating. The user requested a shutdown, or an error in another thread is causing the pipeline to shutdown.
Error: 0xC0047021 at Data Flow Task, DTS.Pipeline: Thread "WorkThread0" has exited with error code 0xC0047039.
Task failed: Data Flow Task
View 4 Replies
View Related
Apr 3, 2007
Thanks for any one can give me a help.
I am try to transfer some tables data from one database server into another database server. I create a package in SSIS, and I use a variable to pass each table name. In Data flow, I use a OLEDB Source, but I cannot set the Data access mode to Table name or view name variable. Ever time, I will get this following error info "===================================
Error at Data Flow Task [OLE DB Source [31]]: A destination table name has not been provided.
(Microsoft Visual Studio)
===================================
Exception from HRESULT: 0xC0202042 (Microsoft.SqlServer.DTSPipelineWrap)
------------------------------
Program Location:
at Microsoft.SqlServer.Dts.Pipeline.Wrapper.CManagedComponentWrapperClass.ReinitializeMetaData()
at Microsoft.DataTransformationServices.DataFlowUI.DataFlowComponentUI.ReinitializeMetadata()
at Microsoft.DataTransformationServices.DataFlowUI.DataFlowAdapterUI.connectionPage_SaveConnectionAttributes(Object sender, ConnectionAttributesEventArgs args)".
Some one can tell me what is the reason, or give me some examples.
Thanks in advance.
View 7 Replies
View Related
Oct 10, 2006
Hi all,
I got an error when i do an OLE db Source pointing to an sql 2000 database and executing a sql query inside the OLE Source. The ole source will point to an OLE DB destination which is an sql 2005 database.
But i got the below error:
Error at Data Flow Task [OLE DB Destination [245]]: the column firstname cannot be processed because more than one code page (936 and 1252) are specified for it.
Error at Data Flow Task [DTS.Pipeline]: "component "OLE DB destination" (245)" failed validation and returned validation status "VS_ISBROKEN".
Error at Data Flow Task [DTS.Pipeline]: One or more component failed validation.
Error at Data Flow TaSK: There were errors during task validation.
(Microsoft.DataTransformationServices.VsIntegration)
View 5 Replies
View Related
Feb 1, 2008
I would like to know how to disable valiation of SSIS when the SSIS runs in server.
The SSIS takes about 20% of time to validate the environment.
Are there any configuration in SSIS to tell SSIS the environment is stable and no validation is required?
Thanks.
View 4 Replies
View Related
Sep 1, 2006
If I start a long running query running on a background thread is there a way to abort the query so that it does not continue running on SQL server?
The query would be running on SQL Server 2005 from a Windows form application using the Background worker component. So the query would have been started from the background workers DoWork event using ado.net. If the user clicks an abort button in the UI I would want the query to die so that it does not continue to use sql server resources.
Is there a way to do this?
Thanks
View 1 Replies
View Related
Apr 19, 2006
I have SSIS Projects taking a long time to open with packages with a large number of data flows. Is there a way to turn off validation of metadata when a package opens? Turn off validation during execution on SSIS Service (after previously validated in dev)? Or be able to control when validation takes place in general?
In my one package (1 of 5) I have 43 data flows (with a single source to target mapping) in 4 sequence containers, and it takes approximately 2-3 seconds per source to target mapping and sequence container to validate which will translate to 1 ½ to 2 ½ minutes to open. When the project with all 100+ tables for the data warehouse goes through validation, I can make coffee in the time it takes to open the project. I have to delete *.suo file (or verify all packages are closed in the designer and save the project file), and when I open the project, I have to jump immediately to SSISÃ Work Offline to set it to not validate the metadata to be able to work in a timely fashion. DelayValidation=TRUE does not help much.
Running in debug mode, has an effect of causing packages that were not open and validated to go through validation though I am not running those packages. Validate once during design and run forever.
Even if I re-open a package that I just closed from designer and had gone through validation, it will go through the validation process again.
It would be great if there could be an on-demand option off the menu bar to allow one to control when validation can take place for a project, or a more granular validation option for a specific data flow or container.
View 7 Replies
View Related
Apr 13, 2007
Hi all,
I have an XML Source which seems properly configured on design mode. But when I try to execute the DataFlow I have the following Error Message:
"The Output "AccidentData" (outputID) contains a RowsetID with a value of AccidentData that does not reference a data table in the schema"
"VS_NEEDNEWMETADATA"
I don't understand what this error message means, I have another dataflow which uses the same XML Data file and XSD file which works properly, the difference is that the working data flow does not include all columns.
Any help will be appreciated.
View 3 Replies
View Related
Nov 8, 2007
Dear all,
I am trying to executed a packege so that it loads data from from the excel file to the SQL Server Server database.
When I execute it, it prompts the following error message and 1 warning
The excel file has three colums, Week, Item and Value
Error 4 Validation error. Data Flow Task: OLE DB Source [94]: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80040E14. An OLE DB record is available. Source: "Microsoft OLE DB Provider for Oracle" Hresult: 0x80040E37 Description: "ORA-00942: table or view does not exist ". Test - GET NW PERF 1.dtsx 0 0
Warning
Warning 1 Validation warning. Data Flow Task: OLE DB Destination [36]: The external metadata column collection is out of synchronization with the data source columns. The column "DAY" needs to be added to the external metadata column collection. The column "TCH_AVAIL" needs to be added to the external metadata column collection. The column "PDROP" needs to be added to the external metadata column collection. The column "P_HR" needs to be added to the external metadata column collection. The column "SFAIL" needs to be added to the external metadata column collection. The "external metadata column "VALUE" (90)" needs to be removed from the external metadata column collection. The "external metadata column "ITEM" (89)" needs to be removed from the external metadata column collection. Not in use - GET NW STATS.dtsx 0 0
Could someone give me a hand here.
Regards,
Ronald
View 1 Replies
View Related
Dec 13, 2007
I was trying to load data using SSIS, Data Flow Task, OLE DB Source, source was a view to a OLE DB Destination (SQL Server). This view returns 420,591 rows from Query Analyzer in 21 seconds. Row length is 925. When I try to executed the Data Flow Task from SSIS, I had to stop the process after 30 minutes, because only 2,000 rows had been retrieved. I modified the view to retun top 440, 000 and reran. This time all 420, 591 rows were retrieved and written in 22 seconds. Next, I tried to use a TOP 100 Percent. Again, only 2,000 rows were return after 30 minutes. TempDB is on a separate SAN Raid group with 200 gig free, Databases on a separate drive with 200 gig free. Server has 13 gig of memory and no other processes were executing.
The only way I could populate the table was by using an Execute SQL Task and hard code an Insert into table selecting data from the view (35 seconds) from SSIS.
Have anyone else experience this or a similar issue? Anyone have a solutionexplanation?
View 13 Replies
View Related
Dec 10, 2007
I have a package that hangs in the designer after I change the sql statement in a DataReader Source from a 'select' to a 'call stored procedure'. The stored procedure takes 2 date parameters. I use an expression to build the 'call stored proc' statement and the 2 date strings. The data reader source uses an ADO.Net connection manager. The ADO.Net connection manager uses the provider for MySQL (Connector/.Net 5.1) which I installed from MySQL.com (http://dev.mysql.com/downloads/connector/net/5.1.html). Before creating the stored procedure I had been using an expression to build a 'select' statement with two date variables as follows:
select ...
where ads.last_seen >= "" + (DT_STR,10,1252) Year(@[User:: StartDate] ) + "-" + (DT_STR,10,1252) Month(@[User:: StartDate] ) + "-" + (DT_STR,10,1252) Day(@[User:: StartDate] )
+ "" and ads.first_seen <= "" + (DT_STR,10,1252) Year(@[User::EndDate] ) + "-" + (DT_STR,10,1252) Month(@[User::EndDate] ) + "-" + (DT_STR,10,1252) Day(@[User::EndDate] )+ "" group by sm.service_provider_id,lm.location_id,lm.web_sublocation_id;"
The sql for the data reader source is set via the sql command property of the data flow component.
After testing the sql, I created a stored proc from this sql and then changed the expression (using the sql command property of the the data flow component) to build the 'call stored proc' statement, like this.
"call usp_SEL_Rollup ("" + (DT_STR,10,1252) Year(@[User:: StartDate] ) + "-" + (DT_STR,10,1252) Month(@[User:: StartDate] ) + "-" + (DT_STR,10,1252) Day(@[User:: StartDate] ) + "","" +(DT_STR,10,1252) Year(@[User::EndDate] ) + "-" + (DT_STR,10,1252) Month(@[User::EndDate] ) + "-" + (DT_STR,10,1252) Day(@[User::EndDate] ) +"");"
then when I tried to switch to the data flow tab, the editor froze, with the status bar saying "validating datareader source". The data flow tab says "Loading...". I don't know how to troubleshoot this. Each time I have tried I have had to kill the application. Any ideas/suggestions?
Thanks,
Al
View 6 Replies
View Related
Jan 15, 2008
I was having one package which uses a source query in OLEDB Source Control and fetches the record and a couple of lookups and then an oledb command to insert/update the records in the table using as SP. I changed the source query(Infact the package) and removed in lookup and a different SP was called similar to the old one. But my problem is the package which was before taking only minutes to update 50,000 records is now taking more than 2 hrs. The problem is the number of records it is fetching from the source each time is very less.. its fetching hardly 500 records a time compared to nearly 2500 records before. Where am i going wrong? Any suggestion greatly appreciated.
View 5 Replies
View Related
Feb 12, 2008
Hi,
I am running an MDX query in SSIS but I don't know what is the best way of doing this, performance wise.
I know I can run the MDX query through an openquery in the OLEDB, and also run it through a Datareader, no openquery needed.
I know the datareader is slower in a normal basis due to .Net, but in this case the OLEDB is running an open query to a linked server which won't be fast like running a regular SQL.
If anyone knows which of this two run faster in this scenario I'll appretiate if you let me know.
View 1 Replies
View Related
May 11, 2007
Hi,
Firtsly - I am new to SSIS if my approach could be improved then I welcome suggestions.
Scenario: I have a large SSIS package that consolidates / summarizes work week information from several data sources. Currently each data flow task in the control flow calculates the from and to date that is filtered on, for example:
DECLARE @FromDT AS DATETIME
SET @FromDT = CAST(FLOOR( CAST( DATEADD(D, -7, GETDATE()) AS FLOAT ) ) AS DATETIME)
DECLARE @ToDT AS DATETIME
SET @ToDT = CAST(FLOOR( CAST( GETDATE() AS FLOAT ) ) AS DATETIME)
I would like to remove these statements that appear in most steps and replace them with a global variable that is used throughout the package. This statement would only appear once & it would make the package much easier to run after failure etc.
Problem: I am using Data Reader Source with the 'SQLCommand' property specified. It looks like parameters are only supported if an OleDB connection is used?
So I switched to an OleDB connection and no parameters are recognised in the string - a forum search reveals that parameters in sub queries are not always found properly. The solution to this problem appears to be, to set 'Bypass Prepare' to True but this is a property for the Execute SQL task, not the Data Flow Task source.
Questions:
Does the Data Reader Source control from Data Flow Source toolbox section support parameters?
Can anyone suggest a fix to the OleDB Source issue with Parameters?
Is there a better way to solve my problem e.g. Using Execute SQL Task instead of Data Flow tasks etc
Example SQL:
This SQL is an example of the SQL for the OleDB Data Source (within a Data Flow task)
------------------------------
--RADIUS LOGINS
------------------------------
DECLARE @FromDT AS DATETIME
SET @FromDT = CAST(FLOOR( CAST( DATEADD(D, -7, GETDATE()) AS FLOAT ) ) AS DATETIME)
DECLARE @ToDT AS DATETIME
SET @ToDT = CAST(FLOOR( CAST( GETDATE() AS FLOAT ) ) AS DATETIME)
DECLARE @Attempts AS BIGINT
SET @Attempts =
(SELECT COUNT(*)
FROM dbo.Radius_Login_Records
WHERE LoggedAt BETWEEN @FromDT AND @ToDT)
DECLARE @Failures AS BIGINT
SET @Failures =
(SELECT COUNT(*)
FROM dbo.Radius_Login_Records
WHERE LoggedAt BETWEEN @FromDT AND @ToDT
AND Authen_Failure_Code IS NOT NULL)
DECLARE @Successes AS BIGINT
SET @Successes = @Attempts - @Failures
DECLARE @OcaV1Hits AS BIGINT
SET @OcaV1Hits = (SELECT COUNT(DISTINCT LoginName)
FROM dbo.Radius_Login_Records
WHERE LoggedAt BETWEEN
@FromDT AND @ToDT
AND EAPTypeID = 25)
DECLARE @OcaV2Hits AS BIGINT
SET @OcaV2Hits = (SELECT COUNT(DISTINCT LoginName) AS OcaV2Hits
FROM dbo.Radius_Login_Records
WHERE LoggedAt BETWEEN
@FromDT AND @ToDT
AND EAPTypeID = 13)
SELECT
@Attempts AS ConnectionAttempts,
@Failures AS ConnectionFailures,
(CAST(@Successes AS DECIMAL(38,2)) / CAST(@Attempts AS FLOAT) * 100) AS SuccessRate,
@OcaV1Hits AS OcaV1Hits,
@OcaV2Hits AS OcaV2Hits
Please remember, I'm new to SSIS - so be detailed in your response. Thanks for your help!
View 5 Replies
View Related
Jun 21, 2006
Iam migrating data from one database to another .I want give input of that source and traget database names through globally declared user variables (@sourcename,@targetname)
How i can map the variables in OLE DB Source ..i dint find any option to that .
Can somebody help ?
Thanks
Kumar
View 1 Replies
View Related
Aug 22, 2007
I have a pretty complex query that aggregates lots of data and inserts multiple rows of that data into a reporting table. When I call this SPROC from SQL Server Management Studio, it executes in under 3 seconds. When I try to execute the same SPROC using .NET's SqlCommand object the query runs indefinitely until the CommandTimeout is reached. Why would this SPROC behave differently with the same inputs, but being called from .NET? Thanks for your help!
View 3 Replies
View Related
Aug 20, 2001
Hi everyone.... I'm trying to execute this update statement... It takes an eternity... any ideas on how to rewrite or speed it up?
It's a several step process... below is everything that i run, one step at a time. The final update statement is what takes so long. It should only affect about 2600 rows out of a potential 9000. That's why I'm confused on the response time
select d.olddevicename, de.device, d.newdevicename into #temp9
from dns d, devices de
where de.device = d.olddevicename
update #temp9 set device = newdevicename where olddevicename = device
update devices set device = #temp9.device from #temp9, devices where #temp9.device in
(select #temp9.device from #temp9, devices where #temp9.olddevicename = devices.device)
Thank you!
J.
View 1 Replies
View Related
Jan 10, 2001
Hi all,
I have 3 three scheduled job: one runs onece a day, one runs once per hour, and another runs every 17 minutes. It is a NetIQ application. I just scheduled SQL Server maintianace job last night which ran at 2:00Am and 4:00Am. This morning, I came in office and found all my jobs were still running; and they were all blocked by the first 3 jobs. I had to kill all of them. In this afternoon, I kicked off one of my many DTS packages which runs usually about 40 minutes, but it failed. I tried several times but no luck. I suspected one of user tables corrupted or one of stored procedures corrupted. After I recycle the server, and dropped the table and the stored procedure, and recreated them, the package went fine. The store procedure involves many updates and inserts.
The question I have is: is it possible to cause this problem because I killed the unfinished jobs (especially the sql maintanace job)?
NOTE: the sql maintanace job does not include the backup of database and transaction log.
View 1 Replies
View Related
Feb 7, 2003
When I execute the following stored procedure it runs for about a minute.
CREATE PROCEDURE EquipmentListByProduct
(
@iProdTypeId int
)
AS
SET NOCOUNT ON
DECLARE@iError int, @iRows int
SELECT pn.prodTypeId, pn.prodId, pn.prodName
FROM prodNames pn
WHERE pn.prodTypeId = @iProdTypeId
SELECT@iError = @@ERROR, @iRows = @@ROWCOUNT
IF ( @iError <> 0 )
BEGIN
RETURN@iError
END
IF ( @iRows = 0 )
BEGIN
RETURN-1
END
RETURN@iError
GO
The table only has 22 records.
Do I need to index the table? If so how do I do this?
View 4 Replies
View Related
Mar 11, 2003
My backups are running 5-6 hours on SQL2000. I'm sure they only used to take 1 hour or so. On another server, backing up the same database (both about 50 gig), the backup only takes 45 min - 1 hour. What can I look at to see why it's taking so long ?
(edit) I'm using a maint plan to backup to disk
View 3 Replies
View Related
Jul 30, 2004
Trying to come up with a way to monitor (without profiler, hopefully with a job and a select statement) a specific sql job that may cause a problem if the duration is too long. It seems that there is an sp called sp_sqlagent_log_jobhistory that shoves a record in sysjobhistory, but only after all the job steps run. Anyone tried this before?
View 1 Replies
View Related
Apr 3, 2008
1. could any one tell me in a simple way how to troubleshoot long running queries.
2.what is the default recovery model
View 6 Replies
View Related
May 9, 2008
Hello,
I want to find long running queries?
Can any one help me?
Thanks
Prashant Hirani
View 4 Replies
View Related
Jun 4, 2008
Hello Gurus
I am using sql 2005 and one job status is executing in job monitor in 2005,How can i check since how long this job is running?
Please advice
Nitin
View 4 Replies
View Related
Jul 20, 2005
I've got a server (SQL 2K, Win2K) where the backupshave started running long.The database is a bit largish -- 150GB or so. Up untillast month, the backups were taking on the order of4 to 5 hours -- depending on the level of activity on theserver.I'm using a T-SQL script in the SQLAgent to run thebackups. Native SQL backup to an AIT tape drive.Now, for no apparent reason, the backups are takingon the order of 24 to 26 hours. The backups completesuccessfully -- no errors, just taking an outrageouslylong time to complete. DBCCs check out AOK, noproblems with the database.No changes to the machine. No hardware changes. Nosoftware changes. Weird.Multiple tape media have been tried -- it's not a caseof a tape going bad.We've had no problems with this box for almost 4years. Now it's gettin' jiggy with us!Any ideas on where to start with this one?Thanks in advance.
View 1 Replies
View Related
Jul 6, 2007
Hi. I'm fairly new to SSRS, and very new to this forum. I have a report based on a stored procedure. I've optimized the procedure so that it runs from 2-4 minutes (previously over half an hour). However, when I run the report that calls the sp, it runs forever (well over 45 minutes in some cases), and the users basically give up on it. Any ideas of why this happens and what steps I can take to improve performance?
Thanks,
Marianne
View 4 Replies
View Related
Dec 21, 2006
I need to execute a long running package (it takes about 16 hours to finish) to load a data warehouse for the first time with all historical data. This package it's a master package and execute other packages; I log the start time and the finish time of the package in a table to manage future incremental loads.
I executed the package on sql server where it is saved, but after 8 hours it was running, a new package was started automatically. Then two more packages started .. each every two hours.
I set the MaxConcorrentExecutable = 4, this could affect this strange behavoir ?
Anyone could imagine wath happened ?
Thanks
Cosimo
View 3 Replies
View Related
Jun 2, 2006
Hi,
I'm trying to optimize a long running (several hours) query. This query is a cross join on two tables. Table 1 has 3 fields - ROWID, LAt and Long. Table 2 has Name, Addr1,Addr2,City,State,Zip,Lat,Long.
Both tables has LatRad - Lat in radians, LonRad- Lon in Radians. Sin and Cos values of Lat and Lon are calulated and stored to be used in the distance formula.
What I'm trying to do here is find the nearest dealer (Table 2) for each of Table 1 rows. The Select statement takes a long time to execute as there are about 19 million recs on table 1 and 1250 rows in table 2. I ran into Log issues- filling the transaction log, so I'm currently using table variables and split up the process into 100000 recs at a time. I cross join and calculate the distance (@DistValues) and then find the minimum distance (tablevar2) for each rowid and then the result is inserted into another Table (ResultTable).
My Code looks like this:
Declare @DistValues table (DataSeqno varchar(10),Lat2 numeric(20,6),Lon2 numeric(20,6),StoreNo varchar(60), Lat1 numeric(20,6),Long1 numeric(20,6),Distance numeric(20,15))
Declare @MinDistance table (DataSeqno varchar(10) primary key,distance numeric(20,15))
Insert into @DistValues
Select DataSeqno,T.Lat Lat2,T.Lon Lon2,S.StoreNo,S.Lat Lat1,S.Long Long1,
distance=3963.1*Case when cast(S.SinLat * T.SinLat + S.CosLat * T.cosLat * cos(T.Lonrad - s.Lonrad) as numeric(20,15)) not between -1.0 and 1.0 then 0.0 else acos(cast(S.SinLat * T.SinLat + S.CosLat * T.cosLat * cos(T.Lonrad - s.Lonrad) as numeric(20,15))) end
from dbo.TopNForProcess T , dbo.Table2 S where Isnull(T.Lat,0) <> 0 and Isnull(T.Lon,0)<> 0
Insert into @MinDistance
Select DataSeqno,Min(distance) From @DistValues Group by DataSeqno
Insert into ResultTable (DataSeqno,Lat2,Lon2,StoreNo,LAt1,Long1,distance)
Select D.DataSeqno, D.Lat2, D.Lon2, D.StoreNo, D.LAt1, D.Long1, M.distance from @DistValues D Inner Join @MinDistance M on D.DataSeqno = M.DataSeqno and D.Distance = M.Distance
I created a View called TopNForProcess which looks like this. This cut down the processing time compared to when I had this as the Subquery.
SELECT TOP (100000) DataSeqno, lat, Lon, LatRad, LonRad, SinLat, cosLat, SinLon, CosLon FROM Table1 WHERE (DataSeqno NOT IN (SELECT DataSeqno FROM dbo.ResultTable)) AND (ISNULL(lat, 0) <> 0) AND (ISNULL(Lon, 0) <> 0)
I have indexes on table table1 - Rowid and another one with Lat and lon. Table2 - Lat and Long.
Is there any way this can be optimized/improved? This is already in a stored procedure.
Thanks in advance.
View 7 Replies
View Related
Aug 8, 2006
Hi all,
I am new in SSIS. Anyone know how to valify number of record that I load from csv file to SQL database table?
For example, the source file call product.csv and target table in database named DSS table name PRODUCT. I load data from flat file to table then I need verification if count between source and target not match send e-mail to me.
Thanks.
Grace
View 5 Replies
View Related
Jan 10, 2012
I am putting the below query in a OLEDB SOURCE through a variable (it is a select statement with a where clause from one date to another).
"select TestRecordtype, request_id from department
where LOAD_TMSTP between
(select max(END_TMSTP) LOG) and
(TO_DATE("+RIGHT("0" + (DT_STR,4,1252)DATEPART( "dd" , @[System:tartTime] ), 2) + "-"
+RIGHT("0" + (DT_STR,4,1252)DATEPART( "mm" , @[System:tartTime] ), 2) + "-"
+RIGHT("0" + (DT_STR,4,1252)DATEPART( "yy" , @[System:tartTime] ), 2) + " "
+RIGHT("0" + (DT_STR,4,1252)DATEPART( "hh" , @[System:tartTime] ), 2) + "."
+RIGHT("0" + (DT_STR,4,1252)DATEPART( "mi" , @[System:tartTime] ), 2) + "."
+RIGHT("0" + (DT_STR,4,1252)DATEPART( "ss" , @[System:tartTime] ), 2) +",'DD-MM-YY HH24.MI.SS')) "
View 1 Replies
View Related
Apr 29, 2008
I am debugging a Data Flow task in my SSIS package. When I run the package in debug mode, one of the OLEDB Data Sources turns red. I have rerouted all Error Output to a flat file, and put a Data Viewer on that path: no rows get sent. When I click the Preview button on this component in Design mode, I see the expected data and get no error messages. The connection does a simple table access...no SQL command. I don't see anything different between this component and other OLEDB sources in the same package that don't trigger any errors. I've tried dropping and re-creating the component with the same results.
What else can I do to debug this?
View 7 Replies
View Related