I read another post. I'm hoping I'm just doing something wrong, and that the SSIS team wouldn't have done this:
I read in a batch of records that should cause changes in an SCD in some cases. The table is empty originally. In the wizard, its handling changing attibutes ,fixed and historical.
3 records in ( all with the same business key) = 3 records out?
In this case, it should only produce 1 record.
Are you kidding me? There must be a way to set the batch size to 1, right?
I've created a store dimension. Within the store dimension is a CityID column that is related to a City dimension. Currently this city dimension has the following columns:
CityID (primary key --- foreign key for the StoreDim) CityName CountyName StateName CountryName ...
My SCD's source is being loaded from a query to a Zip table that contains ZipCodes along with cityID, CountyID, StateID, and CountryID that can be joined out to their respective tables to find the information needed to populate my SCD. However, cities can have several number of zipcodes and several cities in different states can have the same names. This causes a problem with eliminating duplicates for my dimension as well as finding a suitable business key.
For example, let's take a case: Washington city, District of Columbia (county), District of Columbia (state), USA (Country) Now, this city has SEVERAL zip codes so when I create my joins:
SELECT ... FROM Zip Z INNER JOIN City C ON C.CityID = Z.CityID INNER JOIN County Cty ON Cty.CountyID = Z.CountyID
So this returns several CityIDs for the same city... which when I try to use for my SCD causes problems because i have several business keys based off my CityIDs so my dimension loaded would look like
CityID CityDesc CountyDesc StateDesc CountryDesc 1 Washington District of Columbia District of Columbia USA 2 Washington District of Columbia District of Columbia USA 3 Washington District of Columbia District of Columbia USA 4 Washington District of Columbia District of Columbia USA
I want to have just ONE record in my dimension for this city.... How would I do this and what would I use for a business key? Also any recommendations on which of these attributes should be historical attributes? I would appreciate any input you guys have.
I have to update a field within a table of 60 records or so. Each record has a different field value. it's type varchar. i was given an excel file with the field values and was thinking of a bulk update like bulk insert, but i don't recall that it's possible that way.
Is the only way to create a table, bulk insert, then merge the two tables together with UPDATE?
Just wanted to see if there was an easier way to do it, otherwise i'll take the latter route. Thanks!
I have a table containing 8 million records. I need to replace 2 million of these records with a scaled down query that goes something like: SELECT 1, ShareholderID, Assets1 FROM MyTable (Yields appx. 200,000 recods) SELECT 2, ShareholderID, Assets2 FROM MyTable (Yields appx. 200,000 recods) . . . SELECT 10, ShareholderID, Assets1 + Assest2 + Assets3 + ... + Assets9 FROM MyTable (Yields appx. 200,000 recods)
Updates and cursors just seem to be too slow.
So far I have done the following, but was wondering if anyone could think of a better way. SELECT 6 million records that don't need to be deleted into a #TempTable Use statements above to select into same #TempTable DROP and recreate Original Table SELECT 6 + 2 million records INTO original table.
This seems rather convoluted. Is there a better approach? Would it be worth while to dump data to a file and use bcp / Bulk Insert
I'm trying to use Bulk insert for the first time and getting the following error. I think it might have something to do with my Format File and from the error msg there's a conversion error for the first column. In my database the Field is nvarchar(6) so my best guess is to use SQLNChar for the first column. I've checked the end of each line is CR LF therefore the is correct for line 7 right?
Msg 4863, Level 16, State 1, Line 1 Bulk load data conversion error (truncation) for row 1, column 1 (ASXCode). Msg 7399, Level 16, State 1, Line 1 The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error. Msg 7330, Level 16, State 2, Line 1 Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
BULK INSERTtbl_ASX_Data_temp FROM 'M:DataASXImportTest.txt' WITH (FORMATFILE='M:DataASXSQLFormatImport.Fmt')
Before implementing memory based bulk copy insert with IRowsetFastLoad interface of SQL Server 2005 OLE DB provider, I want to know some considerations.
- performance : compared with T-SQL's "BULK INSERT ..." and bcp utility
- SQL Server's resource usage : when running memory based bulk copy, server resource's influence
- server side action(behavior) : when server is busy, delayed-update means IRowsetFastLoad::Commit(true) method can insert right after?
- row-count : The rowcount limitation can be inserted by IRowsetFastLoad::InsertRow() method before IRowsetFastLoad::Commit
I'm just learning SSIS and I've hit my first bump. I am doing a bulk import from a tab delimited text file to an empty sql table that has a Idendity column defined. How do I tell the bulk insert task to skip that column when inserting from the text file. If I remove the identity column it imports the data fine, but I want to create the indentity column in the table too.
Hi~, I have 3 questions about memory based bulk copy.
1. What is the limitation count of IRowsetFastLoad::InsertRow() method before IRowsetFastLoad::Commit(true)? For example, how much insert row at below sample?(the max value of nCount) for(i=0 ; i<nCount ; i++) { pIFastLoad->InsertRow(hAccessor, (void*)(&BulkData)); }
2. In above code sample, isn't there method of inserting prepared array at once directly(BulkData array, not for loop)
3. In OLE DB memory based bulk copy, what is the equivalent of below's T-SQL bulk copy option ? BULK INSERT database_name.schema_name.table_name FROM 'data_file' WITH (ROWS_PER_BATCH = rows_per_batch, TABLOCK);
------------------------------------------------------- My solution is like this. Is it correct?
// CoCreateInstance(...); // Data source // Create session
I have a DTS that imports data from an orcle database into SQL Server. Doesn't the processing mostly occur on the SQL Server, not on the oracle database from which the data is being imported? The oracle database is vendor provieded and they are saying our SQL Server DTS package is killing their server. Any insight is appreciated. Thanks
I've got a process that creates records in my database based on XML input that I've gotten. What I am doing it giving this XML to a stored procedure to handle a specific task, then modify the XML and send it to the next stored procedure.
For instance, the XML could hold header records with detail records, I would first send the XML to a stored procedure that creates the header records, then updates the XML so the XML now knows the identity values of the header records I have just created, and then send the XML to the next stored procedure to create the details for those headers.
All works great and fine, but I have a problem with writing the identity values back to the XML. It seems I can only change one item in the XML at a time and thus need to loop this. For many records this really takes a long time.
Here is some sample code of what I'm doing (please excuse any typos, this is a simplified version of the code) :
declare @lvSeq numeric(15) declare @lvRowNo int declare @lvNumRows int
insert into myHeaderTable ( recid, recdesc ) select ref.value('@recid', 'nvarchar(25)') recid, ref.value('@recdesc', 'nvarchar(250)') recdesc from @pXML.nodes('//headers/header') R(ref)
select @lvRowNo=1, @lvNumRows = @pXML.value('count(//headers/header)', 'int') while (@lvRowNo<=@lvNnumRows) begin select @lvSeq = recseq from myHeaderTable where recid = @pXML.value('//headers/header[position()=sql:variable("@lvRowNo")]/@recid)
set @pXML.modify('replace value of (//headers/header[position()=sql:variable("@lvRowNo")]/@recseq with sql:variable("@lvSeq")')
select @lvRowNo=@lvRowNo+1 end
Obviously I am looking for a better way to update the XML with the sequences. The insert takes a second, the loop takes minutes with large XML sets. I guess MSSQL is searching the whole XML to find the item to update.
It would be nice if I didn't have to loop through the XML. One solution I was thinking off is to store the XML in a temporary table with a single record per header item. Then I could do the modify in one go and recreate the XML by simply selecting the contents of the temporary tabel. I have no idea if this is possible.
So something like this:
select ref.value('@recid','nvarchar(25)') recid, ref.value('.','XML') XMLData -- this gives an error into #TMP_XML from @pXML.nodes('//headers/header') R(ref)
insert into myHeaderTable ( recid, recdesc ) select recid, ref.value('@recdesc', 'nvarchar(250)') recdesc from #TMP_XML CROSS APPLY XMLData.nodes('/header') R(ref)
update #TMP_XML set XMLData.modify('replace ....') from myheadertable where #TMP_XML.recid = myheadertable.recid
Hello friends, I needed a suggestion, I am currently working on a reporting website that generates reports and i need to store all the reports in the database.
I usually go by row wise processing as it can be easily controlled but the problem is there will be a lot of reports, that is an estimation of 30,000 rows in a month and i m not sure if sql server can hold more than 2 billion rows per table.
I will just explain whole scenario what I m facing in tricky problem..
We have xml files coming at regular interval by some other source into sql server 2000…daily having records near @10000 to 70000…we have job scheduled to run it regular interval…we doing this by some filter criteria… suppose the flow is like staging table into secondary table and then final into primary table…. We design DTS package accordingly means take the records from staging table put into secondary table and then into primary table…(near @ 8 task involved in it…) Suppose xml file came at 8:30 am and our DTS package will run at 9:00 am…and then 11:00 am and the 1:00 pm like that….what I observing from many days is that after running job at 9:00 am successfully some good data still pending in secondary table not processed into primary table. But when again job ran at 11:00 am it processed that pending good records into primary table…some times when I ran this job manually through DTS design level the good data that pending in secondary table processed!!!
My question is that why this job not processed all the good records in single shot????
Hello everyone. I need help regarding the following:Given the following table:CREATE TABLE T1 (C1 nvarchar(10), C2 money)INSERT INTO T1 VALUES ('A',1)INSERT INTO T1 VALUES ('B',2)INSERT INTO T1 VALUES ('C',3)let's say that i have this table in a local server and i want to uploadit to a remote server and in the remote server upload it to a databasethat contains the same table.the uploading part can be done by another application in the remoteserver, but i want i need is a way to transfer the data at the fastestpossible way.what steps do i need to follow?tia,Rey Guerrero
Which component should be used to process dataset as a whole, and not on per row basis? I have need to process dataset conditionally (condition based on dataset), e.g. if a special row is present in dataset than dataset should be processed in a special way. Should I maybe use one Script Transformation to determine if dataset satisfies condition (and store that result into a variable), and then based on that condition (variable value) perform or not processing (using Conditional Split and Script Transformation)?
I am having trouble trying to construct the following process in SSIS/SQL 2005:
1. Grab a set of unprocessed rows (ProcessDT = null) in an 'Action' table 2. For each of these rows, execute multiple stored procedures base on the action type If actiontype = 1, exec spAct1a @param1, @parm2 exec spAct1b @param1, @parm2, @param3, @param4 If actiontype = 2, exec spAct2a @param1, @parm2, @param3 exec spAct2b @param1, @parm2, @param3 etc.... 3. Update ProcessDT so it's not processed again 4. Repeat until all rows are processed
Note - all sp params are contained in additional columns in the Action table. Basically the Action table is a store for post-event processing of sorts but is order dependent, hence the row by row processing. And some of my servers are 2000 so Service Broker is not an option (yet).
I first attempted to do this totally within the control flow - using an ado recordset/foreach loop control, but I could not figure out how to run conditional process paths based on the ActionTypeID. I then tried to do this within the dataflow using on OLEDB data source, a conditional split, and an oledb command control which almost got me there - the problem being for each row I need to execute multiple sp's and it appears as if the oledb command only gives me one sp.
I'm populating a new table based on information in an existing table. The new table is a list of all "items" and contains a primary key. The old table is a database of receipts where items can appear many times in any order.
I have put together the off-the-shelf components to do this, using a lookup transformation to see if the item is already in the new table. Problem is, because there's so much repetition in the old table I need to process the old table one row at a time. Batch processing is generating errors because the lookup doesn't detect duplicates within the buffer.
I tried setting the "DefaultBufferMaxRows" property of the task to 1, but that doesn't seem to have any effect.
To get the data from the old table, I'm using an OLE DB source. To get the data into the new table, I'm using the OLE DB Command transformation with parameters to execute an INSERT statement.
This is a job I have to do exactly once, so I don't care if I have to run it overnight. I'd rather have a simple, easy to understand but inefficient script so I understand what it's doing completely.
Any help on forcing SSIS to process one row at a time?
I am verifying my reports processing time. I get the information from the Reporting Service DB - [ExecutionLogs] table. I have the following information:
[TimeEnd] €“ time that reports generation ends.
[TimeStart] - time that reports generation starts.
[TimeDataRetrieval] - amount of time spent running the data sources.
[TimeProcessing] - time spent processing the report.
[TimeRendering] - time spent generating the output format.
If this information is correct the following statement should be true:
When using the AS processing task with a connection to "an Analysis Services project in this solution", only some processing options are available for processing dimensions. For instance, it is not possible to select "Process Update". Once I change the connection manager to point to the deployed cube database, I can choose from all the options. Is this by design?
I need help from sql experts for the following problem
DOCTYPE DATE QTY PRD LOT Purchase 1 jan 20+ AA 2007FW Purchase 4 jan 50+ AA 2007SS Purchase 9 jan 10+ AA 2007FW Sale 3 jan 10- AA Sale 4 jan 20- AA Returned Good 4 feb 10 AA
As you can see I don't have the LOT code in sales records, so I must update these records with the following logic:
I have to assign LOT code to sales in FIFO order: from the table above the 3rd of January I find the first sale of 10, take the first LOT (ascending date order), check for qty on hand (20+), consume 10 from LOT (10 remaining), update sale record with 2007FW LOT code.
Then I find next sale of 20, as before I take the first LOT with qty on hand to consume, again it's the first record with only 10 remaining so I set LOT qty on hand to zero but I have 10 more to allocate to a LOT code. So I find next available LOT of 50 and consume 10, with 40 remaining.
It's important to remember that I can sell part of a "lot", have to track the remaining goods per LOT
Is it possible in SQL to do that in batch mode? How? If I have to split sale record consuming 2 or more LOTS, how can I do? Can you show me SQL or good hints?
I have 3 cubes in a single SSAS database and these cubes should be processed using the following schedule
Cube 1 - Every Day Cube 2 - Every Week Cube 3 - Every Month Cube 4 - Every Day
The issue that I face is that these cubes share the dimensions and so I cant do a FULL process of these SHARED Dimensions as it will affec other cubes.
I can expect additions and deletions to my dimension data , but the structure remains the same. It would be great if someone can suggest how to go about processing the dimensions. I am confused with the number of options(Process Incremental, Process Update etc.,) available for processing the dimensions.
I will creating a SSIS package to automate the processing. One more question is say, if Cube 2 fails during a day and Cube 1 has succesfully processed on the same day earlier, how do I revert back to the old state of Cube 2? Does this mean that I need to do a back up of the SSAS database before processing each cube?
I am in the midst of designing a new Data Warehouse system. As we get further into the design of the system, the more we are realising how complex our ETL is going to be and that the amount of time it will take to run could be significant i.e. a few days! My question is obviously I don't want to have a down time in my relational system for this long and prevent my users from accessing the data for days at a time each month. So what functionality should I be looking at to allow me to maintain a working copy of the data that users can query whilst I perform database updates and then perform a quick promotion of the updated data to users for querying?
If you can point me in the direction of the right functionality in SQL Server 2005 and possible some relevant white papers that cover this sort of scenario I would be grateful.
I receive the following error message when I try to use the Bulk Insert Task to load BCP data into a table:
Error: 0xC002F304 at Bulk Insert Task, Bulk Insert Task: An error occurred with the following error message: "Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.The bulk load failed. The column is too long in the data file for row 1, column 4. Verify that the field terminator and row terminator are specified correctly.Bulk load data conversion error (overflow) for row 1, column 1 (rowno).".
Task failed: Bulk Insert Task
In SSMS I am able to issue the following command and the data loads into a TableName table with no error messages: BULK INSERT TableName FROM 'C:DataDbTableName.bcp' WITH (DATAFILETYPE='widenative');
What configuration is required for the Bulk Insert Task in SSIS to make the data load? BTW - the TableName.bcp file is bulk copy file as bcp widenative data type. The properties of the Bulk Insert Task are the following: DataFileType: DTSBulkInsert_DataFileType_WideNative RowTerminator: {CR}{LF}
Any help getting the bcp file to load would be appreciated. Let me know if you require any other information, thanks for all your help. Paul
Stored procedure works: PROCEDURE dbo.VerifyZipCode@CZip nvarchar(5),@CZipVerified nvarchar(5) outputAS IF (EXISTS(SELECT ZIPCode FROM ZipCensus11202006 WHERE ZIPCode = @CZip)) SET @CZipVerified = 'Yes' ELSE SET @CZipVerified = 'Not Valid Zip'RETURN Need help calling and processing sp information:
protected void C_Search_btn_Click(object sender, EventArgs e){ SqlConnection con = new SqlConnection(ConfigurationManager.ConnectionStrings["localhomeexpoConnectionString2"].ConnectionString); SqlCommand cmd = new SqlCommand("VerifyZipCode", con); cmd.CommandType = CommandType.StoredProcedure; cmd.Parameters.AddWithValue("@CZip", UserZipCode_tbx.Text); . . . how do you set up output parameter, call the SP and capture output. . . ? if (@CZipVerified == "Not Valid Zip") { TextBox5.Text = "Zip code not valid please re-enter zip"; } else { Continue processing page } }
How would I go about checking incoming e-mails? For example, on a certain e-mail address, I would get e-mails formatted in a certain way. According to the response, some scripts need to run/ some sql tables updates etc. How can one do this in (ASP) .NET with SQL Server? Anyone did this kind of stuff before?
When I process a new cube I recieve an error "Error (211): Unknown dimension member ' 8'; Time: 8/11/00 1:09:45 PM". Any ideas about what this error is about ?
Is there anybody out there that can help me on how can I know the processing time taken for one transaction by using SQL Analyzer??
1)For example, I want to update using Analyzer and I would like to know time taken to do this update???
2) How to reduce processing time by using Store Procedures that using cursor?? I have add in some commit statement in my update statement.. Is there any other ways??
I've noticed lately that my queries through ADO/VB are taking alot longer to process at certain times. The query and the result information never changes, only that at certain times the query takes alot longer than usual. I thought that I possibly need more licenses, or it might be network traffic. I currently use MS SQL Server 2000 small business Ed(5 Cals).
Has anyone any information about performance problems due to licensing issues?