I have 3 source for IS flow. One is flat file, one is DB table and one is output bad data. It might be a situation when I could have duplicate primary key since records come from 3 sources (flat file, db table, reject (output) table). Can any one give me suggestion how to handle duplicate primary key problem in this situation.
I want to import a data file into a sql table. The table has a primary key but the data could have a duplicate value in the PK column (error in the source data). How can I "trap" for this type of error in SSIS?
Hi All, I`m using BCP to import ASCII data text into a table that already has many records. BCP failed because of `Duplicate primary key`. Now, is there any way using BCP to know precisely which record whose primary key caused that `violation of inserting duplicate key`. I already used the option -O to output error to a `error.log`, but it doesn`t help much, because that error log contains the same error message mentioned above without telling me exactly which record so that I can pull that `duplicate record` out of my import data file. TIA and you have a great day. David Nguyen.
I have a Transform Data Task that copies a lot of data from my source system. Unfortunately, I cannot use a DISTINCT in the SQL from the source system, due to a very poor ODBC driver! So, when I am creating my primary key, I am trying to do a lookup on the PK column before I insert the record to see if it exists. If it does, then I skip the row. The lookup references the target database of the task.
The problem I have is that the lookup doesn't find any duplicates loaded from the database. It allows them through and causes the database to throw a primary key error.
Has anyone experienced this, or think they know what I'm doing wrong?
I get the following error when I try to insert a stored procedure in an SQL-server.
Violation of PRIMARY KEY constraint 'PK_Login'. Cannot insert duplicate key in object 'Login'.
My question if this is the real problem or the symtom of something else? I find it hard to believe that try to insert double key values. The table Login doesn´t contain any values.
How can I avoid duplicate primary key error when I use DetailsView Inserting that the field column is one of the primary key ? Thanks in advance ! stephen
I am putting my problem in an example as I Feel it would be clear.
Assume my table PEOPLE is having 4 columns with 6 rows, the SlNo being primary key. SlNo Name LastName birthdate 1 A B x -- 2 C B x |-- 1 pair (A, B, x) 3 D E y --|------------ 4 A E y | | 5 A B x __| |-- 2'nd pair (D, E, y) 6 D E y --------------- In this scenario, I need to find SlNo values having similar values in other columns. The o/p for above must be: 1 5 0 3 6 0 (0 needs to include in output for distinction in the sets)
(a)IS THIS POSSIBLE TO DO IN ONE SELECT STATEMET? and HOW? (b)If I create another temp table tempPEOPLE and select distinct row information of the 2'nd, 3'rd and 4'th columns from the PEOPLE table and then selecting SlNo's where the information match, I am able to get o/p 1 5 3 6 without 0...and I cannot makeout the distinct sets in this. HOW DO I FIND THE DISTINCTION IN SETS?
when i save this table modifying the pubid and pubcode as primary keys the following error displays...
Unable to create index 'PK_PUBS3'. CREATE UNIQUE INDEX terminated because a duplicate key was found for index ID 1. Most significant primary key is '51'. Could not create constraint. See previous errors. The statement has been terminated.
what i understand is that on the primary key duplicates are not allowed how could i allow it?
I have one table that stores log messages generated by a web service. I have a second table where I want to store just the distinct messages from the first table. This second table has two columns one for the message and the second for the checksum of the message. The checksum column is the primary key for the table.
My query for populating the second table looks like: INSERT INTO TransactionMessages ( message, messageHash ) SELECT DISTINCT message, CHECKSUM( message ) FROM Log WHERE logDate BETWEEN '2008-03-26 00:00:00' AND '2008-03-26 23:59:59' AND NOT EXISTS ( SELECT * FROM TransactionMessages WHERE messageHash = CHECKSUM( Log.message ) )
I run this query once per day to insert the new messages from the day before. It fails when a day has two messages that have the same checksum. In this case I would like to ignore the second message and let the query proceed. I tried creating an instead of insert trigger that only inserted unique primary keys. The trigger looks like:
IF( NOT EXISTS( SELECT TM.messageHash FROM TransactionMessages TM, inserted I WHERE TM.messageHash = I.messageHash ) ) BEGIN INSERT INTO TransactionMessages ( messageHash, message ) SELECT messageHash, message FROM inserted END
That didn't work. I think the issue is that all the rows get committed to the table at the end of the whole query. That means the trigger cannot match the duplicate primary key because the initial row has not been inserted yet.
We have a SQL Server 6.5 table, with composite Primary Key, having the Duplicate Entry for the Key. I wonder how it got entered there? Now when we are trying to import this table to SQL2K, it's failing with Duplicate row error. Any Help?
'm trying to import a text file but the primary key column contains duplicatres (tunrs out to be the nature of the legacy data). How can I kick out all duplicates except, say, for a single primary key value?
the point here that i have a small table with two fileds, ID (guid) as primerykey RAF(char) and the table is empty when i add a new row i recieve this exception, Violation of PRIMARY KEY constraint 'PK_tblType'. Cannot insert duplicate key in object 'dbo.tblType'. i found no way to solve the problem. thanks in advans
I have table variable in which I am inserting data from sql server database. I have made one of the columns called repaidID a primary key so that a clustered index will be created on the table variable. When I run the stored procedure used to insert the data. I have this error message; Violation of Primary key Constraint. Cannot insert duplicate primary key in object. The value that is causing this error is (128503).
I have queried the repaidid 128503 in the database to see if it is a duplicate but could not find any duplicate. The repaidID is a unique id normally use by my company and does not have duplicates.
I have a CSV file which contains some duplicate record and i have to load this file in SQL server database using SSIS package .
What i have to do is read the file and if the same record entry is occur more than 10 times for a particular unique combination ( like ID , Date , Time ) then i need to take only one record for that occurance.
I am having a problem where duplicate log statements are being written to a log file (as defined by a log provider).
I believe that this is because in the logging dialog box, I have ticked the checkbox next to a child task to override the logging functionality. I need to do this because it is a script task and I want to capture "ScriptTaskLogEntry" events (something that I cannot do at the parent level). However by doing this I seem to get the script events written at the parent, as well as at the Script Task level.
Is there any way of avoiding this, but still capturing the log events from the script task?
Another issue that is possibly linked is that I am getting an error from the log provider:
The SSIS logging provider "SSIS log provider for Text files" failed with error code 0x800700EA ((null)). This indicates a logging error attributable to the specified log provider.
Could this be because of the parent and child task are both attempting to write to the same log provider?
public static void CreateDestDFC1() { destinationDataFlowComponent1 = dataFlowTask.ComponentMetaDataCollection.New(); destinationDataFlowComponent1.Name = "SQL Server Destination 1"; destinationDataFlowComponent1.ComponentClassID = "{5244B484-7C76-4026-9A01-00928EA81550}"; managedOleInstance1 = destinationDataFlowComponent1.Instantiate(); managedOleInstance1.ProvideComponentProperties(); managedOleInstance1.SetComponentProperty("BulkInsertTableName", "Employee"); managedOleInstance1.AcquireConnections(null); managedOleInstance1.ReinitializeMetaData(); managedOleInstance1.ReleaseConnections(); } //Second one here.. public static void CreateDestDFC2() { destinationDataFlowComponent2 = dataFlowTask.ComponentMetaDataCollection.New(); destinationDataFlowComponent2.Name = "SQL Server Destination 2"; destinationDataFlowComponent2.ComponentClassID = "{5244B484-7C76-4026-9A01-00928EA81550}"; managedOleInstance2 = destinationDataFlowComponent2.Instantiate(); managedOleInstance2.ProvideComponentProperties(); managedOleInstance2.SetComponentProperty("BulkInsertTableName", "Customer"); managedOleInstance2.AcquireConnections(null); managedOleInstance2.ReinitializeMetaData(); managedOleInstance2.ReleaseConnections(); } And its giving a error.can anyone say why? or can anyone change this? The package contains two objects with the duplicate name of "component "SQL Server Destination" (50)" and "component "SQL Server Destination" (22)".
First post here. Anyway, I have a question regarding SSIS. I'm currently given a task that requires reading a flat file, applying duplicate removal as well as invalid data removal, processing it, and finally writing it to a SQL Server 2005 DB.
Part of the processing requires checking for partial duplicates in the batches of records provided in the text file. For example, the record contains a a phone number, status, timestamp of creation and various other entries. If a phone number is repeated (meaning, duplicate entry), a column called 'Status' must be checked, and only entries with the status of 'C' is allowed through.
Another part of the processing requires that if the phone number is repeated along with various other entries including status, the timestamp of creation is checked and only the entry with the most recent timestamp is accepted.
I would like to know how to implement this in SSIS without using table objects and scripts, as my experience tells me that doing this in a script can really take a hit on system performance. The task is expected to handle tens of thousands of records in a day.
I've a dtsx package which runs nightly to do following:
1. select data from a SQL replicated table 2. do some lookups (Lookup, Derived Column, Multicast, Conditional Split, etc.) 3. insert into another SQL table on another server using "Table or view - fast load", rows per batch = 10000, maximum insert commit size = 10000, and "redirect row" on error output on destination to an error log text file. Once in a while, I found duplicate records in the error log; these rows cannot be inserted into destination table due to primary constraint. For example, transaction_id=111000 appears twice in the error log but it is a unique key in the source table.
My questions: 1. What could be the cause of duplicating rows during ETL in SSIS? I've asked this before and have spent so much time research but still could not find the reason. This link is from my previous post:
2. For a daily extract data with over millions of rows, what would be best to set rows per batch, maximum insert commit size, etc? I've read some posts on this forum and decide to use 10000 for both, but once in a while there's just one duplicate rows that causes the whole batch of 10000 rows not committed.
I recently encountered an error when I created several copies of one package.
It's always nearly the same package with small modifications. I call this packages from a parent package which is part of our datawarehouseing-framework.
The problem is, when copying a packages or using a packages as template the packages' IDs and Task's-IDs are the same. And this isn't only an issue concerning logging!! :
When the parent package calls one of the copied packages the first task is executed in every package parallely. Furthermore ... when I for example set a breakpoint on a data transformation task in one of the packages, the breakpoint is set in all packages on the same task! This is resulting in strange errors because the tasks-states and variable values seem to get mixed up.
Unfortunately there is only a possibility to change the package's ID, but the IDs of tasks are readonly!
One solution is, to create a new package and copy all the tasks to the new package which creates new IDs, but doing so, I have to manually recreate a long list of variables, all the configurations, all the connection-managers once again. Furthermore I loose the layout of tasks.
I found some posts about it here
http://groups.google.de/group/microsoft.public.sqlserver.dts/browse_thread/thread/6f85a31ea190608a/0eae312aa8440cf8?lnk=gst&q=pitfall&rnum=1&hl=de#0eae312aa8440cf8 or
In my SSIS package, i have a field test_method_number coming from OLE DB Source. I used Derived transformation to trim test_method_number: TRIM(test_method_number)
Now in the next Derived Transformation, i see duplicate test_method_number. How to get rid of this duplicate?
I have one ssis package moving the data from staging to destination. In stating table we have the duplicate data. But in destination table 4 columns have primary key. How to handle the duplicate records in oldedb source.
I'm doing a group by in an aggregate transformation. I have say 6 columns in the output and I'm grouping on all of them - how can I get duplicate rows in the output? If I do the same select and group by in SQL on the source data I don't get any duplicate rows. In fact out of 6000+ rows I only get 2 duplicates.
We have scenario like this .the source table have composite primary key columns c1,c2,c3,c4.c5,c6 .when we move the records to destination .we have to check columns (c1+ c2 + c3 + c4 + c5 + c6) combination exist in the destination. if the combination exist then we should do a update else we need to do a Insert . how to achive this .we have tryed useing conditional split which is working only for a single Primary key . can any one help us .
In my SSIS package i am loading data from 1 source database to 2 targets database, both targets have same structure and datatype.
My package is working fine for 1 target  database but for 2 nd database its giving error "Violation of PRIMARY KEY constraint" but where as primary key constraint is not violating .
I am using the Import/Export wizard to import data from an ODBC data source. This can only be done from a query to specify the data to transfer.
When I try to create the tables, for the query, I am getting the following error:
Msg 2714, Level 16, State 4, Line 12
There is already an object named 'UserID' in the database.
Msg 1750, Level 16, State 0, Line 12
Could not create constraint. See previous errors.
I have duplicated this error with the following script:
USE [testing]
IF OBJECT_ID ('[testing].[dbo].[users1]', 'U') IS NOT NULL
DROP TABLE [testing].[dbo].[users1]
CREATE TABLE [testing].[dbo].[users1] (
[UserID] bigint NOT NULL,
[Name] nvarchar(25) NULL,
IF OBJECT_ID ('[testing].[dbo].[users2]', 'U') IS NOT NULL
DROP TABLE [testing].[dbo].[users2]
CREATE TABLE [testing].[dbo].[users2] (
[UserID] bigint NOT NULL,
[Name] nvarchar(25) NULL,
IF OBJECT_ID ('[testing].[dbo].[users3]', 'U') IS NOT NULL
DROP TABLE [testing].[dbo].[users3]
CREATE TABLE [testing].[dbo].[users3] (
[UserID] bigint NOT NULL,
[Name] nvarchar(25) NULL,
I have searched the "2714 duplicate error msg," but have found references to duplicate table names, rather than multiple field names or column name duplicate errors, within a database.
I think that the schema is only allowing a single UserID primary key.
Uma writes "Hi Dear, I have A Table , Which Primary key consists of 6 columns. total Number of Columns in the table are 16. Now i Want to Convert my Composite Primary key into simple primary key.there are already 2200 records in the table and no referential integrity (foriegn key ) exist.
may i convert Composite Primary key into simple primary key in thr table like this.