There is one report to identify potential duplicate in a table and it is performing poor.I'm now tuning the existing SP and got struck in modifiying it. rewrite the query in a best way. I just pasted below an example of query which is now in a report.The report will be run every week currently the table has 10 million records, and every week there will 5k to 10k will be added up so with that 5k to 10 k we have to check all the 10 miilion rows that if it is duplciated the logic is (surname = surmane or forename = forename or DOB =DOB )
Create table #employee ( ID int, empid varchar(100), surname varchar(100), forename varchar(100), DOB datetime, empregistereddate datetime, Createdate datetime
I have run into what I think is a bug in SQL Server Express version 2005.90.3042.0
To reproduce it, create a simple table as described below:
CREATE TABLE [dbo].[test]( [priority] [tinyint] NOT NULL, CONSTRAINT [PK_test] PRIMARY KEY CLUSTERED ( [priority] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] OK, a simple table with one tinyint column. It doesn't actually matter if you create the primary key or not.
Then I populated the table with 10 entries, with the numbers 1 - 10 in the priority column.
So the query SELECT * FROM test
produces
priority 1 2 3 4 5 6 7 8 9 10
Now I created a view using the following SQL
CREATE VIEW [dbo].[showtestd] AS SELECT TOP (100) PERCENT dbo.test.* FROM dbo.test ORDER BY priority DESC
If then run the following
SELECT * FROM showtestd
I would expect the see the result ordered by descending index. However, the DESC in the view is ignored, it comes out sorted ascending, So I still see 1-10.
So I am hoping someone can tell me if this is a real bug, or if I misunderstand views.
This procedure when accepting a table name as parameter will produce a Create Table statement, INsert INto statement, and produce the top 10 records from a table in a suitable format for posting.
SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO
CREATE Procedure [dbo].[pGenerateDDL](@Tablename varchar(128)) AS Declare@Structure varchar(8000), @colstr varchar(8000)
Set NOCOUNT on
Select @colstr = '' If exists (select * from sysobjects where name = 'cols') Drop Table cols
Create Table cols ( ColInfo varchar(500) null ) Insert Into Cols (ColInfo) Select '[' + RTRIM(C.name) + '] ' + Case When isComputed = 0 then LEFT(CASE WHEN (T.name IN ('char', 'varchar', 'nchar','nvarchar')) THEN T.name + '(' + LTRIM(RTRIM(STR(C.length))) + ')' When t.name in ('numeric','decimal') then t.name + '(' + Cast(c.prec as varchar) + ','+ cast(c.scale as varchar) + ')' else t.name END,30) else 'AS ' end +
Case when isnullable = 1 and iscomputed = 0 then ' NULL' When isnullable = 0 and iscomputed = 0 then ' NOT NULL' end --Cast((SELECT TOP 1 value -- FROM ::fn_listextendedproperty(null, 'user', @user, @type, @tablename, 'column', default) -- WHERE objname = C.name) as varchar) + Case When c.colid = (Select max(c.colid) maxid FROM sysobjects o left JOIN syscolumns c ON (o.id = c.id) left JOIN systypes t ON (c.xusertype = t.xusertype) WHERE o.name = @tablename ) then ')' else ',' end FROM sysobjects o inner JOIN syscolumns c ON (o.id = c.id) inner JOIN systypes t ON (c.xusertype = t.xusertype) WHERE o.name = @tablename
--- Declare colcur Cursor READ_ONLY FOR Select Cast(Colinfo as varchar(500)) FROM cols
OPEN ColCur FETCH colcur into @structure IF (@@FETCH_STATUS <> 0) BEGIN -- No matching objects CLOSE TableCursor DEALLOCATE TableCursor END
WHILE (@@FETCH_STATUS = 0) BEGIN Select @colstr = @colstr + ' ' + cast(@structure as varchar(500)) FETCH colcur INTO @structure END CLOSE colcur DEALLOCATE colcur
If exists (select * from sysobjects where name = 'cols') Drop Table cols PRINT: '--TABLE STRUCTURE BELOW' Print: 'Create Table ' + @TableName + '(' + @colstr
DECLARE @SQLstring varchar(8000) DECLARE @SELString varchar(8000) DECLARE @firstTime bit
DECLARE getColumnsCursor CURSOR READ_ONLY FOR SELECT c.COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS c WHERE c.TABLE_SCHEMA = 'dbo' AND c.TABLE_NAME = @Tablename
DECLARE @columnName nvarchar(128) OPEN getColumnsCursor
FETCH NEXT FROM getColumnsCursor INTO @columnName WHILE (@@FETCH_STATUS <> -1) BEGIN IF (@@FETCH_STATUS <> -2) BEGIN IF (@firstTime = 0) SELECT @SQLstring = @SQLstring + ', '
-- append our column to the UPDATE statement SELECT @SQLstring = @SQLstring + '[' + @columnName + ']' Select @SELString = @SELString + 'Cast('+ '[' + @columnName + '] as varchar) +'',''+' SELECT @firstTime = 0 END FETCH NEXT FROM getColumnsCursor INTO @columnName END
CLOSE getColumnsCursor DEALLOCATE getColumnsCursor
Select @SELString = 'Select TOP 10 ''Select '' +' + Left(@SELstring,len(@SELstring)-4) + ' +'' UNION ALL'' FROM ' + @Tablename
Print (@SQLstring) Print ''
Exec (@SELstring)
GO
SET ANSI_NULLS OFF GO SET QUOTED_IDENTIFIER OFF GO
In my case, I tested on a small table and it produced this in the messages window:
--TABLE STRUCTURE BELOW Create Table ALI_ExpectedLR( [ALI_Score] int NULL, [ALIExpectedLR] numeric(10,5) NULL, [ALIExpectedLR_Unit] numeric(10,5) NULL) --BELOW IS INSERT STATEMENT. Insert Into ALI_ExpectedLR ([ALI_Score], [ALIExpectedLR], [ALIExpectedLR_Unit])
Select 141,0.04086,0.07329 UNION ALL Select 142,0.04013,0.07217 UNION ALL Select 143,0.03940,0.07106 UNION ALL Select 144,0.03868,0.06996 UNION ALL Select 145,0.03797,0.06888 UNION ALL Select 146,0.03727,0.06782 UNION ALL Select 147,0.03658,0.06676 UNION ALL Select 148,0.03589,0.06572 UNION ALL Select 149,0.03522,0.06470 UNION ALL Select 150,0.03455,0.06368 UNION ALL
I am not sure if it is "handy" or not, but was an interesting thing to work on. It is an extension of script added to madhivans post on Generating script (as an alternative). Without writing another whole cursor, I didn't see an easy way to eliminate the last "UNION ALL"
If I am missing some fundamental logic or best practice that can be improved on, please make whatever changes are best.
If it is useful, let me know that as well.
Note: removed a bit that identified calculated columns for this purpose, as for the intended use, it isn't necessary.
Poor planning on your part does not constitute an emergency on my part.
Hi everyone.I have a question that maybe some of you may be able to shed some lighton. My set up involves three SQL Server Databases. First one, used bythe application. Second, copy of first via log shipping. Third,contains data aggregation from the Second. Now, my issue is that Ihave a DTS packagage that (COPY SQL SERVER OBJECTS) that copys selecttables from the second DB, into the third. The issue lies when the DTSpackage executes, everyonce in a while it errors out. I'm assumingit's an error due to the log shipping restore on the second database.Are there any solutions to this?TIA
I am using SQL Server 2005. I am trying to import data from CSV files into an SQL Server table using the Import wizard. The text qualifier is double quotes ("), column delimeter is a comma (,), first row has column names. One of the field name is "id", which is a GUID, whose datatype in SQL Server is uniqueidentifier. It looks like this in the file: ..."data","data","dbf7edf8-0ca8-4e53-91e3-5901cdc1819a","data"... As you can see, there are no enclosing curly braces for the guid value. The DTS chokes on this and throws this error: The value could not be converted because of a potential loss of data If I add curly braces like this {dbf7edf8-0ca8-4e53-91e3-5901cdc1819a}, it imports with no problem. Is there way to import this type of data, because there is no way I can edit these files, and I would prefer not changing the datatype of the id field? Or is this a limitation of SQL Server? Thanks,Bullpit
I have a openquery query like this Select * from openquery([db2],'Select * from tableA')
To only return and process records for a given date range I changed it to be something like this
Select * from openquery([db2],'Select * from tableA') where datefield > '06/06/2001'
While this works fine, my question is that does it copy all the records from the db2 server to the sql server before filtering them. I think it does. The db2 table will have over 1,000,000 records eventually, and the sql server will use records for a given day/date range only.
Hello, I've been unable to find any reports of these two problem when using the SQL Native Client driver (9.00.3042.00), so I'm hoping you can direct me.
Firstly check constraint definitions accessed by the following TSQL only returns data with the SQL Server driver, ie nothing with SQL Native Client. Permissions are the same, just the driver in the (file) DSN changed.
TSQL is: Select sys.check_constraints.Name, Definition from sys.check_constraints inner join sysobjects on sys.check_constraints.parent_object_id = sysobjects.id where sysobjects.name = 'MYTABLE' and sys.check_constraints.type = 'C'
Secondly, When using SQLNCL instead of the SQL Server driver we are finding that ado(2.8) recordsets are returning more fields than the view has Select Statement. It turns out that these fields are inputs to the view, and potentially not for external viewing. Eg if we have: Table1 with Columns T1A, T1B, T1C and; Table2 with Columns T2A, T2B, T2C and create a view "Select T1A, T2A from Table1 inner join Table2 on .....etc.. " we can also see the other columns in the ado recordset when accessing this view via ado and SQL Native Client!
Any help much apprecated. We want to move to SQL Native Client to allow better connectivity to a 64bit SQL Server Thanks, Warwick
During import from CSV i am getting following error "The value could not be converted because of a potential loss of data". My CSV file contains a column "years" and i have selected datatime in the import wizar. Can I scape from this error and import all my data.
I've searched the threads and didn't see anything which seemed to fit this specific issue....
I have a Data Flow task which reads from an OLE DB Source (SQL Server 2005), uses a Data Conversion transformation to convert some field values, and finally outputs the result to distinct tabs of an Excel workbook. The task is failing with the following error:
There was an error with input column "oBBCompanyName" (2162) on input "Excel Destination Input" (57). The column status returned was: "The value could not be converted because of a potential loss of data.".
Using the Advanced Editor for the Excel Destination component, I examined the datatype of oBBCompanyName (ID = 2162) in the Input Columns list of the "Excel Destination Input" (identified with ID = 57). The Data Type is defined as DT_WSTR with Length = 255. The ExternalMetaDataColumnID = 203.
Also in the Advanced Editor for the Excel Destination, I examined the datatype of BBCompanyName (ID = 203) in the Extranl Columns list of the Excel Destination Input. The Data Type is defined as "Unicode String [DT_WSTR] with Length = 255.
What could I be overlooking that might be the root cause of this issue? The same error is occurring for different Excel Destination tasks in the data flow.
I am getting the error below in my Flat File Source.
I've seen this error many times before, and have successfully resolved this problem in the past.
However, this time it's a little different. It's complaining about row 7 of myFile.csv, column 20. I have column 20 defined as a Numeric(18,6). It also maps to the Price field in the table, which is also a Numeric(18,6).
The problem is, on row 7 of myFile, column 20 is blank. That is, there's no data for row 7, column 20.
So, why should it care about this?? If it's blank, then how can you lose any data?? I have several other blank columns in this file, but they aren't throwing any errors. Just this one.
Thanks
Errors:
[Flat File Source - myFile [1]] Error: Data conversion failed. The data conversion for column "Column 20" returned status value 2 and status text "The value could not be converted because of a potential loss of data.".
[Flat File Source - myFile [1]] Error: The "output column "Price" (333)" failed because error code 0xC0209084 occurred, and the error row disposition on "output column "Price" (333)" specifies failure on error. An error occurred on the specified object of the specified component.
[Flat File Source - myFile [1]] Error: An error occurred while processing file "d:myDirmyFile.CSV" on data row 7.
I am having a bit of a difficult trying to understand this one. I had imported two tables around 2-300 rows each ran this in 64 bit scheduled it and it ran okay. I now introduced a 3rd table which if I change the true64bit to false it will run however if left true it keeps mentioning the output column of descr with a loss of data.
I did move it to 32bit and then ran the package it comes up as below. If I rememeber I can run this in 32bit mode which I'm sure will work hmm maybe!! but what I can't understand is why it works for two tables? is it something to do with the translation of the table or do I need alter the select statement?
Currently it is a select *
Executed as user: jvertdochertyr. ...on 9.00.3042.00 for 64-bit Copyright (C) Microsoft Corp 1984-2005. All rights reserved. Started: 10:30:11 PM Error: 2007-12-06 22:30:21.02 Code: 0xC020901C Source: st_stock st-stock out [1] Description: There was an error with output column "descr" (56) on output "OLE DB Source Output" (11). The column status returned was: "The value could not be converted because of a potential loss of data.". End Error Error: 2007-12-06 22:30:21.02 Code: 0xC0209029 Source: st_stock st-stock out [1] Description: SSIS Error Code DTS_E_INDUCEDTRANSFORMFAILUREONERROR. The "output column "descr" (56)" failed because error code 0xC0209072 occurred, and the error row disposition on "output column "descr" (56)" specifies failure on error. An error occurred on the specified object of the specified component. There may be error messages posted before this with more information about the failure. End Error Erro... The package execution fa... The step failed.
I have created a program using SQL-DMO to replace an SQL Server 2000 database by setting it OFFLINE, replacing the .MDF and .LDF and then setting the database back ONLINE.
SQL Server does permit replacing the files while offline. And when the database is back online, everything seems to have worked perfectly with the new database files.
My questions is ... What potential problems (if any) exist with this process replacing a database?
Other methods I have considered: - Detach, replace DB files, re-attach - Delete old database, attach-using-new-DB-files
Lets say I have table A and table B and another table AB where each row in AB refereces a row in A and a row in B. Furthermore, I set both relationships to cascade upon delete.
Then one user deletes a row from A which cascades to two rows in AB. But another user has deleted a row in B which is also trying to delete the same two rows in AB. The first transaction deletes one of the AB rows, the second deletes the other and then both transactions cannot get the other row in AB to delete because its locked. So this is a deadlock! Is it really that easy to get a dead lock?
any useful SQL Queries that might be used to identify lists of potential duplicate records in a table?
For example I have Client Database that includes a table dbo.Clients. This table contains various columns which could be used to identify possible duplicate records, such as Surname | Forenames | DateOfBirth | NINumber | PostalCode etc. . The data contained in these columns is not always exactly the same due to differences caused by user data entry; so some records may have missing data from some of the columns and there could be spelling differences too. Like the following examples:
1 | Smith | John Raymond | NULL | NI990946B | SW12 8TQ 2 | Smith | John | 06/03/1967 | NULL | SW12 8TQ 3 | Smith | Jon Raymond | 06/03/1967 | NI 99 09 46 B | SW12 8TQ
The problem is that whilst it is easy for a human being to review these 3 entries and conclude that they are most likely the same Client entered in to the database 3 times; I cannot find a reliable way of identifying them using a SQL Query.
I've considered using some sort of concatenation to a new column, minus white space and then using a "WHERE column_name LIKE pattern" query, but so far I can't get anything to work well enough. Fuzzy Logic maybe?
the results would produce a grid something like this for the example above:
ID | Surname | Forenames | DuplicateID | DupSurname | DupForenames 1 | Smith | John Raymond | 2 | Smith | John 1 | Smith | John Raymond | 3 | Smith | Jon Raymond 9 | Brown | Peter David | 343 | Brown | Pete D next batch of duplicates etc etc . . . .
With my SSIS package, I want to import data from a flat file (TXT- delimited with ?) to a table in my database in sql server 2005. The problem is that I have a column of type datetime in my table. But as you know, the data in txt is string. First I created my package through importing data and using import/export wizard in management studio to my database and selecting flat file connection. There, I selected my txt file and column delimiter as ?, then suggested types for the columns. There it selects 8 byte signed integer type for the datetime column in my table. After these steps I create my package and execute it. But it does not put data in my table in the database. It gives the error of "The value could not be converted because of a potential loss of data" or "cast conversion failed" . I tried other types of date, timestamp, string but none of them was successful. What should I do to put data in my table from txts. Please can you help me urgently!!!
I have a flat file that has a date column where the date fields look like 20070626, for example. No quotes.
The problem is that several of the date values are missing, and instead of the date value the field looks like this , ,
That is, there are several blank spaces where the date should be. The number of blank spaces between the commas doesn't appear to be a set number (and it could even be 8 blank spaces, I don't know, in which case I don't know if checking for the Len will produce the correct results, but that's another issue...)
So, similar to the numeric field blanks problem, I wrote a script to convert the field to null. This is the logic I used:
If Not Len(Row.TradeDate) = 8 Then
Row.TradeDate_IsNull = True
End If
The next step in my data flow after the script is a derived column where I convert TradeDate from 20070625 to 06/25/2007. So the exact error message I am receiving is this:
[OLE DB Destination [547]] Error: There was an error with input column "TradeDate - derived" (645) on input "OLE DB Destination Input" (560). The column status returned was: "The value could not be converted because of a potential loss of data.".
Do I need to add a conditional split after the script and BEFORE the derived column to redirect bad rows so they don't go to the derived column?
I'm getting a very strange potential loss of data error on my flat file source in the data flow. The flat file is fixed width and the column in question is defined as numeric [DT_NUMERIC]. The transform runs great if this column IS NOT A ZERO. As soon as a zero value is found, I get the error. It errors on the flat file source, so I haven't been able to use a data viewer to see what's going on.
Why would I get these errors: " SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80040E21. An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80040E21 Description: "Invalid character value for cast specification".
There was an error with input column "UniqueID" (3486) on input "OLE DB Command Input" (3438). The column status returned was: "The value could not be converted because of a potential loss of data.".
Error: 0xC0209029 : SSIS Error Code DTS_E_INDUCEDTRANSFORMFAILUREONERROR. The "input "OLE DB Command Input" (3438)" failed because error code 0xC0209069 occurred, and the error row disposition on "input "OLE DB Command Input" (3438)" specifies failure on error. An error occurred on the specified object of the specified component. There may be error messages posted before this with more information about the failure."
I read related posts but could not figure out the error?
The following error is encountered when importing a delimited flat file with date of fomat "dd.mm.yyyy"
Error: 0xC02020A1 at Data Flow Task, Source - DCDtest_xpt [1]: Data conversion failed. The data conversion for column "value date" returned status value 2 and status text "The value could not be converted because of a potential loss of data.".
This was when I manually built the package.
I get the same error when using the Import/Export wizard
I am even using the "suggest types" button and have tried sampling the default number of rows (?200) and also 2000.
The type it suggests is DT_DATE.
But reading the BOL, this would appear to be the wrong type:
http://msdn2.microsoft.com/en-us/library/ms141036.aspx (obviously the right hand doesn't know what the left hand is doing)
seems to indicate that
DT_DBTIMESTAMP
is the correct value
(I cannot believe that they have different datatypes in SSIS than in SQL. I can't believe for a minute this is for all the hundreds of thousands of Oracle users who obviously switched to SSIS when they saw what a high quality product it is.)
I tried other DT_... values but no dice.
Can anyone help?
I always thought that Classic ASP was the worst product I've ever worked with from the Microsoft stable, but I was wrong.
I am fed up of having to post on this board (no wonder it is so 'popular')
Talking to peers, reading books, googling nearly always enables me to figure out a problem with any application I have ever used, but SSIS breaks the mould in sheer crapness and the weirdnes and unfathomability of its cryptic errors,.
I have SQL Server Enterprise set up with Merge replication, it is being used to Sync with Windows Mobile Devices as the subscribers.
All the Windows Mobile subscribers have been synchronising fine up until recently, when 1 of the Subscribing Windows Mobile devices received this error. The day after, another device Synchronised fine the first time (note that the Synchronisation was a long one and the SDF grew in size dramatically while the synchronisation occured, but the Sync worked fine and the SDF dropped back to its initial size) The 2nd sync 5 mins later failed however so this Mobile device has the same error also.
Any suggestions as to the below are much appreciated
a) Why this error occurs? b)How do you resolve this error? c)What is the Repair utility mentioned in the error, and is it of use to me?
Thanks in advance, I am very greatful for any advice.
I have an odd problem that is driving me nutz. I have a very simple SSIS package that imports a 5 colum flatfile into a sql Server 2005 Table.
When I created this package with the wizzard, it will execute perfectly fine and processes all rows into the destination table.
But when I hit F5 to execute it manually it will fail before inserting a single row.
The error it generates is (Spalte 5 is a Datetime in the format DD.MM.YYYY) :
Error: 0xC02020A1 at Datenflusstask, Source - Daten_NC_1_txt [1]: Data conversion failed. The data conversion for column "Spalte 5" returned status value 2 and status text "The value could not be converted because of a potential loss of data.".
Error: 0xC0209029 at Datenflusstask, Source - Daten_NC_1_txt [1]: The "output column "Spalte 5" (25)" failed because error code 0xC0209084 occurred, and the error row disposition on "output column "Spalte 5" (25)" specifies failure on error. An error occurred on the specified object of the specified component.
Error: 0xC0202092 at Datenflusstask, Source - Daten_NC_1_txt [1]: An error occurred while processing file "C:WorkDaten_NC_1.txt" on data row 177.
Edit: Modified the Title so it properly reflects the Problem & the Solution