CHECKSUM && CHECKSUM_AGG In T-SQL
Sep 24, 2007
Hi,
I recently researched on the CHECKSUM & CHECKSUM_AGG functions in T-Sql and found them really useful. However, I was skeptical that there are chances of these functions returning the same values for non-identical inputs. I just got on to the forums and found more than one unhappy folks writing about their experience with these functions.
I am designing a large database (warehouse) and found these functions tempting to implement for the sake of
using CHECKSUM for
- indexing long character fields
- multiple colums of the same table that would involve in a join and use the new checksum field instead
using CHECKSUM_AGG for
- I bulkcopy flat file soruce data into a character field of a table and to ensure that I am not loading the same file multiple times, I plan to use CHECKSUM_AGG( CHECKSUM( [FlatFileRecord] ) ) and verify that no two loads have the same output.
Can some body suggest if I can trust these methods for my purpose?
Many thanks in advance!!
Thanks,
Harish
View 5 Replies
ADVERTISEMENT
Feb 29, 2008
do you have to store the checksum from the task in order to verify change? Here is what I did and it seems it's not going to work
Lookup transformation that
selects matching fields and returns a t-sql with the binary_checksum(fields)
Checksum transformation returns the checksum of same inbound fields
conditional split passes on changed records to the update
getting all the records everytime and should not be happening.
is there a way to make this work or do I have to store the checksum
View 8 Replies
View Related
Jul 23, 2005
Gentlemen,I am using the following query to get a list of grouped checksum data.SELECT CAST(Field0_datetime AS INT),CHECKSUM_AGG(BINARY_CHECKSUM(Field1_bigint, Field2_datetime,Field3_datetime, Field4_bigint, Field5_bigint, CAST(Field6_floatDecimal(38,6)), Field7_datetime))FROM Table1WHERE Field0_datetime BETWEEN '2003-01-01' AND '2003-01-20'GROUP BY CAST(Field0_datetime AS INT)Please notice the used filter: from January 1 to January 20.That query takes about 6 minutes do return the data. The result is 18records.However, when I execute the same query filtering BETWEEN '2003-01-01' and'2003-01-10', this time it takes only 1 second to return data.When I execute the query filtering BETWEEN '2003-01-10' and '2003-01-20' thequery takes another 1 second to return data.So why 6 minutes to process them together??The table have an index by Field0_datetime.It contains about 1.5 millions records total, using around 1.7Gb ofdiskspace, indexes included.From 2003-01-01 and 2003-01-20, there are 11401 records selected. Don't looklike that much.The situation is repeatable, I mean, if I execute the queries back andagain, they takes the about the same ammount of time to execute, so I don'tthink this problem is related to cache or something like that.I would appreciate any advice about what might be wrong with my situation.Thanks a lot and kind regards,Orly JuniorIT Professional
View 2 Replies
View Related
Jul 23, 2005
Hi,I can see that by using the object ID rather that the object name, thefollowing SQL query works. Has anybody got any idea what is causing theerror?-- Works OKselect o.id,checksum_agg(binary_checksum(m.text))from sysobjects o,syscomments mwhere o.id = m.idand o.xtype in ('FN','IF','P','TF','TR','V')group by o.id-- Error-- Server: Msg 1540, Level 16, State 1, Line 1-- Cannot sort a row of size 8096, which is greater than the-- allowable maximum of 8094.select object_name(o.id),checksum_agg(binary_checksum(m.text))from sysobjects o,syscomments mwhere o.id = m.idand o.xtype in ('FN','IF','P','TF','TR','V')group by object_name(o.id)-- Error-- Server: Msg 1540, Level 16, State 1, Line 1-- Cannot sort a row of size 8096, which is greater than the-- allowable maximum of 8094.select o.name,checksum_agg(binary_checksum(m.text))from sysobjects o,syscomments mwhere o.id = m.idand o.xtype in ('FN','IF','P','TF','TR','V')group by o.name-- Workaroundselect getdate(),object_name(x.id),check_sumfrom (select m.id,checksum_agg(binary_checksum(m.text)) as check_sumfrom syscomments minner joinsysobjects oon m.id = o.idwhere o.xtype in ('FN','IF','P','TF','TR','V')group by m.id) as xRegardsLiam
View 1 Replies
View Related
Jun 11, 2008
Hi
What is the purpose of checksum function???
RKNAIR
View 3 Replies
View Related
Feb 6, 2004
Hi,
Can anyone provide me with the syntax for comparing rows of two tables using binary checksum? The tables A and B have 8 & 9 columns respectively. The PK in both cases is Col1 & Col2. I want checksum on Columns 1 to 8.
Thanks
View 6 Replies
View Related
Feb 6, 2004
Please execute the script below to understand the problem -
---
create table test(id int, col1 int,col2 varchar(5),col3 datetime)
create table test2(id int, col1 int,col2 varchar(5),col3 datetime)
--id & col1 make up the PK.
insert test values(4,4,'d','02/06/2004')
insert test values(4,4,'e','02/06/2004')
insert test2 values(4,4,'d','02/06/2004')
insert test2 values(4,4,'e','02/06/2004')
select *
from test
select *
from test2
--The rows are identical.
--Script A
select t.*
from test t
join test2 t2 on t2.id=t.id
where CHECKSUM(t.col2,t.col3)<>CHECKSUM(t2.col2,t2.col3)
--The purpose of the above script is to check for any updates in the two tables. It returns two rows. But as you can see both these rows were present in the table before. So I modify the script to -
--SCRIPT B
select t.*
from test t
join test2 t2 on t2.col2=t.col2
where CHECKSUM(t.col3)<>CHECKSUM(t2.col3)
-- In this case no row is returned.This is exactly what I need. The problem - Now execute the script below.
TRUNCATE TABLE TEST
TRUNCATE TABLE TEST2
insert test values(4,4,'d','02/06/2004')
insert test values(4,4,'d','02/01/2004')
insert test2 values(4,4,'d','02/06/2004')
insert test2 values(4,4,'d','02/01/2004')
--Now when I execute script B two rows are returned which is not what I want. Since the rows are identical no row should be returned. So depending on what column changes (col2 or col3), I have to alter the script. I seek advise on the method to calculate checksum. Again the PK is ID and Col1 only.
Thanks
drop table test
drop table test2
go
--
View 3 Replies
View Related
Jun 7, 2006
Hello All,
How to use CHECKSUM function and how it is useful?
Thanks
Sanjeev
View 1 Replies
View Related
Jul 23, 2005
I'm developing a stored procedure to run an update based on valuesentered into a .Net web form. I want to capture the chceksum of therow when it is displayed on the form then validate that when the updateis exec'd. Simple enough logic, eh? The problem is when I try to usethe checksum(*) function, SQL server yells at me and says that it isn'trecognized. I'm using SQL Server 7, so wtf? I am not the admin of theserver and I'm skirting around SQL Server Enterprise Manager and usingany free utils, MS Access, and Visual Studio to maintain this db.ThanksAlex Jamrozek
View 7 Replies
View Related
May 29, 2007
Hi,I'd like advices about an idea I add to resolve a problem. thanks toyou in advance for yours answers.I have a database with tables that I load with flat file. The size ofeach table is 600 Mb. The flat file are the image of an applicationand there is no updated date or created date on any table. So mytables are just a copy of the data from the flat file.Now I'd like to create an History Table. So I have to determine whichlines changed and which one did'nt.As I don't have any date on my row the only answer I had unil know wasto check each column on each row to see if any data changed. If thedata changed I add a new line in my history date.My idea is to add a checksum column in both table on all columns. Toknow if any data change I just have to check my PK + my checksumcolumn.Do you think that is a good idea ? Is checksum a quick function ornot ?.Thanks.--K
View 3 Replies
View Related
Jan 24, 2008
Hi,
I heard that page checksum enabled will reports errors occured in the log. That's good.
Currently we have DBCC PHYSICAL_ONLY run alone and CHECKTABLE on group of tables on different days. A suggestion came from a person is to turn off 'DBCC CHECK TABLE' and run only when checksum reports an error and continue running CHECKDB WITH PHYSICAL_ONLY as before.
Is this suggestion a best practice? Please also write few lines to say why it is wrong or wright.
Thanks and best regards
Priw
View 1 Replies
View Related
Feb 18, 2008
it sounds like a column can be added to each row in a table that is the checksum or binary_checksum of an expression. How many bytes do each of these occupy? Does the answer depend on the number and/or length of items in the expression?
View 4 Replies
View Related
Dec 18, 2007
Hi:
i am using checksum in my etl process for this i have a checksum field to calculate the values in my table
the column is a computed column and it has a property for persistence .
what decision should i take should i make it persisted ot not what is the industry standard.
Can you please expalin how this property would affect the behaviour of the column
will this property affect me in any thing like indexes . please let me what step should i take should i make the column persisted or not .
Please let me know.
Thanks,
View 3 Replies
View Related
May 11, 2006
Morning Campers,
I have two tables src_monthly_terrrier and src_weekly_terrier. Both of these tables consists of 10+ columns. As the table names probably suggest, I import weekly data into one and monthly data into another.
All the source data comes from an Excel spreadsheet via straight Import Data procedure. The only guaranteed change on a weekly and monthly basis is that one of the columns in each table named src_date will obviously have the data value for whichever month or week's data it relates to.
I understand that through 'SQL Server Business Intelligence Development Studio' I can create an 'Intergrated Services' package that will import the spreadsheet details for me. I might be going the long way around this, but it was my intention to bring in all the data and then run a couple of 'INSERT INTO' Stored Procedures.
My biggest issue / vunerability I have is that there is no error checking of the data on the way in to ensure that it has not already been imported. What I was thinking I could do to resolve this was to create a Checksum field comprising of a number of different columns (incl src_date) and then somehow write something that will look at the values of each intended imported row and then work out whether a duplicate checksum was found in the target table and then rejected the import routine as Duplicate Data Found (or something similar) and move onto the next stored procedure.
My problem is two fold, one I have no idea how to create said checksum and two no idea where to begin on coding a procedure etc that looks to see if the value already exists etc etc.
I have looked up checksum creation on the net and there appears to be plenty of resource to explain how to create one, so I guess my main question is, Where do I start when it comes to writing some code that will do the check of the checksum before the importation routine begins (or at least the Insert Into procedures.
I would truly appreciate anyone's help on this. In the meanwhile I am off to learn how to create them.
I would like to add, if anyone sees this as a bad idea, then please speak up.
Thanks in Advance
View 1 Replies
View Related
Mar 19, 2004
Hello,
I need to generate HASH of text values for my app. I can generate hash values for normal fields using CHEKCSUM and BINARY_CHECKSUM function but it does not support checksum of text, ntext, image, and cursor, as well as sql_variant.
How can I generate checksums of such datatype.
Karam
View 7 Replies
View Related
Feb 11, 2015
what is the use of check sum in sql server:
ex: RESTORE DATABASE [XXX] FROM DISK = 'XX.BAK' with check sum
View 1 Replies
View Related
Aug 13, 2007
How much of a performance impact will using Cheksum have over Torn Page Detection for Page Verify Recovery? Thanks
View 3 Replies
View Related
Mar 13, 2007
Looking for some clarification on the CHECKSUM option of the BACKUP command.
If the the CHECKSUM option is specified in the backup, will the backup fail if CHECKSUM finds bad values (or at least raise an error)? Or, is it only reported when doing a RESTORE VERIFYONLY?
Thank you.
View 3 Replies
View Related
Aug 22, 2006
With this discussion here http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=70328
I started to thinkn about Microsoft really calculated checksum value.
This code is 100% compatible with MS original. That is, the result is identical.
You can use it "as is", or you can use it to see that MS function does not produce that unique values one could expect.
With text/varchar/image data, call with SELECT BINARY_CHECKSUM('abcdefghijklmnop'), dbo.fnPesoBinaryChecksum('abcdefghijklmnop')
With integer data, call with SELECT BINARY_CHECKSUM(123), dbo.fnPesoBinaryChecksum(CAST(123 AS VARBINARY))
I haven't figured out how to calculate checksum for integers greater than 255 yet.CREATE FUNCTION dbo.fnPesoBinaryChecksum
(
@Data IMAGE
)
RETURNS INT
AS
BEGIN
DECLARE@Index INT,
@MaxIndex INT,
@SUM BIGINT,
@Overflow TINYINT
SELECT@Index = 1,
@MaxIndex = DATALENGTH(@Data),
@SUM = 0
WHILE @Index <= @MaxIndex
SELECT@SUM = (16 * @SUM) ^ SUBSTRING(@Data, @Index, 1),
@Overflow = @SUM / 4294967296,
@SUM = @SUM - @Overflow * 4294967296,
@SUM = @SUM ^ @Overflow,
@Index = @Index + 1
IF @SUM > 2147483647
SELECT @SUM = @SUM - 4294967296
RETURN @SUM
ENDActually this is an improvement of MS function, since it accepts TEXT and IMAGE data.CREATE FUNCTION dbo.fnPesoTextChecksum
(
@Data TEXT
)
RETURNS INT
AS
BEGIN
DECLARE@Index INT,
@MaxIndex INT,
@SUM BIGINT,
@Overflow TINYINT
SELECT@Index = 1,
@MaxIndex = DATALENGTH(@Data),
@SUM = 0
WHILE @Index <= @MaxIndex
SELECT@SUM = (16 * @SUM) ^ ASCII(SUBSTRING(@Data, @Index, 1)),
@Overflow = @SUM / 4294967296,
@SUM = @SUM - @Overflow * 4294967296,
@SUM = @SUM ^ @Overflow,
@Index = @Index + 1
IF @SUM > 2147483647
SELECT @SUM = @SUM - 4294967296
RETURN @SUM
END
Peter Larsson
Helsingborg, Sweden
View 6 Replies
View Related
Jul 25, 2007
Hi,
We are using binary_checksum in some of instead of update trigger. The problem came into the knowledge when update falied without raising any error. We came to know after research that checksum returns same number for two different inputs and thats why update failed.
We are using following type of inside the trigger.
UPDATE [dbo].[Hospital]
SET
[HospitalID]= I.[HospitalID],
[Name]= I.[Name],
[HospitalNumber]= I.[HospitalNumber],
[ServerName] = I.[ServerName],
[IsAuthorized]= I.[IsAuthorized],
[IsAlertEnabled]= I.[IsAlertEnabled],
[AlertStartDate]= I.[AlertStartDate],
[AlertEndDate]= I.[AlertEndDate],
[IsTraining]= I.[IsTraining],
[TestMessageInterval]= I.[TestMessageInterval],
[DelayAlertTime]= I.[DelayAlertTime],
[IsDelayMessageAlert]= I.[IsDelayMessageAlert],
[IsTestMessageAlert]= I.[IsTestMessageAlert],
[IsUnAuthorizedMessageAlert]= I.[IsUnAuthorizedMessageAlert],
[IsWANDownAlert]= I.[IsWANDownAlert],
[IsWANUpAlert]= I.[IsWANUpAlert],
[CreateUserID]= Hospital.[CreateUserID],
[CreateWorkstationID]= Hospital.[CreateWorkstationID],
[CreateDate]= Hospital.[CreateDate] ,
/* record created date is never updated */
[ChangeUserID]= suser_name(),
[ChangeWorkstationID]= host_name(),
[ChangeDate]= getdate() ,
/* Updating the record modified field to now */
[CTSServerID]= I.[CTSServerID]
FROM inserted i
WHERE
i.[HospitalID]= Hospital.[HospitalID]
AND binary_checksum(
Hospital.[HospitalID],
Hospital.[Name],
Hospital.[HospitalNumber],
Hospital.[ServerName],
Hospital.[IsAuthorized],
Hospital.[IsAlertEnabled],
Hospital.[AlertStartDate],
Hospital.[AlertEndDate],
Hospital.[IsTraining],
Hospital.[TestMessageInterval],
Hospital.[DelayAlertTime],
Hospital.[IsDelayMessageAlert],
Hospital.[IsTestMessageAlert],
Hospital.[IsUnAuthorizedMessageAlert],
Hospital.[IsWANDownAlert],
Hospital.[IsWANUpAlert]) !=
binary_checksum(
I.[HospitalID],
I.[Name],
I.[HospitalNumber],
I.[ServerName],
I.[IsAuthorized],
I.[IsAlertEnabled],
I.[AlertStartDate],
I.[AlertEndDate],
I.[IsTraining],
I.[TestMessageInterval],
I.[DelayAlertTime],
I.[IsDelayMessageAlert],
I.[IsTestMessageAlert],
I.[IsUnAuthorizedMessageAlert],
I.[IsWANDownAlert],
I.[IsWANUpAlert]) ;
Here is the checksum example which produces same results for two different input.
DECLARE @V1 VARCHAR(10)
DECLARE @V2 VARCHAR(10)
SELECT @V1 = NULL, @V2=NULL
SELECT binary_checksum('KKK','San Jose','1418','1418SVR ',0,1,@V1,@V2,0,30,180,1,0,1,1,1),
binary_checksum('KKK','San Jose','1418','1418SVR ',1,1,@V1,@V2,0,30,180,1,1,1,1,1)
Lookat the two binary_checksum above, they are different and should not match, but they both return same value.
Can someone please provide some info on these.
View 4 Replies
View Related
Mar 22, 2007
For detecting delta records, I'm a big fan of SQLIS' checksum transform. I'm having difficulty in it's install on my current machine, however. After the installation and the new transform is added to my DataFlow toolbox... I can't open the UI for the transform to define the checksum. Instead, I get the following error:
===================================
Could not load file or assembly 'Microsoft.ExceptionMessageBox, Version=9.0.242.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91' or one of its dependencies. The system cannot find the file specified. (Microsoft Visual Studio)
------------------------------
Program Location:
at Konesans.Dts.Pipeline.ChecksumTransform.ChecksumTransformUI.Edit(IWin32Window parentWindow, Variables variables, Connections connections)
at Microsoft.DataTransformationServices.Design.DtsComponentDesigner.StartComponentUI(Boolean startGenericUI)
Anyone have any suggestions? Thanks in advance.
View 2 Replies
View Related
Jul 20, 2007
I am using the Konesans Checksum transformation ( http://www.sqlis.com/21.aspx ) to detect changes in my big (many columns, type 2 SCD) dimensional table.
But I am running into collossions
The checksum transformation, sometimes misses a small change in the record, for instance when a certain flag is set or unset. Is there a more robust checksum generator? Of any other suggestions on to solve this?
thx
View 12 Replies
View Related
Aug 30, 2006
Does anyone know how to detect the CHECKSUM setting of the PAGE_VERIFY database option (2005 only)?
BOL (ALTER DATABASE) includes the following statement:
PAGE_VERIFY { CHECKSUM | TORN_PAGE_DETECTION | NONE }
The current setting of this option can be determined by examining the page_verify_option column in the sys.databases catalog view or the IsTornPageDetectionEnabled property of the DATABASEPROPERTYEX function.
However, there is no column named page_verify_option in the view sys.databases, and DATABASEPROPERTYEX('IsTornPageDetectionEnabled') does not discriminate between the settings CHECKSUM and NONE (it returns 0 for both)!
View 1 Replies
View Related
Aug 14, 2014
From what I've seen, the CheckSum_Agg function appears to returns 0 for even number of repeated values. If so, then what is the practical use of this function for implementing an aggregate checksum across a set of values?
For example, the following work as expected; it returns a non-zero checksum across (1) value or across (2) unequal values.
declare @t table ( ID int );
insert into @t ( ID ) values (-7077);
select checksum_agg( ID ) from @t;
-----------
-7077
declare @t table ( ID int );
insert into @t ( ID ) values (-7077), (-8112);
select checksum_agg( ID ) from @t;
-----------
1035
However, the function appears to returns 0 for an even number of repeated values.
declare @t table ( ID int );
insert into @t ( ID ) values (-7077), (-7077);
select checksum_agg( ID ) from @t;
-----------
0
It's not specific to -7077, for example:
declare @t table ( ID int );
insert into @t ( ID ) values (-997777), (-997777);
select checksum_agg( ID ) from @t;
-----------
0
What's curious is that (3) repeated equal values will return a checksum > 0.
declare @t table ( ID int );
insert into @t ( ID ) values (-997777), (-997777), (-997777);
select checksum_agg( ID ) from @t;
-----------
-997777
But a set of (4) repeated equal values will return 0 again.
declare @t table ( ID int );
insert into @t ( ID ) values (-997777), (-997777), (-997777), (-997777);
select checksum_agg( ID ) from @t;
-----------
0
Finally, a set of (2) uneuqal values repeated twice will return 0 again.
declare @t table ( ID int );
insert into @t ( ID ) values (-997777), (8112), (-997777), (8112);
select checksum_agg( ID ) from @t;
-----------
0
View 0 Replies
View Related
Aug 10, 2015
I'm trying to load data from old SQL server 2000 to new SQL server 2014. I need to do a checksum to check if all the source data is loaded in the target database(SQL server 2014). I've created the insert statement for the same which works. I need to use checksum to make sure all the source rows are loaded in the target table. I haven't done checksum before.
Here is my insert statement:
INSERT INTO [Test].[dbo].[Order_tab]
([rec_id]
,[date_loaded]
,[Name1]
,[Name2]
,[Address1]
,[Address2]
[code]....
View 2 Replies
View Related