My Tech Learning: Checksum Vs. Hashbytes

Tuesday, March 23, 2010

Checksum Vs. Hashbytes

What is Checksum?
Checksum is a function available in sql server 2005 and sql server 2008. It is intended to build a hash index based on an expression or column list.

Determining if two rows or expressions are equal can be a difficult and resource intensive process. This can be the case where the update was conditional based on all of the columns being equal or not for a specific row. Without checksums it is a long process of using innerjoins to identify the exact row to update in the update statements.

When is Checksum used?
When you need to compare the unique characteristics of an expression, columns or a table consider using the Checksum function.

When you have to update a row where many columns are compared to determine if the data is unique, use CHECKSUM function to build a unique value using checksum function and then compare the CHECKSUM values.

Example:

Hashbytes

What is hashbytes function?
HashBytes function was introduced in SQL server 2005 to simplify creating hashes in the database. It can convert values to MD2, MD4, MD5, SHA, or SHA1 formats.

When can you use a hash function?
Hash functions are a useful option to improve the efficiency of particular queries on large volumes of data. For example, on long strings of text, you can build a hash-index to perform efficient lookups or to speed up aggregate operations.

Differences between checksum and Hashbytes

Checksum	Hashbytes
Faster but can produce lot of duplicate values	Slower but efficient than checksum
Returns an int value	Returns a varbinary(8000)
Microsoft does NOT recommend using CHECKSUM for change detection purposes	Use Hashbytes for change detection purposes

1 comment:

Rudrashis said...: Nice post..; January 28, 2014 9:49 PM

Tuesday, March 23, 2010

Checksum Vs. Hashbytes

1 comment:

How to download files from Microsoft Fabric Lakehouse ?