Quick T-SQL Regression Testing
Using CHECKSUM() and CHECKSUM_AGG()
Here’s something I’ve found useful in SQL Server, but it should apply in any DBMS with checksum functions. Next time you’re refactoring some database objects and you want to do some regression testing, give the
CHECKSUM (MSDN) and
CHECKSUM_AGG (MSDN) functions a try.
CHECKSUM and CHECKSUM_AGG
They behave pretty much as you would expect;
CHECKSUM returns 1 checksum given 1 row, and
CHECKSUM_AGG is an aggregate function that returns 1 checksum given multiple checksum rows. Between the two of them, you can get a checksum for any arbitrary collection of data:
-- Returns 2 rows with 2 columns SELECT Field1, Field2 FROM (VALUES ('foo','bar'),('baz','foo')) TempTable(Field1,Field2) -- Returns 2 rows with 1 checksum column SELECT CHECKSUM(Field1,Field2) FROM (VALUES ('foo','bar'),('baz','foo')) TempTable(Field1,Field2) -- Returns 1 row with 1 checksum column SELECT CHECKSUM_AGG(CHECKSUM(Field1,Field2)) FROM (VALUES ('foo','bar'),('baz','foo')) TempTable(Field1,Field2)
These help with a common problem: when you’re refactoring a database object, how can you be sure that you haven’t changed the existing functionality? Sometimes it’s enough to spot check a few inputs, but often the logic is complex enough that this does not provide a high level of confidence. With
CHECKSUM_AGG, you can quickly check large numbers of test cases.
For example, say we have a TVF called MyDateTVF that takes in a date parameter and returns columns Column1 and Column2. I want to refactor this function and then test that the functionality is unchanged for every date in 2018. If we have a table called DateList which contains every date (more useful than you might expect in a DB with a lot of temporal aspects), I can simply run the following before and after my change:
SELECT CHECKSUM_AGG(CHECKSUM(MDT.Column1, MDT.Column2)) FROM DateList DL CROSS APPLY ( SELECT Column1, Column2 from dbo.MyDateTVF(DL.Date) ) MDT WHERE DL.Date BETWEEN '2018-01-01' AND '2018-12-31'
If that query returns the same checksum before and after my change, we can rule out any regressions for those inputs. Nice and easy!