Importance of Data Reconciliation

John Shanks   •   03.20.2019

Since working in the Business Intelligence realm, I’ve worked on many projects, including new installations, maintaining existing systems, and consolidating different systems. There are many things that are common with these projects, but one thing that stands out to me is the importance of data reconciliation. A business intelligence system can be a very complex system and when you are pulling data from source system(s) there are many things that can happen that cause the data to be out of sync with the source.

Issues with Data Flows

  • Extracting data from multiple sources
  • Timing issues
  • Conversion issues
  • Consolidation issues
  • Manual intervention

The technical aspects of a Business Intelligence system can be working but without any sort of measurement showing the accuracy of your data, it is difficult to sell the system to end users.

Reasons for Reconciling

  • Help to sell a system when users are still gaining trust in the information.
  • Help developers stay in front of problems rather than having end users notify them of issues

Business Intelligence professionals should make checking data integrity reports a daily habit.

Working with a Business Intelligence environment, it is important to reconcile the data at the beginning of the process as well as making reconciliations an on-going process. It is best to reconcile the data at each step (staging, ODS, EDW) of the process through to the end.

  • Check points should be put in place to make sure data is flowing accurately from source to destination.
  • Alerts should be associated to these checkpoints to notify the proper people, so they can act accordingly.

In the past it could be a very daunting task to set up a reconciliation test because of all of the different sources of data plus the transformations that occur on the data. When voicing these concerns at my last project, Rob Long, VP of consulting services at Talavant, mentioned that he had developed a solution, using a SQL Server Database, SSIS, and SSRS, that helps alleviate some of the pain of setting up a reconciliation test, plus he had written a data integrity report to display the results of each test.

The remainder of this BLOG post will show how I was able to set up a reconciliation test, execute the test after the nightly ETL process, displayed the results of the reconciliation test, and set up an SSRS data driven subscription to alert the appropriate audience when the reconciliation test failed.

  • Set up a connection for each query that I would be running in the test

  • Setting up the reconciliation test


/* TEST TEMPLATE
Instructions:
1. Populate the Test Attributes (Name, Description, and Group)
2. Define the target server, database, and query for the resultset to be evaluated.
a. Repeat this section as many times as needed.
3. Execute the script (This can be run multiple times, and will make changes to queries.)
*/
--------------------------------------------------------------------------------------------------
-- 1.
-- CREATE Test Definition
--------------------------------------------------------------------------------------------------
DECLARE @TestDefinitionId INT;
EXEC data_testing.AddTest @Name = 'Talavant Reconciliation', -- Unique Test Name, this should be distinctive
@Description = 'Testing ReconDB_1, ReconDB_2, and ReconDB_3', -- A place holder to define this test. Make use of this, you'll appreciate it later
@Group = 'Talavant Staff', -- Put this test in a group to have it executed with other tests at the same time
@TestDefinitionId = @TestDefinitionId OUTPUT; -- Used later. No need to adjust this.

————————————————————————————————–
— ASSIGNING QUERY & CONNECTION COMBINATIONS TO THE TEST
————————————————————————————————–

————————————————————————————————–
1. Query 1
— Assigns a query and connection to the test defined above.
— This section will should be repeated for each result set to be evaluated in the test.
————————————————————————————————–
EXEC data_testing.AssignNewQuery @TestDefinitionId = @TestDefinitionId, — int
@ServerName = ‘Localhost’, — The remote server name
@Database = ‘ReconDB_1’, — The remote database name
@QueryName = ‘ReconDB_1 Staff Table’, — A unique query name, keep in mind that a single query may be run against multiple systems
@CommandText = ‘
select

Test1 As GroupName
,Test2 AS Value

from dbo.ReconTable1;
‘; — OPTIONAL : The SELECT Statement that meets the needs of the test to return results for evaluation

————————————————————————————————–
2. Query 2
— Assigns a query and connection to the test defined above.
— This section will should be repeated for each result set to be evaluated in the test.
————————————————————————————————–
EXEC data_testing.AssignNewQuery @TestDefinitionId = @TestDefinitionId, — int
@ServerName = ‘Localhost’, — The remote server name
@Database = ‘ReconDB_2’, — The remote database name
@QueryName = ‘ReconDB_2 Staff Table’, — A unique query name, keep in mind that a single query may be run against multiple systems
@CommandText = ‘
select

Test1 As GroupName
,Test2 AS Value

from dbo.ReconTable2;

‘; — OPTIONAL : The SELECT Statement that meets the needs of the test to return results for evaluation

————————————————————————————————–
3. Query 3
— Assigns a query and connection to the test defined above.
— This section will should be repeated for each result set to be evaluated in the test.
————————————————————————————————–
EXEC data_testing.AssignNewQuery @TestDefinitionId = @TestDefinitionId, — int
@ServerName = ‘Localhost’, — The remote server name
@Database = ‘ReconDB_3’, — The remote database name
@QueryName = ‘ReconDB_3 Staff Table’, — A unique query name, keep in mind that a single query may be run against multiple systems
@CommandText = ‘
select

Test1 As GroupName
,Test2 AS Value

from dbo.ReconTable3;

‘; — OPTIONAL : The SELECT Statement that meets the needs of the test to return results for evaluation

————————————————————————————————–
— No changes needed.
— Show the overall design of the Test
————————————————————————————————–
SELECT *
FROM data_testing.TestDesign
WHERE TestDefinitionId = @TestDefinitionId;

  • Running the reconciliation test


DECLARE @Execution_id BIGINT
EXEC [SSISDB].[catalog].[create_execution] @package_name=N'TestDataAquisition.dtsx', @execution_id=@execution_id OUTPUT, @folder_name=N'Reconciliation', @project_name=N'ReconciliationFramework', @use3
SELECT @execution_id
DECLARE @var0 SQL_VARIANT = N'Finance Metrics'
EXEC [SSISDB].[catalog].[set_execution_parameter_value] @execution_id, @object_type=30, @parameter_name=N'TestGroupName', @parameter_value=@var0
DECLARE @var1 SMALLINT = 2
EXEC [SSISDB].[catalog].[set_execution_parameter_value] @execution_ID, @object_type=50, @parameter_name=N'LOGGING_LEVEL', @parameter_value=@var1
EXEC [SSISDB].[catalog].[start_execution] @execution_id
GO

Sample Data Integrity Report

You can see that the report below shows the results of query 1 versus query 2 versus query 3.

Data Reconciliation

Data Reconciliation

 

You can also drill into the details of the results.

Data Reconciliation 

 

 


John Shanks

Senior Consultant

Share This Article


At Talavant, our goal is to help you discover, ask and answer the right questions – so you get more value from your data to drive your business forward. You’ll gain a competitive edge with faster, more relevant analytics. It’s all part of our focused, holistic approach that goes beyond tools.

Ready to get started?

Contact Us