Skip to content

Technical Reconciliation

Technical reconciliation validates data movement from source to target. As part of Echelon, flow data (or record movement) is captured from point-to-point. Technical reconciliation uses this data in conjunction with a formula to validate if records were all accounted for.

Flow Labels

Storing the flow metrics in the flow table facilitates technical reconciliation.

In order to ensure technical reconciliation can be performed, the label column within the flow table must match the following standards:

Description Label
Number of records present in the source
entity that were selected.
source_count
Number of records inserted into the target entity. insert_count
Number of records that were found to be
identical in the source and the target entity.
match_count
Number of records that were not inserted into
the target entity because of an error.
reject_count
Number of records inserted into the target entity
as part of SCD Type 2.
update_insert_count
Number of records updated in the target entity
as part of SCD Type 2.
end_date_count
Number of records that are not accounted for.
If this value is greater than or less than zero
technical reconciliation has failed.
technical_reconciliation_variance

Reconciliation Formulas

The following formulas have been set up in the constant table to enable technical reconciliation through the Echelon. These formulas live in the constant table and are retrieved during the technical reconciliation job.

Labelling of reconciliation job depends on the target entity type. The results of reconciliation are stored in the flow table.

File

All

Formula

{source_count} - {insert_count}

Labels

Job Label used in flow table
Retrieve the number of records in the header (control file) or the file. source_count
Count the number of records loaded into the staging table. insert_count

Data Vault

Hub

Formula

{source_count} - {insert_count} - {match_count} - {filter_count}

Labels

Job Label used in flow table
Count the number of records in the staging table. source_count
Count the new records loaded into the hub (i.e., the new inserted because new business key identified). Technically a new hash_key has been identified. insert_count
Count the number of records not loaded into the hub because the business key already exists in the hub. Technically the hash_key already exists in the hub. match_count
Count the number of records not loaded into hub due to the hard rule being broken. reject_count
Satellite

Formula

{source_count} - {insert_count} - ({update_insert_count} + {end_date_count})/2 - {match_count}

Labels

Job Label used in flow table
Count the number of records in the staging table. source_count
Count new records loaded into satellite ( i.e., the new is inserted because new business key identified). Technically a new hash_key identified. insert_count
Count the number of records where the business key already exists in satellite but new descriptive information needs to be loaded. Technically hash_key already exists in the satellite but the new hash_diff_key needs to be inserted. update_insert_count
Count the number of records where business key already exists but needs to be end dated because new descriptive information has been identified and loaded. end_date_count
Count the number of records that already exist in satellite and have no update. Technically hash_key and hash_diff_key are the same. match_count
Count the number of records rejected due to the hard rule being broken. reject_count
Link

Formula

{source_count} - {insert_count} - {match_count} - {filter_count}

Labels

Job Label used in flow table
In source, count the number of records with a business key combination that defines the link. source_count
Count the number of new records loaded into the link ( i.e., the new inserted because the new business key combination was identified). Technically a new hash_key has been identified. insert_count
Count the number of records not loaded into the link because the business key combination already exists in the link. Technically the hash_key already exists in the link. match_count
Count the number of records not loaded into the link due to the hard rule being broken. reject_count

Dimensional Layer

Dimension Type 1

Formula

{source_count} - {insert_count} - {match_count} - {update_count}

Labels

Job Label used in flow table
Count the number of records in the source query. source_count
Count the number of new records loaded into the dimension ( i.e., the new is inserted because new business key identified). Technically a new hash_key identified. insert_count
Count the number of records where the business key already exists in dimension but new descriptive information needs to be loaded. Technically hash_key already exists in dimension but new hash_diff_key needs to be inserted. update_count
Count the number of records not loaded into the dimension because the business key combination already exists in the dimension. Technically the hash_key already exists in the dimension. match_count
Dimension Type 2

Formula

{source_count} - {insert_count} - ({update_insert_count} + {end_date_count})/2 - {match_count}

Labels

Job Label used in flow table
Count the number of records in the source query. source_count
Count new records loaded into dimension (i.e.the new is inserted because new business key identified). Technically a new hash_key identified. insert_count
Count the number of records where the business key already exists in the dimension but new descriptive information is needed to be loaded. Technically hash_key already exists in dimension but new hash_diff_key needs to be inserted. update_insert_count
Count the number of records where the business key already exists but needs to be end dated because new descriptive information has been identified and loaded. end_date_count
Count the number of records that already exist in dimension and have no update. Technically hash_key and hash_diff_key are the same. match_count
Count the number of records rejected due to business. reject_count
Accumulating Snapshot Fact

Formula

{source_count} - {insert_count} - {update_count}

Labels

Job Label used in flow table
Count the number of records in the source query. source_count
Count the number of records where the business key already exists in the fact but new descriptive information is needed to be loaded. Technically hash_key already exists in dimension but new hash_diff_key needs to be updated. update_insert_count
Count the number of new records loaded into the fact (i.e., the new is inserted because new business key identified). Technically a new hash_key identified. insert_count
Snapshot Fact

Formula

{source_count} - {insert_count} - {update_count}

Labels

Job Label used in flow table
Count the number of records in the source query. source_count
Count the number of records where the business key already exists in the fact but new descriptive information is needed to be loaded. Technically hash_key already exists in dimension but new hash_diff_key needs to be updated. update_insert_count
Count the number of new records loaded into the fact (i.e., the new is inserted because new business key identified). Technically a new hash_key identified. insert_count
Transaction Fact

Formula

{source_count} - {insert_count}

Labels

Job Label used in flow table
Count the number of records in the source query. source_count
Count the number of new records loaded into the fact (i.e., the new is inserted because new business key identified). Technically a new hash_key identified. insert_count