Part 8: This is the final of eight posts on how to measure data quality. This post describes why integrity is a good measure and how it’s used.
Integrity refers to the relationships between data entities. For example, companies are linked to contacts and addresses, contacts can be linked to job titles, and so on.
A type of integrity problem that is common is having orphaned contacts, that is, contacts that are not linked to accounts, or you may have companies with no contacts. There are many scenarios that can happen and typical each scenario requires different treatment to resolve.
Integrity is measured for each scenario (or business rule). A percentage can be applied highlighting the severity of the problem. There may be 5 different scenarios that need to be looked at, hence 5 different percentages will be required.
The scenarios are dependent on the organisation and what is important to the organisation in terms of the integrity it requires.
The other dimensions are:
- Completeness (Part 2 of 8)
- Accuracy (Part 3 of 8)
- Consistency (Part 4 of 8)
- Conformity (Part 5 of 8)
- Currency (Part 6 of 8)
- Duplication (Part 7 of 8)