A golden record is a unique record that that does not have any duplicates, and where associated records are linked together correctly. Lets look at a couple of examples:
For B2B data, a company such as may have two subsidiaries; say Acme Finance Ltd and Acme Insurance Ltd. Both these subsidiaries should point to its parent company. If no other record exists for Acme Group Ltd (i.e. no duplicates exist) then the record for Acme Group Ltd is considered a golden record. We would also check to ensure the companies are currently trading.
For B2C data, a house may have three people living there: Mr Steven Gardiner, Mrs Hilary Gardiner and Ms Emma Gardiner. The database will contain three records with the same address. As the surnames are the same, it is safe to assume that they are from the same household. One of the residents can be marked as the primary contact, which can be considered the golden record. We would also check that these residents haven’t moved recently and that you have permission to contact them and that they are still alive.
Creating business hierarchies or household records follows the same process of creating golden records. The following diagram illustrates the process:
Stage 1 - Standardisation
Before golden records are searched for duplicates it’s important that the data is cleansed to maximise the correct number of duplicates found. Some common issues that arise are:
1. Data is in the wrong fields e.g. a telephone number is stored in a job title field
2. Incorrectly formatted data could possibly affect Deduplication results e.g. for a surname field ‘Shelley (BSc.)’ – the post nominal initials can affect Deduplication results
3. Matching against reference files e.g. address matching against Postal Address File (PAF) or matching against a consumer names or business names data to verify details.
Once all the fields are standardised and cleansed to the best they can be, then we are ready for the next stages.
Stage 2 – Data Enhancement
To further identify duplicates and create golden records, additional information may be needed to help the decision making process. For example, with business data we can append a DUNS number, its Parent DUNS number and the Ultimate Parent DUNS number. This allows business hierarchies to be created, in addition to identifying the head company. For B2C, adding information such as the main income provider would lead to the decision of deciding who the primary contact for that household is.
Stage 3 - Deduplication
Once this extra information is populated we can deduplicate the database and assign a cluster identifier that potentially contains the golden record and its associated linked records. A cluster is an identifier that is the same for all the records that are linked in some way.
The process of Deduplication is very much a science. It often requires many steps to identify all the real duplicates in a file. Here is an example of a three step Deduplication process:
1. If Company Name, Address Line 1, Postcode are exactly the same then this is a duplicate
2. For the remaining records, if the Company Name and Postcode are the same, and there is also match of 70% for Address Line 1 then this is a duplicate
3. For the remaining records, if the Company Name, Postcode and Address Line 1 match with a confidence value of 70% then this is a duplicate
For more details on how to find your duplicates, please view other posts or contact Acuate for more information.
Stage 4 – Automatic Merging
Deciding how to merge records can be difficult task. It is important to understand the business requirements for merging these records. You can merge just using the basic data available, or you may want to take into consideration extra factors such as:1. Frequency of purchases
2. Size of purchases
3. Gold, Silver or Bronze customer status
4. Age of record
Using commercial data to define the merging criteria is key to creating high-quality golden records. There are lots of merging criteria that can define a golden record.
this has been done, some or all of the merging criteria may be automatically
merged depending on the complexity of the underlying data structure and/or the
complexity of the merging rules.
Any related information must also be merged. For example, if two company records are to be merged, then both their contact records will also need to be merged.
Stage 5 – Manual Merging
Often some rules cannot be merged automatically and require human intervention to determine which is the golden record. The decision to assign the golden record and what to merge should be undertaken by a data steward or a marketing executive who has knowledge of the data.
Stage 6 – Deletion
Once records have been merged, there will be records that are no longer needed. These are often called orphan records, or records that are not linked to anything. These records need to be deleted as they are redundant.
The above process is a brief overview of the procedures required to create golden records or a single customer view. It does not matter whether the data is in one file, or a complex database with hundreds of tables. The high-level process is the same, but the technical challenges will be around divulging the complexity of the structure.
It is vitally important that companies hold golden records for both B2B and B2C data. Having good data quality will maximise marketing campaigns and allow the business to tactically manage accounts.