Having worked on a number of data migration projects over the years, I have my own list of the 11 must-dos for any successful data migration. These do, of course, apply to other activities that we undertake, including data integration; however, for me, they act as the perfect recipe for a trouble-free data migration.
These steps are not necessarily in the order that you would undertake them, some of these tasks will have dependencies, and some tasks will have a number of iterations throughout the entire data migration process.
Hopefully this is a check-list that ensures that nothing is missed.
For now, I’ll let someone else worry about Stake Holder Management, Programme Management, Project Management, Business/Systems analysis, and the Waterfall vs. Agile debate.
- Understand your data.
- What is it for?
- Where has it come from?
- Where is it going?
- Understand your Reporting and Auditing needs.
- Understand the Reporting and Auditing needs of the business.
- Qualify success.
- Define the process for data that fails migration.
- Acquire your data.
- Keep it simple.
- Minimum of fuss and impact on source systems.
- Identify your data.
- Identify the source.
- Identify the row number.
- Identify the row (add your own UUID).
- Profile your data.
- Up-front profiling.
- In-flight profiling.
- Quality check your data.
- Reject rows that do not meet the minimum quality standard.
- Minimise risk of failure.
- Mandatory fields.
- Constrained Values.
- Formats / Conversions.
- Length / Scale / Precision checks.
- Reject entire inputs that do not meet the minimum quality threshold (percentage).
- Prepare your data.
- How does it stand on its own merits?
- Apply cleansing rules.
- Create look-up keys (for Constrained Values).
- Finalise your data.
- Shape your data so that it is ready for loading.
- Minimise risk of failure.
- Apply Foreign Keys.
- Reject rows that fail Foreign Key validation.
- Constrained Value mapping.
- Apply Formatting / Conversion rules.
- Field Truncation.
- Reject inputs that do not meet the minimum finalisation success threshold (percentage).
- Load your data.
- Minimal risk of failure.
- Reject inputs that do not meet the minimum load success threshold (percentage).
- Checkpoint your migration.
- Allow for sensible restart points in the event of a failure.
- Restart from an appropriate point on resolution of an underlying issue.
- Test your data.
- Review output from each stage.
- Review finalised data (and then review again).
- Review data in target application.
Reporting / Auditing
- Collect Statistics.
- Stamp your target data with your own row identification (UUID).
- Retain data from intermediate steps in files or database.
- Report on your data migration.
- Audit your data migration.
- Be able to show data lineage.