Migrating Data Successfully


Businesses invest a lot of money (usually more than budgeted), time, and other resources to migrate their legacy data. The reasons for making such an investment range from upgrading to the latest version of an enterprise resource plan to making old data compatible with new information systems. The success of these data migration projects depends on a number of critical factors. This article looks at a few of them.

Consider Data Migration as a Separate Project
In most organizations, data migration occurs either as part of the customization of an enterprise resource plan (ERP) package or during implementation of a new system. There are a few pitfalls to this approach. Usually the cost and time frame of a project cover both customization and implementation, which includes data migration. In most cases these approximations, along with the risks associated with data migration, are underestimated. Customization and implementation either take precedence or are parallel to this data migration. But in most cases, a majority of problems at the end of a project are due to poor data migration tools, techniques, or utilities. The newly implemented systems or customized code may work well independently, but several problems surface with migrated data.

In order to resolve this issue, treat the data migration separately from customization and implementation. Data migrations are most often performed by conversion utilities that are developed in-house for this specific task. Since conversion utilities are pieces of software, it seems logical to consider the development and testing of that software as separate projects. This allows for independent design, development, and testing of a utility before executing it on the existing system. Conversion, being a one-off process has a huge impact on customization or implementation.

Review the Quality of the Data
Quality of data can pose serious issues for data migration projects. Poor quality can lead to extreme delays and also cause projects to fail. In any given organization, data is created or manipulated as a result of people, process, and technology. Lack of user knowledge, absence of stable and robust processes, and missing relationship linkages can lead to poor quality data in any system. The most common issues that emerge with data are:

  1. 1. Incomplete Data--Data can be missing partially or completely. For example, if a record has six fields and some of them are empty, the data is deemed to be incomplete. Such data records cause problems during migration unless the utility is designed to handle these scenarios.
  2. Duplicate Data--Multiple instances of the same data is a big problem during data migration. It's unlikely that conversion will ignore duplicate data records. Since the data format is different in each of the duplicate records, though the information is the same, it is difficult to narrow down and ignore duplicate data records.
  3. Data Non-conformity--This refers to information stored in non-standard formats such as free text fields.
  4. Inconsistent Data--When merging various systems, the data can lack consistency and represent wrong information.
  5. Inaccurate Data--Data deteriorates over time, which can cause a lot of difficulties during migration.
  6. Data Integrity--Missing relationship linkages can drastically degrade the quality of data and pose problems during migration.

Prior to developing a conversion utility for data migration, it is worthwhile to research the type of data presented in the system. This is similar to the requirements gathering phase of any project. Listing all the types of data that need to be converted reduces the risk of errors.

Create a Test Database
The successful delivery of any development project relies on how effectively the user requirements are translated into a working application. Similarly, a successful data migration should convert all the data correctly, and all business processes should operate on the converted data without any errors. In order to achieve this, we should have a correct and realistic idea of the data and range of inconsistencies with which we are dealing. This is best achieved using a snapshot or a scaled-down version of the actual database. As the utility is developed, it can be executed repeatedly on this test


About the author

AgileConnection is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.