GxP Lifeline

Ensure Data Integrity With a Proactive Data Management Strategy


Two years after the U.S. Food and Drug Administration (FDA) published its draft guidance on data integrity, the final guidance remains in the “forthcoming” stage. Meanwhile, regulatory guidelines for ensuring the completeness, consistency and accuracy of data still must be upheld. We are all well aware that there are numerous regulations surrounding data integrity, so I’ll forgo the trite references relating to corralling felines. Instead, I will suggest that a good approach to achieving data integrity compliance is to develop a practical data management strategy around data storage and migration.

Perhaps, we can agree that going digital with data management has been both beneficial and somewhat challenging. Certainly, the ability to automate data collection, storage and analysis, as well as immediately access any data in real time has been positive.

On the other hand, the various, continuously changing technologies for storing and migrating data have made maintaining data integrity a bit unwieldy. For example, as data storage infrastructures evolve, data stored for 10 years or longer may not be readable on newer versions of operating systems or applications. Still, data integrity guidelines require that every 1 and 0 of original data remain intact.

Data migration technologies also evolve. Therefore, keeping up your end of the data integrity compliance agreement means you routinely need to move the data you’re trying to keep safe and undisturbed, yet still accessible, to a new system.

Data Storage

Data integrity guidelines do not prescribe a specific method or technology for data storage. Still, two areas of focus for data storage compliance are integrity and retention.

Data Integrity – Data integrity is established where the data is stored and managed in its original form. The guidelines for maintaining the integrity of stored data include:

  • Data must be the original (or a true copy) and kept secure from modification, corruption or loss.
  • Data must be retained throughout the data lifecycle.
  • Data records must be complete and contain all data history information.
  • Stored data must be accompanied by all metadata, as well as appropriate validation data.
  • Data must be stored in a way that prevents deterioration.
  • Data must be searchable and retrievable for audits or legal review purposes.

Data Retention – Data is retained for both short-term and long-term purposes. Retention dictates how long data must remain in storage, which is determined by the data lifecycle. Short-term retention is in the form of data backups and long-term storage refers to data that is archived. Data backup and data archive are often used interchangeably. However, they each have different roles in the data integrity scheme.

  • Data Backup – Data backup refers to the process of copying data to a secondary site. The purpose is for the ability to restore data that gets lost or corrupted or for disaster recovery. Data backups are performed frequently in order to retain the most recent version of that data. Data backups do not satisfy the regulatory data integrity requirements for storing data and metadata. The data in backups can easily be overwritten so they serve as more of a temporary storage. Data restored from a backup usually involves entire data sets instead of specific files because most data backup technology does not include searching functionality.
  • Data Archive – Archiving data is the process of moving data that is no longer being added to or modified to a separate storage device. Data archives qualify for data integrity compliance because they are capable of maintaining data in its original form, are indexed and are searchable. The primary goals of a data archive are to preserve the integrity and allow for easy accessibility of data. It is recommended that you regularly test and validate your data archive system and functionality.

Data Migration

Data migration is the process of moving stored data and metadata from one storage system to another. Moving stored data is necessary due to storage technology becoming obsolete and no longer able to meet regulatory requirements. Also, data stored in one location can deteriorate over time. Unfortunately, data migration is not a risk-free endeavor. Some of the migration risks include:

Data loss – When data is migrated to a new system, some of the data may not move over from the source system.

Inconsistency – Even when data migration is done efficiently, the data in one column of the source database could appear in a different column in the new database.

Data corruption – Some of the data migrated from the source system may not be compatible with the new system software, which could result in errors or incomplete datasets.

Migration not performed in correct order – It is extremely important that data be migrated in the proper order as there are varied dependencies between the different processes. Skipping processes or performing them out of order could cause subsequent processes to fail.

System incompatibility – Some programs in the new system may not be compatible with the programs used to migrate data from the source system. This could lead to errors with the migrated data.

Data Migration Strategy

Risky or not, the need for data migration is unavoidable. However, many of the risks with migration can be avoided. A good way to mitigate the risks with data migration is to create a data migration strategy that includes testing and verification procedures that help maintain the integrity of data during and after migration.

A common challenge with data migration is companies have a large amount of data in storage. The first task of a migration strategy is to identify which data needs to be migrated. The following are suggestions for processes to include in your data migration strategy:

Data governance structure – When you have data moving from one location to another, it’s important to identify who has rights to access, edit or remove archived data. This information may be included in the metadata.

Understand the quality of existing data – Before you begin migrating data, spend some time assessing the quality of the data in the source system. Is the data complete? Does it comply with data integrity requirements? Will the data be readable on the new system?  

Identify the data that needs to move – Not all archived data needs to be migrated. You can ease the complexity of data migration by clearly identifying and migrating only the data you need keep archived for data integrity compliance.

Protect data at rest and en route – To maintain data integrity and security, keep your data in read-only format throughout the retention timeframe and during migration.

Test migrated data – You can alleviate many data migration risks with data migration testing procedures. To validate the completeness, consistency and correctness of the migrated data, perform a validation test, which involves comparing the data of the source system and the new system using a predefined comparison criteria.

Verify data if format changes – It’s common for the data format to change during migration. A format change is OK. You just need to verify that the data itself remained the same and is still readable in the new data storage system.

When developing your data management strategy, it’s recommended that you devote significant attention to your data migration quality assurance and testing processes. It’s also important to fully document your data migration experience and outcome in the event you need tangible evidence to demonstrate data integrity compliance. Creating a well-structured plan as part of your strategy can be invaluable in your efforts to preserve the integrity of your data.


David Jensen is a content marketing specialist at MasterControl, where he is responsible for researching and writing content for web pages, white papers, brochures, emails, blog posts, presentation materials and social media. He has over 25 years of experience producing instructional, marketing and public relations content for various technology-related industries and audiences. Jensen writes extensively about cybersecurity, data integrity, cloud computing and medical device manufacturing. He has published articles in various industry publications such as Medical Product Outsourcing (MPO) and Bio Utah. Jensen holds a bachelor’s degree in communications from Weber State University and a master’s degree in professional communication from Westminster College.

[ { "key": "fid#1", "value": ["GxP Lifeline Blog"] } ]