Business Benefits of Archiving

Jim Cook
Note: The views expressed in this article are those of the author and do not necessarily represent those of his employer, GxP Lifeline, its editor or MasterControl Inc.


The word “archive” is springing up all over the world of IT and data management. Why now, and what are the implications to an organization working within the highly regulated world of a GxP environment? This article sets out some of the key business drivers around long-term data storage, and offers some lifelines to IT professionals drowning under a deluge of data.

In a recent report , McKinsey projects a 40% growth in global data generated per year, while growth in global IT spending will manage only 5% each year. Against that backdrop McKinsey believes that the value of making use of big data to the U.S. healthcare market could be $300 billion, more than double the total annual healthcare spending in Spain.

So we know that data growth is enormous, that IT budgets will be hard pushed to cope, and yet there are huge benefits to those that are able to successfully ride this roller coaster.


Turning to the regulated world of GxP compliance, regulations such as the Medicines for Human Use Clinical Trials Act (SI 2004/1031) in the UK and a raft of regulations from the FDA and ICH stipulate extensive data retention and archiving requirements. For example, the FDA requires that sponsors and investigators retain “records and reports required by this part for two years after a marketing application is approved for the drug; or if an application is not approved for drug, until two years after shipment and delivery of the drug for investigational use is discontinued and the FDA so notified.” All take a similar line that archiving has to ensure the integrity, authenticity, confidentiality and accessibility of documents for the required retention period.

For all of these regulations, the word “access” is key – which has big implications over the timescales for which data has to be kept.

Examples of the key guidance from the MHRA’s GCP guidelines include:

  • 10.7.1 Retention Times
    • At least 5 years, but often 15 or 30 years
  • 10.7.2. Responsibility
    • Named individual at sponsor and investigator
  • 10.7.5 Tracking
    • Chain of custody
    • Retrieval and removal

Data Archiving

The most common approach to archiving data in corporate environments is to retain data on the same systems where it was first created, typically enterprise storage servers. The capacity of these servers then grows to match ever-increasing volumes of data and the need to keep ever more of it for compliance or reuse. This approach is expensive, difficult to manage, and can put data at risk, for example through hardware failures or accidental data deletion or modification. Keeping infrequently accessed and static archive data on expensive enterprise storage servers is a luxury in today’s challenging economic climate.

  • 10.7.9 electronic archiving requirements:
    • More than one copy
    • More than one location
    • Different formats, media, manufacturers
    • Access controlled
    • Authenticity protected
    • Validated and auditable migrations
    • Periodic retrieval/restore to test access
    • Demonstrate no loss or corruption

These requirements mirror best practice in data preservation, which focuses on holding multiple copies of the data, kept in different locations. This approach must be supplemented by using diverse technologies to reduce the risk of multiple failures, which effectively means not holding all your eggs in one basket.

Figure 1. Digital Preservation Best Practice

These multiple copies must be actively managed by migrating to new storage or formats to address obsolescence and by regularly checking and repairing any loss of data integrity. This is why having multiple copies is so important – if there’s a problem with one of the copies then it can be replaced by replicating one of the other good copies.

Total Cost of Ownership

Most organisations underestimate the long-term total cost of ownership of archiving, especially where the stringent data retention requirements of GxP compliance need to be met. Long-term archiving requires specialist expertise, active data management, the procurement and migration of systems to address obsolescence, and regular auditing to make sure retention metrics are being met.

Figure 2. 20 years of keeping content alive

With a steady advance towards a fully digital document lifecycle, many companies focus on the initial parts of the data lifecycle rather than the whole lifecycle costs. For instance when a business builds a business case to for a 21 CFR part 11 compliant document management system, the focus is usually on putting proper digital signature processes in place to ensure authenticity. A deeper analysis, however, reveals that the main costs typically are in the long-term where integrity, authenticity, confidentiality and availability need to be delivered over multi-decade-level retention periods.

It is clear that the complex and potentially costly requirements associated with compliance and long-term data retention can have a very large impact on IT budgets. What’s worse is that, as McKinsey is telling us, we work in times when growth in data volumes is massively outstripping the growth in IT budgets.

Service-Oriented Approaches

What can a CIO do to bring these demands under control? One answer lies in a major trend being adopted across all industry sectors – the move to a service-oriented approach to delivering information services (IS) across the enterprise. Taking a service-oriented approach exposes a set of fundamental services that form part of the IS environment that can deliver shared resources that are ubiquitous, scalable, reliable, sustainable, and cost-effective.

The service can be delivered by the enterprise, but with applications such as digital data preservation, outsourcing to a specialist provider can deliver significant benefits.

Regulators recognise the benefits that can be derived from contracting specialist services from external suppliers. For example, GCP guidelines specifically addressing the contracting out of archive facilities state:

  • It's OK to use a commercial vendor for storage;
  • The responsibility lies with the sponsor/investigator
  • Integrity, confidentiality, quality, retrieval must be subject to satisfactory Service Level Agreements (SLAs);
  • The suitability of facilities must be assessed in advance;
  • It is critical to use a formal contract with an archive company, and
  • The location of the documents and records must be known at all times.


Life Science companies are facing massive growth in data volumes compounded with ever-increasing pressure on IT budgets and resources. GxP compliance adds an extra dimension of complexity and when document and data retention periods can run to multiple decades, simply keeping data on enterprise storage is untenable.

Arkivum’s A-Stor Pharma, provides a service-based data archive and when coupled withMasterControl’s information lifecycle management (ILM) solutions, offers a cost-effective solution that can reduce the IT budget while simultaneously enhancing an organisation’s GxP compliance.

For further information about A-Stor Pharma contact Arkivum:Tel +44 1249 405060

Jim Cook is CEO and co-founder of Arkivum, a company operating in the rapidly growing cloud storage sector. Arkivum offers a fresh take on how to retain archive data that tackles the cost, complexity and management challenges of long-term data retention. During a career spanning more than three decades, Jim has been instrumental in helping organisations both large and small to achieve their IT and business ambitions. With a skill set that ranges from deep technical understanding through business strategy to investor relations, Jim can provide unparalleled expertise to high technology businesses and their teams with a common aim to create value for themselves and their organisation.