Storing And Sharing Analytical Data For Regulatory Compliance And Maximum Commercial Benefit

Kevin Smith, Director of Electronic Record Management at Thermo LabSystems, describes the issues and introduces a new solution called eRecordManagerTM
In February 2001, Thermo LabSystems announced its acquisition of Galactic Industries Corp., the spectroscopy software specialist. This lead to the June 2001 launch of a new product for electronic record-keeping and knowledge management.
Why?
As laboratory throughput increases, the volume of data generated is growing exponentially to create a real pressure on science-based organizations to manage analytical data more effectively. The major challenge to lab managers is the long-term, secure storage of the raw data, method details and results which accumulate. Providing the means to easily search, explore and retrieve any piece of data for inspection, visualization and manipulation becomes the next hurdle and this is where the real commercial benefits and competitive advantages lie. Such challenges can be almost insurmountable in laboratories where there are various types of instruments, data systems and file formats from many different manufacturers. eRecordManager is designed to relieve this burden.
As instruments get faster and the choice of instrument wider, managing the volumes of data produced can become a real headache for system managers. Much of the data is required to be stored for a lengthy and indefinite period to comply with FDA regulations regarding electronic records (21CFR Pt 11). Not only is there a need for effective data archiving; there is also a requirement for electronic record keeping and knowledge management.
Much of the 'Knowledge' in a science-based organization is in the spectra and chromatograms it acquires from its laboratory instruments. Being able to share, compare, search and mine these instrument files is seen as vital in improving R&D productivity.
A key feature for the user will be the ability to read and understand multiple different data formats. This is a significant benefit for two reasons:
- in the future the original data system used to acquire the data may not be available, meaning it will not be possible to show the chromatograph peaks etc to regulators
and
- to share the data amongst all scientists in the organization, every application would require installation on every PC.
Much of an organization's analytical data is required to be used as evidence in potential patent infringement or intellectual property protection cases. Long-term secure storage of spectral and chromatographic data from multiple types of instrumentation (such as GC, LC, MS, FT-IR, NMR, UV-Vis, Raman, NIR) and multiple data formats, and elimination of the reliance on the original instrument software, operating system and hardware to search, restore, view and manipulate the data are both key requirements.

Figure 1 - Typical workflow for archiving analytical data. Note, the meta-data can be queried but the analytical data cannot be generally accessed except by the original instrument workstation.
What is the target market for data management?
Quite literally, all science-based organizations form the market for electronic record-keeping and knowledge management solutions.
Highly regulated industries, such as pharmaceutical, biotech, environmental and food and beverage, will be particularly attracted by electronic records compliance features.
Regarding market size, estimates range from $1 to as much as $10 billion US dollars, depending upon which vendor you speak to. Based upon informal investigation with our own customer base, Thermo LabSystems believes the value of the market is closer to the former figure than the latter.
Instrument Life:
Industry observers are estimating that the five to seven year instrument product life cycle is set to reduce to two to three years. This means that over the 10 to 15 years of a drug development process, a company might go through three or more instrument models, each with its own data acquisition and control system, and often with its own file format.
Data that was created ten years ago might still be needed in ten or fifteen-year's time to resolve questions of product liability or regulatory compliance. Imagine the headache of archiving so many data file formats and keeping them easily accessible to search and restore. It will be music to the ears of lab managers that they are no longer required to retain legacy computer and software systems so they can access the archive of proprietary, binary data file formats.
This is all possible because of a unique library of over 150 powerful file converters that automatically generate XML versions of the original data, in the new Thermo LabSystems' product eRecordManager. Archived information can be viewed and reworked on virtually any platform long into the future, effectively future-proofing a customer's data.
So, what's the regulator's view on file conversion?
In its comments on 21 CFR Part 11, the FDA recognized the financial burden that could be placed on companies if it forced them to maintain obsolete instrument and computer equipment in order to maintain records that are ‘true' (see panel below). The conversion of data files into a standard, normalized format appears to have regulatory support.
What do you mean by the term normalized in this context?
By normalized we mean automated conversion to a platform-neutral format that is based upon XML. XML is acknowledged as the industry standard format for data storage and exchange and, as such, enjoys acceptance around the world and will have longevity. For example, it is anticipated that XML will last much longer than the Windows Metafiles that are used by vendors of other eRecordManager-type systems.
As part of 21 CFR Part 11, the US FDA indicates that it recognizes the long-term need for normalized data format, provided it is an "accurate and complete" copy of the original raw data. As you would expect, eRecordManager also archives the raw data in its original format; data that customers can retrieve and use provided the original data system is still operable.

File Conversion Formats:
Why is Thermo LabSystems adopting XML as its converted file format? XML (eXtensible Markup Language) has become the industry standard format for data storage and exchange, mainly because it allows the accurate representation of any data structure. Such acceptance ensures XML will predominate for many years to come, regardless of the evolution of operating systems and computer hardware. XML files are ASCII text-based and therefore retain the ‘knowledge' in the data. Many large organizations—each with much more at stake than Thermo LabSystems—have made significant commitments to XML.
There are many aspects in favor of XML, including:
- It is publicly-available; managed by the not-for-profit World Wide Web Consortium (W3C)
- Being both self-describing and open ASCII text, , XML-based files are essentially ‘future proof'.
- Public-domain Schema for specific data types are used to guide its use and against which documents can be externally validated
To satisfy the long-term record keeping requirements, eRecordManager archives both the original raw data files from the instrument software, along with the normalized representation in XML. Users with access to the eRecordManager archive can view the normalized version of the data from any computer. In addition, the XML or the original data files can be retrieved for use with other software applications, though the latter relies on the original software and hardware being available.
A further problem with relying on mechanisms such as Windows Metafiles is that the same fonts and symbol sets must be available on both the original workstation where the image was created and the workstation where it is viewed. Unlike systems that rely on uniformity in computing environments, application-independent data archived by eRecordManager, is complete and self describing, ensuring that the Electronic Record will not ‘change' when it is viewed on different workstations. This is clearly a pre-requisite for any system claiming to be compliant with 21 CFR Part 11, or to be relied upon during patent litigation.
Knowledge Management
As its name suggests, eRecordManager is a solution for the management of electronic records. The 'eRecord' aspect refers to 'Electronic Records,' as defined by the FDA in its 21 CFR Part 11 ruling that deals with the requirement for secure archiving and the ability to retrieve records in the future. The 'Manager' aspect refers to Knowledge Management and the ability to share all this information across the organization.
Spectra and chromatograms are the fundamental units of data upon which calculated results and subsequent conclusions are based. Putting this data into one place in a common format such as XML provides the ability to data-mine, compare and visualize instrument data that is so vital in improving R&D productivity. As global drug discovery and development projects become the norm, requiring interaction from specialists around the world, this is particularly important.
The ability to easily access data from throughout the organization also aids the development of new ways of analyzing samples and predictive models that are impossible when the data is scattered across the company in individual instrument workstations. Access to past research avoids redundancies such as the repeating of work on identical compounds. eRecordManager helps organizations to improve efficiency by providing a structured central repository of knowledge.

Figure 2 - There are trade-offs between the level of information content and transportability of data when using different storage representations for analytical instrument measurements.
Real Access to the Real Data
Due to the archival of XML-based files, Thermo LabSystems claims eRecordManager is unique in terms of its ability to free so many types of data files from the software applications that created them and to make them available for viewing and manipulation.
Using any workstation, the user is able to view the real data as acquired by the instrument, including 2D and 3D representations of complex data structures such as LCMS and FT-IR. eRecordManager allows the user to view right down to the individual XY data points of a trace as they came off the detector. It is possible to expand the signal to see precisely, for example, where the original chromatography data system positioned the baseline and peak characteristics. Such detail is imperceptible from a picture saved as a PDF or Windows Metafile, where expansion would merely ‘thicken the curve'.
Thermo LabSystems contests that conventional data archiving systems only allow the restoration of data into the original data system application, or as a ‘picture' of a report that, in fact, only represents a small subset of an electronic record, as defined by the FDA.
Organizations that rely on the archiving of a graphical representation of a final report would be advised to consider its limits. The content of their archived file is restricted to that data the scientist included in their report. What if a colleague or regulatory inspector wishes to view other information not incorporated in the report? Furthermore, the ability this type of picture file presents to view, rework or manipulate the real data is extremely limited.
What evidence is there that instrument manufacturers will be willing to co-operate?
Early reaction at Pittcon in March 2001 has confirmed an atmosphere of co-operation exists and many unsolicited calls have been received from major instrument vendors looking to work with Thermo LabSystems in this area.
In Thermo LabSystems' own market, Atlas controls instruments such as those from Agilent and Waters and many of these developments have been led by the instrument vendor. Across Thermo, there is a great deal of evidence of co-operation. For example, Thermo Finnigan's mass specs have HPLCs integrated such as Waters and Agilent.
On the subject of integration, ultimately the customer will decide.
In summary
Thermo LabSystems' acquisition of Galactic Industries has resulted in the launch of a powerful single integrated application that looks set to position Thermo LabSystems as a formidable player in the electronic record management software sector. Compliance of the eRecordManager software itself is ensured via a series of security features including comprehensive audit trailing, electronic signature functionality and data security implemented through operator/role/group configuration.
Not only are all the issues surrounding 21 CFR Part 11 addressed, but the product enables data-mining, viewing and comparison of data without any reliance on the original instrument data system to allow customers to benefit from genuine knowledge management.
About the Author:
Having joined Thermo LabSystems in 1984, Kevin Smith is the Company's Director of Electronic Record Management with the specific role of managing business development for Thermo LabSystems' products and services in the areas of electronic record-keeping and knowledge management. Prior to being appointed to this position, Kevin held the role of R&D Director at Thermo LabSystems. He has a BSc in Physics and Computing from the University of Kent at Canterbury, UK. Previously, Kevin has worked with Ilford Limited and British Petroleum, managing scientific IT systems and developing bespoke packages. He is a frequent speaker at conferences concerning IT directions and scientific computing.
For more information please call Thermo LabSystems at +44-800-0185227 or e-mail info@thermolabsystems.com.
© 1995-2001 Thermo LabSystems. All rights reserved.