Providing Clean Datasets: The Benefits and Challenges of SDTM

There’s no doubt that the data standards body Clinical Data Interchange Standards Consortium (CDISC) has benefited both industry and regulatory authorities by standardizing the management of clinical trial data. But it’s fair to say that significant complexities are involved in converting data to the CDISC standards.

By Han Zou, Manager, Biometrics and CDISC, DXC

Let’s start at the beginning. To remove inconsistency in the way case report forms were designed, CDISC developed the Clinical Data Acquisition Standards Harmonization (CDASH) standard for use by sponsor and study sites. If CDASH were used across the board, then the next step, submission into the Study Data Tabulation Model (SDTM), would be fairly seamless. In reality, however, that’s rarely the case, and life sciences companies are therefore left struggling to convert the data to SDTM.

Not surprisingly, companies have faced many problems and found that their SDTM datasets are full of holes and errors. To manage this, some companies have developed their own set of macros to transfer the raw case report form (CRF) data into the SDTM, but since each study is designed differently, the companies have to adjust these macros every time they receive new raw data.

It’s not simply about converting the raw data to SDTM. Companies also need to run a report on the data and do an analysis. Pinnacle 21, the commercial arm of OpenCDISC, developed a software solution that will tell companies what’s in the datasets: how many variables there are, how many variables there should be, what’s missing, what shouldn’t be there, what’s incorrect and so on.

While this is a useful tool, there are two key considerations for companies: 1) a license to use Pinnacle needs to be bought for each study, and 2) running Pinnacle requires broader CDISC knowledge, such as the processes to get the data into the various formats, how to analyze the Pinnacle report and how to put corrective actions into place.

FDA takes a stricter line

Until recently, the U.S. Food and Drug Administration (FDA) gave companies some leeway with their legacy data, recognizing the challenges they face converting their raw data. However, in the future, the agency is likely to be far stricter. As of the end of 2016, study data standards were enforced for all new drug applications, biologic license applications, and abbreviated new drug applications, and starting at the end of 2017 the same will be required for commercial investigation drug applications (INDs).

That’s going to create some serious headaches for companies. For example, in 2016, we worked with a company that had sought help from another vendor to create its SDTM dataset for a submission. The company wanted a second set of eyes to make sure its data was in order, but when we analyzed their datasets, we were shocked to find more than two pages’ worth of errors in the Pinnacle report — less than 2 weeks before submission. Fortunately, because it included a lot of legacy data, the company was able to submit the dataset as it was, but that’s unlikely to happen in future.

Furthermore, the FDA will require companies to submit Trial Summary (TS) datasets, which provide a high-level overview of a study in a structured format, for all submissions starting at the end of 2017. Although the agency has provided guidelines for constructing the TS, it’s far from straightforward for those without expertise in CDISC.

SDTM is just one aspect of CDISC. In addition, companies must be able to implement and understand the Analysis Data Model (ADaM), which specifies principles for analysis datasets and standards, as well as the TLF (tables, listings and figures).

Addressing all of these CDISC dataset standards requires deep domain expertise, referred to as biometrics, including extensive hands-on experience in working with the FDA. Ensuring high-quality datasets is therefore extremely difficult for most companies to manage in-house. Equally important, however, is that companies need to know that their vendor partner has the necessary depth of knowledge across all these standards — SDTM, ADaM and TLF — to meet FDA requirements and ensure a successful submission.

Learn more about CDISC and how to ensure quality datasets by downloading DXC’s white paper, Harnessing the Power of Clinical Data.


Five Steps to eCTD Submission

From Discover to Trial to Biometrics

Speak Your Mind


This site uses Akismet to reduce spam. Learn how your comment data is processed.