Rabidwolff's Alehouse
"Your place for fun and knowledge."
HOME | Quotes | SAS | C# | BEER





Random Q & A


Coded Tools
XML Output



Standards for data collection used for drug research and development. Created July 21, 2004 for standard FDA (Food and Drug Administration) or other regulatory agency submissions, is expected to be required at some point in the future. Standards allow for increased efficacy and speed by using uniform datacollection, data structure, error checking, statistical analysis and data review.

Current SDTM Version: v3.1.2

  • CDISC = Clinical Data Interchange Standards Consortium
  • SDTM = Study Data Tabulation Model
  • ODM = Operation Data Model
  • ADaM = Analysis Data Model
  • CRT = Case Report Tabulation
  • CDMS = Clinical Data Management System
  • CDISC = Organization that defines the standards.
  • SDTM = General framework to describe the organization of information to be submitted to the FDA.
  • ODM =
  • ADaM = Describes general structure, metadata, and content found in Analysis Datasets that are also sent to a regulatory agency.
    Three types:
    • AD (Analysis Dataset Metadata) = Statistical summaries of efficacy and safety. Examples would be change from baseline, last observation, etc.
    • (Analysis Variable Metadata)
    • Analysis Results Metadata
  • CDMS – Application used to store CRF (Case Report Form) information by a user into a relational database. Examples are Medidata Rave and Oracle Clinical.
  • Define.xml = metadata describing the data exchange structures (domains)

SAS Dataset Makeup

Overview of Data Organization
  1. Organized by domains (one domain would correspond to a SAS dataset or table, some exceptions where same topic data is spread across domains)
  2. Each domain is made up of individual records or observations containing variables.
  3. Variable info is described in a metadata definition file called ‘Define.xml’ which is also submitted to the FDA.
  4. SDTM guide lines specify variables per Domain as below:
    1. REQUIRED – Must be present with no missing values
    2. EXPECTED – Must be present but can have missing values
    3. PERMISSIBLE – Not required, should not be included if no data actually present
Define.xml File Contents
  1. Variable Names per Domain (8 characters or less, SAS conventions and per CDSIC SDTM conventions)
  2. Variable Label describing purpose (40 characters or less, unique per dataset)
  3. Data Type (Character or Numeric)
  4. Controlled Terms or Format per variable
  5. Origin or source per variable
  6. Role per variable
  7. Comments or other relevant info per variable or it’s data
Domain Organizing Data
  • General Classes of Domains:
    1. Events
    2. Findings
    3. Interventions
    4. Other
      • Trial Design
      • Special Purpose
      • Special Purpose Relationships
  • Example Class Domains:
    1. Events
      • AE (Adverse Events)
      • DS (Disposition)
      • MH (Medical History)
      • DV (Protocol Deviations)
      • CE (Clinical Events)
    2. Findings
      • DA (Drug Accountability)
      • EG (ECG Tests)
      • IE (Inclusion/Exclusion Exceptions)
      • LB (Laboratory Tests)
      • MB (Microbiology Specimens)
      • QS (Questionnaires)
      • MS (Microbiology Susceptibility)
      • PE (Physical Examinations)
      • PC (Pharmacokinetics Concentrations)
      • SC (Subject Characteristics)
      • PP (Pharmacokinetics Parameters)
      • VS (Vital Signs)
      • FA (Findings About Events or Interventions)
    3. Interventions
      • EX (Exposure)
      • CM (Concomitant Medications)
      • SU (Substance Use)
    4. Other
      • Special-Purpose:
        • DM (Demographics)
        • CO (Comments)
        • SE (Subject Elements)
        • SV (Subject Visits)
      • Special Purpose Relationships:
        • SUPPQUAL (Supplemental Qualifiers) – Data for variables longer than 200 characters (first 200 in regular domain variable).
        • RELREC (Relate Records)
      • Trial Design:
        • TE (Trial Elements)
        • TA (Trial Arms)
        • TV (Trial Visits)
        • TI (Trial Inclusion/Exclusion Criteria)
        • TS (Trial Summary)
Variable Makeup
  1. Topic Variables
    1. Specify the focus of record
      • --TEST
      • --TESTCD
      • --TRT
      • --TERM
    2. Grouping Qualifiers
      Name use to tie a group of topic variables together within a domain
      • --CAT (Category)
      • --SCAT (Subcategory)
      • --GRPID
      • --SPEC (Specimen)
      • --LOT (Drug Lot)
      • --NAM (Laboratory Name)
    3. Synonym Qualifiers
      Alternative names for a variable in a record
      • --MODIFY (Modifier for --TRT)
      • --DECOD (Decoded Term for --TERM)
      • --LOINC (Equivalent term for --TEST and --TESTCD)
  2. Identifier Variables
    1. Identify the study, subject, domain, record sequence number, etc Examples:
      • STUDYID (Unique identify for study name, such as the protocol name)
      • DOMAIN (Name of the dataset)
      • USUBJID (Unique identify for the subject (STUDYID-[Site ID]-[Subject ID])
      • --SEQ
  3. Timing Variables
    1. Identify date/times or other time points per record Examples:
      • VISIT
      • VISITNUM
      • --TPT
      • --TPTNUM
      • VISITDY
  4. Qualifier Variables
    1. Grouping Qualifiers
    2. Result Qualifiers
      • Describe the specific results per finding, answer topic question
      • Includes raw user entered data and standardized result
      • Examples:
        1. --ORRES
        2. --STRESC
        3. --STRESN
    3. Synonym Qualifiers
    4. Record Qualifiers
      Describes whole record
      • --REASND
      • --AESLIFE
      • --BLFL
      • --POS
      • --LOC
    5. Variable Qualifiers
      Modify or describe specific variable in record
      • --ORRES (Result)
        Qualifier Examples Below:
        • --ORRESU (Result Unit)
        • --ORNHI
        • --ORNLO
        • --ORRES
      • --DOSE
        Qualifier Examples Below:
        • --DOSU (Dose Unit)
        • --DOSFRM
        • --DOSFRQ (Dose Frequency)

Mapping CDMS to CDISC

Common Mapping Techniques:
  • DIRECT – CDMS variable is set directly to a CDISC domain, possibly only modifying to a CDISC standard label
  • RENAME – CDMS variable name is only renamed (RENAME=) and label modified only
  • STANDARDIZE – CDMS values are converted to a separate standard value and unit variables (such as weight in lbs. to kg)
  • REFORMAT – CDMS values do not change but attributes are modified (length, format, data type, etc). Example would be converting date/times to ISO8601 formats.
  • COMBINING – 2 more CDMS values are combined. Example would be CDISC’s USUBJID (Unique identify for the subject (STUDYID-[Site ID]-[Subject ID])).
  • SPLITTING – CDMS variable is split into 2 or more SDTM variables
  • DERIVATION – SDTM variable is created through computation, algorithm, decoding, or specific logic rules
Common CDISC Mapping Notes:
  • Character variables defined as numeric
  • Numeric variables defined as character
  • Variables collected without an obvious corresponding domain in the CDISC SDTM mapping, so must go into SUPPQUAL
  • Several CDMS tables that map to one corresponding domain in CDISC SDTM
  • Dictionary codes not in SDTM parent module, so if needed must be collected in SUPPQUAL
  • Different structure of Lab Normal data to CDSIC standards
  • Adding lab normal standardization, creating metadata needed
  • Vertical versus Horizontal structure, (e.g. Vitals)
  • Additional Metadata needed to describe the source in SUPPQUAL
  • Dates – combining date and times; partial dates; ISO8601 conversions
  • Data splitting issues e.g. Adverse Events and Concomitant Medications


BASE SAS contains a procedure called PROC CDISC that will complete standard data validation (SDTM v3.1) and convert SAS datasets into CDISC ODM complaint XML format. Also allows for importing.

Example Use:
  1. Check SAS dataset for compliance to SDTM v3.1 Format:
    • Verifies that all required variables are present in the data set
    • Reports as an error any variables in the data set that are not defined in the domain
    • Reports a warning for any expected domain variables that are not in the data set
    • Notes any permitted domain variables that are not in the data set
    • Verifies that all domain variables are of the expected data type and proper length
    • Detects any domain variables that are assigned a controlled terminology specification by the domain and do not have a format assigned to them
    • Verifies that all required variable fields do not contain missing values
    • Detects occurrences of expected variable fields that contain missing values
    • Detects the conformance of all ISO-8601 specification assigned values; including date, time, date time, duration, and interval types
    • Notes correctness of yes/no and yes/no/null responses

    • Example Code:

  2. Export SAS Dataset to XML:
  3. Import SAS Datasets from XML:

Rabidwolff Industries | Establisted: 10/15/2011 | Version: 4 8/4/2012 | Page Last Generatated: 9/25/2021 4:40:22 PM