DFcmpSchema

DFcmpSchema — Apply the data dictionary rules against the study database

Synopsis

DFcmpSchema [-a] [-v] [-p #] [-d DRF_filename] {-s #}

Description

DFcmpSchema performs a consistency check of the database against the data dictionary for the study. It reports:

  • records that have an incorrect number of fields

  • records that have illegal status, validation levels, or time stamps

  • records that have an impossible CRF image reference

  • key fields that are illegal or inconsistent with the current study or plate

  • values that occupy more columns than the maximum defined for the field

  • values that have an incorrect format

  • fields that have impossible values

  • choice and check fields whose data value does not match the coding defined in the study schema

  • missing delimiter at the end of data records for user defined CRF plates

  • missing or empty required fields on clean primary or secondary data records

  • date values that contain digits, characters, or ?? on clean primary and secondary records, and do not conform to the field imputation or partial rounding variable format

When the data dictionary for an existing study is changed via DFsetup, DFdiscover does not retroactively modify the database to match the new dictionary. DFcmpSchema is useful in this circumstance to identify and locate existing data that now violates the data dictionary.

DFcmpSchema applies two types of checking: basic and exhaustive checking. Basic checking is the default and includes all but the last two bulleted items described above. Exhaustive checking includes everything verified in basic checking plus the last two items for missing/empty required fields and inconsistent date values. Exhaustive checking can be performed by running DFcmpSchema with the -v option.

By default, DFcmpSchema applies basic checks to all primary records in the database, excluding any records in the new record queue. New records are only checked if plate zero is explicitly specified. Because the data fields in new records have not yet been verified, DFcmpSchema only checks fields 1-7, i.e. checking stops at the subject ID field.

For each inconsistency, the report includes the record, plate number, the line number in the exported data file, a synopsis of the inconsistency, the key fields of the record (so it can be retrieved with DFexplore), the data dictionary definition, and the current data value.

Options

-a

Check all primary and secondary records. The default is to check only primary records. [15]

-v

Apply exhaustive checking. Exhaustive checking includes all basic checking plus additional checks on clean records only for

  • missing or empty required fields

  • date values that do not conform to the field's imputation or partial rounding specifications

-p #

Check only the requested plate number. The plate number can be one of the user-specified plates, plate 0 (new record queue), plate 510 (reason for data change), or plate 511 (query).

-d DRF_filename

Create a DFdiscover retrieval file for all problems identified.

-s #

Study number (required).

Exit Status

DFcmpSchema exits with one of the following statuses:

0

The command was successful.

36

The required command-line arguments were not present or were incorrectly specified.

2

The command failed because the study number was not defined, the study schema could not be read, the requested plate number was not defined, or communication with the database server failed.

Examples

Example 4.3. Report on all inconsistencies for study 240

% DFcmpSchema -s 240
2|1|0245R0032001|240|1||1||0|0||||||||2|02/11/11 13:26:52|02/11/11 13:26:52|
E** Plate 1 (Screening Form), line 1: Incorrect field count: 20 should be 21.

Example 4.4. Perform exhaustive checking for inconsistencies on plate 2 for study 254

% DFcmpSchema -v -p 2 -s 254

1|7|0436R0008002|254|2|1|10052|KKL|23/06/04||09/09/24|||180||080|1|1|23/07/04|1|04/09/08 11:08:15|04/11/11 09:33:48|
E** Plate 2 (Patient Entry Form), line 2: Required data field.
        Subject ID='10052', Visit='1'
        Field#10  (status=1, validation=7)
        Name='MEDCODE'
        Desc='Medication Code 1'
        Type=INT      Required  Width=4
        DATA LEN=0 ''

1|7|0436R0006002|254|2|1|10056|POL|00/04/03|1112|07/08/24||111.1|166||070|1|1|05/06/04|1|04/09/08 10:03:14|04/11/11 09:45:32|
E** Plate 2 (Patient Entry Form), line 3: Bad data.
        Subject ID='10056', Visit='1'
        Field#9  (status=1, validation=7)
        Name='EDATE'
        Desc='Entry Date'
        Type=DATE     Required  Width=8   Format='dd/mm/yy'
        Imputation='never'  Range='1940 - 2039'
        DATA LEN=8 '00/04/03'



[15] Missed records are never checked.