Chapter 1. Introduction

Table of Contents

1.1. About This Guide
1.2. DFdiscover Programming Limits

1.1. About This Guide

This guide is for programmers, and for those who aspire to be. It covers DFdiscover file formats and those tools that are executed at the shell level. As a result, some familiarity with the UNIX operating system and UNIX shell level programming is assumed.

If this is your first encounter with UNIX we strongly recommend that you look for a good UNIX programming book in your local bookstore. We would not hesitate to recommend The UNIX Programming Environment by Kernighan and Pike, and The Awk Programming Language by Aho, Kernighan and Weinberger. The former is an excellent introduction to writing UNIX shell scripts, while the latter describes awk, a simple C-like language which is ideal for manipulating data files. All of the tools described in these 2 books come as standard components of the UNIX operating system.

In the remaining chapters of this guide we will describe:

  • DFdiscover Study Files A description of the location and format of all study files (including the configuration and data files) used in all DFdiscover studies

  • Shell Level Programs Over a dozen shell level commands for doing such things as:

    • exporting data records from a study database

    • selecting individual data fields

    • reformatting exported data records

    • importing data records from other sources into a DFdiscover study database

    • sending and managing faxes

    • getting study configuration parameters

    These programs can be used in shell scripts to create your own programs. When combined with standard UNIX commands (like awk, sed, grep, etc.) they provide an extremely powerful and flexible way of creating study report programs.

  • Utility Programs Descriptions of several infrequently used, yet useful, utility programs that can simplify some DFdiscover management tasks.

  • Edit checks A programming language for writing and executing edit checks that occur in real-time during data validation.

  • Batch Edit checks An extension of the edit checks language that permits execution of edit checks in batch, non-interactive mode.

  • DFsas: DFdiscover to SAS® An environment for generating SAS® job and data files from a DFdiscover study database and study schema.

  • DFsqlload: DFdiscover to Relational Database Tables A program that creates relational database tables from a DFdiscover study database and study schema.

1.2.  DFdiscover Programming Limits

The following cribsheet is for DFdiscover programmers and summarizes all relevant DFdiscover database limits and formats. This cribsheet is to be used in conjunction with the information that follows in this chapter.

Description Limit Comments
DFdiscover Study Number 1-999 The suggested range for study numbers is 1-249 as study numbers of 250-255 are reserved for DFdiscover test and validation studies (e.g. ATK = 254). With the appropriate software license, study numbers 256-999 are available for defining EDC studies.
Plate Number 0-501, 510, 511 Plates 501 and 511 are reserved by DFdiscover for Query Reports and can not be re-defined at the user level. Plate 0 references the new record queue. Plate 510 is reserved by DFdiscover for Reason records.
Visit/Sequence Number (barcoded) 0-511
Visit/Sequence Number (first data field) 0-65535 Any data field representing the visit/sequence number must be defined in the database as field #6 using DFdiscover schema numbering.
Site Number 0-21460 This limit applies to the site number only. A subject identifier is concatenated to the site number to obtain the subject ID.
Subject ID Number 0-281474976710655 For subject IDs that are composed of site # + ID #, this limit applies to the concatenated value of the two. This field could contain 15 digits at maximum.
Any numeric value -2147483647-2147483647 Any numeric field, except the subject ID field, can contain 10 digits at maximum, which include any leading sign and decimal point. This limit applies to the following DFdiscover field types, which have a base numeric value: numeric, visual analog scale (VAS), choice codes, check codes.
Query Use 0 = none 1 = external 2 = internal
Query Type 0 = none 1 = Q&A (clarification) 2 = refax (correction)
Query Category Code 1=missing, 2=illegal, 3=inconsistent, 4=illegible, 5=fax noise, 6=other, 21=missing page, 22=overdue visit, 23=edit check missing page, 30-99=user-defined problem type
Query Status 0=pending review, 1=new, 2=in unsent report, 3=resolved, NA, 4=resolved, irrelevant, 5=resolved, corrected, 6=in sent report
Query Detail Field max 500 characters
Query Note Field max 500 characters
Missed Data Log Explanation Field max 500 characters
Default Date Format YY/MM/DD
Validation Level (system) 0-7 Level 0 represents new, not yet entered, records
Validation Level (user) 1-7 A user cannot assign a validation level of 0 to a data record.
Maximum Data Record Length16384 ASCII (4096 UNICODE) charactersThis is the maximum length that the system can accept and includes 55 characters of overheard maintained by the system. Therefore, the length of data record available for user-defined fields is 55 characters less.
Maximum DFsas Record Length2048 charactersDFsas is unable to process input files for SAS® greater than this size.