data
ecsrc
lib
lut
work
return
exit
dfblank
dfmissing
dfmissval
dfmisscode
dfmissingrecord
dflostcode
dflosttext
dfaccess
dfaccessinfo
dfalias2id
dfask
dfbatch
dfcapture
dfcenter
dfclosestudy
dfdate2str
dfday
dfdirection
dfentrypoint
dfexecute
dfgetfield
dfgetlevel
dflevel
dfgetseq
dfhelp
dfid2alias
dfillegal
dfimageinfo
dflegal
dflength
dflogout
dfmail
dfmatch
dfmessage
dfdisplay
dferror
dfwarning
dfmetastatus
dfmode
dfmoduleinfo
dfmonth
dfmoveto
dfneed
dfpageinfo
dfpassword
dfpasswdx
dfplateinfo
dfpref
dfprefinfo
dfprotocol
dfrole
dfsiteinfo
sqrt
dfstay
dfstr2date
dfstudyinfo
dfsubstr
dftask
dftime
dftoday
dftool
dftrigger
dfuserinfo
dfvarinfo
dfvarname
dfview
dfvisitinfo
dfwhoami
dfyear
int
dfaddqc
dfaddmpqc
dfanyqc
dfanyqc2
dfanympqc
dfdelmpqc
dfeditqc
dfreplyqc
dfresqc
dfunresqc
dfqcinfo
dfqcinfo2
dfaddreason
dfanyreason
dfautoreason
dfreasoninfo
dflookup
while
break
continue
DFdiscover Release 5.9.0
All rights reserved. No part of this publication may be re-transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of DF/Net Research, Inc. Permission is granted for internal re-distribution of this publication by the license holder and their employees for internal use only, provided that the copyright notices and this permission notice appear in all copies.
The information in this document is furnished for informational use only and is subject to change without notice. DF/Net Research, Inc. assumes no responsibility or liability for any errors or inaccuracies in this document or for any omissions from it.
All products or services mentioned in this document are covered by the trademarks, service marks, or product names as designated by the companies who market those products.
Google Play and the Google Play logo are trademarks of Google LLC. Android is a trademark of Google LLC.
Dec 04 2024
For software support, Please contact the DFdiscover team:
via email, help@dfnetresearch.com.
Visit our website, https://www.dfnetresearch.com.
A number of conventions have been used throughout this document.
Any freestanding sections of code are generally shown like this:
# this is example code code = code + overhead;
If a line starts with # or %, this character denotes the system prompt and is not typed by the user.
#
%
Text may also have several styles:
Emphasized words are shown as follows: emphasized words.
Filenames appear in the text like so: dummy.c.
dummy.c
Code, constants, and literals in the text appear like so: main.c.
main.c
Variable names appear in the text like so: nBytes.
nBytes
Text on user interface labels or menus is shown as: Printer name, while buttons in user interfaces are shown as Button .
Menus and menu items are shown as: File > Exit.
This guide is for programmers, and for those who aspire to be. It covers DFdiscover file formats and those tools that are executed at the shell level. As a result, some familiarity with the UNIX operating system and UNIX shell level programming is assumed.
If this is your first encounter with UNIX we strongly recommend that you look for a good UNIX programming book in your local bookstore. We would not hesitate to recommend The UNIX Programming Environment by Kernighan and Pike, and The Awk Programming Language by Aho, Kernighan and Weinberger. The former is an excellent introduction to writing UNIX shell scripts, while the latter describes awk, a simple C-like language which is ideal for manipulating data files. All of the tools described in these 2 books come as standard components of the UNIX operating system.
In the remaining chapters of this guide we will describe:
DFdiscover Study Files A description of the location and format of all study files (including the configuration and data files) used in all DFdiscover studies
Shell Level Programs Over a dozen shell level commands for doing such things as:
exporting data records from a study database
selecting individual data fields
reformatting exported data records
importing data records from other sources into a DFdiscover study database
sending and managing images and faxes
getting study configuration parameters
These programs can be used in shell scripts to create your own programs. When combined with standard UNIX commands (like awk, sed, grep, etc.) they provide an extremely powerful and flexible way of creating study report programs.
Utility Programs Descriptions of several infrequently used, yet useful, utility programs that can simplify some DFdiscover management tasks.
Edit Checks A programming language for writing and executing edit checks that occur in real-time during data validation.
Batch Edit Checks An extension of the edit checks language that permits execution of edit checks in batch, non-interactive mode.
Writing Your Own Reports Create and install custom reports for a DFdiscover study.
DFsas: DFdiscover to SAS® An environment for generating SAS® job and data files from a DFdiscover study database and study schema.
DFsqlload: DFdiscover to Relational Database Tables A program that creates relational database tables from a DFdiscover study database and study schema.
The following cribsheet is for DFdiscover programmers and summarizes all relevant DFdiscover database limits and formats. This cribsheet is to be used in conjunction with the information that follows in this chapter.
This chapter describes the files maintained under each DFdiscover study directory. It starts with a brief overview of the top-level directories and then describes the files kept under each directory in detail. A similar chapter, System Administrator Guide, DFdiscover System Files, describes the files located under a DFdiscover installation directory.
The following directories are part of a standard DFdiscover study definition.
bkgd This directory holds CRF background images created by DFsetup. Three files are created for each study plate. These include the plate background used by DFsetup (plt###), the data entry screen used by DFexplore (DFbkgd###), and a background used when printing CRFs containing database values from DFprintdb or DFexplore (DFbkgd###.png).
bkgd
NOTE: This directory must not be used to store any other files. Extraneous files will be deleted when CRF images are published from a development study to its production study, or when reverting a development study to the current state of its production study.
data This directory holds the study data files, queries data file, and journal files. The following files are described in detail later in this chapter in The study data directory:
plt###.dat - per plate data files the study data files are created from the data recorded on the study CRFs
plt###.dat
DFqc.dat - the query database the quality control data file contains all field level queries added to flag CRF problems and request data clarifications from investigators
DFqc.dat
DFreason.dat - reason for change records the reason for change data file contains all field level reasons for data changes clarifying, if required, why a field's data value was changed
DFreason.dat
DFin.dat - the new records database the new record queue contains all records that have been received and ICRed by the DFdiscover software but not yet validated
DFin.dat
plt###.dat - missed records missed records serve as placeholders in data files indicating that requested data records will never be available
plt###.ndx - per plate index files per plate index files that speed database searching and hold record lock information
plt###.ndx
YYMM.jnl - monthly database journal files the journal files record all database transactions and provide an audit trail of additions and modifications
YYMM.jnl
DFaudit.db - sqlite audit trail sqlite version of the journal files
DFaudit.db
dfsas This directory contains any stored SAS® jobs that were created from DFexplore.
dfsas
dfschema This directory contains any stored DFschema files which are used to track changes to the study setup. These are used by DFaudittrace to generate records for DF_ATmods.
dfschema
drf This directory contains any .drf files (DFdiscover Retrieval Files) created by server-side tools, including reports and the program DFmkdrf.jnl. The common .drf files created by DFdiscover reports include:
drf
DupKeys.drf Lists any duplicate primary keys found in the database by the last execution of DF_XXkeys.
DupKeys.drf
VDillegal.drf Lists any illegal visit dates found in the database by the last execution of DF_XXkeys.
VDillegal.drf
VDincon.drf Lists any inconsistent visit dates found in the database by the last execution of DF_XXkeys.
VDincon.drf
DFunexpected.drf Lists any unexpected data records found in the database by the last execution of DF_QCupdate.
DFunexpected.drf
ecbin Any study specific scripts or programs that are run by edit check function dfexecute() and any Plate Arrival Trigger scripts defined in the DFsetup Plates View, must be stored in the study level ecbin directory. Executables can also be stored in the DFdiscover level ecbin directory for use in all studies. The study level ecbin directory has priority if the same program or script is stored in both locations.
ecbin
dfexecute()
ecsrc This directory contains the edit check files: DFedits and any other edit check source files referenced in 'include' statements. Include files can also be stored in the DFdiscover level ecsrc directory. The study level ecsrc directory has priority if the same include file is stored in both locations.
DFedits
lib All study configuration files, created through both the DFadmin and DFsetup tools, are located in this directory. The following files are described in detail later in this chapter in The study lib directory:
DFcenters - sites database Study sites database includes information on all clinical sites participating in the study.
DFcenters
DFccycle_map - conditional cycle map Defines cycles that may be required, unexpected or optional, depending on specified database conditions (optional).
DFccycle_map
DFcvisit_map - conditional visit map Defines visits that may be required, unexpected or optional, depending on specified database conditions (optional).
DFcvisit_map
DFcplate_map - conditional plate map Defines plates that may be required, unexpected or optional, depending on specified database conditions (optional).
DFcplate_map
DFcterm_map - conditional termination map Defines database conditions that signal termination of subject follow-up (optional).
DFcterm_map
DFCRFType_map - CRF type map Defines categories for different CRF background types (e.g. translations) (optional).
DFCRFType_map
DFcrfbkgd_map - CRF background map Defines visits with CRF backgrounds used across multiple visits (optional).
DFcrfbkgd_map
DFedits.bin - published edit checks The "published" equivalent of the edit checks - for use by DFexplore clients only (optional).
DFedits.bin
DFfile_map - file map Specifies each unique CRF plate used in the study.
DFfile_map
DFlogo.png - study logo Study logo displayed in data entry screens and reports.
DFlogo.png
DFlut_map - lookup table map Defines and associates lookup table names with directory names of lookup tables. Also includes definition for query lookup table, if it exists (optional).
DFlut_map
DFmissing_map - missing value map Missing value codes (and labels) used in the study (optional).
DFmissing_map
DFpage_map - page map Used to specify descriptive labels to replace the visit/sequence and plate numbers that are used by default to identify problems on the Query Reports (optional).
DFpage_map
DFqcsort - Query and CRF sort order Specifies a sort order for queries as written on Query Reports. The default is to sort by field number within plate number within visit/sequence number within subject ID (optional).
DFqcsort
DFqcproblem_map - Query category code map This file contains all the system-defined and user-defined query category codes.
DFqcproblem_map
.DFreports.dat - reports history Personal file of DFdiscover report commands (and options) corresponding to reports which the user runs in the reports tool (optional).
.DFreports.dat
DFschema - database schema Study database schema (or dictionary) describes all variables on all plates, as specified in the setup tool.
DFschema
DFschema.stl - database schema styles Study database schema (or dictionary) describes all variable styles defined for the study in the setup tool.
DFschema.stl
DFsetup - study definition This is a JSON format file maintained and used by DFsetup to record all study setup information, including all plate, style and variable definitions.
DFsetup
DFsetup.db - sqlite setup database This is the sqlite database for setup and setup change history.
DFsetup.db
DFserver.cf - server configuration Configuration file for the study database server.
DFserver.cf
DFsubjectalias_map - subject alias map Used to specify descriptive labels to optionally replace the numeric subject ID used as an identifier throughout DFdiscover.
DFsubjectalias_map
DFsubjectalias_map.log - subject alias change log Log all changes to the mapping of subject ID to subject alias over the course of a study.
DFsubjectalias_map.log
DFtips - ICR tips This file contains the coordinates, field type and legal values for all variables defined on all plates. It is used by the ICR software to create the initial data record from new CRF pages that arrive as images.
DFtips
DFvisit_map - visit map This file describes the scheduling of subject assessments during the trial and the CRF pages expected at each assessment.
DFvisit_map
DFwatermark - watermark This file describes watermarks used for printed output assigned by role.
DFwatermark
QCcovers - Query cover pages This file contains formatting information to be included in a Query Report cover sheet. To include a cover sheet for a Query Report, QCcovers must first be defined.
QCcovers
QCmessages - Query Report messages This file contains messages to be included in Query Report cover sheets. More than one message may be included in a single Query Report, however, DF_QCreports expects the cover sheet information and all messages to fit on a single page.
QCmessages
QCtitles - Query Report titles This file describes how the report title and each title of the 3 sections of a DFdiscover Query Report are to be customized. DF_QCreports checks for the existence of this file and will use it if it exists, otherwise standard titles will be produced.
QCtitles
DFreportstyle - DFdiscover Report styling This file contains styling "instructions" for reports, excluding Legacy reports. Styling includes font choices and colors used in graphs and charts.
DFreportstyle
lut This directory contains all study specific lookup tables. Lookup tables can also be stored in directory lut at the DFdiscover level. Study level files have priority if a lookup table with the same file name is stored in both locations. Lookup table file names must be associated with an edit check name in DFlut_map (found in the study /lib directory) before they can be used in edit checks.
/lib
pages This directory holds all standard (SD) resolution (100 dpi) CRF image and supporting document files for all CRF pages received or uploaded during the study. Each CRF page received as an image or PDF is stored as a PNG file (Sun Rasterfile in pre-2014 DFdiscover releases), and requires about 25K bytes of disk space (i.e. 40 CRF pages per MB). Supporting documents can be audio, video, DICOM or PDF files and will be in their native format.
pages
DFdiscover creates subdirectories, below the study pages directory, in which to store the CRF image files. These subdirectories are named by the year and week (in yyww format) in which the CRFs arrived. For example, a study which started on January 1, 1995 would have subdirectories named: 9501, 9502, 9503, etc. through to the end of the trial.
yyww
Within each of the yyww subdirectories, the individual CRF page files are named by the concatenation of the sequence number of the image (starting at 0001 each week) in which they arrived during that week, and their page number within the document. The file-naming format is ffffppp. For example, if the first document to arrive in the week of 9501 contained 3 pages, it would be represented by the following files beneath the study pages directory: 9501/0001001, 9501/0001002, and 9501/0001003. The ffff part of the file name is a sequential value constructed from 4 characters, each character taken from the alphabet:
ffffppp
9501/0001001
9501/0001002
9501/0001003
ffff
0 1 2 3 4 5 6 7 8 9 B C D F G H J K L M N P Q R S T V W Y Z
This is essentially the digits 0 through 9 followed by the uppercase letters A through Z, with the exception of the letters A, E, I, O, U, and X. This part of the file name thus lies between 0001 and ZZZZ, which allows for a total of 30x30x30x30-1 = 809,999 images per week. Example file names within a week include:
0001005 5th page of 1st document
0001005
000B001 1st page of 10th document
000B001
000Z011 11th page of 29th document
000Z011
0010002 2nd page of 30th document
0010002
01C0004 4th page of 1230th document
01C0004
pages_hd This directory holds all higher (HD) resolution (300 dpi) CRF images and supporting document files received or uploaded during the study. DFdiscover creates subdirectories and stores the files in the same manner as the pages directory.
pages_hd
reports It is recommended that all study specific reports, i.e. those programs which have been created specifically for a particular clinical trial, are stored in the study reports directory. This helps to standardize maintenance across studies.
reports
Documentation for study specific reports must also be stored in this directory in a file named .info, which uses the same format used to document the standard DFdiscover reports. The specific requirements of this file and its format are described in Writing Documentation for Study Specific Reports.
.info
reports/QC This directory is used to store the standard DFdiscover Query Reports, created by DF_QCreports. The files and directories maintained under this directory are briefly described below.
reports/QC
The naming convention for Query Reports is a zero padded 3 digit site ID (if the site ID is greater than 999, it will be a 4 digit number), followed by a dash, and then the date on which the report was created. For example: 055-150815 would identify a Query Report created for site 55 on Aug 15, 2015. Thus you can tell from a Query Report name, both whom it was created for and when.
Although this is helpful it does have one disadvantage, namely, it means that you cannot create more than one Query Report per day for an individual clinical site. If you do, the earlier one will be over-written. However, since Query Reports are sent to the clinical sites, and this isn't something you should do more than about once a week, and certainly not more than once a day, this approach to naming Query Reports works.
Query Reports remain under QC until they are sent to the clinical sites. Thus any reports that you see at this level have not been transmitted to the study clinics.
QC
QC/sent
QC/internal
QC_LOG
QC_NEW
SENDFAX.log
SENDFAX.qup
18273_N.refax
18273_N.qalist
work The work directory contains temporary files, DFdiscover retrieval files and summary data files. It is important to be aware of the files that DFdiscover creates and maintains in this directory so that programmers do not inadvertently overwrite them.
Temporary Files The work directory is used for temporary files created by various reports and is the recommended location for temporary files created by study specific reports and for data files exported using DFexport.rpc. If DFexport.rpc is used to export study data files to this directory, or create temporary files here in the process of generating study specific reports, you need to be very careful in your selection of temporary file names, so that simultaneous execution of different programs does not result in temporary file name conflicts. A useful strategy is to include the process ID plus some part of the program name in all temporary file names, and also to check for and create a lock directory for any programs that should not be executed simultaneously by more than one user.
Because the work directory is used for temporary file creation by various DFdiscover programs, you might find temporary files that were not removed because a program failed to complete due to a power failure or some other problem. Any file that is several days old, and has what looks like a process id number as part of it's name, is probably a temporary file left over when a program failed to complete successfully. Such files can be removed. This should be done periodically to clean up the study work directory.
Summary Files The work directory contains a number of summary data files created by DF_XXkeys and DF_QCupdate and used by other programs including DF_QCreports and DF_PTvisits. The following files are described in detail later in this chapter in The study work directory:
DFX_ccycle - cycle conditions met
DFX_ccycle
DFX_cvisit - visit conditions met
DFX_cvisit
DFX_cplate - plate conditions met
DFX_cplate
DFX_cterm - termination conditions met
DFX_cterm
DFX_keys - key fields for all required plates
DFX_keys
DFX_schedule - subject scheduling and visit status
DFX_schedule
DFX_time1 - date and time from receipt to last modification
DFX_time1
DFX_visit.dates - value in the database of all visit dates
DFX_visit.dates
The data, pages, and pages_hd directories must be owned by user datafax, with read,write and execute permissions for owner, and read and execute permissions for group.
datafax
The bkgd, dde, frame, lib, reports, and work directories must have read, write and execute permissions set for all users of the study group (typically this will be UNIX group studies).
dde
frame
studies
These permissions are set by DFadmin on those directories which it creates when the study is first registered. The utility program DFstudyPerms is available for diagnosing and correcting study permission problems. A detailed list of study files and directories checked by this program can be found at System Administrator Guide, Maintaining study file system permissions
Each file documented in this section is described with the following attributes:
File attributes
+
A DFdiscover Retrieval File (DRF) is an ASCII file in which each record identifies the subject ID, visit or sequence number and CRF plate number (in that order, | delimited) and optionally the image ID corresponding to a record in the study database.
DFdiscover retrieval files are created by DFexplore, DFdiscover reports, and the DFdiscover programs DFmkdrf.jnl and DFexport.rpc.
DFdiscover retrieval files created by DFexplore and DFdiscover reports and programs reside in the study drf directory. It is also recommended that DRFs created using DFexport.rpc also be saved to the study drf directory. All DRFs must end with the .drf extension. Exercise caution when selecting DRF names to avoid conflicts with those DRF file names generated by DFdiscover applications. Below is a description of the DRF file format.
.drf
Data Retrieval Files (DRF)
filename.drf
|
\n
This is an example of a DRF listing all primary records having one or more queries attached to them. The DRF filename ext_qcs.150813.drf was created on August 13, 2015 and has the description external qc notes.150813.
ext_qcs.150813.drf
external qc notes.150813
99001|1|2 99001|1011|9 99002|1|2 99002|22|5 99002|1011|9 99004|1|2
DFdiscover Retrieval File field descriptions
The study data directory holds all study data files, including the plate files that correspond to CRF plates, the query data file and the database transaction journals. These files are managed by the study database server and should not be accessed directly. Instead the records required should be exported from the study database to another file, as described in DFexport.rpc, and then read from the exported file.
Data records should never be added to these database files without doing so through the study database server. The program DFimport.rpc is available and is the recommended method of sending new or revised data records to the study database server.
Occasionally you may need to make extensive changes to an entire database file. The typical scenario is a change to one or more of the study CRFs to add or remove fields. In this case the plate definitions originally entered in the setup tool also need to be changed to match the new structure of the corresponding data files. This is a major operation. It requires that the study server be disabled while the database and setup definitions are being changed, and that the effected data files be re-indexed before the study database server is re-enabled.
The following sections describe data and index files ending with .dat and .idx suffixes. Very rarely, you may notice files having the same name but with .tad or .xdi suffixes. These files are temporary files created and managed by the database server. They are present while the server performs sorting and garbage collection on a particular data file. As such they should be treated in the same manner as the other data and index files - basically, do not touch them. Typically, this is not a problem because the files are present for only a second or two.
.dat
.idx
.tad
.xdi
plt###.dat - per plate data files
After first validation, new records are moved from DFin.dat to the study database files, plt###.dat. There is a separate data file for each unique CRF plate number defined for the study. For plate N, the data file is named pltN.dat, where N is leading zero-padded to 3 digits in length. Each record in one of these data files corresponds to the data entered from one CRF page with that plate number. Thus, each data file holds the data recorded for all instances of that CRF page (i.e., across all subjects and all visits).
N
pltN.dat
All database records are stored in free-field format with a | delimiter and \n delimiter. All database records have 3 CRF information fields followed by 4 key fields. Within all database files for each study, the study number key should always be the same; and within each plt###.dat data file, the plate number key should always be the same.
WARNING: Do not edit data files Data files must not be edited by conventional means. Doing so will surely corrupt the database. Each data file has a corresponding binary index file that encodes the information of the data file. Editing any data file will change the contents so that they are inconsistent with the index file. Another reason not to read or edit the data files explicitly is that each record may appear in the database more than once. This arises because each time a record is modified, the new version is appended to the end of the database file. Only the index file knows which version of the record is the most recent. There is no indication in the data record. DFexport.rpc should always be used to read data files because it consults the index files to identify the correct version of each record to be exported. Old versions of modified records are removed by the study database server the next time the database is resorted.
Data records have the common fields and characteristics described in plt###.dat field descriptions.
Here is an example of a plate 4 data record after level 1 validation to database file plt004.dat
plt004.dat
1|1|9 145/0045001|099|4|1|0123|egb|92/01/01|high blood pressure| Dr. Smith|92/01/10 10:23:23|92/01/10 10:23:23|
The text fields have been entered and record status has been set to final.
final
plt###.dat fields descriptions
DFSTAUS
DFVALID
DFRASTER
YYWW/FFFFPPP
YYWWRFFFFPPP
YY
WW
FFFF
PPP
/
PAGE_DIR
R
DFSTUDY
DFPLATE
DFSEQ
the visit or sequence number of this occurrence of a plate for a subjects id. Legal limit is 0-511 if defined in the barcode, and 0-65535 if defined as the first data field on the page.
IMPORTANT: Field 6 has this name only if the field is pre-defined in the barcode or visit map; otherwise, the name is user-defined.
DFSCREEN
DFCREATE
yy/mm/dd hh:mm:ss
DFMODIFY
plt###.dat - missed records
lost
Here is an example of a missed data record
``0|7|000/0000000|099|4|1|0123|1| subject was on vacation|92/01/10 10:23:23|92/01/10 10:23:23|
0|7|000/0000000|099|4|1|0123|1| subject was on vacation|92/01/10 10:23:23|92/01/10 10:23:23|
Missed record field descriptions
DFSTATUS
0
0000/0000000
DFqc.dat - the Query database
The query data file, DFqc.dat, is maintained by the study server in the same manner as the CRF data files, DFin.dat and plt\*.dat. All query records have exactly the same format, across all DFdiscover studies. Each query record has 22 fields, of which 5 (fields 4-8) are the keys needed to uniquely identify each record. Multiple queries are permitted per field provided their category codes are unique.
plt\*.dat
As for study data files, you should always use DFexport.rpc to export the query database before using it. For this purpose, DFqc.dat has been assigned a DFdiscover reserved plate number of 511.
DFqc.dat field descriptions describes each field in a query record.
An example of a newly added query that is to be included in the next Query Report sent to the originating site (the line break is for presentation purposes only):
1|1|0000/0000000| 20|10|999|2100101|6|21|000000|0||1. Sample collection date |11/10/05 |1|2|||user1 06/05/27 11:35:57|user1 06/05/27 11:35:57||1|
An example of a query that was sent out in a Query Report and has now been resolved:
5|4|0000/0000000|20|3 97|2030|1415412|11|14|060424|31|1|A2. Whole Blood 1ml ID # ||1|2|||brian 06/03/23 1 2:57:20|brian 06/03/23 12:57:20|barry 06/05/11 14:10:04|1|
DFqc.dat field descriptions
DFPID
DFQCFLD
the data entry field to which this query is attached
WARNING: Query Field Number = Database Field Number - 3 For historical reasons this number is 3 less than the database field number. For example, a query on the ID field, which is always database field 7, will have a query field number of 4.
DFQCCTR
DFQCRPT
DFQCPAGE
DFQCREPLY
DFQCNAME
DFQCVAL
DFQCPROB
DFQCRFAX
DFQCQRY
DFQCNOTE
DFQCCRT
DFQCMDFY
DFQCRSLV
DFQCUSE
DFreason.dat - reason for change records
Here is an example of a reason data record
1|3|000/0000000|254 |4|1|0123|9|phone|information provided by physician over t he phone|nathan 03/01/10 10:23:23|nathan 03/01/10 10:23:23
It indicates that field 9 was changed because of the reason "information provided by physician over the phone".
Reason for data change record field descriptions
DFRSNFLD
DFRSNCDE
REASON
DFRSNTXT
DFRSNCRT
name yy/mm/dd hh:mm:ss
DFRSNMDF
DFin.dat - the new records database
Each initial data record, created by the ICR software when it scans a CRF page, is passed to the study database server which appends it to this file. These are called "new records".
All new records have the common fields and attributes identified in DFin.dat field descriptions.
The first 5 fields are added by the DFinbound.rpc process that ICRed the page. The 4th and 5th fields were read from the barcode at the top of each CRF page. Dependant upon the success of the ICR algorithm, subsequent fields in each new record may or may not be filled in. If an image is particularly difficult to read, it is possible that only the first five fields will appear in the new record.
The first three fields, in addition to being delimited, are also in fixed column positions. The record status begins in the first column and is one column wide. The validation level begins in the third column and is one column wide. The raster name begins in the fifth column and is 12 columns wide. Columns two and four contain the field delimiter.
Here is an example of a new data record from DFin.dat
0|0|9145/0045001|099|4|1|0123||92/01/01|||
This record has new record status, has not yet been validated, contains the data from image $(PAGE_DIR)/9145/0045001, and belongs to study 99, plate 4, visit 1, and subject ID 123.
DFin.dat field descriptions
new
yyww/ffffppp
YYMM.jnl - monthly database journal files
The study database server adds a record to the current study database journal file, each time that a data record is written to the study database, including when:
a new record arrives in the study database from the incoming daemon
an existing record is modified in DFexplore
the validation level of an existing record is modified in DFexplore
the status of an existing record is modified in DFexplore
an existing record is deleted in DFexplore
a query is added, modified, resolved, or deleted in DFexplore
a record is imported into the study database using DFimport.rpc
the database is structured by DFsetup
Separate journal files are kept for each study, and within a study, a new journal file is created for each month. The naming scheme for journal files is YYMM.jnl where YY is the last two digits of the year (e.g. 92) and MM is the two digit month of the year, January being 01 and December being 12.
MM
The fields within each journal record are defined as described in YYMM.jnl field descriptions.
Following is an example of a complete journal record. The line breaks are for presentation purposes only.
980312|132 426|valid1|d|1|1|9807/0047008|254|5|24|99001|ABC|06/07/97| 171|097|172|096|2|055.1|121.5|2||1143|00|*|*|*|1|1|*|1|1| 98/03/12 13:24:26|98/03/12 13:24:26|
YYMM.jnl field descriptions
YYMMDD
HHMMSS
d
q
r
s
S
DFaudit.db sqlite audit trail
All journal records are stored additionally in a sqlite database. The database contains table DFaudit and indexes. Columns in the DFaudit table match the output from DFaudittrace:
CREATE TABLE IF NOT EXISTS dfaudit ( dftype TEXT NOT NULL, -- 1 type: N=new, C=changed field, D=deleted dfdate INTEGER NOT NULL, -- 2 date: yyyymmdd dftime TEXT NOT NULL, -- 3 time: hhmiss dfuser TEXT NOT NULL, -- 4 user dfsubject INTEGER, -- 5 subject dfvisit INTEGER, -- 6 visit dfplate INTEGER NOT NULL, -- 7 plate dffieldid INTEGER, -- 8 unique field id: data(=0), query(>0), reason(<0) dffieldchange INTEGER, -- 9 changed field: data(0, unique id), query(field#), reason(field#) dfstatus INTEGER, -- 10 data, query and reason status dflevel INTEGER, -- 11 validation level dfmaxlevel INTEGER, -- 12 maximum validation level reached dfcategory TEXT, -- 13 missed record code, query category, reason code dfuse TEXT, -- 14 missed record text, query usage, reason text dfoldval TEXT, -- 15 old value dfnewval TEXT, -- 16 new value dffieldnum INTEGER, -- 17 field number dffielddesc TEXT, -- 18 field description dfoldlbl TEXT, -- 19 old coded field label dfnewlbl TEXT, -- 20 new coded field label dfuid INTEGER) -- 21 internal use for database restructure CREATE IND EX IF NOT EXISTS dfidx_date ON dfaudit (dfdate, dfsubject) CREATE INDEX IF NOT EXISTS dfidx_user ON dfaudit (dfuser) CREATE INDEX IF NOT EXISTS dfidx_fuid ON dfaudit (dfuid) CR EATE INDEX IF NOT EXISTS dfidx_fid ON dfaudit (dffieldid) CRE ATE INDEX IF NOT EXISTS dfidx_fnum ON dfaudit (dffieldnum) CREATE INDEX IF NOT EXISTS dfidx_keys ON dfaudit (dfsubject, dfvisit, dfplate) CREATE INDEX IF NOT EXISTS dfidx_vst ON dfaudit (dfvisit) CREATE INDEX IF NOT EXISTS dfidx_plt ON dfaudit (dfplate) CREATE IND EX IF NOT EXISTS dfidx_sta ON dfaudit (dfstatus, dflevel) CREATE INDEX IF NOT EXISTS dfidx_lvl ON dfaudit (dflevel)
DFserver.rpc opens DFaudit.db in read-write mode. Whenever a record entry is added to journal file, it will be appended to DFaudit table. DFedcservice opens DFaudit.db in read-only mode when requesting history. DFauditdb converts existing journal files to DFaudit.db.
plt###.ndx - per plate index files
Every study data file has a corresponding index file that the study server uses to track the current status and location of each record in the data file. The index entry for a particular data record includes the value of the key fields id, plate, and visit/sequence number, the record status, the validation level, the offset of the beginning of the record into the data file, and the length of the data record. When searching for a data record by keys, it is much more efficient for the database server to search the index file for matching keys and then use the offset and length to extract the data record from the data file.
Each time an existing data record is modified or a new record is added, a new entry is made at the end of the index file for the new, modified copy of that record, and the status of the old index entry (if there was one) is changed to indicate that it has been superseded by a new entry.
Index and data files are sorted (on id and visit/sequence number) by the study database server, each time the server exits. Before sorting, an index file has M sorted entries at the top of the file, and N unsorted entries at the bottom of the file. When searching, the unsorted index entries are searched first in a linear fashion and then, if necessary, a binary search is performed on the sorted entries.
M
The first 32 bytes of an index file are header information, consisting of four 4-byte numbers that identify attributes of the file as a whole, as described in plt###.ndx header bits, followed by 16 bytes of padding. Before the study database server exits, it checks this header information of each index file. If the number of unsorted entries and the number of pending deletes are both 0, then the file is already sorted and does not need further attention.
The 32 bytes of header information are followed by the actual index entries. Each index entry is 32 bytes in size and is described in plt###.ndx record bits.
Each index entry contains 8 bytes for the subject ID, 2 bytes for the visit number, 2 bytes for the plate number, 4 bytes for the offset into the data file, 2 bytes for the record length, a status byte and 13 bytes of padding.
The status byte encodes three pieces of information: the record status (equivalent to the numeric value of the first byte of the data record), the record validation level (equivalent to the numeric value of the third byte of the data record), and the status of the index entry. This is encoded as illustrated below.
The record status contains the same values as those allowed in the record status field of the data record, namely 0 through 7 (binary 111). Similarly, the validation level will take on the same values as those allowed in the validation level of the data record, again 0 through 7. The index status contains the value 2 if this is a new index entry, 1 if the index entry has been superseded by a newer one, and 0 otherwise.
The size of all index files should always be a multiple of 32 bytes.
plt###.ndx header bits
plt###.ndx record bits
This directory contains the edit check source files.
DFedits - edit checks
edit SetInit() { if ( dfblank( init ) && !dfblank( init[,0,1] ) ) init = init[,0,1]; } edit AgeOk() { number age; if ( !dfblank( p001v03 ) && !dfblank( p001v04 ) ) { age = ( p001v03 - p001v04 ) / 365.25;
...
This directory contains the study configuration files, those files that make this study unique from every other DFdiscover study.
DFcenters - sites database
Each DFdiscover study has a sites database that records where each participating site is located, who the contact person is, and what subject ID number ranges are covered by each site ID.
Each site typically corresponds to a different clinical site, but this is not required. If necessary, a single clinical site may be defined as 2 or more sites, each corresponding to a different participating clinical investigator at that site. The sites database consists of one record per site ID. The field definitions are described in Field definitions for DFcenters.
Each site ID must be a number in the range 0 to 21460 and uniquely identifies the site within the database. All site numbers printed on DFdiscover reports are leading zero-padded to 3 digits but they may be up to 5 digits in length if the site number is greater than 999.
21460
The contact person is required as it is the person that outgoing documents for the site are addressed to.
The Fax field is not required and may be left blank for sites that are not meant to receive Query Reports. This field may include an email address or fax number. The email address must be specified using the notation mailto:email_address. The prefix mailto: is fixed and required while email_address must be a valid email address (there is no validity checking performed on email addresses, so be careful to enter them correctly).
mailto:
email_address
If more than one email and/or fax number is specified, each must be separated by a single space. Comma separators are not permitted.
The fax number, if specified, should exactly match the number, including long distance, country, and area codes, that would be dialed from a numeric keypad. Most punctuation characters are ignored so they can be used for improving readability. Valid punctuation characters include: +#\*-,()0-9. For example, 1-(416)521-9800 is equivalent to 14165219800.
+#\*-,()0-9
1-(416)521-9800
14165219800
WARNING: Whitespace characters: Whitespace characters (space and tab) may not be used as punctuation as they are interpreted as delimiters between multiple entries.
Each comma inserts a one-second pause in the dialing sequence. This can be helpful when leaving a local PBX or waiting to dial an extension.
Subject ID ranges are specified in fields 11 through the end of the record. There is no limit to the number of range fields that can be given. Each range field contains a minimum subject ID and maximum subject ID separated by exactly one space character. For example,
|101 199|
indicates that subject IDs 101 through 199 inclusive are to be included for the site. Individual subject IDs that are disjoint from any range are indicated by setting both the minimum and maximum ids to the actual subject ID. For example,
|244 244|301 399|401 420|
includes subject IDs 244, 301 through 399 inclusive, and 401 through 420 inclusive.
In the event that subject IDs are incorrectly entered in data records, there should always be a 'catch-all' site listed that receives all subject IDs that do not fall into any of the other subject ID ranges. This site is indicated by the phrase ERROR MONITOR in the 11th field and no other subject ID range fields.
ERROR MONITOR
NOTE: DFcenters may be modified at any time, via DFsetup only, for an active study. However, if quality control reports are being created for the study, the DFdiscover report DF_QCupdate must always be run after DFcenters modification and prior to Query Report creation. DF_QCupdate ensures that the most up-to-date DFcenters file will be used when generating subject status and scheduling information for the Query Reports.
A single sites database record for a site that is responsible for subject IDs 1101 through 1149, and 1151 through 1199 inclusive.
011|Lisa Ondrejcek|DFnet|140 Lakeside Avenue, Suite 310, Seattle, Washington, 98122|1-206-322-5932 country:USA;enroll:100|1-206-322-5931|Lisa Ondrejcek| 1-206-322-5931||1101 1149|1151 1199
Field definitions for DFcenters
Attributes including Start Date, End Date, Enroll Target, Protocol Effective Date (x5), Protocol Version (x5)
This field contains zero of more semi-colon ( : ) delimited pairs, where each pair is a keyword and value where the keyword is from the list country, beginDate, endDate, enroll, protocol1, protocol1Date, protocol2, protocol2Date, protocol3, protocol3Date, protocol4, protocol4Date, protocol5, protocol5Date, testSite.
:
country
beginDate
endDate
enroll
protocol1
protocol1Date
protocol2
protocol2Date
protocol3
protocol3Date
protocol4
protocol4Date
protocol5
protocol5Date
testSite
DFccycle_map - conditional cycle map
This table describes the conditional cycle map file structure and provides an example. It does not describe all of the syntax and rules related to this feature. Usage instructions for all 4 conditional maps is fully described in Study Setup User Guide, Conditional Maps.
The file contains one or more specifications, each consisting of a condition definition followed by one or more actions to be applied if the condition is met. Entries in the file have the general appearance of:
IF|Visit List|Plate|Field|Value AND|Visit List|Plate|Field|Value [+-~]|List of conditional cycles
Each condition may be followed by one or more action statements. Each of these statements begins with: '+' to indicate that the cycles are required, '-' to indicate that the cycles are unexpected, or '~' to indicate that the cycles are optional, when the condition is met.
There is no limit to the number of condition/action entries that may be included but the order in which the conditions appear may be important, because in the event of a conflict, the action specified by the last entry, applicable to each cycle, is the action that will be applied. This point is illustrated in the following example.
IF|0|1|22|6 +|2,5,8 -|3,6,9 ~|4,7,10 IF|0|1|22|5 AND|0|9|13|>0 AND|0|9|36|!1 +|11 IF|1|3|9|~^A -|11
This example, consists of 3 conditional specifications. They are applied in the order in which they are defined. The first specification indicates that, if field 22 on plate 1 at visit 0 equals 6, then cycles 2, 5 and 8 are required; cycles 3, 6 and 9 are not expected; and cycles 4, 7 and 10 are optional. The second specification indicates that, if field 22 on plate 1 at visit 0 equals 5, and field 13 on plate 9 at visit 0 is greater than zero, and field 36 on plate 9 at visit 0 is not equal to 1, then cycle 11 is required. The third specification indicates that, if field 9 on plate 3 at visit 1 begins with the capital letter "A", then cycle 11 is not expected.
If both conditions 2 and 3 are met cycle 11 will be considered unexpected because, when a conflict occurs, the last condition wins.
DFcvisit_map - conditional visit map
This table describes the conditional visit map file structure and provides an example. It does not describe all of the syntax and rules related to this feature. Usage instructions for all 4 conditional maps is fully described in Study Setup User Guide, Conditional Maps.
IF|Visit List|Plate|Field|Value AND|Visit List|Plate|Field|Value [+-~]|List of conditional visits
Each condition may be followed by one or more action statements. Each of these statements begins with: '+' to indicate that the visits are required, '-' to indicate that the visits are unexpected, or '~' to indicate that the visits are optional, when the condition is met.
There is no limit to the number of condition/action entries that may be included but the order in which the conditions appear may be important, because in the event of a conflict, the action specified by the last entry, is the action that will be applied. This point is illustrated in the following example.
IF|0|1|22|6 +|10-19 -|20-29 ~|30 IF|0|1|22|5 AND|0|9|13|>0 AND|0|9|36|!1 +|40 IF|1|3|9|~HIV -|40
This example, consists of 3 conditional specifications. They are applied in the order in which they are defined. The first specification indicates that, if field 22 on plate 1 at visit 0 equals 6, then visits 10 to 19 are required, visits 20 to 29 are unexpected, and visit 30 is optional. The second specification indicates that, if field 22 on plate 1 at visit 0 equals 5, and field 13 on plate 9 at visit 0 is greater than zero, and field 36 on plate 9 at visit 0 is not equal to 1, then visit 40 is required. The third specification indicates that, if field 9 on plate 3 at visit 1 contains the literal string "HIV", then visit 40 is not expected.
If both conditions 2 and 3 are met, visit 40 will be considered unexpected because, when a conflict occurs, the last condition wins.
DFcplate_map - conditional plate map
This table describes the conditional plate map file structure and provides an example. It does not describe all of the syntax and rules related to this feature. Usage instructions for all 4 conditional maps is fully described in Study Setup User Guide, Conditional Maps.
IF|Visit List|Plate|Field|Value AND|Visit List|Plate|Field|Value [+-~]Visit List|List of conditional plates
Each condition may be followed by one or more action statements. Each of these statements begins with: '+' to indicate that the plates are required, '-' to indicate that the plates are unexpected, or '~' to indicate that the plates are optional, at the specified visits, when the condition is met.
There is no limit to the number of condition/action entries that may be included but the order in which the conditions appear may be important, because in the event of a conflict, the action specified by the last entry, applicable to each plate, is the action that will be applied. This point is illustrated in the following example.
IF|0|1|22|6 +10,20|50,51 -10,20|40,41 ~10,20|15 IF|0|1|22|5 AND|0|9|13|>0 AND|0|9|36|!1 +91-95|16 IF|1|3|9|yes -91|16
This example, consists of 3 conditional specifications. They are applied in the order in which they are defined. The first specification indicates that, if field 22 on plate 1 at visit 0 equals 6, then at visits 10 and 20: plates 50 and 51 are required, plates 40 and 41 are not expected, and plate 15 is optional. The second specification indicates that, if field 22 on plate 1 at visit 0 equals 5, and field 13 on plate 9 at visit 0 is greater than zero, and field 36 on plate 9 at visit 0 is not equal to 1, then at visits 91-95 plate 16 is required. The third specification indicates that, if field 9 on plate 3 at visit 1 contains exactly the string "yes", and nothing more, then plate 16 is not expected at visit 91.
If both conditions 2 and 3 are met plate 16 will be considered unexpected at visit 91, but required at visits 92-95, because, when a conflict occurs, the last condition wins.
DFcterm_map - conditional termination map
This table describes the conditional termination map file structure and provides an example. It does not describe all of the syntax and rules related to this feature. Usage instructions for all 4 conditional maps are fully specified in Study Setup User Guide, Conditional Maps.
The file contains one or more specifications, each consisting of a condition definition followed by one action to be applied if the condition is met. Entries in the file have the general appearance of:
IF|Visit List|Plate|Field|Value AND|Visit List|Plate|Field|Value A or E
Each condition is followed by either the letter 'A' (abort all follow-up), or 'E' (early termination of the current cycle). The termination date is defined as the visit date of the visit that triggered the condition, specifically the visit specified in the IF statement.
IF
IF|0|1|22|6 A IF|6|1|22|5 AND|6|9|13|>0 AND|6|9|36|!1 E
This example, consists of 2 conditional specifications. The first specification indicates that, if field 22 on plate 1 at visit 0 equals 6, then all follow-up terminates as of the visit date for visit 0. Visits scheduled to occur before this date are still expected, but visits scheduled following this date are not.
The second specification indicates that, if field 22 on plate 1 at visit 6 equals 5, and field 13 on plate 9 at visit 6 is greater than zero, and field 36 on plate 9 at visit 6 is not equal to 1, then the current cycle terminates, i.e. the cycle in which visit 6 is defined; with the termination date being the visit date of visit 6. Any visits in this cycle (or in previous cycles) that were scheduled to occur before the termination date are still expected, but visit within this cycle scheduled following this date are not. On termination of a cycle, subject scheduling proceeds to the next cycle in the visit map, if there is one.
DFCRFType_map - CRF type map
Each record in the CRF type map has two fields, an acronym or short form (1st field), and a descriptive label (2nd field). The CRF type acronym and label appear in the Import CRFs dialog. The CRF type label appears in the DFexplore Preferences dialog.
This file is optional. If it does not exist, CRF Types are not used for this study.
PAPER|Print Version CHINESE|Chinese ENGLISH|English SWEDISH|Swedish FRENCH|French SPANISH|Spanish
DFcrfbkgd_map - CRF background map
Each record in the CRF background map has three fields: the visit number with the background to be repeated (1st field), the list or range of visit numbers where the background will be repeated (2nd field), and an optional comment field (3rd field).
This file is optional. If it does not exist, the default CRF will be displayed for visits not tagged during the Import CRFs step. If no default CRF has been imported, the CRF background will be blank for untagged visits.
10|11,13-14,16-17,19-20|Quarterly visits 12|15,18|Annual visits 22|25,28|Annual lab work
DFedits.bin - published edit checks
This file contains an internal, binary equivalent of the edit checks stored in the plain text DFedits file. This binary format contains no external references to other included files and is a more compact representation that can be more efficiently transmitted to and interpreted by DFexplore clients.
The DFedits.bin file is created when Publish is selected in DFsetup's edit checks dialog.
This file is the only edit checks file used by DFexplore.
DFfile_map - file map
The file map lists all of the unique plate numbers used in a particular study database. The study server uses this file at start-up to determine which database files it may need to initialize and allocate file descriptors for. In addition to the listed plate numbers, the study server also allocates file descriptors for the new records file (DFin.dat) and the query data file (DFqc.dat).
The format of this file is one record for each plate defined in the study setup. Each record has 4 fields. The first field of the record is the plate number, leading zero-padded to three digits. The second field is a textual description of the plate from the user-supplied definition in DFsetup. The textual part is displayed in the plate selection dialogs that can be found in various study tools. The third field identifies how the visit/sequence number key field is coded for that plate. A code of 1 means that the visit/sequence number is in the barcode; a code of 2 means that it is the first data field on the data entry screen for that plate. Any other code is an error. The fourth field indicates whether the arrival of this plate signals early termination of study follow-up for the subject, as would for example the arrival of a death report. A code of 1 appears if the plate signals early termination, otherwise the code is 2.
1
2
The records in the file map do not have to be sorted in increasing plate number order as the file is internally sorted by the study database server at start-up time.
001|Dosage of Study Drug (DOST-2)|2|2 002|Concomitant Medication (COM)|1|2 003|Written Consent (CONS-w)|1|2 005|Diagnosis (DSM-III-R)|1|2 006|Psychiatric History (PSH)|1|2 200|Death Report|1|1
DFlogo.png - study logo
A study logo is a small PNG file, maximum dimension is 64 pixels tall and 128 pixels wide. The study logo is displayed in the top-left corner of report output in DFexplore, in the data entry screen header of DFweb, in the studies list displayed during login, and in several DFadmin dialogs. If no study logo is available, the study name is written to the header of each report output.
The study logo must be created outside of DFdiscover - there is no interface for creating it. Once the file is created, it must be copied to the study folder and saved as lib/DFlogo.png.
lib/DFlogo.png
DFlut_map - lookup table map
Lookup tables are used to select results from a list of pre-defined values and insert them into DFdiscover data fields. Lookup tables can also be defined for the query, query note, and reason fields to allow users to select pre-specified text for these fields. Lookup tables are described in Lookup tables.
A study setup may use multiple lookup tables and so the lookup table map is used to associate simple table names with more complex and lengthy full pathnames of the actual lookup tables.
Each record in the lookup table map has a lookup table name, followed by a |, and a filename containing the lookup table. DFdiscover will search for the lookup table filename first in the $(STUDY_DIR)/lut directory and if that fails in the /opt/dfdiscover/lut directory. The table name is a symbolic name that can be referenced in edit checks.
$(STUDY_DIR)/lut
/opt/dfdiscover/lut
The special table names QC, QCNOTE and QCREPLY are used to associate a lookup table with the detail, note and reply fields, respectively, in the DFexplore Add/Edit/Reply Query dialogs. The special table name REASON is used to associate a lookup table with the reason code and text for data changes.
QCNOTE
QCREPLY
QC|DF_QClut QCNOTE|DF_QCnotelut QCREPLY|DF_QCreplylut REASON|DF_Reasonlut MEDS|meds.dict COSTART|costart.dict WHO|who.dict
DFmissing_map - missing value map
Each record in the missing value map has two fields, the missing value code (1st field), and a descriptive label for the code (2nd field). The missing value code appears verbatim in any data field which has been identified as missing with that code.
The missing value map lists all missing value codes used in the study. These codes can be entered into any data field by selecting the desired code from the list available under Field > Mark Field Missing in DFexplore.
Note that the field delimiter, |, which is used in all DFdiscover databases, can not be used as a missing value code (for obvious reasons). Any other single (or multiple) character is acceptable.
This file is optional. If it does not exist, a single missing value of * (asterisk) - Not Available is available by default. If it does exist but is empty, then no missing values are permitted.
*
Not Available
WARNING: Changes to the missing value map for an in-progress study: DFdiscover determines if each data value is a missing value code by comparing the value with each of the missing value codes in the map. If there is a match, DFdiscover handles that data value as a missing value. If there is no match, the data value is handled as a normal data value. This is important to remember because changes to the missing value map after a study has started and data has been entered can result in DFdiscover handling data values in a manner which is different from the handling when the value was originally entered. For example, defining .A as a missing value code at study start and then subsequently deleting that code from the missing value map will result in DFdiscover treating each occurrence of .A in the current database as a normal data value.
.A
*|Not Available .|Not Applicable -|Temporarily Unavailable .U|Unknown
DFpage_map - page map
This file is optional for a study. If defined, it must be located in STUDY_DIR/lib/DFpage_map. The page map allows one to specify non-default labels, to identify visits and CRF pages, on Query Reports and in DFexplore's subject binder view. If a page map file is not defined, queries and pages in the binder are identified by the visit number and plate number keys of the CRF.
STUDY_DIR/lib/DFpage_map
The first record in a page map contains only a single field which specifies a text label to be used as a title over the queries. The default label is:
PLATE SEQNO
This label appears at the top of the Fax/Refax List and Question & Answer List sections of the Query Reports.
The remaining records describe the text labels to be used in place of the default plate and visit number identifiers.
The first field is the plate number, the second field is the visit/sequence number, and the third field is the substitution to be used for that plate and visit/sequence combination. The third field can contain %P and %S which are replaced with the plate number and the visit/sequence number fields from the query, respectively. It may also contain parts of the plate number and/or sequence number fields by using the notation %{n.P}, %{P.n}, %{n.S}, or %{S.n}. %{n.P} which returns the leading n digits of the plate number while %{P.n} returns the trailing n digits of the plate number. Similarly, %{n.S} and %{S.n} produce n leading and trailing digits of the sequence number. The third field may also contain the notation %# which is replaced with the value stored in field n of the data record matching the specified plate and sequence number. Additionally, when using the %# notation, and for data fields that have a data type of choice or check, it is possible to request that the reported value be decoded by using the %n:d notation. This substitutes the label associated with the value, instead of the value itself.
%P
%S
%{n.P}
%{P.n}
%{n.S}
%{S.n}
n
%n:d
To implement %# or %n:d, the data record must be available. This is always true for Query Reports. However, in DFexplore only the currently visible data record is available at any moment. The result is that in the subject binder, for any record other than the current record, use of %# or %n:d is substituted with ???. This limits the usefulness of this notation in DFexplore. </span
%#
???
When a Query Report is created, those queries whose plate and visit numbers match the first and second fields, will be identified on the Query Report by the label which appears in the third field.
If either the first field or second field contains a *, all values for that field that have not yet matched a previous rule will use the format in the third field.
For more information, see Study Setup User Guide, Page Map.
PAGE (PLATE-SEQ) 018|001|MED VISIT1 *|001|VISIT1 (%P-%S) 025|*|TERM (%P-%S) *|*|(%P-%S)
Note how the last rule catches all plate and visit/sequence pairs that were not previously matched.
DFqcsort - Query and CRF sort order
A query sort order map can be specified to control the order in which queries appear on the reports created by DF_QCreports.
If DFqcsort is not defined, queries will appear on Query Reports sorted by ascending subject ID, then visit number, then plate, and finally data field number. A different sort order can be specified to reorder queries from particular visits and/or plates. For example, one might want all queries from adverse event reports to appear at the top of each subject's list. This can be done with DFqcsort.
The file contains one or more records in the following format:
plate#|visit#|sort_order#
where the sort_order# is an integer-valued sort priority such that a lower value indicates an earlier sort order. There is no minimum or maximum value for sort_order#, hence negative numbers can be used to give higher priority to sort_order#s of zero or greater. If two or more entries assign the same sort_order#, those entries are sorted in the default manner of ascending plate number within ascending visit number.
sort_order#
Within each record the plate number, or the visit number, can be replaced by * to signify all plates for a specified visit, or all visits for a specified plate. In such cases queries are sorted numerically on the unspecified field within the specified field. For example:
12|*|1
specifies that, for each subject, all queries on plate 12 are to be given a sort priority of 1 as compared to other queries. Since no visit number is specified, if queries appear on plate 12 at more than one visit for a given subject, they will appear on the Query Report sorted by visit/sequence number. * may also be specified for both plate and visit numbers (to catch all plate and visit combinations that were not otherwise specified), but this is not necessary as this is given the lowest sort priority by default.
Note that while sort order control is provided over plates and visits, no control is provided at the individual field level. Queries on a given plate are always sorted in field number order. Also queries and records are always sorted by ascending subject ID. Thus the control provided here does not allow for all possible sort orders. It merely provides a mechanism for controlling query ordering by plate and visit numbers.
12|*|1 (1) *|1|2 (2) 25|2|3 (3) 13|2|4 (4) 5|2|5 (5) *|2|6 (6) *|*|7 (7)
(1) Queries on plate 12 of any visit appear first
(1)
(2) Queries for visit 1 appear next (in plate number order)
(2)
(3) Queries on plate 25 of visit 2 appear next
(3)
(4) Queries on plate 13 of visit 2 appear next
(4)
(5) Queries on plate 5 of visit 2 appear next
(5)
(6) Queries on all remaining plates of visit 2 appear next
(6)
(7) All the rest appear in the default order (i.e. plate# within visit#)
(7)
DFqcproblem_map - Query category code map
Each record in the Query category code map has four fields, the category code (1st field), the descriptive label (2nd field), the auto-resolve value (3rd field), and the sort value (4th field). The query category code appears verbatim in any data field which has been assigned a query with that code.
The pre-defined categories are assigned integer codes from 1-6 and 21-23 (inclusive). User-defined categories must be assigned integer codes ranging from 30-99 (inclusive). Category codes must be unique.
The problem labels must have a length ranging from 1-20 characters (inclusive), must be unique, and are case insensitive.
The auto-resolve field may be set to No (0) or Yes (1). With auto-resolve set to Yes, an outstanding query with this category code will be automatically resolved if a new reason is added to the corresponding data value.
The sort value may be set to any integer value between -2147483648 and 2147483647. Category types are sorted in ascending order by sort value, then by code.
The query category code map lists all query category codes in the study. A query with one of these types can be entered into any data field by selecting the desired type from the category code list available under Field > Add Query DFexplore.
When a new study is started, the file is created. By default, the file is populated with the following:
1|Missing|1|0 2|Illegal|1|0 3|Inconsistent|1|0 4|Illegible|1|0 5|Fax Noise|1|0 6|Other|1|0 21|Missing Page|0|0 22|Overdue Visit|0|0 23|EC Missing Page|0|0
1|Missing|1|0 2|Illegal|1|0 3|Inconsistent|1|0 4|Illegible|1|0 5|Fax Noise|1|0 6|Other|1|0 21|Missing Page|0|0 22|Overdue Visit|0|0 23|EC Missing Page|0|0 30|Clinical QC|1|1 50|Needs Review|0|1 37|Refuted|1|2
.DFreports.dat - reports history
Users create and save report history lists (including options) in much the same way that they create and save task lists.
The reports history list for ail study users is stored in this file in the study lib directory. Each history list is represented by a single record in this file. Each user may have more than one history list.
History lists are created in DFreport by executing the desired reports and then saving the history list. There is no text-based interface to this file or its contents.
DFschema - database schema
\n``\n
The schema, or data dictionary, is generated/updated automatically by DFsetup whenever a save is required for the study definition. Each field is composed of a key letter (with a leading % character) to indicate the field type, a space character, and the value of the field. This is essentially UNIX 'refer' format.
The defined key letters and corresponding field types are listed in Schema codes and their meaning.
The T code requires additional explanation. Its value is the variable type (one of [choice, check, int, date, string]) followed by the setup style name. If the variable type is date, the style name is followed by the variable pivot year and the date imputation method and whether the date is used for scheduling purposes or not. The pivot year represents the first possible year in those dates where the year is only 2 digits. The imputation method is one of:
T
choice
check
date
string
The scheduling attribute is unused by most programs except for DF_QCupdate which searches the schema file for date fields which use the Visit Date attribute. These date fields are used to determine subject visit scheduling. DFsas also uses this information to determine the correct year value to include in the YEARCUTOFF option statement.
Visit Date
YEARCUTOFF
Each study schema begins, in order, with records defining values for the S, B, N, Y and then Z codes. If study is linked with another study then a and b fields are included identifying production and development study numbers. If tag definitions for custom properties are given, then one or more z records follow. These definitions document the relationship between the standard name of the custom property and a tag that is defined to replace the standard name in future use. For example,
B
Y
Z
a
b
z
%z DFSTUDY_USER1 TITLE
defines TITLE as the tag for the custom property with the standard name of DFSTUDY_USER1. Where values are defined for custom properties, y records follow. Such definitions connect a custom property to a value. The custom property is referenced by it's tag, is one is defined, otherwise by the standard name. For example,
TITLE
DFSTUDY_USER1
y
%y TITLE Acme Pharma Next Gen Skin Rejuvenator
defines Acme Pharma Next Gen Skin Rejuvenator as the value for the TITLE custom property.
Acme Pharma Next Gen Skin Rejuvenator
Thereafter, each new plate definition begins with a record defining values for the P, t, n, E, and R codes. Plate definitions may also include y and z records for plate level custom properties.
P
t
E
This is followed by records for each variable in that plate. Each variable has a record that begins with an I field, followed by at least i, V, D, W, w, T, and A fields. The v, F, L, H, J, j, K, k, g, h, s, c, and C fields are included only if applicable. Variable definitions may also include y and z records for variable level custom properties.
I
i
V
D
W
w
A
v
F
L
H
J
j
K
k
g
h
c
C
NOTE: This file is present for users and programs that require backwards compatibility and are not able to read the JSON study definition. This file may be removed in a future release.
%S 254 %N 12 %B 1990 %Y 0 0 %Z 40 %P 1 %p Blood Pressure Screening Visits %n 35 %t simple %E 1 %R PrintCRF.shx %I 1 %i 10 %V DFstatus1 %v DFSTATUS %D DFdiscover Record Status %T choice SimpleChoice %A required %W 1 %L 0~6 %C 0 lost %C 1 final %C 2 incomplete %C 3 pending %C 4 FINAL %C 5 INCOMPLETE %C 6 PENDING %I 2 %i 11 %V DFvalid1 %v DFVALID %D DFdiscover Validation Level
DFschema.stl - database schema styles
\n\n
This file is very much like the study schema, DFschema - database schema but instead of variable definitions it simply catalogs all of the variable styles used in DFsetup to define new variables. Like DFschema the file is generated/updated automatically by DFsetup whenever a save is required for the study definition.
Note that the style schema lists only field types that are defined by the style itself. It omits those definitions that are specified at the variable level.
%S 254 %N 51 %I 1 %T int SimpleNumber %A optional %I 2 %T date SimpleDate 1940 0 %A optional %W 8 %F dd/mm/yy %I 3 %T string SimpleString %A optional %I 4 %T choice SimpleChoice %A optional %L $(choices) %l 0 %c 0
Schema codes and their meaning
$(ids)
a compound value containing, in order:
data type, one of string, int, date, choice, check, time
style name (which may contain spaces)
for dates only, the cutoff year
for dates only, the imputation method
for dates only, one of NonSched or VisitDate
DFsetup - study definition
The setup file contains the study definition in JSON format that can be read and written efficiently by DFsetup. It keeps the internal state of the program together with all of the information about study, page, and variable definitions. Many other study configuration files, such as the ICR tips file and study database schema, are constructed from this internal representation.
WARNING: Internal use only: This file format is intended to be written only by DFsetup only. It may be read and processed by other programs that need to determine setup information but its internal structure is subject to change without notice.
DFsetup.db - sqlite setup database
All setup information and setup change history is stored in a sqlite database. The database contains the following tables:
DFserver.rpc updates DFsetup.db when DFsetup JSON file and dfschema are updated.
DFedcservice and DFsetup use DFsetup.db to display setup audit trail.
DFsetupdb utility converts existing setup and audit trail to DFsetup.db.
DFserver.cf - server configuration
=
STUDY_NUMBER=254 STUDY_NAME=DFdiscover Acceptance Test Study STUDY_DIR=/opt/studies/val254 PAGE_DIR=$(STUDY_DIR)/pages WORKING_DIR=$(STUDY_DIR)/work PRINTER=hp4000 THRESHOLD=500000 STUDY_URL=http://www.ourstudy.com/index.html LOCAL_CACHE=1 VERSION_STRICT=0 AUTO_LOGOUT=5|30 ADD_IDS=1|| VERIFY_IDS=0|
DFserver.cf configuration keywords
STUDY_NUMBER
999
STUDY_NAME
STUDY_DIR
$(STUDY_DIR)
$(STUDY_DIR)/pages
WORKING_DIR
$(STUDY_DIR)/work
PRINTER
PRINT_CMD
THRESHOLD
AUTO_LOGOUT
VERSION_STRICT
LOCAL_CACHE
STUDY_URL
ADD_IDS
VERIFY_IDS
DFsubjectalias_map - subject alias map
This file is optional for a study. If defined, it must be located in $(STUDY_DIR)/lib/DFsubjectalias_map. The subjectalias map allows the specification of "aliases" (descriptive labels), each up to 30 UNICODE characters in length, that may be used in place of the standard numeric subject identifier. If a subject alias map is defined, and has been enabled as a preference for the current user, the subject alias appears everywhere in DFdiscover that the numeric subject ID would normally appear. Otherwise, the numeric subject ID is displayed.
$(STUDY_DIR)/lib/DFsubjectalias_map
NOTE: There are a few places where the subject alias is not displayed, for example legacy reports.
The file contains zero or more rows, each row containing 2 fields and the fields are delimited by |. The value in the first field is the numeric subject ID and the value in the second field is the matching alias for that subject ID. Each subject ID must be unique and each subject alias must also be unique.
For more information, see Study Setup User Guide, Subject Alias Map.
150040|T1-B-40 150041|T1-B-41 150042|T1-B-42 150043|T1-B-43 150044|T1-B-44 150045|T1-B-45 200006|T2-A-6 200007|T2-A-7 200008|T2-A-8 200009|T2-A-9 200010|T2-A-10
DFsubjectalias_map.log - subject alias change log
This file is optional for a study. If defined, it must be located in $(STUDY_DIR)/lib/DFsubjectalias_map.log. The contents of the file are managed entirely by DFserver.rpc - it is not meant for external editing.
$(STUDY_DIR)/lib/DFsubjectalias_map.log
The file contains zero or more rows, each row containing 7 fields where the fields are delimited by |. Each row logs the addition, editing or deletion of a single, specific subject alias.
200703|174748|userB|n|500003|T5-B-3| 200703|180202|userB|n|250009|T2-A-259| 200703|180219|userB|c|250009|T25-A-9|T2-A-259 200722|112345|userA|d|200004||T2-A-4 200722|112345|userA|d|200005||T2-A-5
Field definitions for DFsubjectalias_map.log
DFtips - ICR tips
.``\n
The purpose of this file is to document the location, size, and type of each data field on each study plate. This information is parsed and used by DFinbound.rpc during its ICR processing of each CRF image file.
The tips file begins with 4 comment lines which identify the study number, name, number of CRF plates that have been defined and the date on which the tips file was created. This is followed by the keyword DELIMITER which identifies the single character delimiter used to separate fields in data records. The current implementation of DFdiscover accepts only the | delimiter.
DELIMITER
The above header information is followed by N plate definitions, where N is the number of unique plates defined for the study. Each plate definition begins with plate specific information followed by M variable definitions, where M is the number of variables defined on that plate. Each variable definition includes: the data field number and unique name, the data type, legal values (if any were specified), the size of the boxes, or VAS line, used to record the data field, and its location on the CRF image that was used to define the plate in DFsetup.
Each field in the tips file has a leading keyword separated from the value by a \t (TAB) character.
\t
The plate specific keywords are listed in DFtips plate specific keywords. The keywords recognized for variable definitions are listed in DFtips variable specific keywords.
# Study Number: 253 # Study Name: Demo Study 253 # Total Pages: 14 # Created: Wed Dec 1 12:33:29 2004 DELIMITER | PAGE 1 PLATE 1 # Blood Pressure Screening Visits SEQ_CODE 1 PREOP echo "$image $data" >> /tmp/test DO_ICR 2 # FAX: plt001 FIELD 1 # ID001 TYPE int LEGAL $(ids) PART 106 84 2 50 25 PART 167 84 3 74 25 . FIELD 2 # PINIT001 TYPE string skip PART 380 84 3 74 25 . FIELD 3 # AGE TYPE int LEGAL 21-90 PART 79 182 2 50 25 . FIELD 4 # SEX TYPE choice LEGAL 0,1,2, blank PART 0 0 0 0 0 # blank PART 277 189 1 1 15 # male PART 277 212 1 2 15 # female . FIELD 5 # RACE TYPE choice LEGAL 0-65535 PART 0 0 0 0 0 # no choice PART 464 188 1 1 15 # Caucasian PART 464 213 1 2 15 # African American PART 464 238 1 3 15 # Asian PART 464 263 1 4 15 # Other . FIELD 6 # RACEOTH TYPE string skip PART 588 257 1 107 26 . FIELD 7 # S1DATE TYPE date FORMAT dd/mm/yy DATES 1940 0 LEGAL 01/01/01-today PART 172 330 2 49 25 PART 232 330 2 50 25 PART 295 330 2 49 25
DFtips plate specific keywords
DFtips variable specific keywords
the pivot year and imputation method to be used for interpreting 2-digit years (applicable only if the current field is of type date). The imputation method is one of:
3
DFvisit_map - visit map
The visit map file describes the subject assessments to be completed during the study, the timing of these assessments, and the pages of the study CRFs which must be completed at each assessment.
Each assessment is identified by a visit type. There must always be a baseline visit which is typically the date on which the subject qualified for entry to the trial and was randomized. There must also be a termination visit which ends study follow-up. Between baseline and termination there are often several scheduled visits, subject diaries, laboratory tests, and perhaps a few unscheduled visits. At each of these visits there will be a set of required (and possibly optional) forms to be completed.
Each visit is defined in a single record of the visit map. The fields in each record are described in DFvisit_map field descriptions
For additional information regarding visit maps, refer to Standard Reports Guide, DF_QCupdate and Study Setup User Guide, Visit Map.
A simple 4-visit visitmap:
0|B|Baseli ne|1|9 (mm/dd/yy)|0|0| 1 2 3 4 5 6 7 8||99|3 4 1 2 5 6 8 7 10|S|One Week Followup|9|9 (mm/dd/yy)|7|0| 9 10 14|||| 20|S|Two Week Followup|9|9 (mm/dd/yy)|14|0| 9 10|||| 30|T|Termination Visit|9|9 (mm/dd/yy)|21|0| 11 12||||
DFvisit_map field descriptions
for visit type W, a termination window is required and may be one of the following forms:
In each case, the date value must use the format that is defined as the VisitDate attribute's format (and is also recorded in field 5).
Visit codes and their meaning
DFwatermark - watermark
The DFwatermark file stores a description of the watermarks available for use in a study. The file contains zero or more records. The fields in each record are described in DFwatermark field descriptions
In determining which watermark to use, the records are searched in sequential order from the beginning of the file. The first record to match, based upon the current user's role and the role(s) in the record, is selected.
draft| Draft watermark|DRAFT|20|255,0,0|50|1|1|unrestricted,Sites
DFwatermark field descriptions
QCcovers - Query cover pages
External Quality Control report cover sheets can be included by defining them in the file QCcovers. They may then be requested by running DF_QCreports with no -i option or running DF_QCreports with -i c. If no options are specified with DF_QCreports, a full 3-part Query Report and cover sheet will be generated. If -i is used, you must also specify other report parts to be generated (e.g. s, r, q, c). The -i c option may be specified with the site selection option, -c #. This allows the user to generate cover sheets only for those site IDs specified. The -i c option by itself will generate a cover sheet and nothing else for all sites.
-i
-i c
-c #
A cover sheet begins with a
<FOR center="#list">
tag to identify the site(s) that are to receive the cover sheet. It is legal for some sites to receive cover sheets while others do not, and for different sites to receive different cover sheets.
Cover sheets are defined using a <TEXT> block. Within <TEXT> blocks, variable substitution may be performed as described in QCtitles for customized report titles. Aside from variable substitution, all other text will appear exactly as formatted within the text block. Blank lines used to double space text will also be preserved. The default font for cover sheet text is the constant width font, fCW.
<TEXT>
fCW
If more than one cover sheet is defined and a site ID is included in the <FOR center="#list"> for more than one cover sheet, all <TEXT> blocks from all covers sheets specified for the site ID will be added to that site's cover sheet. The <TEXT> blocks will appear in the order they appear in QCcovers, and the site will still only receive one cover sheet.
Sites not specified in either QCcovers or QCmessages do not get a cover sheet. If a site is not defined in QCcovers, but that site is named in QCmessages, a blank cover sheet is generated onto which the message(s) can be written.
DF_QCreports does not perform line wrapping on cover sheets. Each text block is printed exactly as it appears in QCcovers and QCmessages. Care must be taken when choosing fonts and using variable substitutions to ensure that text lines do not exceed the width of the page. DF_QCreports also expects that the cover sheet and all messages will fit on a single cover page for each site. each site. It will not create automatic page breaks. DFdiscover version 3.7-004 and later no longer requires the ^P. For example, <TEXT font="12 fB"> is a valid font specification and may be defined using the Query Covers view in DFsetup.
<TEXT font="12 fB">
The following is an example of an English and French version of a cover sheet for an international trial, where sites 20 to 29 inclusive use French, and all other sites use English.
<FOR centers="1~19,30~300"> <TEXT font="^P12 fB"> -------------------------------------------------------------- PLEASE DELIVER THIS REPORT TO THE FITNESS STUDY COORDINATOR TO: <WHO> Site <CENTER>: <WHERE> <DATE> </TEXT> </FOR> <FOR centers="20~29"> <TEXT font="^P12 fB"> ------------------------------------------------------------- DONNEZ CE RAPPORT AU COORDINATOR DE RECHERCHE POUR FITNESS, S'IL VOUS PLAIT TO: <WHO> Site <CENTER>: <WHERE> <DATE> </TEXT> </FOR>
QCmessages - Query Report messages
Cover sheets may include messages for all or specified site IDs. Messages are stored in the file QCmessages and follow the same rules as described for QCcovers. While messages can be added to cover sheets by placing them in <TEXT> blocks in the QCcovers file, the use of a separate file for messages makes it easier to keep the fixed (probably unchanging) cover sheet header separate from the broadcast messages, which will change frequently throughout the trial.
The file QCmessages may include more than one message, and each site may receive none, some or all of them, as indicated by <FOR centers="#list"> which must appear at the beginning of each message. If a site is to receive more than one message, the messages will be printed on the cover sheet in the order in which they are defined in QCmessages.
<FOR centers="#list">
It is unnecessary to define a cover sheet in order to use messages. If a message is specified for a site which has not been defined in QCcovers, a blank cover sheet will be created, onto which the message(s) will be printed. Cover sheets and messages must be requested by running DF_QCreports with the -i c option or by running DF_QCreports with no -i at all.
DF_QCreports does not perform line wrapping on cover sheets or messages. Each text block is printed exactly as it appears in QCmessages. Care must be taken when choosing fonts and using variable substitution to ensure that text lines do not exceed the width of the page. DF_QCreports also expects that the cover sheet and all messages will fit on a single cover page for each site. It will not create automatic page breaks. Messages will continue to go on each report until the QCmessages file is removed. If you want to keep a message for use later on, but do not want to send it to anyone in the next batch of Query Reports, leave the site number string blank. e.g. use <FOR center="">. DFdiscover version 3.7-004 and later no longer requires the ^P. For example, <TEXT font="12 fB"> is a valid font specification and may be defined using the Query Messages view in DFsetup.
<FOR center="">
In the following example all site IDs (1 to 300 inclusive) receive the announcement of an upcoming meeting. For site IDs 1, 4, 22 to 24, and 127, this is followed by an important message regarding their lack of response to data queries.
<FOR center="1~300"> <TEXT font ="^P12 fB"> -------------------------------------------------------------- PLEASE NOTE: </TEXT> <TEXT font="^P10 fCW"> The next FITNESS Study investigator's meeting will be held on Study investigator's meeting will be held on October 9-10, 1999 in Orlando, Florida. Location and times will follow at a later date. </TEXT> </FOR> <FOR center="1,4,22~24,127"> <TEXT font="^P12 fB"> -------------------------------------------------------------- IMPORTANT! </TEXT> <TEXT font="^P10 fCW"> We have not received any corrections related to the Quality Control reports we have sent to your site over the last 2 months. Please make every attempt to resolve the outstanding data queries as soon as possible. By protocol and agreement with the FDA, all adverse events must be reviewed by us within 3 days of each subject assessment. If you are having problems of any kind, please let us know. </TEXT> </FOR>
QCtitles - Query Report titles
The external or internal report title and sub-titles for each of the 3 sections (Subject Status, Refax List, Q & A List) of a DFdiscover Quality Control report may be specified in this file. DF_QCreports checks for the existence of this file and will use it if it exists, otherwise standard titles will be produced.
Title specifications must be formatted exactly as shown in the examples to follow. The opening and closing tags must appear on new lines by themselves, with no leading or trailing text outside of the tag. Anything entered outside of a tagged block is ignored. The # symbol may be used to indicate a comment line but the # is not strictly needed, and has no special meaning inside a tagged block. For example, it is legal to include a line of # as part of a title inside a tagged block.
Tags are only recognized if they begin a new line. Nothing is to precede the tag symbol '<'. No space is allowed within the opening and closing tag except before the word "font". The "font" value must be enclosed in double quotes.
Limited font specifications may be included within the opening tag as listed in QCtitles font specifications”.
The font specification is optional. By default, font ^P10 fCW is used for external and internal report titles, and font ^P10 fB is used for the 3 report section sub-titles. Note that in DFdiscover versions 3.7-004 and later the ^P specification is no longer required, but it is accepted if present. For example <EXTERNAL font="12 fB"> is a valid font specification.
^P10 fCW
^P10 fB
<EXTERNAL font="12 fB">
The variables enumerated in Variables available for use in QCcovers, QCmessages, and QCtitles may be included in the external and internal report titles which appear at the top of each page, but not in the sub-titles for the 3 parts of the report (status, correction queries, and clarification queries).
DF_QCreports does not perform line wrapping on report titles. Each report title and section sub-title is printed exactly as it appears in QCtitles. Care must be taken when choosing fonts and using variable substitutions to ensure that text lines do not exceed the width of the page.
Here is an example of a QCtitles file that contains formatting information for an external and internal report title and the 3 section sub-titles.
# TITLE AT THE TOP OF EACH PAGE OF AN EXTERNAL QUERY REPORT <EXTERNAL font="^P12 fB"> FITNESS STUDY REPORT <NUM> for: <MON> <DAY>, <YEAR> at <TIME> TO: <WHO>, <WHERE> </EXTERNAL> # TITLE AT THE TOP OF EACH PAGE OF AN INTERNAL QUERY REPORT <INTERNAL font="^P12 fB"> INTERNAL QC REPORT: <NAME> page <PAGE> created: <YEAR>-<MON>-<DAY> </INTERNAL> # SUB-TITLE ABOVE THE SUBJECT STATUS LIST # (Part 1 of a standard external Query Report) <STATUSLIST font="^P10 fB"> SUBJECT VISIT SCHEDULE: Note: * identifies subjects with data queries. </STATUSLIST> # SUB-TITLE ABOVE REFAX LIST # (Part 2 of a standard external Query Report) <REFAXLIST font="^P10 fB"> CRF CORRECTIONS: Initial and date each correction. Resend all corrected pages without delay. </REFAXLIST> # SUB-TITLE ABOVE THE QUESTION AND ANSWER LIST # (Part 3 of a standard external Query Report) <QALIST font="^P10 fB"> QUERIES: Print your response below each question and send back this page. </QALIST>
QCtitles font specifications
Variables available for use in QCcovers, QCmessages, and QCtitles
<STUDYNAME>
<SITE>
<CENTER>
<WHO>
<WHERE>
<MAIL>
<FAX1>
<FAX2>
<PHONE1>
<PI>
<PHONE2>
<REPLYTO>
<NUM>
<NAME>
<PAGE>
<DAY>
<MON>
<YEAR>
<WKDAY>
<TIME>
<DATE>
DFreportstyle - DFdiscover Report styling
script
style
<script type="application/json" id="dfreportjson">{ "palettes": { "Corporate": ["#26456e", "#3a87b7", "#b4d4da", "#1c5998", #67add4", "#1c73b1", "#7bc8e2"], "Diverging" : ["#000066", "#6699ff", "#ffff00", "#3333ff", "#99ccff", "#ffff99", "#cc9900", "# 3366ff", "#ccccff", "#ffff66", "#663300"], "Monochrome": ["#1a1a1a", "#666666", "#b3b3b3", "#333333", "#808080", "#cccccc", "#4d4d4d", "#999999", "#e6e6e6"], "Color blind": ["#006ba4", "#ff80e", "#595959", "#5f9ed1", "#c85200", "#898989", "#a2c8ec", "#ffbc79", "#cfcfcf"] } }</script> <style id="dfreportcss"> /* Body font */ body { font: 100% arial, sans-serif; } /* Title font */ .contextText.reportNameText { font: italic bold 1.5em Helvetica, serif; } /* Heading font */ .c3-title { font: 1.2em Helvetica, serif; } /* Axis label font */ .c3-axis-y-label, .c3-axis-x-label { font: normal 1.2em Helvetica, sans-serif; } </style>
This directory contains all the lookup tables used in the study.
Lookup tables
A lookup table is a simple ASCII file. If it is anything other than this, an error message will appear on the command-line upon starting DFbatch.
Each lookup table file must be associated with a symbolic table name in the lookup table map in order to be used by the dflookup function or in DFexplore. The lookup table map is described in DFlut_map - lookup table map.
Within a lookup table, each record has at least 1 field. The first field is a short acronym (the search field), and the subsequent fields are the lookup text. If the record has only one field, it is the search and the result field (this is useful for spell checking). If the record has 2 fields, the first field is the search key and the second is the return value.
An example query lookup table with two fields is illustrated below.
miss|Please provide a response. dtformat|Date format is YYYY/MM/DD. Please review and correct this date. dtillegal|Date provided is illegal. Legal dates are 2016/01/01-today. dtbeforeic|Date provided is before date of informed consent. ongo|End date is provided, but Ongoing is checked. enddt|End date is prior to Start date.
Records may also have 3+ fields (e.g COSTART tables, MedDRA), where the first field is the search key and all other fields are returned as a single delimited string that can be parsed using the dfgetfield function.
Within the lookup table structure, the search field is typically short but has no maximum length. The lookup text fields have no maximum length; however, they are truncated to the maximum length of the data field into which they are inserted.
Multi-field lookup tables can also support a more flexible structure as illustrated below.
#N:LL Term|Pref Term|SOC Term|LL Code #L:Low Level Term|Preferred Term|System Organ Class|Low Level Code #F:1|1-4 Abdomen enlarged|Abdominal distension|Gastrointestinal disorders|10000045 Abdomen mimicking acute|Acute abdomen|Gastrointestinal disorders|10000047 Abdominal cramp|Abdominal pain|Gastrointestinal disorders|10000056
These lookup tables contain 3 header rows which define the structure of the table.
The first row begins with '#N:' followed by a short column name for each field, which appears above the scrolling list of entries in the lookup dialog displayed by an edit check.
The second row begins with '#L:' followed by a longer descriptive label for each field. This appears in the top section of the lookup dialog where the field values for the current row are displayed, and where the user can indicate which fields to search.
The third row begins with "#F:" followed by 2 fields: the 1st lists the fields to be searched by dflookup() for a match and the 2nd lists the fields to be returned when a match is found or selected by the user. If multiple return fields are specified they are returned to the edit check as a '|' delimited string, which can be parsed using dfgetfield().
If a lookup table is defined for a study data field, the defined lookup text can be retrieved in 2 ways. If the user remembers the acronym (first field) the lookup value can be entered by typing the acronym in the current data field and pressing Enter. The acronym is then replaced with the value from the matching lookup table record. Matching on the acronym is performed case-insensitive. Alternatively the user can simply press Enter without giving an acronym, to pop-up a scrolling list of all acronyms and their lookup text, and then click on the desired choice.
a|before AA|auto accident (MVA) A&P|anterior & posterior abd|abdomen abg|arterial blood gases ac|before meals acid phos|acid phosphatase ACLF|adult congregate living facility ACTH|adrenocorticotrophic hormone acute MI|acute myocardial infarction ad|right ear Ad|up to Ad lib|as desired BaE|barium enema Ba|barium BB|blood bank BCP|biochemical profile BE|barium enema BEE|basal energy expenditure
This directory contains summary files created by DF_QCupdate, DFdiscover retrieval files, and temporary files created during the execution of reports and other programs.
This file records the conditions defined in the conditional cycle map that were met for each subject. It is updated each time DF_QCupdate is run.
DFX_ccycle conditional cycle map conditions that were met describes the data recorded for each of these conditions.
The following example shows 3 conditions (1,4 and 6), that were met for subject 11020. These conditions were met at visits 10,5 and 60 respectively, with data values 2, 98 and 0 respectively. To determine the actions triggered by these conditions we would need to consult the conditional cycle map to determine which cycles are affected by each condition.
11020|10|1|2 11020|5|4|98 11020|60|6|0
DFX_ccycle conditional cycle map conditions that were met
This file records the conditions defined in the conditional visit map that were met for each subject. It is updated each time DF_QCupdate is run.
DFX_cvisit conditional visit map conditions that were met describes the data recorded for each of these conditions.
The following example shows 3 conditions (1,4 and 6), that were met for subject 11020. These conditions were met at visits 10,5 and 60 respectively, with data values 2, 98 and 0 respectively. To determine the actions triggered by these conditions we would need to consult the conditional visit map to determine which visits are affected by each condition.
DFX_cvisit conditional visit map conditions that were met
This file records the conditions defined in the conditional plate map that were met for each subject. It is updated each time DF_QCupdate is run.
DFX_cplate conditional plate map conditions that were met describes the data recorded for each of these conditions.
The following example shows 3 conditions (1,4 and 6), that were met for subject 11020. These conditions were met at visits 10,5 and 60 respectively, with data values 2, 98 and 0 respectively. To determine the actions triggered by these conditions we would need to consult the conditional plate map to determine which plates are affected by each condition.
DFX_cplate conditional plate map conditions that were met
This file records the conditions defined in the conditional termination map that were met for each subject. It is updated each time DF_QCupdate is run.
DFX_cterm conditional termination map conditions that were met describes the data recorded for each of these conditions.
The following example shows 3 conditions (1,4 and 6), that were met for subject 11020. These conditions were met at visits 10,5 and 60 respectively, with data values 2, 98 and 0 respectively. To determine the actions triggered by these conditions we would need to consult the conditional termination map to determine which cycles are affected by each condition.
DFX_cterm conditional termination map conditions that were met
This file records the key fields from all required plates found in the database. It is updated each time DF_XXkeys is run.
DFX_keys key fields from all required plates describes the data recorded for each of these conditions.
The following example shows 4 records for subject 1044, for plates 3,2,1 and 2 respectively at visits 51,61,92 and 211 respectively. All plates have status 1=final and are at validation level 7.
1044|3|51|1|7 1044|2|61|1|7 1044|1|92|1|7 1044|2|211|1|7
DFX_keys key fields from all required plates
This file records the current scheduling and status of all cycles and visits defined in the study visit map, for each subject. It is updated each time DF_QCupdate is run.
Field definitions for cycle records in DFX_schedule describes each field in the cycle records, and Field definitions for visit records in DFX_schedule describes each field in the visit records.
The following example shows the current status for one subject (id=1031) in a study having 3 cycle. The screening cycle consists of visits 91 and 92. The in-study cycle is required and has 8 visits beginning with pre-baseline visit 51 and ending with early termination visit 81. The end cycle contains a single abort visit (visit number 80).
1031|1|0|C|S|1|7||||||03/09/21|37884|| 1031|1|0|91|X|1|7|||||||03/09/21|37884||| 1031|1|0|92|X|1|1|||||||03/09/28|37891||| 1031|1|1|C|R|1|7||||||~03/10/07|37900|| 1031|1|1|51|P|1|7|||||||03/10/05|37898|03/10/05|37898| 1031|1|1|0|B|4|0|-1|10|51|||||||| 1031|1|1|61|r|1|1|||||||04/01/06|37991||| 1031|1|1|3|S|2|1|||||||04/01/06|37991||| 1031|1|1|6|S|1|0|||||||04/04/07|38083||| 1031|1|1|9|S|1|0|||||||04/07/07|38174||| 1031|1|1|12|T|1|0|||||||04/10/06|38265||| 1031|1|1|81|E|3|0||||||||||| 1031|1|2|C|E|0|0|||||||||| 1031|1|2|80|A|3|0|||||||||||
Field definitions for cycle records in DFX_schedule
Field definitions for visit records in DFX_schedule
This file records the date and time each CRF page was processed from image arrival to last record modification. It is updated each time DF_XXtime is run. A related file, named DFX_dfin, contains the same information as DFX_time1, for records that are still in the DFdiscover new queue. These records have not been validated and entered into the study database and thus may contain ICR errors in the key fields.
DFX_time1 timing data for each CRF page describes the data recorded for each of these conditions.
The following example shows 4 records for subject ID 1501. The first 2 records are for plate 1 at visit 0, and the second 2 records are for plate 2 at visit 1. In both cases the first record is the primary and the second is a secondary record. In all 4 records the image arrival date is missing. This will be the case if the CRF was not submitted by email or fax, but instead entered using EDC, raw data entry, or ASCII record import, or if the fax_log is missing or has been trimmed.
fax_log
1|1501|0|1|1|6|1995160154|03/30/95||95/04/24|95/04/24|34787|0|34812|34812||09:43:37|09:43:37|0|583.61|583.61 1|1501|0|1|4|4|1995130086|03/30/95||95/03/30|95/04/24|34787|0|34787|34812||18:41:04|09:43:43|0|1121.07|583.71 1|1501|1|2|1|6|1995200106|04/12/95||95/05/19|95/05/23|34800|0|34837|34841||16:46:37|11:12:18|0|1006.62|672.3 1|1501|1|2|5|5|1995160154|04/12/95||95/04/24|95/05/19|34800|0|34812|34837||09:43:59|16:46:39|0|583.98|1006.65
DFX_time1 timing data for each CRF page
incomplete
pending
4
FINAL
5
INCOMPLETE
6
PENDING
missed
This file records the value of all dates in the database that use the VisitDate attribute. Only one date using the VisitDate attribute may appear on each CRF plate, but different plates for the same visit may contain VisitDate fields, and thus it is possible to have more than one record in DFX_visit.dates for each visit for any given subject. A related file, named DFX_visit.date, records the preferred visit date for each visit, as defined in the study visit map. These 2 files are updated each time DF_XXkeys is run.
DFX_visit.date
DFX_visit.dates the value of all dates using the VisitDate attribute describes the data recorded for each visit date.
The following example shows 5 records for subject ID 1127, for visits 0,3,6,51. Note that there are 2 entries for visit 51, one from plate 10 and one from plate 13.
1127|0|03/08/19|37851|3 1127|3|03/11/20|37944|5 1127|6|04/02/18|38034|5 1127|51|03/08/10|37842|10 1127|51|03/08/10|37842|13
DFX_visit.dates the value of all dates using the VisitDate attribute
DFaccess.rpc — Change access to a study database or DFinbound.rpc, or query their current access status.
DFattach — Attach one or more external documents to keys in a DFdiscover study
DFaudittrace — Used by the DF_ATmods report to read study journal files. DF_ATmods produces an audit trail report showing database modifications for the specified study.
DFbatch — Process one or more batch edit check files
DFcompiler — Compile study-level edit check programs and output any warnings and/or errors encountered in the syntax.
DFdisable.rpc — Disable a study database server or incoming daemon to make them unavailable to clients and incoming images
DFenable.rpc — Enable a study database server or incoming daemon after it is disabled
DFencryptpdf — Protect a PDF file by encrypting it with the specified password
DFexport — Client-side, command-line interface for exporting data by plate, field or module; exporting scheduling and query reports; exporting change history; or exporting components of study definition
DFexport.rpc — Export data records from one or multiple plates from a study data file
DFfaxq — Display the members of the outgoing transmission queue
DFfaxrm — Remove transmissions from the outgoing queue
DFget — Get specified data fields from each record in an input file and write them to an output file
DFgetparam.rpc — Retrieve and evaluate the value of the requested configuration parameter
DFhostid — Display the unique DFdiscover host identifier of the system
DFimageio — Request a study CRF image from the database
DFimport.rpc — Import database records to a study database from an ASCII text file
DFjournal — Run by DFsqlload to extract data changes that were made since the last time DFsqlload was run from the study journal files.
DFlistplates.rpc — List all plate numbers used in the study
DFlogger — Re-route error messages from non-DFdiscover applications to syslog, which in turn writes the messages to the system log files, as configured in /etc/syslog.conf.
syslog
/etc/syslog.conf
DFpass — Locally manage user credentials for client-side command-line programs.
DFpdf — Generate bookmarked PDF documents of CRF images
DFpdfpkg — Generate multiple bookmarked PDF files for specified subject IDs or sites
DFprint_filter — Format input file(s) for printing to a PostScript® capable printer, and print to a specified printer
DFprintdb — Print case report forms merged with data records from the study database
DFpsprint — Convert one or more input CRF images into PostScript®
DFqcps — Convert a Query Report, previously generated by DF_QCreports, into a PDF file with barcoding, prior to sending the report to a study site
DFreport — Client-side, command-line interface for executing reports
DFsas — Prepare data set(s) and job file for processing by SAS®.
DFsendfax — Transmit a plain text, PDF, or TIFF file to one or more recipients via email or fax
DFsqlload — Create table definitions and import all data into a relational database
DFstatus — Display database status information in plain text format
DFtextps — Convert one or more input files into PDF
DFuserdb — Perform maintenance operations on the user database
DFversion — Display version information for all DFdiscover executables (programs), reports, and utilities
DFwhich — Display version information for one or more DFdiscover programs, reports and/or utilities
The DFdiscover distribution includes several programs that can be executed on their own from a command line, or combined in shell scripts to build other programs (e.g. to export data sets for systems, or create study specific reports). All of the programs described in this section can be found in the DFdiscover executables directory, /opt/dfdiscover/bin with the exception of DFmkdrf.jnl and DFmkdrf.ec which are in /opt/dfdiscover/ecbin.
/opt/dfdiscover/bin
/opt/dfdiscover/ecbin
Permission to run these programs is controlled by UNIX file permissions that are set on DFdiscover installation to execute for user 'datafax' and users in group 'studies', but not for 'other' users.
The remainder of this chapter is a reference for each shell program.
DFdiscover includes many programs. Some operate only on a database server computer (referred to as server-side) while others operate on a user's desktop computer (referred to as client-side). The application of user login and permissions varies slightly in these two environments.
Interactive client-side programs like DFexplore, DFsetup, DFsend and DFadmin require the user to interactively provide a valid username and password each time they connect to a database server.
Conversely, the majority of the shell programs require a UNIX-like shell environment to function, and hence they may only be executed directly on a DFdiscover database server machine. Access to specific database information is granted based upon the DFdiscover permissions assigned to the user, where the user is identified by their UNIX login name.
Starting with release 2014.0.0, DFdiscover includes a third type of program. Programs of this type require a command line environment to execute and can be run from either the user's desktop computer (client-side) or from the database server computer (server-side). These programs are DFbatch, DFsas, DFpdfpkg and DFexport. To authenticate, they require the same credentials that an interactive program does, specifically the username and password required to connect to a database server. Since there is no visual login dialog, these programs require a different interaction for authenticating, and, except for DFsas, there are two solutions for the user:
command-line options, the user can specify -S servername, -U username and -C password when the program is run, or
-S servername
-U username
-C password
environment variables, the user can set the variables DFSERVER servername, DFUSER username and DFPASSWD password before the program is run.
DFSERVER servername
DFUSER username
DFPASSWD password
For DFsas, only the environment variable method is available.
Specification of command-line options takes priority if both command-line options and environment variables are supplied. It is not possible to mix solutions using a subset of command-line options and a complimentary set of environment variables. If any command-line option is specified, the expectation is that the command-line solution is preferred; otherwise, the environment variable solution is required.
In many circumstances, it is poor practice to display or store a plain text password. Including a password in a shell script or in a cron job is not secure behavior, and is not recommended. Hence we strongly recommend that users do not use either the -C password command-line option or the DFPASSWD password environment variable. Instead, we encourage users to make use of the DFpass program to locally manage their DFdiscover passwords. After creating a matching entry with DFpass, a user does not need to subsequently specify a password with either -C password or DFPASSWD password.
The recommended practice then is to create/manage passwords locally using DFpass and subsequently use either the two command-line options, -S servername and -U username, or the two environment variables DFSERVER servername and DFUSER username.
Use of locally managed passwords is limited to the DFattach, DFbatch, DFpdfpkg, DFexport, DFreport and DFuserPerms programs.
When determining the authentication credentials to use for one of DFbatch, DFpdfpkg or DFexport, the order of evaluation is as follows and only the first matching combination, if any, is used:
If all three command line options -S servername, -U username and -C password are specified, use them.
If all three environment variables DFSERVER servername, DFUSER username and DFPASSWD password are defined, use them.
If both of the command line options -S servername and -U username are specified, use them together with the matching password entry previously stored via DFpass.
If both of the environment variables DFSERVER servername and DFUSER username are defined, use them together with the matching password entry previously stored via DFpass.
The description of each program in this reference is divided into sections.
Name The official name of the program.
The program name is case sensitive and must be typed exactly as specified. This section also provides the brief purpose of the program.
Synopsis Lists all of the valid options for the program invocation.
Each option is either optional or required. Most options are, not surprisingly, optional, a few are required, and yet others may require one option from a specific subset. Each option type is indicated with a particular notation. Except for some required options, a program option begins with - and is followed by an option letter. In some cases this may be enough to select the option; in other cases, the option letter will be further followed by an option string.
-
NOTE: An option letter may be given different meanings in different programs. There is no requirement that the same option letter infers the same meaning when used across different programs.
Notation for program options
"
Description Describes the program purpose in greater detail.
This section describes exactly how the program works in terms of environment, input requirements, output format, dependencies, etc. It may also include detailed information about the program behavior for commonly used combinations of options.
Options Detailed listing of available options, their use, and their meaning.
Examples Illustrates the program options and output by way of example(s).
See Also Lists other reference materials, typically other programs, that are relevant to the program. This in an optional section.
Limitations Describes any limitations in the use or capabilities of the program. This in an optional section.
DFaccess.rpc { [-s study] | [-inbound] } [ [-ro]| [-rw] | [-enable] | [-q] ] [ [-restrict reason string] | [-disable reason string] ]
DFaccess.rpc
study
reason string
DFaccess.rpc provides the same functionality from the command line that is available in the DFadmin Status View. Like all shell level programs, DFaccess.rpc can be executed by datafax or any other user with a UNIX login account who is in group studies.
Read-Only Mode While a study is in read-only mode DFexplore and DFsetup can be used to view but not change study data and setup information. Studies are typically put into Read-Only mode because data collection is complete, all queries have been resolved, and the data has been verified against source documents. Statistical analysis may be ongoing or an FDA audit may be in progress thus the study cannot be disabled, but no further changes are expected or allowed.
By design, read-write access to a study database is only granted to tools that write new data records or update existing data records. All other tools have read-only access. Thus when DFaccess.rpc is used to switch a study sever to read-only mode only those tools that normally have read-write access will be affected. This includes the following:
Software Components Affected by Read-Only Mode
identify
-r
When a study is put into read-only mode it will remain in that mode until reset using DFadmin or DFaccess.rpc with the -rw option.
-rw
While a study is in read-only mode, any incoming pages barcoded for that study will be moved to DFrouter's identify directory.
Restricted Access Access to a study can be restricted for DFdiscover and study administrators only using the -restricted option. While a study is restricted other users are not allowed to login, even in view-only mode.
-restricted
When a study is restricted using DFaccess.rpc or DFadmin users who are logged in will be able to continue working until they log off, but only administrators will be able to make new login connections.
While a study is restricted, any incoming pages barcoded for the study will be processed normally and sent to the study new image queue, but only DFdiscover and study administrators will be able to view or process them.
A study may be made both restricted and read-only, but these options cannot be used together on the same command line, it requires 2 separate executions of DFaccess.rpc; the order is not important.
When a study is put into restricted mode it will remain in that mode until reset using DFadmin or DFaccess.rpc with the -enable option or until the study server is restarted.
-enable
Disabled Access. A study database may be disabled, making it unavailable to all users, using the -disable option.
-disable
While a study is disabled, any incoming pages barcoded for the study will be held until the study is enabled, and will not show up in the study new image queue until processing is triggered by the arrival of the next document received by the DFdiscover server. It is not necessary for the next document to contain pages barcoded for the study in question or for any study.
When a study is disabled it will remain in that mode until reset using DFadmin or DFaccess.rpc with the -enable option.
To complete a change in access mode, DFaccess.rpc must be the only client accessing the server. Thus DFaccess.rpc fails if there are any tools with open connections to the study server.
DFaccess.rpc exits with one of the following statuses:
Enable read-only access to study #123, do lengthy operation, and then reinstate read-write access
# /opt/dfdiscover/bin/DFaccess.rpc -s 123 -ro
...some lengthy operation...
# /opt/dfdiscover/bin/DFaccess.rpc -s 123 -rw
Restrict study #123 in read-only access to DFdiscover and study administrators only, while they analyze and run reports on a static database, and then reinstate normal access to all users
# /opt/dfdiscover/bin/DFaccess.rpc -s 123 -restrict analysis # /opt/dfdiscover/bin/DFaccess.rpc -s 123 -ro
...administrators perform their analyses
# /opt/dfdiscover/bin/DFaccess.rpc -s 123 -rw # /opt/dfdiscover/bin/DFaccess.rpc -s 123 -enable
Query the current status of DFinbound.rpc
% /opt/dfdiscover/bin/DFaccess.rpc -inbound -q
DFattach [-S server] [-U username] [-C password] [ [-doc docname] | [-subject # -visit # -plate # -doc docname] | [-idrf input_drf] | [-dir directory] ] [-odrf output_drf] [-log logfile]{study#}
DFattach
server
username
password
docname
input_drf
directory
output_drf
logfile
study#
DFattach is a command-line utlity that mimics, and extends, the Plate > Attach Subject Document facility in DFexplore. It's primary function is to permit one or more documents to be attached to one or more record keys from a command-line interface.
In DFexplore, it is possible to attach a document to the current data record; using DFattach it is possible to attach documents to many records with simple commands.
Permitted document types are the same as those allowed by DFexplore.
For each document successfully attached to a key:
an entry is added to the system work/fax_log,
work/fax_log
a DRF entry is appended to output_drf if -odrf output_drf is given as an option. The DRF entry contains 5 fields: the key of the record in the first 3 fields, the image ID assigned to the document, and the original document name.
-odrf
There are 4 different methods for attaching documents.
In it simplest form, DFattach attaches a single document to a single key. Specify the document with -doc docname. The key is determined from the filename of the document. The document filename must have the format subject_visit_plate[_other][.extension] where subject, visit and plate are each valid, numeric identifiers for the key.
-doc
subject_visit_plate[_other][.extension]
subject
visit
plate
Attach a single document to the key given as arguments. In this method, each of -subject # -visit # -plate # -doc docname must be explicitly specified. This has the advantage that the filename of the document to be attached does not have to change to follow a format.
-subject
-visit
-plate
Read the input DRF. For each record in the DRF, the key is in the first three fields and the document name is in the fourth field. Attach the named document to the key. Repeat for each record.
Process each file and subdirectory in -dir directory. For each file, if the filename has the format subject_visit_plate[_other][.extension], treat the file as a document to attach to the key. Repeat for all matching files in the directory. Then recursively repeat for each sub-directory.
-dir
Permissions are enforced for the user identified by the supplied credentials. Minimally, the user must have a study role permission that permits Server - Attach document and DFexplore - Data - Attach subject document. Additional permissions may be required dependent upon the action
For each document and key, [id, visit, plate], to be attached, the steps are:
[id, visit, plate]
Confirm that the combination of visit and plate are defined in the visit map. If it is not, output an error message and skip this document.
Lock the key. If the key cannot be locked, output an error message and skip this document.
If the key does not exist in the database, a record with pending status is created. The document becomes the primary image. Create Data permission is required.
If the key matches an existing missed record, the missed record is deleted and a record with pending status is created. The document becomes the primary image. Delete Missed record and Create Data permission are required.
If the key exists, and there is already a primary image, the document is attached as a secondary image; otherwise, the document is attached as the primary image. Modify Data permission is required.
DFattach exits with one of the following statuses:
In each of the following examples, the options -S server, -U username and -C password are not shown but are required.
-S
-U
-C
Attach 1234_1_12.pdf to subject 1234, visit 1, plate 12 in the database for study 100
1234_1_12.pdf
% DFattach -doc 1234_1_12.pdf 100
Attach radiograph.pdf to subject 30001, visit 10, plate 200 in the database for study 22
radiograph.pdf
% DFattach -subject 30001 -visit 10 -plate 200 -doc radiograph.pdf 22
Attach all of the documents in directory /tmp/newdocs to the matching keys in study 200 and record the successes in /tmp/results.drf
/tmp/newdocs
/tmp/results.drf
% ls /tmp/newdocs 250001_0_1.pdf 250002_0_1.pdf 350001_0_1.pdf 350001_0_2.pdf 350001_1_10.pdf % DFattach -dir /tmp/newdocs -odrf /tmp/results.drf 200
DFaudittrace {-s #} [-d date1[-date2]] [-q] [-r][-N] [-A] [-I subjectID] [-P #] [-V #] [-ffieldlist] [-v vfence] [-x output.xlsx]
output.xlsx
DFaudittrace reads a record from the study journal files, calculates any changes from the previous record for those keys and then outputs the differences to DF_ATmods.
The output from DFaudittrace consists of one or more ASCII records, each having 20 fields. Each field is separated by a pipe-delimiter (|). Field contents depend on:
whether DFaudittrace is describing a new record (type N), a changed field (type C), or a deleted record (type D). This information is found in the first field of each record.
whether DFaudittrace is describing a data record, query record or a reason record. This information is found in the 8th field of each record.
The tables describe each of the 20 fields for each record type (data, query, reason) for each type of change (new record, changed field, deleted record). Fields 1 to 7 are consistent across all records output by DFaudittrace and have been described in the first table. Fields 8 to 20 depend on the record type and the type of change, and are described in tables 2 to 4.
DFaudittrace Output: Fields 1 to 7
New Records
Changed Field
Deleted Record
It is also important to note the following about DFaudittrace:
If a record or a query is deleted and then re-created, the re-created record or query is considered to be new record (N), not a change (C) to the previous record.
In queries, the user is sometimes, but not always set when a query is created by the user. The user is never set when the query is modified or deleted, or when it is created by the report DF_QCupdate. Unknown user names appear in the output as unknown for dates before June 1, 1995 and as ?? thereafter.
user
unknown
??
In queries, the validation level is set when the record is created and reset when the record is modified. The validation level of a query does not change when only the record's validation level changes.
If the -v vfence option is used when running DFaudittrace, the effect is different for data records and queries. Data changes are output for records journaled after the case reached or surpassed the vfence validation level. Query changes are output for queries journaled after the query was modified at or beyond the vfence validation level.
-v vfence
vfence
date1\[-date2\]
date1
date2
yyyymmdd
today
fieldlist
DFaudittrace exits with one of the following statuses:
Execute DFaudittrace to output journal information for study 254 between the dates of January 1, 2001 and January 15, 2001.
% DFaudittrace -s 254 -d 20010101~20010115
Execute DFaudittrace to output journal information for study 254 between the dates of January 1, 1998 and today, for field numbers 10-13.
% DFaudittrace -s 254 -d 19980101-today -f 10-13
Execute DFaudittrace to output all journal information for study 254, for records that have existed at or above a validation level of 2 at some point during the study.
% DFaudittrace -s 254 -v 2
Execute DFaudittrace to generate an Excel file containing all changes for data related to subject 44002.
% DFaudittrace -s 254 -I 44002 -x 44002history.xlsx
DFbatch[-S server] [-U username] [-Cpassword] [-b batchnames] [-p stylesheet] [-exec] [[-o outhtml_file] | [-O outhtml_dir] ] [-x out_xlsx][-X out_dir_name] [-e error_file] {-i control_file}{#}
DFbatch
batchnames
stylesheet
outhtml_file
outhtml_dir
out_xlsx
out_dir_name
error_file
control_file
DFbatch is a framework for edit check execution in command-line or unattended mode. The framework specifies which data records are to be operated upon by edit checks and what is to happen when edit checks are invoked. It uses the same edit checks that are used in interactive edit checks through DFexplore and introduces no changes to the language.
Complete reference documentation for DFbatch can be found in Batch Edit Checks.
-p stylesheet
batchlog.xsl
no apply
batch_name
\_out.html
-o outhtml
-x out_xlsx
out_dir
-X out_dir
error_log
DFbatch exits with one of the following statuses:
DFcompiler [-o compiled_outputfilename] {-s #} {filename}
compiled_outputfilename
DFcompiler is used for syntax checking of edit check files. It can be run in the study DFsetup tool or from the command-line.
In DFsetup, selecting Edit checks from Study and selecting the Check Syntax, button in the edit checks window, runs DFcompiler. Results are output to the bottom Outputs section of the edit checks window. A user requires permission to use DFsetup in order to run DFcompiler in this way.
DFcompiler may also be run from the command-line using the options described below. The study number and edit checks file name (with a correct pathname) are both required arguments.
DFexplore has the ability to compile and run edit checks. Edit checks can be compiled (to DFedits.bin) by selecting Publish in DFsetup or from the command line using DFcompiler as follows:
% DFcompiler -s study# -o DFedits.bin DFedits
Edit checks are loaded automatically when DFexplore starts but can be reloaded following a new compile from the File > Reload - Edit checks option.
$STUDY_DIR/ecsrc/DFedits
DFcompiler exits with one of the following statuses:
Compile and check the syntax for edit checks defined in DFedits for study 254
% DFcompiler -s 254 /opt/studies/254/ecsrc/DFedits
DFdisable.rpc { [-s #] | [-i #] } [-f] [-w #] {key}
The study database server must be disabled when a user requires "off-line" access to the database and wishes to ensure that no client connections to the server are permitted.
Equivalent functionality can be obtained using the Studies dialog in DFadmin.
A disable request will fail if there are other users with client tools connected to the study server at the time of the request.
When -i is used to disable an incoming daemon, the daemon is allowed to complete any image processing that it might be engaged in, but as soon as this is completed and the daemon exits, it will not be allowed to process another incoming document.
Following execution of DFdisable.rpc, a study server or incoming daemon can be re-enabled using DFenable.rpc. The DFenable.rpc command must be executed by the user who executed DFdisable.rpc, and the same key must be provided.
When a study server is disabled, the default requirement is that the server shutdown as quickly as possible. In the interest of efficiency, it removes only those index entries that are for obsolete records and sorts only those index files that are unsorted. This is sufficient for correct operation of the server. However, the server does not remove those indices or records that are marked for deletion nor does it perform garbage collection on data records that are obsolete. The result is that it is subsequently possible to export a record that is still scheduled for deletion. This may be undesirable for archival purposes. Use of -f will force the server to remove all records marked for deletion and perform garbage collection of obsolete data records. This option is primarily useful when shutting down a database in preparation for archiving.
-f
DFdisable.rpc exits with one of the following statuses:
Disable study #123 so that database maintenance can be performed
% key=`date` % DFdisable.rpc -s 123 -r "maintenance" $key
DFenable.rpc { [-s #] | [-i] } {key}
When a study database server has been disabled, use DFenable.rpc to enable the server again and make it available for client connections. Equivalent functionality can be obtained using the Studies dialog in DFadmin.
When an incoming daemon is re-enabled it once again becomes available for processing incoming images.
Study servers and incoming daemons can only be enabled by the user who disabled them, and then only if the correct key is provided.
DFenable.rpc exits with one of the following statuses:
Enable study #123
In this case, the assumption is that the study was previously disabled and the disable key was saved in a shell variable.
% DFenable.rpc -s 123 $key
DFencryptpdf { [-c password] | [-C] } [-i string] [-o string]
PDF supports password protection beginning with version 5.0 of Acrobat Reader and version 1.4 of the PDF standard. A PDF document has at least two levels of password: the owner password (which permits the owner to perform actions that another user cannot) and the user password. Since PDF is typically a plain text format, a password protected PDF document is encrypted and stored in binary format. The owner/user password is used as a salt for the encryption.
For DFdiscover purposes, the password is needed to ensure that the document can only be opened and read by known users (those with the password). Since DFdiscover supports delivery of study data/reports by PDF and email, it makes sense that the PDF document be protected from viewing by an unintended recipient. The password is not intended to limit what a user with the password can do with the document, so the owner and user password are the same and no restrictions are applied to any one of the actions available on the document.
DFencryptpdf returns zero upon success, otherwise returns one. DFencryptpdf may fail if the input PDF file is invalid PDF, has multiple cross-reference tables, such as linearized or updated PDF files, or is already encrypted. If neither -c nor -C is specified, it also returns an error.
-c
If -c is specified, the command line password is used. If -C is specified, DFencryptpdf reads the password from password file ~/.dfpdfpasswd. If neither -c nor -C is specified, the program exits with an error.
~/.dfpdfpasswd
The password file ~/.dfpdfpasswd may contain multiple lines, but only the first non-empty line will be used. The leading spaces and trailing spaces will be removed. The maximum length of the password is 32 characters.
DFencryptpdf exits with one of the following statuses:
Encrypt a set of CRFs stored in crfs.pdf and protect with the password "DoNotUse".
crfs.pdf
% DFencryptpdf -c DoNotUse -i crfs.pdf -o protected_crfs.pdf
DFexport.rpc [-A] [-X] [ [-w] | [-a] ] [ [-c] | [-j] ] [-d] [-e] [-J] [-m] [-p] [-q] [-h] [-k] [-z] [-s status_list] [-v #, #-#] [-n #, #-#] [-I #, #-#] [-V #, #-#] [-C yy/mm/dd-yy/mm/dd] [-M yy/mm/dd-yy/mm/dd] [-L lostcode] [ [-U fieldname_list] | [-G fieldname_list] ] [-f fieldnum_list] [-H header_list] {study} {plate(s)} {outfile}
DFexport.rpc
status_list
#, #-#
yy/mm/dd-yy/mm/dd
lostcode
fieldname_list
fieldnum_list
header_list
DFexport.rpc can be used to export whole data records or selected fields, from specified plates of a specified study database. Data records are exported in ASCII text format. DFexport.rpc does not provide support for joining fields from different plates or studies. To use DFexport.rpc users must have 'Export Data' permission, which is granted by an administrator on a study-by-study basis (see System Administrator Guide, Roles). Users with export permission will only receive records for which they have get permission, which may be restricted by level, site, subject, assessment and plate. DFexport.rpc provides no warning that some records cannot be exported as this would itself provide information about the existence of restricted data records.
DFexport.rpc may appear in a shell script used to create reports stored in the study reports directory, to be run in DFexplore from the Reports View, or in a shell script run using dfexecute in an edit check. In these cases the permissions of the DFexplore user running the report or edit check will be applied. Thus the behavior and output may differ depending on the permissions of the user who runs the script.
If the same script is run from the UNIX shell, the permissions associated with the UNIX login name apply. Thus, if a user has different UNIX and DFexplore login accounts with different permissions, the user may get different results depending on whether they run the script in the UNIX shell or DFexplore environments.
test only
-w
-a
-j
-1
:d
:D
Outfile extension. With this option, the output filename specified will have a text file extension (.txt) added to the end of the user defined outfile name. Outfile extension is ignored if the output file is standard output indicated by '-'. The outfile extension is also ignored if only a single plate is specified. In the case of a single plate, the outfile name will be exactly as specified.
If this option is used along with the csv (-z) option, the csv extension (.csv) is added to the end of the specified output instead of the text extension (.txt).
-z
Match data record validation level. This option is only relevant when exporting metadata records: reasons (plate 510) and queries (plate 511); and is ignored when exporting data records.
Meta-data records may have different validation levels from the underlying data records to which they are attached. When metadata records are exported using the -m option, the validation level of each exported metadata record is changed to match the validation level of the data record to which it is attached.
Trailing pipe. For backwards compatibility, exported data records can be terminated by a trailing |, if one is not already present. This allows exported records to maintain the original DFdiscover record format so that they can be, for example, easily imported. A trailing pipe is only needed to make data records compatible.
IMPORTANT: DFdiscover does not expect a trailing pipe to be present in query records (plate 511) or reason for change records (plate 510). Hence, the -p option should not be used when exporting query records or data change records if the intention is to re-import them into the data base using DFimport.rpc.
-p
Header. Include as the first record in the output file, a delimited record of variable names for the columns in the data records. The variable names are, by default, the alias variable names defined in the study data dictionary combined with -H. If fields are extracted using -G, the header contains the variable names combined with -H.
-H
-G
This option cannot be used with an export of the new record queue, plate 0, as the variable names are not fixed. It can however, be used with the export of query records (plate 511) and reason for data change records (plate 510). The headers for query and reason for data change records are defined in the study schema.
The trailing character is different in data records and queries. Data records are terminated by a final | while queries are not. This also applies to the header records. Thus, when exporting data records the header record is terminated by a trailing | while this is omitted when exporting queries.
If multiple plates are requested and the -h option is used, a warning message will appear if the output is written to standard output. The warning message will notify users that the per plate headers will be interleaved with multiple plate output data.
-h
-f "7,5,6,1,2"
The following three options are required and must appear in order at the end of the option list:
Plate numbers of the data files to export. A single plate can be specified or a list of plates can be specified to run in one invocation. If the same plate is listed more than once, the plate data is only exported once. The keyword 'all' can be used to export all existing plates within the specified study including the special reserved plates.
Special reserved plate numbers include: 0 for new records (received but not yet , reviewed) 501 for returned Query reports, 510 for reason for data change records, and 511 for queries. If plate 511 is exported, the validation level of all exported records is updated to reflect the current validation level of the record that the notes are attached to.
Output file that the exported records are written to. The user executing DFexport.rpc must have permission to create or modify this file. In the case of multiple plates being exported, the outfile name is used as the base filename from which a unique filename is constructed by appending each three digit, zero-padded plate number. If the output file is given as -, exported records are written to standard output rather than a file.
NOTE: Data written to standard output will also include "inline" header metadata if either of the -h or -x options are specified.
-x
The following options allow specific records to be selected from the output. With no options specified, all records are output. With options specified, only records matching the selection criteria are output. If multiple options are specified, only those records that match all criteria are output.
Record status. The recognized status keywords consist of the legacy terminology: clean, dirty, error, CLEAN, DIRTY, ERROR, missed, primary, secondary, all and the current terminology: final, incomplete, pending, missed, all. Any combination of legacy and current terminology can be given as a quoted string. The default is records of all statuses.
clean
dirty
error
CLEAN
DIRTY
ERROR
primary
secondary
all
If plate 511 is exported, the available status keywords are the same but their meaning (as related to query status) is translated by the table:
Record and query status equivalence
For backwards compatibility, old keywords CLEAN, DIRTY, and ERROR remain supported for secondary record statuses.
-I
-n
Creation date. If this option is specified, only those records which have a creation date matching the selection criteria are output.
Note that the selection criteria apply only to the creation date of data records and not queries.
Modification date. If this option is specified, only those records which have a modification date matching the selection criteria are output.
Note that the selection criteria apply only to the modification date of data records and not queries.
For both -C and -M, the term today can be used to represent today's date.
-M
Missing value code for missed records.
Missed records are exported if the keywords 'missed', 'lost', or 'all' are specified with the status option, -s. Missed records can be exported in 2 different formats. The first format is comprised of the missed data fields (i.e. including the missed category code and the text field containing the reason the record was classified as missed). The second format is comprised of all of the user defined data fields that appear on the CRF for that plate. The first format is the default and will be used whenever missed records are requested and the -L option is not used. If missed records are requested and the -L option is specified, missed records are exported with the lostcode value inserted into each user data field after field 7 (subject ID) - the trailing 3 system fields are exported with status 0, and the missed record creation and modification timestamps. The lostcode will also be used if the user specified fields using the -f and/or -U|G options along with the -J option and if one or more of those fields do not exist.
-s
-L
-U|G
-J
The only exception to this is when missed records are requested along with one of the field selection options (-f, -G, or -U). In this case, the second format is always used. If a substitution code has not been specified with the -L option, the generic DFdiscover default missing value code (*) is used, and a warning message is written to standard error, each time the substitution is made.
The lostcode specified with the -L option may consist of one or more alphanumeric characters. It is not permissible to use control characters. Also, the DFdiscover delimiter (|) cannot be used, unless data is being exported in CSV format with the -z option.
Note that specifying the -L lostcode option will not, by itself, cause missed records to be exported. This decision is controlled entirely by the status option. The -L option only indicates that the second format is desired and is to be used with the specified substitution code.
-L lostcode
Argument parsing in DFexport.rpc ignores extra, repeated delimiters. Any combination of space and comma is collapsed to one delimiter. For example:
DFRASTER,,,DFVALID = DFRASTER,DFVALID DFRASTER DFVALID = DFRASTER,DFVALID DFRASTER,, ,,DFVALID = DFRASTER,DFVALID ,,DFRASTER,, ,,DFVALID = DFRASTER,DFVALID
are equivalent.
The following options allow fields (variables) to be selected from the output. With no options specified, all fields are output, unless -k has already been specified. With options specified, only the requested fields are output. If multiple options are specified, all of the requested fields are output, in the order requested.
-k
Field number. Specifying this option selects for output from each record only those fields requested by their number. Output fields can be re-ordered by specifying the field numbers in the desired order. Fields can be repeated within the option. This option cannot be repeated in the command-line.
Command-line referencing of field numbers using NF (last field) or relative to NF is also possible. Some examples of NF usage are as follows:
NF
2-NF include fields 2 to the last field inclusive
NF-1 include the second last field
5-NF-1 include fields 5 to the second last field, inclusive
NF-5-NF-1 include fields fifth from the last to the second last field,inclusive
Wherever a field number or field name is expected in field selection criteria, a constant value may also be referenced by inserting it in single quotes, as in 'AB'. The purpose of this is to insert the constant value in the specified field location into the output records. The 'value' notation allows constant values to be exported in much the same way that a variable value can be. Since it is possible for a constant value and variable name to be the same, a constant value must always use the 'value' notation. For example, the specification:
'AB'
-G "date,'AD',DFRASTER"
requests the variable with the name date, followed by the constant value AD and the raster image ID number. The output might appear as:
AD
99/06/17|AD|9924/00001001
If a blank field is being exported, it must be represented by 2 single quotes with nothing inside.
DFexport.rpc will ignore the special meaning of any characters that would otherwise be delimiters. For example, 'a,b' is the constant value a,b, not two separate constants.
'a,b'
a,b
If -h is present, the column name for each constant value will be the value itself, as in the following specification and output:
% DFexport.rpc -h -G "date,'AD',DFRASTER" 254 1 - | head -2 date|AD|DFRASTER 99/06/17|AD|9924/0001001
If -H is also present, the column name for each constant value will be replaced with the next -H value, if one is available. For example:
% DFexport.rpc -h -G "date,'AD',DFRASTER" -H "when" 254 1 - | head -2 date|when|DFRASTER 99/06/17|AD|9924/0001001
By default, date fields are output in the format defined for the field in the study data dictionary. If -c or -j is specified, the default output format is changed to 4-digit year format with imputation (-c) or julian format (-j). It is possible to override the default format for selected fields by following the field number or field name specification with a :c, :j or :o modifier. For example, the specification: -G "VisitDate, VisitDate:c, VisitDate:j" requests the variable with name VisitDate to be output in its default format, in calendar date format, and finally in julian format. The :c modifier applied to a field that already uses a 4 digit year format applies date imputation only. When using -c or -j to change the default output format for dates, note that there is no field level modifier that restores the original date format defined in the study data dictionary.
:c
:j
:o
-G "VisitDate, VisitDate:c, VisitDate:j"
VisitDate
The :o modifier can be used in an analogous fashion to :c and :j. The effect of the :o modifier will be to cause the original date value to be output without any imputation.
A modifier cannot be applied to a non-date variable nor can it be applied to a range of field numbers or field names; it must be applied to a single field number or field name at a time.
It is possible to output the decoded label for a variable's value by appending the :d modifier to any variable that is defined with coding.
Output the coded and decoded values for the sex variable
-G "sex, sex:d"
Any coded value that cannot be decoded is output as is.
The modifier cannot be applied to a non-coded variable nor can it be applied to a range of field numbers or field names; it must be applied to a single field number or field name at a time.
It is possible to split a string field into multiple, shorter string fields by appending a :numxcharscw modifier to the field number or field name specification. The modifier includes a specification of the number of fields to split the input string into (num), the maximum width in characters of each output field (chars), and whether or not splitting should be done on character (c) boundaries or (w) boundaries. If word boundaries are used, a word is delimited by any combination of space or tab characters. Word splitting will always split at the word boundary closest to but less than the maximum width. If there is no word boundary in the substring, the string will be split at the character boundary.
:numxcharscw
num
chars
Split a string field into 5 fields
This example splits the comments field into 5 200-character segments at character boundaries:
comments
-G "comments:5x200c"
If a field to be split contains a missing value code, the first output field will contain the complete missing value code and the remaining fields will be blank.
If a field is split to create additional fields and a header record is requested with -h, field names for the new fields are identical to the original field name unless -H is specified. If -H is given, the field names for the new fields are assigned in order from the option list. If the option list contains fewer names than there are new fields, the remaining fields are assigned names identical to their original field names.
New fields can be created consisting of substrings of database fields. The following example creates a data field from the 4th and 5th characters of field 22.
-f 22:x4.2
Substrings can be extracted from any field type: strings, dates, or numbers. Substrings are extracted from the string value of the field. Numeric fields will be zero-padded to their store width before the substring extraction is done. In the following example, the first three digits of the subject ID field are extracted.
-f 7:x1.3
CSV is a popular format used for sharing data records between different software programs. By selecting the -z option, DFexport.rpc will generate output records that are compliant with this format. The requirements of the format are:
The record delimiter is a newline character (this is also true of records exported in the default, non-CSV format).
Each field within a record is separated by a comma.
Leading and trailing spaces adjacent to comma separators are ignored.
Fields containing commas as part of their value are enclosed in double quotes.
Fields containing double quotes are enclosed within double quotes, and the embedded double quotes themselves are each represented by a pair of consecutive double quotes.
Fields containing embedded line breaks are enclosed within double quotes.
Fields with leading or trailing spaces are enclosed in double quotes.
The first record in the CSV output file may consist of a header record containing field names. Each field in the header will also be comma-delimited and follow the requirements above.
DFexport.rpc exits with one of the following statuses:
Export all records from plate 1 of study 255 to standard output
% DFexport.rpc -s all -h 255 1 - DFSTATUS|DFVALID|DFRASTER|DFSTUDY|DFPLATE|DFSEQ|PID|INIT|VDATE|DFSCREEN|DFCREATE|DFMODIFY| 1|1|9807/1234567|255|1|0|99001|SCL|98/01/25|1|98/02/10 12:34:12|98/02/12 12:34:12| 2|4|9811/0005001|255|1|1|99002|RRN|98/02/12|2|98/02/10 15:03:34|98/03/01 11:23:14| 5|2|9831/0004012|255|1|1|99002|RRN|98/12/12|2|98/07/02 13:45:20|98/07/05 09:21:44| 1|3|9809/0044002|255|1|0|99003|*|98/02/03|1|98/02/10 14:23:01|98/02/10 14:23:01|
Export, with header, all primary records at validation levels 1-2 from plate 1 of study 255 to standard output
% DFexport.rpc -h -s primary -v "1-2" 255 1 - DFSTATUS|DFVALID|DFRASTER|DFSTUDY|DFPLATE|DFSEQ|PID|INIT|VDATE|DFSCREEN|DFCREATE|DFMODIFY| 1|1|9807/1234567|255|1|0|99001|SCL|98/01/25|1|98/02/10 12:34:12|98/02/10 12:34:12| 1|3|9809/0044002|255|1|0|99003|*|98/02/03|1|98/02/10 14:23:01|98/02/10 14:23:01|
Export the first 3 and the 7th fields from plate 1 to standard output
% DFexport.rpc -f "1-3,7" 255 1 - 1|1|9807/1234567|99001 2|4|9811/0005001|99002 5|2|9831/0004012|99002 1|3|9809/0044002|99003
Export the VDATE date field in 4-digit year format and export the subject initials in 3 single-character fields. Include the header record and define names for the newly created fields
VDATE
% DFexport.rpc -c -G "VDATE,INIT:3x1c" -H "middle,last" 255 1 - VDATE|INIT|middle|last 1998/01/25|S|C|L 1998/02/12|R|R|N 1998/12/12|R|R|N 1998/02/03|*||
Export the primary records for subject IDs 99001 and 99002 selecting the fields from DFSTUDY to VDATE inclusive
% DFexport.rpc -s primary -I "99001,99002" -G "DFSTUDY-VDATE" 255 1 - 255|1|0|99001|SCL|98/01/25 255|1|1|99002|RRN|98/02/12
Show all forms of manipulation for a partial date value in the variable DateCompleted1 for study 251
DateCompleted1
% DFexport.rpc -j -G "DateCompleted1,DateCompleted1:c,DateCompleted1:o" 251 1 - 2450845|1998/02/01|98/02/00 2451297|1999/04/29|99/04/29
Create a DFdiscover Retrieval File (ID99001_plate10.drf) for study 254, all primary records for plate 10, for subject ID 99001
% DFexport.rpc -f "7,6,5" -I 99001 254 10 ID9901_plate10.drf
The following is the contents of the file ID9901_plate10.drf.
99001|1|10 99001|2|10 99001|3|10 99001|6|10 99001|9|10 99001|12|10
Export all primary records for plates 1-3, 7 and 8 to standard output
% DFexport.rpc -f "7,6,5,3" 254 "1-3,7,8" - 99001|0|1|0915/000T001 99003|0|1|0915R000S001 99004|0|1|0915/000V001 99001|1|2|0915/000T002 99004|1|2|0915/000V002 99001|1|3|0915/000T003 99004|1|3|0915/000V003 99005|1|3|0000/0000000 99001|30|7|0915/000T009 99004|30|7|0915/000V009 99001|51|8|0915/000T010
When exporting multiple plates in one invocation, the outputs are always sorted in ascending order of the plate numbers, for example if the plate argument is changed to "7,8,3-1", the same output will be seen.
Export all primary records for all plates in a study and create separate text files for each set of plate data.
% DFexport.rpc -e -f "7,6,5,3" 254 all Study254_
The following are the output files created by the above command
Study254_000.txt Study254_001.txt Study254_002.txt Study254_003.txt Study254_004.txt Study254_005.txt Study254_006.txt Study254_007.txt Study254_008.txt Study254_009.txt Study254_010.txt Study254_011.txt Study254_020.txt Study254_501.txt Study254_510.txt Study254_511.txt
As seen above, all plates are exported in one invocation of the command. Special reserved plates are also included when the keyword 'all' is specified for the plates argument. A three digit, zero padded, plate number is also appended to the end of the outfile name, and is optionally followed by a file extension.
Export, in CSV format, a subset of fields for plate 3 of study 254 and write the results to standard output
% DFexport.rpc -s all -z -f 1-7,59,63-66 254 3 - 2,1,9807/0047003,254,3,1,99001,,0,0,"knee surgery, hip replacement",2 1,1,0347R0012001,254,3,1,99101,"""other"" surgery",1,1," carotid endarterectomy ",2
Note that field values containing commas (knee surgery, hip replacement) are enclosed within double quotes. Fields containing double quotes (such as "other" surgery) are enclosed within double quotes and the embedded double quotes themselves are represented by a pair of consecutive double quotes.
knee surgery, hip replacement
"other" surgery
Export only missed records for plate 1, inserting the missing value code "NA" into the fields of each missed record exported. Export the results to standard output.
% DFexport.rpc -s missed -L "NA" 254 1 - 0|7|0000/0000000|254|1|0|20100|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|0|2018/01/15 12:35:23|2018/01/15 12:35:23 0|7|0000/0000000|254|1|0|20101|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|0|2018/01/15 12:35:23|2018/01/15 12:35:23
Note that missing value code insertion is only performed for user-defined fields after field 7 (subject ID).
DFexport.rpc and DFexport are the recommended ways to export data records from the study database for subsequent examination or analysis. Never read the data files directly as they may contain old copies of records scheduled for removal.
Since DFexportrpc; may only be run server-side by an authenticated user, it does not enforce user permissions as strictly as DFexport does. For example, DFexportrpc; ignores the Hidden/Masked field property while exporting field values.
DFexport [-S server] [-U username] [-C password] [ [-Mname moduleName,moduleName,...] | [-Mnum moduleID,moduleID~moduleID] | [-Pnum [ [plt#,plt#~plt#] | [all] | [qc] | [reason] | [new] | [missed] ] ] ] [-status] [ [#, #-#] | [missed,final,incomplete,pending,primary,secondary] ] [-View] [ [all] | [name,name,...] ] [-level #, #-#] [-visit #, #-#] [ [-site #, #-#] | [-subject #, #-#] | [-alias alias,alias,...] ] [-plate #, #-#] [-create list] [-modify list] [-resolve list] [ [-Pattern string] | [-pattern string] ] [-expr string] [ [-ALL criteria] | [-ANY criteria] ] [ [-Fnum #, #-#] | [-Fname fieldname_list] | [-Falias fieldalias_list] ] [ [-joinALL plt#[moduleFields]...] | [-joinANY plt#[moduleFields]...] ] [-usealias] [-X] [ [-c] | [-j] ] [-d] [-sub] [-p] [-x] [-sheet] [ [number] | [name] | [both] ] [-toc string] [-z] [-h] [-D] [-H header_list] [-L lostcode] [ [-w] | [-a] ] [-O output_folder] [-o [ [outfile] | [-] ] ] [-e errlog] {study#}
DFexport
moduleName,moduleName,...
moduleID,moduleID~moduleID
missed,final,incomplete,pending,primary,secondary
name,name,...
alias,alias,...
list
criteria
fieldalias_list
moduleFields
outfile
errlog
DFexport also has a descriptive mode for the output of schema information.
DFexport [-S server] [-U username] [- password] [-listfields] { [ [-listmodules] | [-listplates] ] [ [plt#,plt#~plt#] | [all] ] }
DFexport has a history mode for the output of change history related to a subject, visit, or plate.
DFexport [-S server] [-U username] [-C password] {-history} {-subject #} [-visit #] [-plate #] {study#}
DFexport has a schedule and query report mode mode with output to Excel.
DFexport [-S server] [-U username] [-C password] {-schedule} [-internal] [-rowcolor] [-statuscolor] [-site #, #-#] [-subject #, #-#] [-alias alias,alias,...] {-o outfile.xlsx} [-e errlog] {study#}
DFexport has a CDISC ODM export mode with output to xml or json.
DFexport [-S server] [-U username] [-C password] {-odm} [-odmdesc] [-studydesc] [-protocol] [-userfield] [-fieldalias] [-event plates:visits|all] [-site] [ [site#,site#~site#] | [all] ] [-subject] [ [subject#,subject#~subjetc#] | [all] ] [-alias] {-o outfile.[xml|json]} [-e errlog] {study#}
plates:visits|all
site#,site#~site#
subject#,subject#~subjetc#
DFexport exports data from an individual study database. Minimally, the user must specify the study database and also at least one of the plates or modules containing the data of interest. Data records are exported in ASCII plain text format to a specified file, or the command-line output.
DFexport and DFexport.rpc share many features yet they are also unique in several important ways.
DFexport is available as both a server-side and client-side application; DFexport.rpc is server-side only.
DFexport is able to export descriptive information from a study schema with the -listfields, -listmodules and -listplates options.
-listfields
-listmodules
-listplates
DFexport is able to export, with the -schedule option, schedule and query reports in Excel format. The-internal option includes internal queries in the query report. The -rowcolor option uses an alternate row color when the subject ID changes. The -statuscolor option uses the status color to fill subject ID cells.
-schedule
-internal
-rowcolor
-statuscolor
DFexport is able to export, with the -odm option CDISC ODM xml or json with or without clinical data by including or omitting the -site, -subject and -alias options. The option odmdesc includes ODM description up to 200 characters. The option studydesc includes a study description up to 200 characters. The option -protocol includes a study protocol name up to 200 characters. The option -userfield includes user fields only, excluding system fields. The option -fieldalias uses field alias instead of field name. The option -event includes plates:visits (example 1,3-5:all).
-odm
-site, -subject
-alias
-protocol
-userfield
-fieldalias
-event
plates:visits
DFexport is able to export, with the -history option, the history of changes for a subject, visit within subject, or plate within visit of a subject.
-history
DFexport includes support for joining data fields from the same module definition that are instanced over one or more plates.
DFexport is able to export all records marked as missed in one command execution using the -Pnum missed option. In DFexport.rpc this would require looping through each of the plates individually, selecting missed records at each step of the loop.
-Pnum missed
User permissions for selecting by validation level are enforced in DFexport.
Selecting queries and reasons by creation and modification date is supported in DFexport; in DFexport.rpc only data records may be selected in this way.
Many options are specified using the same notation as DFexport.rpc, while others are specified using unique notation.
It is possible for DFexport and DFexport.rpc to produce different results for the same selection criteria. DFexport enforces user permissions for the user specified with the -U ``username option or the DFUSER environment variable. DFexport.rpc uses the UNIX login name of the command-line.
-U ``username
DFUSER
DFexport can export data in a plate-centric manner, like DFexport.rpc, using -Pnum, in a module-centric manner that is unique to DFexport using -Mname or -Mnum, or by a pre-defined view or views using -View all | view,view,.... Many options are available to create data sets for a wide variety of needs. To get the most out of DFexport it can be useful to approach it's use with the following in mind:
-Pnum
-Mname
-Mnum
-View all | view,view,...
Preparation Access to exported data is available only to authenticated users with appropriate database permissions, Authentication and Database Permissions. Plate, module and field name information is available from DFexport itself, Schema Listings.
Where is the data? Specifying the source of the data, whether it is plate-based or module-based, is required, Data Source.
Do the data records need to be filtered? In many cases not all of the rows in a data set are of interest and need to be filtered further, Record Selection by Keys and Filters.
Data Fields By default, all of the data fields in the specified data source will be exported. It is possible to specify a subset of the fields to export, Extracting/Combining/Concatenating Fields for Output. It is also possible to add field, module, plate, visit, image, site, and study metadata (including custom properties) to the export data, Including Metadata in Output.
Output Formatting For export, the appearance of data fields can be modified without modification of the data source, Output Formatting.
Output Destination Finally, the export data is written to a destination, Output Options.
Authentication To authenticate, DFexport requires the username and password for connection to a specific database server. These may be supplied as:
DFSERVER
servername
DFPASSWD
Specification of command-line options takes priority if both command-line options and environment variables are supplied. Refer to User Credentials for more information.
Database Permissions Subsequent to authentication with a specific database server, use of DFexport is allowed (or disallowed) by the Server - Export Data permission, or the intersection of both DFexplore - List view and DFexplore - Print/Save Data permissions. These permissions are granted to a role (and then user) by an administrator on a study-by-study basis (see System Administrator Guide, Roles). An authenticated user with one of these role permissions will then be able to export data records from the set of site and subject IDs granted by their user permissions, which may further be limited by visit, plate and level restrictions associated with the user's role. DFexport provides no warning that some records cannot be exported due to permission restrictions as this would itself provide information about the existence of restricted data records. For roles without 'Show Hidden Fields' permission and fields with the Hidden or Masked property enabled, DFexport will output an empty (NULL) value when exporting such data fields, and it will also skip over reasons and queries attached to such fields.
This feature is unique to DFexport and is not available in DFexport.rpc. DFexport is able to output database schema information. The available information includes plate numbers and module names, and may optionally also include field number, name, alias and data type for data fields in the requested plates or modules. This descriptive information can be useful reference material when a user wishes to further refine an export.
To obtain a listing of all defined plate numbers, use -listplates all. This output format matches the output from DFlistplates.rpc To obtain a listing of all defined module names, use -listmodules all. In both cases:
-listplates all
-listmodules all
the output is a space-delimited list of values (either plate numbers or module names) matching the requested criteria,
a subset can be specified by replacing all with individual, lists, or ranges of plate numbers in the format #, #-#.
List all module names used in plates 100 through 200
% DFexport -listmodules 100-200 10 Chemistry Eligibility Enrollment Header LabData PhysicalExam SocialImpact Urinalysis
The output is sorted by module name and each module is listed exactly once, even when instanced more than once. Modules that are not instanced on one of the selected plates are not included in the output.
Information about data fields in the specified plates or modules can be requested by additionally including the -listfields option. This option may only be used in conjunction with the -listmodules and -listplates options.
List schema information for data fields of all plates
% DFexport -listfields -listplates all 253 Plate 001: Blood Pressure Screening Visits Field Name Alias Type ----- ------------------------- ------------------------- ------- 1 DFSTATUS DFstatus1 Choice 2 DFVALID DFvalid1 Choice 3 DFRASTER DFraster1 String 4 DFSTUDY DFstudy1 Number 5 DFPLATE DFplate1 Number 6 DFSEQ DFseq1 Number 7 ID ID001 Number 8 PINIT PINIT001 String 9 AGE AGE Number 10 SEX SEX Choice 11 RACE RACE Choice 12 RACEOTH RACEOTH String 13 S1DATE S1DATE Date ...
This is the beginning of the output - actual output would be much longer as the user has requested this information for all defined plates. Field information is included both for user-defined fields and the fixed DFdiscover fields.
List schema information for data fields of all modules
% DFexport -listfields -listmodules all 253 Module: BloodPressure (ID=5022): Blood Pressure Readings Field Name Alias Type Used in Plates ----- -------------------- -------------------- ------- ------------------ 1 DFSTATUS DFSTATUS Choice 2 DFVALID DFVALID Choice 3 DFRASTER DFRASTER String 4 DFSTUDY DFSTUDY Number 5 DFPLATE DFPLATE Number 6 DFSEQ DFSEQ Number 7 DFPID DFPID Number 8 DFMNAME DFMNAME String 9 DFMID DFMID Number 10 DFMREF DFMREF Number 11 SBP1 SystolicReading1 Number 1,5 12 SBP2 SystolicReading2 Number 1,5 13 SBP3 SystolicReading3 Number 1 14 DBP1 DiastolicReading1 Number 1,5 15 DBP2 DiastolicReading2 Number 1,5 16 DBP3 DiastolicReading3 Number 1 17 SDAT ReadingDate Date 1 18 BPARM BPwhichArm Choice 5 ...
This is the beginning of the output - actual output would be much longer as the user has requested this information for all defined modules.
The module listing provides the module name, BloodPressure (useful in conjunction with -Mname), the module number, 5022 (useful in conjunction with -Mnum) and the field numbers and names defined in the module, DFSTATUS, ... (useful in conjunction with -Fname or -Fnum).
BloodPressure
5022
DFSTATUS, ...
-Fname
-Fnum
This feature is unique to DFexport and is not available in DFexport.rpc. DFexport is able to output the history of all changes made to the data for a specific subject, visit or plate. The available information includes when the change was made, who made the change, what was changed, any reason associated with the change and any queries related to the data. The output is to a file, in Excel format, using the -o outfile.xlsx or to a folder with separate files for each plate, module or view using -O folder_name.
-o outfile.xlsx
-O folder_name
To obtain a history listing use -history. The history of changes is specific to a single subject, and includes all data, query and reason changes for all data elements in records identified by the subject ID. The output can be further filtered by visit, -visit #, and plate, -plate # .
-visit #
-plate #
DFexport is designed to export history of changes for exactly one subject. To export history for multiple subjects, DFaudittrace is available.
Specifying the data source for the records to export is required. Data can be exported by plate number, using the -Pnum option, or by module, using either the -Mname or -Mnum option. Within any of these data sources, the default behavior is to export all of the defined fields. It is possible to further select the exported fields using one of the -Fname, -Falias or -Fnum options. Field name, alias and number information can be obtained from a previous invocation that includes the -listfields option.
-Falias
Plate numbers or all to export all plates. Any plate number or range #~# of plate range #~# of plate numbers may be specified. The value is the actual plate number or from this list of keywords:
all: export all plates
qc: export all queries
qc
reason: export all reason records
reason
new: export all records awaiting first validation
missed: export all records marked as missed across the entire database
If a plate selector is not given, the data source must be specified by module name or module number.
Module names or module numbers. Export data from the specified modules. Any number of module names (separated by commas) or module numbers (separated by commas or in a range #~#) or all may be specified. Schema listing options -listfields -listmodules may be used to determine the module name or number of interest.
#~#
-listfields -listmodules
If a module selector is not given, the data source must be specified by plate number.
There are many options available for selecting subsets of data records for export. Several of these options are similar or identical to their counterparts in DFexport.rpc. DFexport typically uses a word to specify an option while DFexport.rpc uses an option letter. In all cases, record selection and filtering is from the set of records allowed by the user's database permissions.
The available selection mechanisms and filters are:
Site number. This option selects records by site number, site numbers, site number range or any combination thereof. The site number of any data record is derived by dereferencing the subject ID in the study's sites database.
This option may not be combined with -subject #, #-# or -alias string - doing so will cause DFexport to exit with an error message.
-subject #, #-#
-alias string
Subject ID. This option selects records by subject ID, subject IDs, subject ID range or anycombination thereof.
This option may not be combined with -site #, #-# or -alias string - doing so will cause DFexport to exit with an error message.
-site #, #-#
alias,alias...
Subject aliases. Accepts multiple aliases delimited by a comma (export data only).
This option may not be combined with -subject #, #-# or -site #, #-# - doing so will cause DFexport to exit with an error message.
status
Record status. Select records by status keyword or status numeric value. The recognized status keywords are final, incomplete, pending, missed, all The default is all statuses, using keyword all, or by excluding this selector. When -Pnum includes data and metadata, a range #~# of statuses can be used to apply status to all outputs. A single status keyword or multiple, comma-delimited status keywords can be specified but they only apply to data records, not metadata.
For backwards compatibilty, the legacy status keywords ([clean, dirty, error, CLEAN, DIRTY, ERROR, missed, primary, secondary, all]) are currently supported but will be removed in a future release. Do not rely upon the availability of the legacy status keywords.
Numeric value, record status keyword, query status and reason status eauivalnce
Plate number. Select from module instances, queries, new, missed or reason records by plate number.
It is important to note that this is not the same as the required plate number selector in DFexport.rpc. In that environment, the plate number selects the data file source for the export. In DFexport, this option filters the data source which may be modules, complete data records, queries, reason, new or missed records.
Export all missed records in plates 1 through 10
%DFexport -S server -U username -Pnum missed -plate 1-10 -o - 254
Export all ESIG modules in plates 100 through 110 and 200
%DFexport -S server U username -Mname ESIG -plate 100-110,200 -o esigmodules.txt 23
The output, from study 23, is written to the file esigmodules.txt in the directory.
esigmodules.txt
Filter records to export only those that match the specified pattern string. For inclusion in the export, a fragment of the record must exactly match the entire pattern string. The -Pattern option requests a case sensitive match, while -pattern requests a case insensitive match.
-Pattern
-pattern
Matching occurs before any output formatting (such as variable decoding or string splitting) is applied.
Specify one or more, simplified filter criteria. Each filter references fields from one plate and multiple filters may be combined to include fields from other plates. The result of these selection criteria is a set of subject IDs matching all of the criteria (-ALL) or at least one of the criteria (-ANY). That set of subject IDs is then used to further filter the output.
-ALL
-ANY
Each criterion must be specified using the notation:
plate#:$(fieldname) op value
where:
plate# is a plate number from the set of plate numbers defined for the study, followed by a : separator,
plate#
$(fieldname) is the name of a user data field defined for the plate number,
$(fieldname)
op is an operator from the set of operators: >, >=, ==, !=, <, <=, and
op
>, >=, ==, !=, <, <=
value is a single value consistent with the data type of the field (string and data values should be enclosed in "" to prevent interpretation by the command environment).
value
""
To specify multiple criteria use \| as a delimiter. The -ANY and -ALL options specify the resulting match behavior: with the -ALL option, every criterion must evaluate to true for a match to occur; otherwise, at least one criterion must evaluate to true.
\|
Export eSignature records for eligible, african-american males
The SEX (1=male) and RACE (2=african-american) are defined on plate 1 while ELIG (1=yes) is defined on plate 2.
SEX
RACE
ELIG
-Mname eSignature -ALL '1:$(SEX) == 1 | 1:$(RACE) == 2 | 2:$(ELIG) == 1'
The fields used in the criteria are, by default, not part of the output, nor are they required to be. In fact, unless the criteria are all from the same source, this is not possible with this option.
Frequently, only a subset of the fields in the data source are of interest. These can be selected using one of the -Fnum, -Fname or -Falias options. Special handling of data fields that are defined in a module which is then instanced across more than one plate is required. When exporting multiple plates or modules by specifyijng range with field output option, -Fnum will work for all plates/modules. The NF, NF-1,NF-2 will be substituted by the actual fields numbers in each plate/module. If using -Fname or -Falias, the field name or alias must be same on all plates/modules. Using -Fnum is safe.
-Fnum #, #-# Select by field number. From each exported record, select for output only those fields requested by their number. Output fields can be re-ordered by specifying the field numbers in the desired order. Field numbers can be repeated. It is possible to request field numbers using NF notation which selects by field number relative to the last field of the record. Some examples of NF usage are:
-Fnum #, #-#
2-NF field numbers 2 to the last field inclusive
NF-1 the second last field
5-NF-1 fields 5 to the second last field, inclusive
NF-5-NF-1 fields fifth from the last to the second last field, inclusive
This option may not be combined with -Fname or -Falias - doing so will cause DFexport to exit with an error message.
-Falias fieldname_list, -Fname fieldname_list Select by field name or field alias. From each exported record, select for output only those fields requested by their name or alias. Output fields can be re-ordered by specifying the field names or aliases in the desired order. Field names/aliases can be repeated. This option may not be combined with -Fnum - doing so will cause DFexport to exit with an error message. Similarly, only one of -Falias or -Fname may be specified - any other specification or combination will cause DFexport to exit with an error message.
-Falias fieldname_list
-Fname fieldname_list
-joinANY, -joinALL The default behavior of DFexport is to export data from one source, a plate or a module. However special handling is needed in the cases where a module instance is distributed across multiple plates. With the -joinALL and -joinANY options it is possible to combine and export data from a data source that is defined across plates in one invocation and into one output file. The join keys (the "join key triple") are always the subject ID, the sequence number and the module reference instance number and must match across data sources. These keys are always exported as the first three fields in each output record when joining occurs. Multiple data sources are specified using | as a delimiter. The specification for a single data source uses the notation:
-joinANY
-joinALL
plate#[fieldlist]
plate# is a plate number from the set of plate numbers defined for the study,
fieldlist is a list of one or more comma-separated field numbers or names, enclosed with [ and ].
[
]
The -joinANY and -joinALL options specify the resulting output behavior: with the -joinALL option, the database must contain a data record for every data source; otherwise, no data is output for that join key triple. Conversely, specifying -joinANY will export data so long as the join key triple appears in at least one data source.
Re-construct the demographics module for output
The DEMOGRAPHICS module is instanced with fields on plates 1 and 2. Export the data, with a specific field ordering and some output formatting.
DEMOGRAPHICS
screen Mname DEMOGRAPHICS -joinANY '1[PINIT, SEX] | 2[RACE, RACE:d, RACEOTH] | 1[AGE, DOB:c]'
Concatenation with + By default, each selected field is output with optional formatting (see Output Formatting) and is separated from adjacent selected fields by the delimiter. Alternatively, selected fields can be concatenated, output with no delimiter at all. To specify concatenation, use the special concatenation symbol, +, between fields to concatenate. For example, to concatenate the site identifier and the subject identifier, use
DFSITE_ID+SUBJID
or to concatenate fields 9 and 12, use
9+12
It is also possible to concatenate a range of fields. In this case, concatenation has precedence over field range so,
8+9-12
or
8-11+12
8+9+10+11+12
are all equivalent. Note that there is no notation to concatenate a range of fields by specifying only the minimum and maximum field numbers; at least one of the numbers must be inidividually specified so that the concatenation operator can be used.
Study metadata is available for export as columns in each output record. Metadata is specified for inclusion in the output by inserting metadata keywords in the list of fields for export. In all cases, the exported value has the string data type.
The available metadata keywords are grouped by 6 categories: field, page, visit, image, site and study.
field These metadata keywords reference field definition properties of a data field. For these keywords only, the keyword is appended to a field name to request a specific property of that specific data field. For example, MHENDAT.DFVAR_PROMPT requests the DFVAR_PROMPT property (the field prompt from the study definition) of the MHENDAT data field.
MHENDAT.DFVAR_PROMPT
DFVAR_PROMPT
MHENDAT
The field properties that are available are:
DFVAR_DESC: field description
DFVAR_DESC
DFVAR_TYPE: field type
DFVAR_TYPE
DFVAR_PROMPT: field prompt
DFVAR_UNITS: field units
DFVAR_UNITS
DFVAR_COMMENT: field comment
DFVAR_COMMENT
DFVAR_MODNUM: module instance number containing field (Module Instance in setup definition)
DFVAR_MODNUM
DFVAR_MODNAME: module name containing field (Module Name in setup definition)
DFVAR_MODNAME
DFVAR_MODDESC: module description containing field (Module Description in setup definition)
DFVAR_MODDESC
DFVAR_USER#: field custom property value, where # is 1-20. Custom property tags may be used in place of the default name
DFVAR_USER#
module These metadata keywords reference properties of the module instance.
DFMODULE_NAME: module name (Module Name in setup definition)
DFMODULE_NAME
DFMODULE_DESC: module description (Module Description in setup definition)
DFMODULE_DESC
DFMODULE_USER#: module custom property value, where # is 1-20. Custom property tags may be used in place of the default name
DFMODULE_USER#
plate Metadata keywords for a plate include DFPLATE_DESC (plate description) and DFPLATE_USER# (plate custom property value, where # is 1-20. Custom property tags may be used in place of the default name).
DFPLATE_DESC
DFPLATE_USER#
page These metadata keywords reference page properties of the plate associated with the current data record. There is currently one page property available: DFPAGE_LABEL. The value is the descriptive label of the page. This label is taken from DFpage_map if there is a matching entry for the combination of visit and plate (field number substitution is also performed if requested with the # or #:d notation); otherwise, it is the plate description taken from the study database definition.
DFPAGE_LABEL
#:d
visit These metadata keywords reference visit properties of the visit associated with the current visit. The current visit is determined by the visit number of the data record, even if the visit number is not included in the export. The visit properties that are available are: DFVISIT_DATE, DFVISIT_TYPE, DFVISIT_ACRONYM, DFVISIT_LABEL, DFVISIT_DUE, DFVISIT_OVERDUE, DFVISIT_REQUIREDPLATES, DFVISIT_OPTIONALPLATES, DFVISIT_ORDERPLATES and DFVISIT_MISSEDPLATE. The value of each property is detailed in Study Setup User Guide, Visit Map .
DFVISIT_DATE
DFVISIT_TYPE
DFVISIT_ACRONYM
DFVISIT_LABEL
DFVISIT_DUE
DFVISIT_OVERDUE
DFVISIT_REQUIREDPLATES
DFVISIT_OPTIONALPLATES
DFVISIT_ORDERPLATES
DFVISIT_MISSEDPLATE
image Each data record has an image attribute, which will have a value if there is an image associated with the data record. For those records, the image properties that are available are:
DFIMAGE_ARRIVAL: arrival date/time of the image
DFIMAGE_ARRIVAL
DFIMAGE_FIRSTARRIVAL: arrival date/time of the image; if there were multiple images over time, the chronologically first image
DFIMAGE_FIRSTARRIVAL
DFIMAGE_LASTARRIVAL: arrival date/time of the image; if there were multiple images over time, the chronologically last (most recent) image
DFIMAGE_LASTARRIVAL
DFIMAGE_FORMAT: image format
DFIMAGE_FORMAT
DFIMAGE_SENDER: sender ID for the document that included this image
DFIMAGE_SENDER
DFIMAGE_PAGES: number of pages in the document that included this image
DFIMAGE_PAGES
site The site metadata associated with each data record can be determined by connecting the subject ID of the data record with the site record that includes the subject ID in the list of subjects. The site properies that are available are: DFSITE_ID, DFSITE_NAME, DFSITE_CONTACT, DFSITE_ADDRESS, DFSITE_FAX, DFSITE_PHONE, DFSITE_INVESTIGATOR, DFSITE_SUBJECTS, DFSITE_TEST, DFSITE_COUNTRY, DFSITE_BEGINDATE, DFSITE_ENDDATE, DFSITE_ENROLL, DFSITE_PROTOCOL1, DFSITE_PROTOCOLDATE1, DFSITE_PROTOCOL2, DFSITE_PROTOCOLDATE2, DFSITE_PROTOCOL3, DFSITE_PROTOCOLDATE3, DFSITE_PROTOCOL4, DFSITE_PROTOCOLDATE4, DFSITE_PROTOCOL5, DFSITE_PROTOCOLDATE5 and DFSITE_REPLYTO. The value of each property is detailed in Programmer Guide, DFcenters - sites database .
DFSITE_ID
DFSITE_NAME
DFSITE_CONTACT
DFSITE_ADDRESS
DFSITE_FAX
DFSITE_PHONE
DFSITE_INVESTIGATOR
DFSITE_SUBJECTS
DFSITE_TEST
DFSITE_COUNTRY
DFSITE_BEGINDATE
DFSITE_ENDDATE
DFSITE_ENROLL
DFSITE_PROTOCOL1
DFSITE_PROTOCOLDATE1
DFSITE_PROTOCOL2
DFSITE_PROTOCOLDATE2
DFSITE_PROTOCOL3
DFSITE_PROTOCOLDATE3
DFSITE_PROTOCOL4
DFSITE_PROTOCOLDATE4
DFSITE_PROTOCOL5
DFSITE_PROTOCOLDATE5
DFSITE_REPLYTO
study The study metadata is static for the entire study, it is defined at the study database level and does not change for different data record keys. The study properies that are available are: DFSTUDY_NUMBER (study number), DFSTUDY_NAME (study name) and DFSTUDY_YEAR (4-digit year of study start, defined by study admin as a global setting). In addition, DFSTUDY_USER# provides the study custom property value, where # is 1-20. Custom property tags may be used in place of the default name.
DFSTUDY_NUMBER
DFSTUDY_NAME
DFSTUDY_YEAR
DFSTUDY_USER#
Date Modifiers By default, date fields are output "as is" in their defined format. The options -c and -j may be used to cause imputation and to alter/normalize that defined format for all date fields that are exported. The -c (calendar format) option requests that any 2-digit year be replaced with a 4-digit year and imputation be applied. The -j (julian format) option requests that imputation be applied and dates be exported as their equivalent julian value - this is primarily useful for mathematical calculations and comparisons with other date values. On an individual field basis, it is also possible to override the defined format with field level date modifiers. By following the field number, name or alias specification with a :c or :j modifier, it is possible to export the same calendar or julian format result for a single field. The :o modifier requests the "original" date value, and can be useful where -c or -j has been previously specified. DateField and DateField:o produce the same result. Similarly, DateField:c also produces the same result if the field's format already specifies 4-digit years and no imputation is needed.
DateField
DateField:o
DateField:c
Calendar and Julian Date Modifiers
For the field named VisitDate, export the value in default, original, calendar and julian formats.
-Fname "VisitDate, VisitDate:o VisitDate:c, VisitDate:j"
If VisitDate is defined with "imputation to the beginning" and an instance of that field in a data record has the data value 16/01/00, this specification will produce 16/01/00|16/01/00|2016/01/01|2456963.
16/01/00
16/01/00|16/01/00|2016/01/01|2456963
A field level date modifier cannot be applied to a non-date field nor can it be applied to a range of field numbers, names or aliases; it may only be applied to a single field number, name or alias at a time.
Label Decoding. For any field that is defined with coding, it is possible to output the decoded label or sublabel for a field's value by appending the :d for labels or the :D modifier for sublabels to the field number, name or alias.
Output coded and decoded values
-Falias "sex, sex:d" 2|female
Any coded value that cannot be decoded is output as is. Any coded value without a sublabel will decode using the standard label. The modifier cannot be applied to a field that has no coding nor can it be applied to a range of field numbers, names or aliases; it must be applied to a single field number, name or alias at a time.
Header Record Specifying -h forces the inclusion of a descriptive record, the header, as the first row of output. If -h is specified with -D, the header contains each field's description. Otherwise, if -h is specified with -Fname, the header contains each field alias. If -h is combined with -Fnum, or no field selection criteria are needed:
-D
the header contains each field alias if the data source is a plate, as specified with -Pnum
the header contains each field name if the data source is a module, as specified with -Mname or -Mnum.
During export it is possible to create one or more new data fields by including options to string split (see String Splitting), extract substrings (see Extracting Sub-Strings) or include fixed, literal values (see Literal, constant values). If -h is specified then these new fields also need names to appear in the header record. This is done with the -H header_list option. This option specifies the variable names to appear in the header record for any new fields. If header_list contains fewer names than there are new fields, the remaining fields are assigned names identical to their original field names.
-H header_list
String Splitting It is possible to split a string field into multiple, shorter string fields by appending a :numxcharscw modifier to a field number, name or alias selection. The modifier includes a specification of the number of fields to split the input string into (num), the literal x as a separator, the maximum width in characters of each output field (chars), and whether or not splitting should be done on character (c) boundaries or word (w) boundaries. If word boundaries are used, a word is delimited by any combination of space or tab characters. Word splitting will always split at the word boundary closest to but less than the maximum width. If there is no word boundary in the substring, the string will be split at the character boundary.
x
This example splits the comments field into 5 200-character fields at character boundaries:
-Fname "comments:5x200c"
If a field to be split contains a missing value code, the first output field will contain the same missing value code and the remaining fields will be blank. If a field is split to create additional fields and a header record is requested with -h, field names for the new fields are identical to the original field name unless -H is also specified. If -H header_list is given, the field names for the new fields are assigned in order from the header_list.
Extracting Sub-Strings New fields can be created from substrings while exporting fields. The modifier notation :xS.L indicates that a substring is being extracted (x), starting at character position S (the first position is 1) and having length L characters. The modifier can be applied to any field number, name or alias selection.
:xS.L
New fields can be created from substrings while exporting fields. The modifier notation :xS.L indicates that a substring is being extracted (x), starting at character position S (the first position is 1) and having length L characters. The modifier can be applied to any field number, name or alias selection. Substrings can be extracted from any field type: string, date, time or numeric. Substrings are extracted from the string value equivalent of the field. Numeric field values are leading zero-padded to their store width before substring extraction is performed.
NOTE: Choice and check field values are NOT zero-padded and hence substring extraction for those data types may yield unexpected or unreliable results.
Digit extraction from subject ID
In some settings, the subject ID is a concatenation of site ID and a numeric subject identifier. In this example, the leading three digits are the site ID and the trailing four digits. Given PID values of 50001, 60401 and 1232003, the extraction would yield these results.
50001
60401
1232003
-Fname "PID:x1.3,PID:x4.4" 005|0001 006|0401 123|2003
Literal, constant values A constant value may be inserted into the data export by including it in one of the -Fnum, -Fname or -Falias specifications. The value must always be enclosed with single-quotes, as in 'AB', to prevent any confusion with a matching field name or alias. To insert a blank field into the export, it must be represented by 2 adjacent single quotes, specifically ''.
''
Insert AD into the data export
-Falias "EnrollDate,'AD',SubjInit" 2014/12/12|AD|STN 2013/06/15|AD|PRP 2015/03/12|AD|KKW
The constant value AD is inserted between the EnrollDate and SubjInit fields in every exported record.
EnrollDate
SubjInit
Enclosed by single-quotes, DFexport ignores the special meaning of any characters that would otherwise be delimiters. For example, 'a,b' is the constant value a,b, not two separate constants. If -h is present, the name used in the header record is the value itself. If -H header_list is also present, the name used in the header record for each constant value will be the next value from header_list, if one is available; otherwise, the value itself will again be used.
Specify a field name for the header record
-h -H "When" -Falias "EnrollDate,'AD',SubjInit" EnrollDate|When|SubjInit 2014/12/12|AD|STN 2013/06/15|AD|PRP 2015/03/12|AD|KKW
Output Subject Alias If -usealias is present, output the subject alias, if defined, in place of the subject ID. If there is no subject alias, output the subject ID.
-usealias
Exclude Test Site Data If -X is present, any data for sites that have been marked as Test Only sites in the sites dialog in DFsetup
-X
Test Only
Trailing Field Delimiter For backwards compatibility, exported data records can be terminated by a trailing | if one is not already present. This is specified with the -p option. (This option has no effect when applied to exported query or reason records.) With the trailing delimiter, exported data records maintain the original DFdiscover record format so that they can be, for example, easily re-imported into DFdiscover in a subsequent step.
Missed Record Handling Missed records are include in the export if either -status all or -status missed are specified. To export missed records, there are two output options:
-status all
-status missed
Internal to DFdiscover, missed records are handled in a different structure than other data records (see plt###.dat - missed records and Missed record field descriptions for further information). Exporting them in this structure, when intermingled with plate- or module-based data records, does not result in normalized records - most records have the data structure, while some have the missed record structure. The result is that these records can be difficult to work with post-export.
By specifying the -L lostcode option, DFexport will export missed records by matching the structure of the corresponding plate and inserting lostcode into all of the user-defined data fields. This results in exported records that are normalized and all share the same structure. Post-export, missed records can be identified by the missed status in the status field, or the lostcode in all of the user-defined data fields.
When missed records are exported along with one of the field selection options -Fnum, -Fname, or -Falias, the second output option is always used. If the option -L lostcode has not been specified, the standard DFdiscover missing value code (``) is automatically inserted in each user-defined field of the exported record.
The lostcode specified with -L must be comprised of one or more alphanumeric characters. It is not permissible to use control characters. Also, the DFdiscover delimiter (|) may not be used, unless data is also being exported in CSV format with the -z option. Note that specifying the -L lostcode option will not, by itself, cause missed records to be exported. That is controlled entirely by the -status option. The -L option only indicates that the second, "normalized" output format is desired and is to be used with the specified substitution code.
-status
Output File By default, output from DFexport is written to standard output (the command-line output). To write the output to a specific file, specify the filename with -o outfile. The user must have filesystem permission to create or modify the file. If outfile is specified as a relative pathname, the file is created relative to the DFexport command invocation directory. (If DFexport is invoked from a cron facility, absolute pathnames are highly recommended.) It is also possible to be explicit that standard output is the output destination by specifying -o - although this is not necessary.
-o outfile
-o -
Output Mode Options -w and -a control the output mode. The output is written (-w) or appended (-a) to the output file. This option is ignored if exported records are written to standard output rather than a file. If the output file does not exist, it is first created. If the output file already exists, it is either overwritten or appended to, depending upon the output mode.
Error Output Re-direct any error messages to a specific file with the -e errorfile option. By default, error messages are written to standard error and hence would get intermixed with data if the data is being exported to standard output.
-e errorfile
Excel Format Excel output has a number of options related to it's use. By including the -x, Excel output is produced rather than the default "|" or pipe-delimited format. The following options are useful when Excel output is selected.
-sheet number | name | both controls how sheets are named - by plate or module number, by description, or by both number and description (default).
-sheet number | name | both
-toc string - optional first sheet name (default: Table of Contents)
-toc string
Comma-Separated Values (CSV) Format CSV is a popular format used for sharing data records between different software programs. By including the -z option, DFexport will generate output that is compliant with this format. The unique requirements of the format are:
Exported records, in CSV or traditional (non-CSV), format are always delimited by a newline character.
If the -h option is also specified, the CSV requirements are applied to the header record and all of the values in the header.
DFexport exits with status 0 if the command was successful. Exit status 1 indicates that an error occurred - errors are written to standard error or the errlog file if -e was specified.
-e
IMPORTANT: Successful command execution can generate no output. This can happen if the selection criteria do not match any available data or match available data which the user does not have permission to view. If output to file is requested, such a command will create a file with no contents and size 0.
In the following examples, only the options unique to the example are specified. For clarity, required options such as authentication credentials, output destination and study number are omitted
Select and filter using data from two different plates
Export complete data records from plate 6. Filter those records to include only instances where records with matching keys on plate 1 have a date variable, S1DATE with a value of 01/01/06.
S1DATE
01/01/06
-Pnum 6 -ANY "1:$(S1DATE) == 01/01/06"
Module-based export
The structure of data exported from a module varies dependent upon whether all of the fields are instanced on a single plate, or on multiple plates.
Consider a study containing a module, BASE, that collects baseline data in fields HEIGHT, WEIGHT, the data across two records as in:
BASE
HEIGHT
WEIGHT
-Mname BASE 73|180|| ||120|70 65|145|| ||110|65
To generate the previous single record output requires a little more work, specifically:
-Mname BASE -joinALL "1[HEIGHT,WEIGHT] | 2[SYSTOLIC,DIASTOLIC]" 73|180|120|70 65|145|110|65
Export eSignature records with a specific name
The eSignature name is stored in the eSignature module. Export records signed by 'Jack'. Several specifications are possible as illustrated here.
eSignature
-Mname eSignature -expr '$(SigName) == "Jack"'
This is the most specific filter - the signature name field must contain exactly "Jack" and nothing else.
-Mname eSignature -expr 'tolower($(SigName)) ~ "jack"'
This is a less specific filter - the signature name field can contain a case insensitive variation of "Jack" somewhere in the value. Note that this also selects values like "Jackson", "Alex Jack" and "Wojack".
-Mname eSignature -pattern "jack"
This is the least specific filter and selects each eSignature module record that contains a case insensitive match for "jack" anywhere in the record.
DFfaxq [#, #-#] [username...]
DFfaxq reports on the status of requested transmission, either by their fax IDs, or by all transmissions owned by the user(s) specified. When invoked with no arguments, DFfaxq reports on all transmissions in the queue. For each outgoing transmission, DFfaxq reports the unique fax ID number (by which it is referred to when using DFfaxrm), the user name of the transmission owner, the scheduled time of sending, the file to be sent, the current status of the entry in the queue, the number of attempts at sending, and the phone number or email address to be sent to.
In the case of a transmission that is being sent to multiple recipients, the fax ID, owner, scheduled time, and file to be sent, are all displayed on the first line together with the status, attempt, and email address/phone number of the first recipient. For the second and subsequent recipients, only the status, attempt and email address/phone number are displayed. The possible status values are:
New: in queue and waiting for scheduled time to arrive
New
Sending: in process of being sent to recipient
Sending
Retry: in queue as previous tries have failed and waiting for retry delay to expire
Retry
Sent: successfully sent to recipient
Sent
Failed: failed and maximum number of retries was exceeded
Failed
If the recipient is an email address, the only possible status values are Sending or Sent. It is not possible to reliably track the delivery status of an email message and so the other statuses cannot be reported.
DFfaxq displays the queue information sorted by increasing fax ID#.
DFfaxq exits with one of the following statuses:
Scheduled but not yet sent transmission
% DFfaxq FaxId Owner Scheduled Time File Status Try To 307 usr1 Nov 30 16:18:23 /etc/fstab New 0 1234567
In-progress transmission to multiple recipients
Here is a queue entry for an in-progress transmission to multiple recipients:
% DFfaxq FaxId Owner Scheduled Time File Status Try To 202 root Nov 30 16:18:23 /etc/magic New 0 5432198 Sent 1 abc@hotmail.com Retry 1 9876543
DFfaxrm { [-] | [FaxId#...] | [-a] }
FaxId#
DFfaxrm removes transmissions from the outgoing queue, that is, those files that have been scheduled for sending but have not yet been sent.
A specific transmission can be removed by supplying its faxid#, which can be obtained using DFfaxq.
faxid#
Removing a transmission by its faxid
% DFfaxq FaxId Owner Scheduled Time File Status Try To 307 usr1 Nov 30 16:18:23 /etc/fstab New 0 1234567 308 usr1 Nov 30 16:18:23 /etc/fstab New 0 555-1212 % DFfaxrm 307
DFfaxrm reports the names of any transmissions it removes, and is silent if there are no applicable transmissions to remove.
When DFfaxrm removes a queue entry, transmissions that are scheduled to be sent (status New) or retried (status Retry) in the future will not be sent or retried. However, any entries that have a status Sending are currently being transmitted and cannot be aborted. These files may be transmitted successfully, but any success command arguments specified to DFsendfax will not be executed. If a Sending status transmission fails, it will not be retried if it has been removed.
Exactly one of the options is required. It is an error to mix these options. It is also an error to not supply any option.
DFfaxrm exits with one of the following statuses:
% DFfaxq FaxId Owner Scheduled Time File Status Try To 314 usr1 Dec 02 12:04:07 /etc/magic New 0 1234567 New 0 9876543 % DFfaxrm 314 STATUS FaxId Owner File Status Try To DELETED 314 usr1 /etc/magic New 0 1234567 New 0 9876543
DFget [-w] [-i delim] [-o delim] [-f #] {#, #-#}
delim
DFget reads records from the standard input and writes records to the standard output. Each input record is assumed to contain one or more fields delimited by the input delimiter. For each input record DFget applies the field specification string to create a new output record that is written to the standard output.
The field specification is constructed from one or more field specifiers. Each field specifier can be a single field number, a range of field numbers or a string.
One or more constant string fields can be inserted in output records. String fields can contain numbers and characters, and can also contain blank spaces. If the constant string field is a number (or begins with a number) it must be enclosed in a pair of single-double quotes (e.g. '"55"') or double-single quotes (e.g. "'55'"). This prevents DFget from interpreting those numeric strings as field specifiers.
Only those fields included in the field specification will be written to the new output record. Fields in the output record are delimited by the output delimiter and are written in the order given in the field specification.
The field number NF can be used to reference the last field in a record, and NF-n can be used to reference the nth from last field in a record.
NF-n
DFget is a convenient way to extract fields from exported database records. Remember to first export the desired data files using DFexport.rpc.
Combining the tools DFexport.rpc, DFget, and DFimport.rpc is a powerful way to manipulate data.
DFget exits with one of the following statuses:
Retrieving specified fields from input records
Consider the following two line input file:
102|2|2|0101|gh|2|mellaril|t|100|1|90/03/01|0| 102|2|9|0101|bc|2|noctec|c|500|1|90/07/12|0|
To retrieve fields 1, 4 through 7, a constant string and the last 2 fields from each record, with a colon as the output delimiter, DFget could be used as follows:
% DFget -i '|' -o ':' 1 4~7 "'12 gm'" NF-1 NF < infile 102:0101:gh:2:mellaril:12 gm:90/03/01:0 102:0101:bc:2:noctec:12 gm:90/07/12:0
DFgetparam.rpc [-n] [-s #] {param}
param
DFgetparam.rpc is the recommended way of determining the value of the master daemon's or a study server's configuration parameter. The current value of the parameter is printed on the standard output without a trailing new line character, unless -n is given. This is a format suitable for command line substitution by the shell.
When DFgetparam.rpc is running as user datafax, if the environment variable DFUSER is set, DFgetparam.rpc runs as that user. Thus for example requesting USER_PERMISSION will get the permissions associated with DFUSER, not the permissions for user datafax. This is important in DFexplore for shell scripts run by dfexecute in edit checks, and in study reports run under the Reports View.
USER_PERMISSION
The following configuration parameters can be retrieved from the master configuration:
length,complexity,expiry,reuse,lockout,failures,email,reset
The following parameters can be retrieved from a study configuration:
CENTERS
The restrictions, if any, on which records the current user can access. The output is:
unrestricted
DFgetparam.rpc exits with one of the following statuses:
Determine the working directory for study 123
In this case, the value is assigned to a new shell variable, work.
% work=`DFgetparam.rpc -s 123 WORKING_DIR`
Echo the value of the default printer for the master daemon
% echo `DFgetparam.rpc PRINTER`
Determine the access permissions for study 253 for the current user
% DFgetparam.rpc -s 253 USER_PERMISSION 1-10|||||||0,1||||||1-44,46-90||
In this example, 3 access permission rules apply to the user:
1-10|||||: all records for sites 1 through 10, inclusive
1-10|||||
||0,1|||: all records for visits 0 or 1
||0,1|||
||2|1-44,46-90||: all records for plates 1 through 44 and 46 through 90, inclusive, at visit 2
||2|1-44,46-90||
The user will be granted access to data records that meet at least one of these permission rules.
DFhostid
DFhostid prints the unique DFdiscover host identifier of the system on the standard output. The output is a 20-character/digit identifier displayed in 5 blocks of 4. Note that the identifier will never contain any of 0 (zero), 1 (one), I (capital i), or O (capital o).
The host identifier of the DFdiscover master machine is required for licensing purposes.
None.
DFhostid always exits with status 0, indicating that the command was successful.
Determine the DFdiscover host identifier
% DFhostid SG8D-QM2L-V8TK-PFB5-DDEG
DFimageio [-hd] [-f string] {-s #} {imageID}
DFimageio
imageID
DFimageio fetches a specific, single CRF image from the requested study database and writes it in its native format to a file, or standard output.
The study number and the unique CRF image identifier are required arguments. The CRF image identifier must be one of:
a database image identifier, written in the YYWW/SSSSPPP notation,
YYWW/SSSSPPP
a DFsetup background identifier, written using the notation $plt###.png where ### is the plate number, or
$plt###.png
###
NOTE: The $ character may need to be escaped with a backslash to avoid interpretation as a shell variable.
an DFexplore background identifier, written using the notation $DFbkgd###.png where ### is the plate number
$DFbkgd###.png
NOTE: Combining -hd with $plt###.png or $DFbkgd###.png will silently ignore the -hd argument.
-hd
Study CRF images are stored in the study file system and hence can also be accessed using standard shell commands. However, shell commands will fail to access the CRF image if:
the CRF image resides on a file system which is not visible to the current computer
filesystem permissions prevent the file from being read
the CRF image was previously deleted when the database record which it references was deleted
DFimageio is the correct, accepted method for accessing CRF image contents from the study database. Future releases of DFdiscover will further tighten the permissions on the CRF image files so that DFimageio will be the only method for retrieving them reliably.
DFimageio exits with one of the following statuses:
Get and write a requested CRF image to a file
% DFimageio -f image1 -s 254 1525/0008001
The CRF image with the unique identifier 1525/0008001 is written to the file named image1 in the current directory.
1525/0008001
image1
Get and re-direct a requested CRF image to a file
% DFimageio -s 254 1525/0008001 > image1
The CRF image with the unique identifier 1525/0008001 is written to the file named image1 in the current directory. The result is identical to that in the previous example but uses shell re-direction to achieve it.
Get and write a DFsetup background image to a file
% DFimageio -f image1 -s 254 \$plt001.png
The DFsetup background image with the unique identifier plt001 is written to the file named image1 in the current directory.
plt001
DFimport.rpc [-v] [-N] [-R] [ [-n] | [-q] |[-c] ] { [-a] | [-m] | [-r] } [-s] {#} {string}
DFimport.rpc imports records from the input file into a DFdiscover study database. To use DFimport.rpc users must have permission to import data, defined by the study administrator in role permissions. To import new records (using the -a or -m options) users must have create permission and to replace existing records (using the -r or -m options) users must have modify permission. Remember that create and modify permission may be specified by level, site, subject, assessment and plate. DFimport.rpc will reject any records it cannot import with an error message indicating that the user does not have the necessary permissions.
-m
When reading records from the input file, each record must be delimited by a newline ('\n') character. Additionally, each record can be at most 4095 characters in length. If 4095 characters are read without encountering a newline, the record is truncated at 4095 characters. It will subsequently fail record checking and be rejected.
DFimport.rpc may be used in a shell script stored in the study reports directory and run in the DFexplore Reports View, or in a shell script stored in the study ecbin directory and run dfexecute in an edit check. In these cases the permissions of the DFexplore user running the report or edit check will be applied. Thus the behavior and output may differ depending on the permissions of the user who runs the script.
If the same script is run from the UNIX shell the permissions associated with the UNIX login name apply. Thus, if a user has both UNIX and DFdiscover login accounts, and these accounts have different permissions, the user may get different results when they run the script in the UNIX and DFexplore environments.
DFexplore uses the environment variable DFUSER to implement user permissions. It is possible to set and export this variable in the shell script to fix the permissions to those of any specified user, thus rendering the results constant for all users despite their specific DFexplore permissions.
Input records can be new data records that need to be reviewed in Image View, new or previously entered data records for direct entry to the study database, or metadata records (data queries or reasons). It is not possible to mix these different record types in a single import operation. The type of records being imported is specified using one of the following options:
-n : new records to be added to the new image queue
-q : queries for specified data fields
-q
-c : reasons for specified data fields
If none of these options is specified, the records must be new or replacement data records to be imported into the study database according to the plate number specified in the fifth field of each data record.
In addition to the record type options, there are 3 import mode options (add, replace and merge) to specify whether records are new and being added to the database (-a), replacement records (-r), or a combination requiring a merge operation during which new records are added to the database and existing records are replaced (-m).
To locate existing records in the database, DFimport.rpc searches for a record with matching keys (id, visit, plate) and image ID. This will always result in the identification of a record uniquely, if it exists, independent of whether the record has primary or secondary status. Dependent upon the record's presence in the database, the requested import mode will result in either an 'Add' or 'Replace' operation as shown in Deriving operations from import mode.
Deriving operations from import mode
If the operation is not in error, then the primary or secondary status of the import record is considered subject to the following rules:
Add operation
Replace operation
Imported records are verified against the study data dictionary if -v is given. Without this option, imported records are only verified for the correct number of fields.
-v
IMPORTANT: Records must be correctly formatted for the DFdiscover plate to which they will be imported. The standard field delimiter, |, must appear between data fields. The first 7 fields must include valid values for: status, level, image ID, study, plate, visit and subject ID respectively, and the last 3 fields must have valid values for: status, creation and modification. The only exception is for data records being imported to the new image queue (using the -n option), for which the subject ID and visit keys may be blank.The third field, image ID, must be unique for all data records in the study database. This field contains the image identifier if the record has an associated image file, or a raw record identifier if there is no image. When importing new records that have no image (e.g. lab data from an external source) this field should contain a placeholder image ID, 0000/0000000. Specifying -R, will instruct DFimport.rpc to generate a new raw record identifier and insert it in the record, replacing the placeholder.When importing metadata records (queries and reasons) the image ID field must also contain 0000/0000000, but this value is not converted to a unique identifier by DFimport.rpc (0000/0000000 is valid in metadata records).A trailing delimiter is required at the end of all data records but must not be present on metadata records, i.e. queries (plate 511) and reasons (plate 510). Thus, care should be taken not to use the -p option, which ensures there is a trailing pipe on exported records, when using DFexportrpc; to export queries and reasons if the intention is to re-import them into the database.
-R
If the validation level of an imported record is higher than the user's maximum write level, it is adjusted down to the maximum, a warning message is generated, and the record import proceeds.
If an input record has a # or a newline in the first column position, DFimport.rpc treats the entire record as a comment and skips over it.
If DFimport.rpc completes without error, it echoes the number of records imported and the number of warning messages and exits with status 0. If DFimport.rpc encounters errors, which are unrelated to the records being imported, it exits with a message to stderr and a positive exit status. If DFimport.rpc encounters errors with the records it is importing, it writes each offending record to stdout, a problem description to stderr, and increments a counter of failed records. Upon exit, DFimport.rpc sets its exit status to the failed record counter. If all records are successfully imported, the exit status is 0.
NOTE: - DFimport.rpc does not complete successfully if the study server has been disabled or put into read-only access mode.- If the study server is in exclusive access mode, only the user who put it into this mode can import data. - If the study is in restricted access mode, only study and DFdiscover administrators are able to import data.
Imported record type: if none of these options is specified the input file must contain data records to be imported into the study database according to the plate number field in each data record. Specifying:
-n, New: add records to the new record queue, DFin.dat. When importing to DFin.dat, the import mode must be -a or -r.
-q, queries: add records to the Query database, DFqc.dat. When importing to DFqc.dat, the import mode must be -a or -r.
-c, Reason for change notes: add records to the reasons database, DFreason.dat. When importing to DFreason.dat, the import mode must be -a or -r.
Import mode (required) specifies one of three possible actions:
-a, Add: fail if a record with matching keys and image ID already exists in database.
-m, Merge: if a primary record with matching keys exists in the database, replace it if the image ID is the same, make it secondary if the image ID is different, otherwise proceed as if adding.
-r, Replace: fail if a record with matching keys and image ID does not exist in the database, otherwise replace it.
Skip plate arrival triggers. This option can only be specified in conjunction with the -n (new record) import option. By default, plate arrival triggers (defined in Plate View in DFsetup) are executed when data records are imported to the new record queue. The combination of -n -s options allows new records to be imported without executing plate arrival triggers.
-n -s
Plate arrival triggers are only executed if the record import is successful. If a trigger fails, a warning message is written to the DFdiscover error log.
DFimport.rpc exits with one of the following statuses:
DFimport.rpc attempts to enforce database integrity as best it can. As a result, the following constraints are enforced:
DFimport.rpc does not allow the import of type 21 (missing page), 22 (overdue visit) and 23 (edit check missing page) queries for which there is a primary record (missing, final, incomplete) in the database. Attempts to do this will result in a message that the record already exists.
DFimport.rpc will delete any corresponding type 21 (missing page), 22 (overdue visit) and 23 (edit check missing page) queries when a primary record (missing, final, incomplete) is imported into the database.
DFimport.rpc does not allow the import of new query records with category codes that are not defined in the Query Category Map in DFsetup. However, it does allow modification or deletion of records with undefined category codes.
DFimport.rpc can be used to import data records that contain fields from an e-signature module but each of the e-signature fields will be blanked during the import. It is not possible to programmatically add/replace data records that have pre-filled e-signature fields. Such records must be reviewed in DFexplore after the import so that they can be e-signed.
DFimport.rpc does not allow the import of queries for which the primary record is locked. Attempts to do this will result in a message that the record is currently in use.
DFimport.rpc does not allow the import of data values for choice and check fields that do not exactly match one of the codes defined for those fields in the study schema.
DFimport.rpc will not import secondary records with a raw or zeros image ID (e.g. 0922R00023001 or 0000/0000000). The only supported function for secondary records is to reference additional images attached to the primary data record. Thus secondary records must have an image ID that points to an image file.
0922R00023001
However, there are situations where a knowledgeable user may want to import records out of order, for example, so that referential integrity is initially violated, and then subsequently restored. The following operations are permitted as they allow a knowledgeable user to fully utilize DFimport.rpc.
DFimport.rpc allows the import of queries for which there is no primary record in the database. This is to allow an out-of-sequence import of queries followed by their data records in a subsequent import. The correct sequence of steps is to first import the primary records followed by the queries.
DFimport.rpc allows the primary record to be imported with a delete status, effectively deleting the primary record and all secondary records, queries and reasons associated with it. If the primary record is deleted, its secondary records and metadata will also be deleted to avoid having orphaned secondary, query and reason records in the database. During deletion, the record currently in the database is compared with the record being imported. All fields (other than status) must match for the operation to succeed. The number of fields can be different than the current definition for a given plate. This allows for the deletion of possibly corrupt records with field counts that differ from the current plate definition by exporting the record, changing its status to delete, and importing the record in question.
delete
DFimport.rpc allows query and reason records to be imported with a delete status, effectively deleting the queries and reasons. All fields (other than status and validation level) must match for the operation to succeed. The deletion of queries and reasons does not delete the primary record to which they are attached.
DFlistplates.rpc [-n] [-p #, #-#] {-s #}
DFlistplates.rpc creates a list of the plate numbers defined for the study and writes that list to the standard output. The list is sorted in increasing numeric order. Each plate number is output with leading zero-padded to three digits and is delimited from the next plate number by a single space character. The complete list of plate numbers is output in one line without a trailing new line, unless -n argument is given.
Both the plate numbers for user-defined plates, and 501 (the plate number reserved for returned Query Reports), are output by DFlistplates.rpc. However, the special plate numbers, 0, which is reserved for new records in DFin.dat, 510, which is reserved for reason for data change records, and 511, which is reserved for records in the query database DFqc.dat, are not included in the list.
With -p #, #-#, the list of known plates for the study is intersected with the argument list to create an output list. If this option is not specified, the program lists all of the defined plate numbers for the study.
-p #, #-#
DFlistplates.rpc exits with one of the following statuses:
List the plates defined for study 123 to the standard output
% DFlistplates.rpc -s 123 001 002 003 004 005 006 007 008 009 010 020 501%
Notice how the prompt has been affected by the lack of a trailing newline.
List all plate numbers, one per line, for study 254 - Bourne shell method
% for p in `DFlistplates.rpc -s 254` ? do ? echo $p ? done 001 002 003 004 005 006 007 008 009 010 020 501
DFlogger [-t string] [-f string] {-s #}
DFlogger reads the standard input, or the file specified with -f string, treating each input line as an error message to be forwarded to syslog.
-f string
DFlogger should be used, in a pipeline fashion, at the end of commands that might generate error messages that you want to record in the system error log. DFlogger sends each input line to syslog and counts how many lines were written. The number of lines is used as the exit status of DFlogger. This has the benefit that, if DFlogger is used at the end of a pipeline, the exit status of the pipelined command will always be the number of error messages generated.
DFlogger
DFlogger exits with one of the following statuses:
Capture any error output from to the system error log
% DFtiff2ras /tmp/TIFFfile /tmp/rasterfile_ 2>&1 | \ DFlogger -t DFtiff2ras
Both the standard output and standard error are directed to DFlogger (this is C-shell syntax).
DFpass {[-add] | [-replace] | [-remove]} {servername:username}
servername:username
Several command-line programs, namely DFattach, DFbatch, DFexport, DFpdfpkg, DFreport and DFuserPerms, connect to study databases and require valid database credentials for the database connection to succeed. Each of these programs already permits the use of command-line options -S, -U and -C for specifying the needed servername, username and password credentials. Similarly, the DFSERVER, DFUSER and DFPASSWD environment variables may be assigned values and used.
DFpass is a convenience program that allows a user to locally store the password part of these credentials in a secure manner. Use of DFpass is not required but it is recommended for command-line users, and is strongly encouraged for users that write shell scripts and/or schedule program execution with facilities like UNIX cron.
DFpass manages credentials for the current user by keeping a local, user-specific database file of servername, username and password triples. Each record in the database is an encrypted representation of the credentials for one unique combination of servername, username and password. With DFpass one can add new records, update existing records with a new password and remove records.
Use of DFpass requires command-line specification of the action to be taken (one of: add, replace or remove) and the servername and username to which the action applies:
if the -add action is specified, the combination of servername and username must not match a previously added entry that is already being managed locally by DFpass. The user is prompted to enter their password. DFpass obscures the password as it is typed in and requires the user to confirm by entering the password again. If the passwords match exactly, the password is accepted and saved locally for the user.
-add
if the -replace action is specified, the combination of servername and username must match a previously added entry that is already being managed locally by DFpass. The user is prompted to enter their new password. DFpass obscures the password as it is typed in and requires the user to confirm by entering the password again. If the passwords match exactly, the password is accepted and saved locally for the user.
-replace
if the -remove action is specified, the combination of servername and username must match a previously added entry that is already being managed locally by DFpass.
-remove
In all cases, DFpass prints a message confirming that the requested action was action, or an error message if the action could not be completed.
IMPORTANT: DFpass does not confirm that the supplied servername, username and password combination is valid. This is the responsibility of the user.
DFpass is not a replacement for the existing DFdiscover tools for managing user credentials, nor does it offer the same functionality. User credentials must still be created and managed within DFdiscover using standard methods. DFpass simply allows you to write and read those credentials locally in a way that does not expose passwords as clear text.
Adding entries with DFpass does not add credentials for the user to DFdiscover. It is vitally important that the entries made with DFpass match existing DFdiscover credentials, otherwise those entries are of no value. Users must also be aware that updating a password in DFdiscover does not update the local information managed by DFpass; this must be done separately with the -replace action.
DFpass exits with one of the following statuses:
Add credentials
% DFpass -add testserver:testuser Password: xxxxxxx testserver:testuser added
Remove credentials
% DFpass -remove testserver:testuser testserver:testuser removed
DFpdf [ [-c password] | [-C] ] [-d] [-i string] [-o string] [-f string] [-v string] [-g string] [-s string] [-t string] [-j string] [-k string] [-n] [-b string [-a string] [-m #] ]{#}
DFpdf generates an optionally password-protected PDF document containing one or more pages, where each page is a study CRF, bookmarked within the complete PDF. The resulting PDF document follows the Adobe PDF specification, version 1.4.
If a password is provided, the entire file is encrypted and stored in binary format. Otherwise, the file is stored in plain text format.
DFpdf requires a study number argument and an input file. The input file, which must be in either DFdiscover data export or DRF format, lists the images that are to be included in the PDF output document. Sort order in the output matches the order that records appear in the input file. If the -s sortmap option is specified, sort order within subject will follow the rules specified by the contents of sortmap.
-s sortmap
sortmap
Bookmarks are created at the document root, subject, visit, and plate levels. The label for the document root bookmark is ID, Visit, Plate but can be over-ridden through the use of -t. A new bookmark is created as each new ID is encountered in the input and the label is of the form ID #####. A visit bookmark is created for each new visit number within the current ID. The label is created from [field 3 of the study visit map](studyfiles.chapter.html#dfvisit_map_fields.table ” DFvisit_map field descriptions”){.link} if the visit is defined in the visit map; otherwise, it has the form Visit #####. Finally, a plate bookmark is created for each CRF in the input file. The label is created from the matching page map entry, if it is defined; from field 2 of the study DFfile_map if the plate is defined (which it should always be); or, it has the form Plate #####. For secondary records, an asterisk (``) is appended to the end of the page bookmark label. Hence it is easy to scan the list of expanded bookmarks and identify those records that are secondaries.
ID, Visit, Plate
-t
ID #####
Visit #####
Plate #####
Using the -j and -k options, custom page header and footer labels, respectively, can be created for each CRF page in the output document. If no labels are specified, the default header label is ID %DFID : %DFPMLBL and there is no footer label. The desired label is specified in a string and may contain any combination of fixed text and zero or more of the following keywords, which are replaced at output creation time with the matching database value:
ID %DFID : %DFPMLBL
NOTE: If the string contains characters that are meaningful to the shell, such as space, quote, percent, etc., then the entire string must be enclosed in double quotes. The string can always be enclosed in double quotes, to ensure that no characters are interpreted by the shell.
%DFSTATUS - record status (the status label is returned, rather than the status number)
%DFSTATUS
%DFVALID - record validation level
%DFVALID
%DFRASTER - image ID
%DFRASTER
%DFSTUDY - DFdiscover study number
%DFSTUDY
%DFPLATE - plate number
%DFPLATE
%DFSEQ - visit or sequence number
%DFSEQ
%DFID - subject ID
%DFID
%DFCREATE - record creation time stamp
%DFCREATE
%DFMODIFY - record modification time stamp
%DFMODIFY
%DFPMLBL - page map label for the current keys as defined in DFpage_map
%DFPMLBL
%DFVLBL - visit label for the current keys as defined in DFvisit_map
%DFVLBL
%DFPLBL - plate label for the current keys as defined in DFschema
%DFPLBL
A warning message is written to stderr for each record in the input file that does not reference an image file. No output is written to the PDF file for such records.
For records exported from the new image queue, bookmarking will only be as good as the accuracy of the ICR on the record keys.
It is possible to override the default labeling for study visits and plates by providing alternative visit map, file map, and page map files with the -v, -f, and -g options respectively.
-g
NOTE: The output PDF document contains copies of the CRF images and hence each file can be quite large. The images are compressed so that the space requirements are not as onerous as for the source CRF images, but still, an input file listing 500 CRF images will create a PDF document that is approximately 5MB in size.
DFpdf exits with a status 0 if a PDF output file was successfully created; otherwise, a non-zero exit status is returned.
If -c is specified, the command line password is used. If -C is specified, DFpdf reads the password from password file ~/.dfpdfpasswd.
The password file ~/.dfpdfpasswd may contain multiple lines, but the first non-empty line will be used. The leading spaces and trailing spaces will be removed. The maximum length of password is 32 characters.
-d
FILE_MAP
VISIT_MAP
Allows the specification of fields to be blinded (blanked out) in the output PDF file. The string is specified in the following format:
plate:field,field,field;plate:field,field
where plate is a valid plate number and field is a comma-delimited list of field numbers that are to be blinded on that plate (ranges of field numbers are specified by enumerating each field in a comma-delimited list). Field numbers must use DFdiscover schema numbering.
field
Field numbers that are not valid for the plate are skipped. Where multiple plates are required, plate specifications are separated by semi-colons.
To implement the blindlist, DFpdf must read the study setup file. The study setup file may be provided with -a string. If the setup file is not provided, the study server will be contacted to locate and supply the setup file.
-a string
-m # can be used to pad by the specified number of pixels, the blinded area. The default default margin is 0 pixels, and the blinded areas are drawn in exactly the dimensions specified by the study setup. Legal values for margin must be > 0 and reasonable values are less than 10. For a margin greater than 0, the margin is added to each side of the setup dimensions and these new dimensions are used to blind the request field.
-m #
DFpdf exits with one of the following statuses:
Use DFpdf without the use of the DFexplore Save As PDF option
This example involves creating a DRF using Save Data Retrieval File from the File menu in DFexplore, and then running DFpdf from the command line to generate a PDF file of the saved record set.
To start the process, DFexplore was used to retrieve a set of records from the study database. The retrieved records were then saved to the DRF dde.drf. Example contents of dde.drf are shown below.
dde.drf
#user1 16/02/22|Records ready for DDE #Id|Visit|Plate 1002|0|1 1006|0|1 2035|1|2 2035|1|3 2035|1|4 2035|21|5 2035|23|5 2035|24|5
Next, DFpdf was run from the command line to create a PDF document from the DRF having a custom header label for each record in the PDF document. In this example, DFpdf was invoked by the following command:
% DFpdf -d -t "Subjects" -i /opt/studies/val254/drf/dde.drf \ -j "Subject %DFID, Image %DFRASTER" -o /opt/studies/val254/drf/dde.pdf 254
To generate the same file with password protection, the following command was invoked using the password "4aY3gh98B":
% DFpdf -c 4aY3gh98B -d -t "Subjects" -i /opt/studies/val254/drf/dde.drf \ -j "Subject %DFID, Image %DFRASTER" -o /opt/studies/val254/drf/dde.pdf 254
It is also possible to pipe the output from DFexport.rpc directly into DFpdf as illustrated in the following example:
% DFexport.rpc 254 1 - | DFpdf -t "Screening Data" \ -j "Subject %DFID, Image %DFRASTER" -o /opt/studies/val254/work/Screening.pdf 254
DFpdfpkg [-S server] [-U username] [-C password] [-A] [-n] [-color] [-hd] [-b blind_spec] [-f filemap] [-v visitmap] [-g pagemap] [-s sortmap] [-t title] [-j header] [-k footer] [ [-P pdfpasswd] ] [-history [-bookmark label] [-excel | -legacy] ] { [-data data,missed] [-image primary|all] } {-output out_dir_name } [-prefix file_prefix] { [-subject pid_list] [-alias alias_list] [-site site_list] } [-visit visnum_list] [-plate pltnum_list] [-expand plate#~# | all] [-e errlog] {study#}
blind_spec
filemap
visitmap
pagemap
title
header
footer
pdfpasswd
label
data,missed
primary|all
file_prefix
pid_list
alias_list
site_list
visnum_list
pltnum_list
plate#~# | all
DFpdfpkg generates a set of optionally password-protected PDF documents (one per subject) containing one or more pages representing data records (EDC or scanned paper CRFs submitted via email, DFsend, or fax), and images and audit trail information. The resulting PDF documents follow the Adobe PDF specification, version 1.4, and as a result can be read by Acrobat Reader versions 5.X and greater.
If a password is provided, files are encrypted and stored in binary format. Otherwise, the file is stored in plain text format.
DFpdfpkg requires a study number argument and an output folder name. Records are output ascending numeric order by visit and plate. If the -s option is specified, sort order will follow the rules specified by the DFsortmap file.
Bookmarks are created at the document root, ID, visit, and plate levels. The label for the document root bookmark is ID, Visit, Plate but can be over-ridden through the use of -t. A visit bookmark is created for each new visit number within the current ID. The label is created from field 3 of the study visit map if the visit is defined in the visit map; otherwise, it has the form Visit #####. A plate bookmark is created for each CRF in the visit. The label is created from the matching page map entry, if it is defined; from field 2 of the study DFfilemap if the plate is defined (which it should always be); or, it has the form Plate #####. For secondary records, an asterisk (*) is appended to the end of the page bookmark label. Hence it is easy to scan the list of expanded bookmarks and identify those records that are secondaries.
Using the -j and -k options, custom page header and footer labels, respectively, can be created for each CRF page in the output document. If no labels are specified, the default header label is ID %DFID : %DFPMLBL and there is no footer label. The desired label is specified in a string and may contain any combination of fixed text and zero or more of the following keywords, which are replaced at output creation time with the matching database value.
Use the -expand option to ensure that text fields with a store length larger than the display length and dropdown fields with choice labels that are longer than the field size are expanded to fit all text. Specify all to apply to all plates or specify plate numbers.
-expand
Where plate is a valid plate number and field is a comma-delimited list of field numbers that are to be blinded on that plate (ranges of field numbers are specified by enumerating each field in a comma-delimited list). Field numbers must use DFdiscover schema numbering.
-P
primary\|all
dirname
DFpdfpkg exits with one of the following statuses:
Use DFpdfpkg without the use of the DFexplore Create Subject Packages... option
The following example creates subject packages for subject ID 2035 and 2049 from Site 2.
% DFpdfpkg -S example.server.com -U example_user -C passwd \ -t "Site 2 Subject Packages" -color -data data -image primary \ -output /opt/studies/demo253/work -prefix "Site2_" \ -subject 2035,2049 253
DFprint_filter [-p string] [-c #] [-o string] [-f string]
DFprint_filter is the main print interface. DFdiscover programs that send output to a printer may use DFprint_filter to format the output into PostScript® format, if necessary, and then spool the output to a printer.
DFprint_filter is capable of re-formatting PNG and Sun rasterfiles (the image formats used in DFdiscover), ASCII text files, and PostScript® files.
/opt/dfdiscover/lib/DFprinters
lp
-o -p8
DFprint_filter exits with one of the following statuses:
Print the UNIX password file to the printer "hplj"
% /opt/dfdiscover/lib/DFprint_filter -p hplj -f /etc/passwd
DFprintdb [-d] [-A] [-usealias] [-p string] [-i string] [-o string] [-t string] [-e platelist|all] {-s #}
platelist|all
DFprintdb prints case report forms merged with data records from a study database. One page is printed for each data record in the input file, and is bookmarked by the record keys: subject ID, visit and then plate. Printed output is either sent directly to the printer or the named output file.
DFprintdb uses the high resolution backgrounds previously generated by DFsetup during a Study - Import CRFs operation. A background, named DFbkgd###.png is created for each CRF plate, where ### is the plate number. If tagged backgrounds also exist, e.g. DFbkgd###_all_SPA.png for a Spanish background used at all visits for plate ###, they can be requested using the -t option.
If the high resolution PNG background does not exist for a plate, DFprintdb will print the field widgets and the data they contain on a blank background.
-o
.pdf
.PDF
-t SPA
Plates for which text fields may expand beyond their display size. Without this option printed text is truncated if it exceeds the field display size. For example,
-e 10-15,22
allows text field expansion, if necessary, on plates 10 to 15 and 22; while
-e all
allows text expansion on all plates.
NOTE: expanded text may cover other data fields on the plate and thus should be used only on plates with empty space into which text fields can expand.
DFprintdb exits with one of the following statuses:
Print all primary adverse event report records (plate 1) for study 222
% DFexport.rpc -s primary 222 50 - | DFprintdb -s 222 -o test.pdf
DFpsprint {infile...}
infile
DFpsprint is a utility program that converts input CRF image(s) to PostScript® suitable for printing. When multiple CRF images are given as options, the resulting PostScript® file contains one CRF image per page.
The resulting PostScript® document is always sent to standard output where it can be re-directed to a file.
NOTE: To convert CRF images files for immediate printing, DFprint_filter provides a simpler interface.
DFpsprint exits with one of the following statuses:
Simple use of DFpsprint for printing the file /tmp/rasterfile
/tmp/rasterfile
% DFpsprint /tmp/rasterfile | lp -dlp
This command line subsequently re-directs the output to the printer.
Concatenating 3 image files where the second is read from standard input
% cat /tmp/ras2 | DFpsprint /tmp/ras1 - /tmp/ras3 > /tmp/ps.out
DFqcps [-A4] [-new] [-l string] [-f string][-h string] [-r string] [-n #] [-p #] {-s #}
DFqcps is used by DF_QCfax and DF_QCprint to format the Query Reports created by DF_QCreports, before they are sent or printed. It takes its input from the standard input, and translates it into a PDF document with the standard DFdiscover barcoding. Barcoding permits returned Query Reports to be routed to the correct study database, in the event that they contain queries in DFdiscover Q&A (question and answer) format, or in case an investigator decides to make a comment on the Query Report and send it back.
-l
Page m of n
DFqcps exits with one of the following statuses:
DFreport [-S server] [-U username] [-C password] [-JS js] [-CSS css] [-usealias] [-logo] [-e errfile] [-o outfile] [-landscape] [-h] [-D] {-CMD cmd} {study#}
js
css
errfile
cmd
DFreport runs DFdiscover reports outside of DFexplore, making them available to run from a command-line or scheduled via the UNIX cron facility.
Authentication To authenticate, DFreport requires the username and password for connection to a specific database server. These may be supplied as:
Database Permissions Subsequent to authentication with a specific database server, use of DFreport is allowed (or disallowed) by the DFexplore - Reports view or Server - Run reports permissions.
An authenticated user with one of these role permissions will then be able to run reports. The reported data is filtered by user permissions, which may further be limited by visit, plate and level restrictions associated with the user's role.
DFreport provides no warning that some records cannot be exported due to permission restrictions as this would itself provide information about the existence of restricted data records.
command
Output File - By default, output from DFreport is written to standard output (the command-line output). To write the output to a specific file, specify the filename with -o outfile. The user must have filesystem permission to create or modify the file. If outfile is specified as a relative pathname, the file is created relative to the DFreport command invocation directory.
NOTE: If DFreport is invoked from a cron facility, absolute pathnames are essential.
Output from legacy reports is written as is (plain text) while output from standard DFdiscover reports may be in HTML, JSON, Excel (.xlsx), pipe delimited data file (.dat), CSV, or PDF format, determined by the file extension of the output file.
The following standard DFdiscover reports with charts or animations are not supported in Excel, CSV, data file or PDF format: Enrollment by Site (enrollment), Visits by Site (sitevisits), CRFs by Site (sitecrfs), Queries by Site (siteqcs), Queries by Field (qcsbyfield), Database Definition (setup), Status and Level Summary (status), Plate Status (platestat), Inbound Traffic (inboundtraffic), Query Status Trend (querystatustrend), Query Time Trend (querytimetrend), Data Entry Time Trend (dataentrytimetrend). The following standard DFdiscover reports are not supported in CSV or data file format: Subject Schedule of Visits (subjectschedule), Subject Unexpected Visits (subjectunexpected), Query Update (queryupdate), Query Report (queryreport), Site Definition (sites).
Error Output - Re-direct any error messages to a specific file with the -e errfile option. By default, error messages are written to standard error and hence would get intermixed with data if the data is being exported to standard output.
-e errfile
DFreport exits with status 0 if the command was successful. Exit status 1 indicates that an error occurred - errors are written to standard error or the errfile file if -e was specified.
DFexplore includes a convenience feature, available from Help > Show Command, to display the command required to run a report. For example, run the Enrollment by Site report within DFexplore and select Help > Show Command to obtain this command (line breaks are for presentation only):
DFreport -S explore.dfdiscover.com -U demo_user1 -CMD "/DFedc?cmd=getreport&name=enrollment&title=Enrollment%20by%20Site&json&compress" -JS "https://cdn.dfdiscover.com/v5.2/js/DF_Enrollment.min.js" -CSS "https://cdn.dfdiscover.com/v5.2/css/DF_ReportStyles.css" 252 -logo -o DF_Enrollment_20191028140508.html -C xxx
Substitute the username for demo_user1, user password for xxx and provide an output filename in place of DF_Enrollment_20191028140508.html.
demo_user1
xxx
DF_Enrollment_20191028140508.html
DFsas {jobfile} [-f] [-dbx] [ [ [-c] | [-C] ] study# ] [-p platelist] [-l|-L [check] | [choice] ] [-d csjo] [-s #] [-t #] [-I pidlist] [-ncidlist] [-v visitlist] [-w] [-SAS #] [-a] [-A] [-newexp] [-X] [-r rundir] [-z]
jobfile
platelist
pidlist
cidlist
visitlist
rundir
DFsas is an intermediary between DFdiscover and SAS®. It can be used to extract summary data files from a study database and to create the corresponding SAS® job file for subsequent processing by SAS®.
Complete reference documentation for DFsas can be found in DFsas: DFdiscover to SAS®.
jobfile.dbx
check,choice
csjo
Output one or more date fields from each date variable using these specified formats:
c=calendar (imputation and 4 digit years)
s=convert date to a string field as is (no imputation)
j=convert date to a julian number (with imputation)
o=use original value and date informat (no imputation)
sidlist
When DFsas is run from DFexplore, DFedcservice sets DFSERVER, DFUSER and DFPASSWD to the current DFexplore login servername, username and password.
When DFsas is run from the command line and the-newexp option is used, there are no command-line options for setting servername, username or password. These must be set using environment variables DFSERVER servername, DFUSER username and DFPASSWD password before the program is run. If DFPASSWD is not set, the password set using DFpass is used.
-newexp
DFsas exits with one of the following statuses:
Create a default DFsas job file for study 11
% DFsas mytestjobfile -c 11
Create a DFsas job file for study 11, include only a subset of plates and request labels for check fields
% DFsas mytestjobfile -c 11 -p 1-5 -l check
DFsendfax [-2] [-A4] [-F from] [-w #] [-r #] [-d #] [ [-p password] | [-P] ] [-c string] [-sANY string] [-sEACH string] [-sALL string] [-sFIRST string] [-fANY string] [-fEACH string] [-fALL string] [-fFIRST string {file} {recipients...}
from
file
recipients
DFsendfax transmits any plain text, PDF, or TIFF file to one or more recipients.
DFsendfax interacts with a DFdiscover outbound daemon. You must have at least one outbound daemon configured before DFsendfax will work. DFsendfax makes a temporary copy of the file to be sent. By default, this is in /usr/tmp, although it may be your home directory if there is no space available in /usr/tmp. The temporary copy of the file becomes owned by user datafax. DFsendfax places its options into a structure that it then sends to the outbound daemon. Which outbound daemon is used is determined by the DFdiscover master using a round-robin scheduling algorithm. The outbound daemon then makes a queue entry for the transmission request in its work directory. It also copies (or remote copies, if necessary) the temporary file to a data file in its work directory. The outbound daemon then manages the queue entry until the scheduled time for sending arrives. At this point, the data file is passed to the system email service, DFprotusfax or HylaFAX, which attempts delivery of the document. If the transmission is successful, that status is forwarded to the outbound daemon which then executes zero or more of the success commands. If the transmission has failed, the outbound daemon either waits the number of minutes specified by the retry delay if one or more retries are still possible, or executes one or more of the failure commands. Finally, if the outbound daemon never receives a reply regarding the disposition of a transmission, the outbound daemon de-queues the transmission and executes the appropriate failure commands.
/usr/tmp
When specifying success or failure commands to execute at the completion of a transmission session, you can reference the values of various arguments that you supplied in your original DFsendfax command. The values that you can reference are:
If your transmission is successfully queued, DFsendfax echoes back the name of the file to be sent, the phone number of the recipients, the scheduled transmission time, and the faxid of the queue entry. This faxid is unique within the system and can be used to monitor the status of the transmission with DFfaxq, or to de-queue the transmission with DFfaxrm. You should be aware that depending on how your outbound daemons are configured, it may take several seconds for a transmission queued with DFsendfax to be detected so that it appears in the output of DFfaxq.
NOTE: The standard DFdiscover report program, DF_QCfax, uses DFsendfax to send Query Reports to participating sites in a study. If a Query Report is successfully transmitted, DFqcsent.rpc is executed (via the -sANY option) to mark the quality control notes as having being successfully sent to the investigator site. This is needed so that DFdiscover will display the correct status of queries in DFexplore, and is also needed if you intend to use the overdue allowance option when creating Query Reports. Thus you should always use DF_QCfax to send quality control reports and not DFsendfax.
-sANY
DFsendfax can schedule a file for transmission at a specified time, rather than the default which is immediately. This is specified by the -w option. The possible arguments to this option can have the following syntax:
now | today | tomorrow | <time spec> [<offset>] (1) (2)
<time spec>
noon | midnight | hh[:mm] | <month> <day> [,<year>]
<offset> = + # {seconds | minutes | hours | days | weeks}
If the time specified is in the past then this is the same as specifying now.
now
A white-space delimited list of email addresses or fax numbers for recipients of the file (required). Each email address must be in the format mailto:email_address where mailto: is fixed and required, and email_address is a valid email address. For each fax number include all long distance codes, etc. as required. Be careful not to include white-space characters inside a fax number as it will be interpreted as multiple fax numbers. numbers.
mailto:email_address
NOTE: DFdiscover does not perform validity checking on email addresses. Further, if the outbound service is configured to send documents via email only, any recipients that are not email addresses are quietly ignored.
DFsendfax provides commands that can be executed upon successful or unsuccessful completion of the transmission. Completion occurs after the specified number of retries have been attempted, if necessary. The command is executed in a Bourne shell as a background child process of the outbound daemon on the machine that the DFsendfax command was originally specified. There are four levels of commands that can be specified. They can be specified individually or in any combination. It is the user's responsibility to ensure that any concurrency issues that may arise as a result of more than one command executing simultaneously are planned for.
DFsendfax exits with one of the following statuses:
Send the file /etc/printcap to one recipient at 5551212
/etc/printcap
% DFsendfax /etc/printcap 5551212 Fax /etc/printcap scheduled for sending to 5551212 at Thu Dec 3 08:45:26 1992 ->Faxid is 317
Schedule a file for transmission at 11:00pm tonight and send it to two recipients, one via email (with a specific from email address)
% DFsendfax -w 23:00 -F repltyo@company.com /etc/fstab 9876543 mailto:luckyuser@hotmail.com Fax /etc/fstab scheduled for sending to 9876543 mailto:luckyuser@hotmail.com at Thu Dec 3 23:00:00 1992 ->Faxid is 318
Schedule a file for transmission 30 minutes from now and be informed via email that the fax was either successfully transmitted or failed
The line break is for presentation purposes only.
% DFsendfax -w 'now + 30 minutes' \ -sEACH 'echo $DataFile faxed to $Number | mail $Requestee' \ -fEACH 'echo $DataFile FAILED to $Number | mail $Requestee' \ /etc/magic 1-800-555-1212 9876543
DFstatus [-l] [-m] [-h] [-c] [-n #, #-#] [-i #, #-#] [-v #, #-#] [-p #, #-#] [-s #]
This command-line invocation of the DFexplore Status View generates a tabular representation of the database status in plain text format and writes the results to standard output.
DFstatus exits with one of the following statuses:
Display record status and current user connections for study 253
% DFstatus -s 253 #Database Status of Study 253 #Date Mon Mar 14 09:45:34 2011 #Sites #Subjects #Visits #Plates Level Pending Missed Incomplete Final Total 0 7 0 0 0 7 1 0 2 32 43 77 2 0 0 4 2 6 3 0 0 3 1 4 4 0 0 1 7 8 5 0 0 0 1 1 6 0 0 0 0 0 7 0 0 0 0 0 Total 7 2 40 54 103 #Records awaiting validation: 18 #Records being validated: 0 #Connected Users User Program Host ---------------- ---------------- ------------------------------ datafax DFexplore vncdemo.datafax.com datafax Status Tool dhcp214.datafax.com
Display current seat usage on the server
% DFstatus -m Seats in Use: Admin 2 of 10 General 122 of 150 Total 124 of 160
DFsqlload [-flavor oracle|postgresql|mysql|mssql] [-d drfname] [-q] [-X] [ [-ignore_mssql_priv] | [-ignore_mysql_priv] ] [-type typed|untyped|both] [-coding code|label|both] [-missing code|label] [-merge] [-date typed|untyped|both] [-noimpute] [-missed] [-table all|DFcoding,DFnullvalue,DFsubjectalias]] {param} {study}
oracle|postgresql|mysql|mssql
drfname
typed|untyped|both
code|label|both
code|label
all|DFcoding,DFnullvalue,DFsubjectalias]
DFsqlload provides a convenient method for migrating data from the proprietary DFdiscover storage to storage in a relational database.
Complete reference documentation for DFsqlload can be found in DFsqlload: DFdiscover to Relational Database Tables.
-flavor oracle
typed
DFTABLE_###
DFQC
DFREASON
untyped
DFPLATE_###
both
code
NULL
DFjournal
VARCHAR
all|DFcoding,DFnullvalue,DFsubjectalias
DFcoding
DFCODING
DFnullvalue
DFNULLVALUE
-d drfname
DFsubjectalias
DFSUBJECTALIAS
DFpid
DFalias
The following two options are required and must appear in order at the end of the option list:
server:database:schema[.tablespace][:username:password]
Database Parameters
tnsnames.ora
optional
If both username and password specified, login as specified username with specified password.
If only username specified, login as specified username lookup the specified user’s password from file ~/.pgpass.
~/.pgpass
If neither username nor password specified, lookup OS user’s name and OS user’s password from file ~/.pgpass.
If only password is specified, this is an error.
If only username specified, discard the specified username and use OS user’s name and lookup OS user’s password from file ~/.my.cnf. The same applies if neither username nor password are specified.
~/.my.cnf
If only username specified, discard the specified username and use OS user’s name and lookup the password from file ~/.orapass identified by database name. If ~/.orapass does not exist or cannot find the password, use Oracle external credential. The same applies if neither username nor password are specified.
~/.orapass
NOTE: Oracle external credentials: The user must have an OS account. In the Oracle initialization file, parameter initDB_NAME.ora must be set as follows: remote_os_authent = true
initDB_NAME.ora
remote_os_authent = true
If only username specified, login using the OS user’s name and lookup the OS user’s password from the file ~/.mssqlpass.
~/.mssqlpass
If neither username nor password is specified, login using the OS user’s name and lookup the OS user’s password from the file ~/.mssqlpass.
If only password is specified, login using the OS user’s name and the specified password.
DFsqlload exits with one of the following statuses:
DFsqlload is a command-line solution that creates all of the table definitions and imports all of the data into a relational database. One SQL table is created for each DFdiscover plate. In addition, three tables are added to record logging information, qc and reason for change data plus four tables and a view are created to store study schema information. Optional tables are created to note any problem fields or store value and label pairs for coded fields.
When run repeatedly, existing SQL tables are compared with the current DFschema file. Unchanged tables are truncated i.e., the data is removed but the table definitions remain. If any changes were made, the existing DFdiscover-defined table is dropped and re-created. DFsqlload calls DFexport.rpc to export all primary records to a temporary file, and then loads the database tables plate by plate.
The -merge option is the most efficient way to update the target relational database rather than dropping and re-creating existing tables. DFsqlload will obtain the date and time of the last run from the target database, then run DFjournal to extract any data changes that were made since that time. The extracted journal entries are then written to the target database to bring it up to date.
-merge
The following operations are performed on the target database.
At least five and optionally three meta tables are created in the target database.
DFSQLLOAD: Each DFsqlload run creates an entry in this log table. This table is also the lock table that prevents more than one DFsqlload process from operating on the same schema. A NULL value for column DFFINISH indicates that a DFsqlload process is running. If DFsqlload terminates abnormally, this record must be removed manually before starting a new DFsqlload process. The following SQL statements are used internally to create this table. The statements use syntax specific to PostgreSQL. Similar tables are created for MySQL, MS SQLServer and Oracle implementations of DFsqlload.
DFSQLLOAD
DFFINISH
CREATE TABLE <schema>.dfsqlload ( dfuser varchar(30) NOT NULL, -- the database user dfstart timestamp(0) NOT NULL, -- the start date and time dffinish timestamp(0), -- the finish date and time (NULL if process running) dfoption varchar(500), -- the options used with the DFsqlload command dfnull int4, -- the total number of problem fields converted to null dferror int4, -- the total number of records discarded or rejected dfstatus int2, -- the status: 0=process completed, 1=running CONSTRAINT dfsqlload_pk PRIMARY KEY (dfstart) ) WITH OIDS;
DFSTUDY, DFPLATE, DFMOULE, DFFIELD and DFSCHEMA_VIEW: Each DFsqlload run creates four tables and a view to store study schema information. Study, plate and field level user-defined tag-value pairs are written to the appropriate table. Since there are no module group identifiers in DFschema, module level user-defined tag-value pairs are ignored. Each DFsqlload run will clear and reload these tables.
DFSTUDY, DFPLATE, DFMOULE, DFFIELD and DFSCHEMA_VIEW
The following SQL statments are used to create these tables and view. The statements below use syntax specific to MS SQLServer. NOTE: MS-SQL loader: The default TDS_VERSION='7.0', used to support MS SQL Server 7.0 and 2000, now updated to TDS_VERSION='7.4'. User can overwrite default version and port by environment TDSVER=ver# and TDSPORT=port#.
CREATE TABLE <schema>.DFSTUDY ( DFNUM INTEGER NOT NULL, DFNAME VARCHAR(256), DFYEAR INTEGER, -- %B: study begin year DFSETUP_VER VARCHAR(10), -- %u: current setup version DFSCHEMA_DT VARCHAR(10), -- DFschema file timestamp: yyyy-mm-dd (ISO) DFSCHEMA_TM VARCHAR(8), -- DFschema file timestamp: hh:mi:ss DFHOST VARCHAR(256), -- short name from /opt/dfdiscover/lib/DFedcservice.cf DFSTUDY_TAGS VARCHAR(500), -- %z: tag=alias, tag=alias, ... (DFSTUDY_) DFPLT_TAGS VARCHAR(500), -- %z: tag=alias, tag=alias, ... (DFPLATE_) DFMOD_TAGS VARCHAR(500), -- %z: tag=alias, tag=alias, ... (DFMODULE_) DFVAR_TAGS VARCHAR(500), -- %z: tag=alias, tag-alias, ... (DFVAR_) DFSTUDY_TAG_VALUE VARCHAR(500), -- %y: tag=value, tag=value, ... CONSTRAINT DFSTUDY_PK PRIMARY KEY (DFNUM)) CREATE TABLE <schema>.DFPLATE ( DFPLT_NUM INTEGER NOT NULL, DFPLT_DESC VARCHAR(100), DFNUM_FILEDS INTEGER, -- %n: number of fields DFSEQ_CODE INTEGER, -- %E: seq in barcode vs first field DFARRIVAL_TRIG VARCHAR(200), -- %R: plate arrival trigger DFTERM_TRIG VARCHAR(6), -- %t: plate termination trigger DFPLT_TAG_VALUE VARCHAR(500), -- %y: tag=value, tag=value, ... CONSTRAINT DFPLATE_PK PRIMARY KEY (DFPLT_NUM)) CREATE TABLE <schema>.DFMODULE ( DFMOD_NAME VARCHAR(30) NOT NULL, DFMOD_DESC VARCHAR(100), CONSTRAINT DFMODULE_PK PRIMARY KEY (DFMOD_NAME)) CREATE TABLE <schema>.DFFIELD ( DFNUM INTEGER NOT NULL, DFPLT_NUM INTEGER NOT NULL, DFVAR_NUM INTEGER NOT NULL, DFVAR_UID INTEGER, -- %i: field unique id DFVAR_SPEC VARCHAR(9), -- %A: required, essential, optional DFVAR_NAME VARCHAR(80), DFVAR_ALIAS VARCHAR(80), DFVAR_DESC VARCHAR(40), DFVAR_TYPE VARCHAR(6), DFVAR_FORMAT VARCHAR(100), DFVAR_LEGAL VARCHAR(100), DFVAR_CODE_LABEL VARCHAR(2000), -- %c %C: code=label, code=label, ... DFVAR_TAG_VALUE VARCHAR(500), -- %y: tag=value, tag=value, ... DFVAR_STORE INTEGER, DFVAR_DISPLAY INTEGER, DFVAR_HELP VARCHAR(500), DFEC_FIELD_ENT VARCHAR(500), DFEC_FIELD_EXT VARCHAR(500), DFEC_PLATE_ENT VARCHAR(500), DFEC_PLATE_EXT VARCHAR(500), DFSTYLE VARCHAR(80), DFMOD_NAME VARCHAR(30), DFMOD_INSTANCE INTEGER, CONSTRAINT DFFIELD_PK PRIMARY KEY (DFNUM, DFPLT_NUM, DFVAR_NUM)) CREATE VIEW DFSCHEMA_VIEW AS SELECT DFSTUDY.DFHOST DFHOST, DFFIELD.DFNUM DFSTUDY_NUM, DFSTUDY.DFNAME DFSTUDY_NAME, DFSTUDY.DFSCHEMA_DT DFSCHEMA_DT, DFSTUDY.DFSCHEMA_TM DFSCHEMA_TM, DFFIELD.DFPLT_NUM DFPLT_NUM, DFPLATE.DFPLT_DESC DFPLT_DESC, DFFIELD.DFVAR_NUM DFVAR_NUM, DFFIELD.DFVAR_SPEC DFVAR_SPEC, DFFIELD.DFVAR_ALIAS DFVAR_ALIAS, DFFIELD.DFVAR_NAME DFVAR_NAME, DFFIELD.DFVAR_DESC DFVAR_DESC, DFFIELD.DFVAR_TYPE DFVAR_TYPE, DFFIELD.DFVAR_FORMAT DFVAR_FORMAT, DFFIELD.DFVAR_LEGAL DFVAR_LEGAL, DFFIELD.DFVAR_CODE_LABEL DFVAR_CODE_LABEL, DFFIELD.DFVAR_TAG_VALUE DFVAR_TAG_VALUE, DFFIELD.DFEC_FIELD_ENT DFEC_FIELD_ENT, DFFIELD.DFEC_FIELD_EXT DFEC_FIELD_EXT, DFFIELD.DFEC_PLATE_ENT DFEC_PLATE_ENT, DFFIELD.DFEC_PLATE_EXT DFEC_PLATE_EXT, DFFIELD.DFSTYLE DFSTYLE, DFFIELD.DFVAR_STORE DFVAR_STORE, DFFIELD.DFVAR_DISPLAY DFVAR_DISPLAY, DFFIELD.DFVAR_HELP DFVAR_HELP, DFFIELD.DFMOD_NAME DFMOD_NAME, DFMODULE.DFMOD_DESC DFMOD_DESC, DFFIELD.DFMOD_INSTANCE DFMOD_INSTANCE FROM DFSTUDY, DFPLATE, DFMODULE, DFFIELD WHERE DFFIELD.DFNUM = DFSTUDY.DFNUM AND DFFIELD.DFPLT_NUM = DFPLATE.DFPLT_NUM AND DFFIELD.DFMOD_NAME = DFMODULE.DFMOD_NAME
DFNULLVALUE: This is an optional table where all problem fields that are converted to NULL are recorded. The following SQL statements are used internally to create this table.
CREATE TABLE <schema>.dfnullvalue ( dfpid int4 NOT NULL, -- study subject id dfplate int2 NOT NULL, -- plate dfseq int4 NOT NULL, -- sequence/visit number dftable varchar(30) NOT NULL, -- name of sql table where substitution was made dffield varchar(30) NOT NULL, -- name of sql field where substitution was made dfvalue varchar(###), -- the original value exported from DFdiscover dfproblem varchar(###), -- the reason the value was converted to NULL dfraster varchar(12), -- the image id of the source CRF page CONSTRAINT dfnullvalue_pk PRIMARY KEY (dfpid, dfplate, dfseq, dftable, dffield) ) WITH OIDS;
The ### above represents the maximum length encountered in the data source.
DFCODING: This is an optional table where all value and label pairs for coded fields are stored. The following SQL statements are used internally to create this table.
CREATE TABLE <schema>.dfcoding ( dfplate int2 NOT NULL, -- plate dffield varchar(30) NOT NULL, -- sql table column name dfcode varchar(###) NOT NULL, -- code value for this column dflabel varchar(###), -- code label for this column CONSTRAINT dfcoding_pk PRIMARY KEY (dfplate, dffield, dfcode) ) WITH OIDS;
Verify any existing SQL tables. The following tasks are performed in the verification of existing SQL tables.
All tables with the prefix DF are treated as DFsqlload-defined tables. For any given run of DFsqlload, only the tables relevant to the current job are used.
DF
Existing SQL table definitions are compared to the DFschema file. DFsqlload will truncate any unchanged tables and re-create changed tables. Changes that may cause DFsqlload to drop a table are as follows:
All privileges (if any) for tables create by DFsqlload are backed up and restored.
If other tables reference any tables created by DFsqlload, the foreign keys will either be dropped (Postgresql, MS SQLServer) or disabled (MySQL, Oracle) and saved to the log file in the form of a database-specific SQL statement. The user can choose to execute these statements to restore the foreign keys manually, as DFsqlload does not do it automatically.
Other database objects - triggers, views, procedures, etc., which depend upon DFsqlload-defined tables are ignored.
All DFsqlload created files are in the study/working_dir/DFsqlload_logs directory by default. If this directory does not exist, DFsqlload will create it at the default location. The file names use time stamp in the format yymmdd_hhmiss as a prefix, where mm is two-digit month and mi is two-digit minute. The time stamp, which is also the value of DFSQLLOAD.DFSTART, is the unique identifier of this DFsqlload process.
study/working_dir/DFsqlload_logs
yymmdd_hhmiss
mm
mi
DFSQLLOAD.DFSTART
Log file: yymmdd_hhmiss.log - This file records the detailed loading progress including schema setup, SQL table definitions, record count, and rejected records with error messages in the following formats:
yymmdd_hhmiss.log
DFdiscover Retrieval File - If path is included, this file will be in the specified location. If the file already exists, the existing file will be removed. If the file cannot be created or written, DFsqlload will report an error and continue the loading process. The file is in standard DRF format. The first four fields are Id, Visit, Plate, and Image. The combination of id, visit, and plate is unique. The 5th field records the list of problems. The record level errors appear first, followed by the field level problems. The field level problems are derived from the DFNULLVALUE table, which is created by default if a .drf file is requested. Problem types are summarized below.
Record level errors
Field level problems
The format used in the .drf file is Table1Name: FieldName (problem), FieldName (problem), ... , Table2Name: FieldName (problem), ...
Table1Name: FieldName (problem), FieldName (problem), ... , Table2Name: FieldName (problem), ...
QC or REASON specific error
Data file:
This is the data source of current loading plate created by DFexport.rpc, and is removed automatically after the data is loaded into the SQL database.
Error files:
These files record the records rejected by the SQL database along with database generated error messages.
Rejected keys: yymmdd_hhmiss.tmp - This file records the keys (plate, seq, id) of primary records discarded by DFsqlload or rejected by the SQL database. If a REASON or QC record matches one of the key combinations in this file, that REASON or QC record will not be loaded into the SQL database and a message will be written to the log file:
yymmdd_hhmiss.tmp
Record (id#,seq#,plate#,image,field#): discarded by DFsqlload.
DFRECORD (error - primary record was rejected).
This file is removed automatically.
Database login credentials files. The following database-specific files may be used to store login credentials.
Postgres: ~/.pgpass - This is a Postgres standard file located in the user's home directory and the file permission must be 600; otherwise, Postgres will ignore it. This file specifies the database login credentials in the format:
host:port:database:username:password
The username can be any valid database user, for example:
parkcity:5432:test_db_name:user_a_name:user_a_password parkcity:5432:test_db_name:user_b_name:user_b_password
MySQL: ~/.my.cnf - This is MySQL standard file located in user's home directory and the file permission should be 600. This file specifies the program groups and options for each group. A typical group is [client]. In this group the database password for OS user can be specified:
[client] password=my_pass ...
The global options can be specified in the global option file /etc/my.cnf or DATADIR/my.cnf.
/etc/my.cnf
DATADIR
/my.cnf
Oracle: tnsnames.ora, ~/.orapass - tnsnames.ora is the Oracle standard net services file located in ORACLE_HOME/network/admin directory by default. Oracle client will lookup net service names from this file. This file can also be in ANY_DIR/network/admin and the environment variable ORACLE_HOME is set to ANY_DIR.
ORACLE_HOME
/network/admin
ANY_DIR
~/.orapass is NOT an Oracle standard file. It is provided for convenience for DFdiscover users to store their database password. This file should be in the user's home directory and file permission should be 600. The format is:
db_name:password
for example,
test_db_name:test_db_password production_db_name:production_db_password
MS SQLServer: ~/.mssqlpass - This is not an MS SQLServer standard file. It is provided as a convenience to DFdiscover users for storing their database password. This file needs to be kept in the user's home directory and the file permission needs to be 600. The format of this file is:
server_name:password
There are two types of entries that can be used. Both the server name and the password to use can be specified as follows:
my_server:my_server_password
or a password can be specified for any server this user connects to as in the following example:
*:any_server_password
The following details apply to tables and fields created by DFsqlload.
Table Names - Table names follow these rules. DFSQLLOAD is a log table and is used to store the details of a given DFsqlload for a particular schema. The table for plate 510 is named DFREASON. The table for plate 511 is named DFQC. All other plates are named DFPLATE_nnn (untyped) or DFTABLE_nnn (typed), where nnn is the plate number. Optional tables created are DFNULLVALUE for the storage of any problem field data conversions and DFCODING for the storage of value/value label pairs. All table names are in uppercase letters. Note that the table names are case sensitive for MySQL on UNIX platforms only.
DFPLATE_nnn
DFTABLE_nnn
nnn
Table Types - Table types used for each of the supported database products are as follows: For PostgreSQL and Oracle, all tables are type relational table. For MySQL, all tables are type InnoDB table.
relational table
InnoDB table
Field Names for study tables - Field naming follows these rules. For fields 1 to 7 and the last three fields for a plate, generic variable names are used. For all fields in between, unique variable names are used. Any field names matching an SQL keyword get an _ (underscore) appended. This is target product dependent. Refer to the documentation for the product you are using for a complete list of the relevant keywords. Any non-alphanumeric characters are replaced with _ (underscore). If a field name starts with a digit, DF_ is prepended. Field names are truncated to 30 chars. A sequence number is appended to each non-unique field name.
DF_
DFdiscover type to SQL type mapping - DFsqlload will map DFdiscover types to SQL types according to Data typing when creating SQL tables for typed columns. Untyped columns are mapped to VARCHAR (PostgreSQL, MySQL) and VARCHAR2 (Oracle).
Data typing
NOTE: MySQL converts VARCHAR(n) to CHAR(n) (n < 4) or TEXT (n > 255).
SQL column naming convention: DFsqlload will follow the rules defined in Fields for typed SQL tables and Fields for untyped SQL tables for DFdiscover field name to SQL column naming convention when creating SQL tables.
Fields for typed SQL tables
(a) The first field, if -coding both or date both is specified.
-coding both
date both
(b) The second field, if -coding both or date both is specified.
Fields for untyped SQL tables
Locking: DFsqlload uses the table DFSQLLOAD to prevent two processes from working on the same database schema. A NULL entry in DFSQLLOAD.DFFINISH indicates that another process is running or terminated abnormally. If DFSQLLOAD does not exist, DFsqlload will create it. Upon completion, DFsqlload updates DFSQLLOAD.DFFINISH to the finish time and the DFSQLLOAD.DFSTATUS to zero (normal exit process).
DFSQLLOAD.DFFINISH
DFSQLLOAD.DFSTATUS
DFdiscover Retrieval File: The first line in the DRF contains a two-field comment. The first field contains the username and creation timestamp. The second field identifies the creator of the DRF as DFsqlload and lists the parameters used to access the SQL database. The next line is a comment describing the format of the DRF records to follow. One DRF data record is created for each DFdiscover record with data import problems. Each DRF data record is identified by Id, Visit, Plate and Image. The 5th field records the SQL table name, followed by the field name and problem description for each problem encountered. Multiple field/problem descriptions are separated by commas.
Date fields: DFsqlload converts two-digit years to four-digits based on the cut off year specified for each date field in DFschema, or the year the study began (%B) if not specified. DFsqlload imputes partial dates according to the imputation method specified for each date field.
Number of fields: DFsqlload creates a minimum of 10 fields (1~7,NF-2~NF) for each SQL table. The maximum number of fields is limited by the database: PostgreSQL 1600, MySQL 1000, Oracle 1000, MS SQLServer 1024. DFsqlload will report errors for DFdiscover records that do not meet these field requirements and will continue the loading process.
Postgres - A schema is a namespace that contains DFdiscover study tables. Schema names are DFdiscover study names. If the schema name specified in the DFsqlload command line does not exist in the Postgres database, DFsqlload will create the schema as specified. DFsqlload will never drop existing schemas.
Oracle - A schema is a database user who is also the owner of DFdiscover tables that belong to one study. The schema specified in the command line must already exist in Oracle, unless the tablespace name (must exist in database) is also specified. If the schema does not exist, DFsqlload will create the schema and assign the specified tablespace as the schema's default tablespace. The DFdiscover study tables will be created, if specified, in the specified tablespace, otherwise in the schema's default tablespace. DFsqlload will check the schema's quota privilege on storage tablespace and grant unlimited quota privilege on that tablespace. DFsqlload will never drop existing schemas.
MySQL - There is no schema in MySQL. A MySQL database maps to a DFdiscover study database. In the command line, the schema name must be the same as the database name. Note that the database name is case sensitive whether MySQL is running on a UNIX or Windows platform. If the specified database does not exist in the MySQL database, DFsqlload will create the database as specified. DFsqlload will never drop existing MySQL databases.
MS SQLServer - There is no schema in MS SQLServer in version 2000 and older. A SQLServer database maps to a DFdiscover study database. In the command line, the schema name must be the same as the database name.
Import study 254 into mySQL
Import the validation study val254 into mySQL on host talisman.
talisman
% DFsqlload -flavor mysql -q talisman:val254:val254:root:mysql 254
DFtextps [-PDF] [-A4] [-h] [-n #] [-d] [[-1] | [-2] | [-4] ] [-p #] {file...}
DFtextps is a utility program that converts plain text input files to PDF. If no input files are named, the input is taken from standard input.
The resulting PDF is sent to standard output where it can be re-directed to a file.
DFtextps exits with one of the following statuses:
Print output from DF_SSvisitmap report
Run the DFdiscover report DF_SSvisitmap for study 1, saving the output as a PDF document.
% /opt/dfdiscover/reports/DF_SSvisitmap 1 | /opt/dfdiscover/bin/DFtextps > visitmap.pdf
DFuserdb [-help] [-fsck] [-reset] [-reset2fa] [-dump2fa] [-stats] [-unlock] [-export filename] [-import filename]
filename
DFuserdb is used to perform maintenance operations such as import/export on the user and role database.
Invoked without arguments DFuserdb works in interactive mode. Allowable commands match the options listed here, but without the leading dash.
FSCK done
-import
DFuserdb.log
DFuserdb.idx
DFuserdb exits with one of the following statuses:
DFversion
DFversion invokes the DFwhich command for every DFdiscover executable, DFdiscover report and DFdiscover utility included in a standard installation.
Output is grouped by executables, reports and utilities. Within each group, output is sorted alphabetically by name.
DFversion is the most efficient and accurate way to determine the current state of an installation.
DFversion exits with one of the following statuses:
DFwhich {name...}
name
DFwhich examines and displays the RCS (revision control system) string of the specified filename(s), in a manner similar to the UNIX what command.
The output from the DFwhich command can be useful in determining exactly what version of the DFdiscover software is currently being used.
DFwhich exits with one of the following statuses:
Show the version information for DFedcservice
% DFwhich DFedcservice DFedcservice: Version: 2016.0.0, Date: Apr 29 2016 DF/Net
DFwhich uses standard UNIX commands to locate version information. The content of the output produced will be essentially the same but may differ in appearance due to different implementations of these commands in flavors of the Linux operating systems.
DFaddHylaClient — Create the symbolic links necessary for accessing HylaFAX on a DFdiscover server
DFadmindb — Add or reload user and role history to DFadmin.db sqlite database.
DFauditdb — Add or reload journal files to DFaudit.db sqlite database.
DFcertReq — Request an SSL certificate signing for DFedcservice.
DFclearIncoming — Clean out the image receiving directory, processing all newly arrived images
DFcmpSchema — Apply the data dictionary rules against the study database
DFcmpSeq — Determine the appropriate values for each .seqYYWW file
.seqYYWW
DFisRunning — Determine if the DFdiscover master program is currently running on the licensed DFdiscover machine
DFmigrate — Upgrade study setup and configuration files from an old DFdiscover version to the current version.
DFras2png — Convert Sun raster files in the study pages directory into PNG files
DFsetupdb — Add or reload all study schema files to DFsetup.db sqlite database.
DFshowIdx — Show the per plate index file(s) for a specific study
DFstudyDiag — Report (diagnose) the current status of a study database server
DFstudyPerms — Report, and correct, the permissions on all required DFdiscover sub-directories and files for a study
DFtiff2ras — Convert a TIFF file into individual PNG files
DFuserPerms — Import and update users and passwords, and optionally import roles, role permissions, and user roles
DFdiscover includes several utility programs that can be executed from a command line. Most of these programs are useful in situations where repairs or corrections within your DFdiscover environment are needed. They are located in the DFdiscover utilities directory, /opt/dfdiscover/utils.
/opt/dfdiscover/utils
DFaddHylaClient
DFdiscover uses HylaFAX to send and receive faxes. However, as installed by the DFdiscover INSTALL program, HylaFAX is not fully configured to run on a fax server workstation. DFaddHylaClient makes the links necessary to allow HylaFAX to operate.
Each workstation that will be used for receiving incoming faxes or sending faxes, must have this utility command run on it. The command must be run by someone with super-user privileges and needs to be run only once per workstation (unless the location of the DFdiscover software is subsequently changed).
As distributed, HylaFAX expects to find all of the fax resources it needs in the /opt/hylafax directory. However, those resources are actually in /opt/dfdiscover/$MACHINE/hylafax. DFaddHylaClient makes symbolic links to resolve this inconsistency. Additionally, DFaddHylaClient creates a HylaFAX work directory in /var/spool/fax. If the directory already exists, the necessary HylaFAX files are copied into the directory.
/opt/hylafax
/opt/dfdiscover/$MACHINE/hylafax
/var/spool/fax
NOTE: This program is generally executed as part of the HylaFAX configuration in a new DFdiscover install, and subsequently never needs to be executed again.
DFaddHylaClient exits with one of the following statuses:
DFadmindb — Load user and role change history to DFadmin.db sqlite database.
DFadmin.db
DFadmindb [-reload]
A new user and role change history sqlite database was introduced in version 5.8.0 to support the History functionality in DFadmin. The study, user, and role history in DFadmin requires all existing user and role history tracked in DFuserdb.log to be loaded into the new admin history database. DFadmindb is provided for this purpose. Until user and role history is loaded, it is not possible to view history in DFadmin.
The DFadmindb utility must be run by either user datafax or root with the -reload option. The reload option empties existing entries and reloads DFuserdb.log to the DFpermlog table in DFadmin.db. This process may take some time if DFuserdb.log is large. This can be performed while DFdiscover is running, but ensure no user and role changes are made in DFadmin while it is running.
DFadmindb must be part of the server upgrade process like DFmigrate. DFadmindb is installed in the utils directory.
DFauditdb [-reload] {-s # [-stdin]|-s #~#|all}
A new audit trail sqlite database was introduced in release 5.5.0 that greatly improves the performance of retrieving and displaying page or field history in DFexplore or DF_SBhistory. Existing journal files for any ongoing studies created in a release previous to 5.5.0 must be loaded into the new audit database for each study. DFauditdb is provided for this purpose. Until study journal files are loaded, it is not possible to view page or field history, or run DF_SBhistory.
Studies that have been created in release 5.5.0, or newer, do not need to be reloaded.
The DFauditdb utility must be run by either user datafax or root when reloading journal files using the -reload option. The reload option empties existing entries and reloads journal files to the DFaudit table in DFaudit.db. If DFdiscover is running, the study to be loaded must not be writing to the study database. If a single study is specified, the -stdin option can be used in cases where there is a large (> 1 Gb) number of audit records, bypassing creation of a potentially large temporary file. When the -stdin option is used, audit trace information is piped to DFauditdb similar to the following example.
root
-stdin
/opt/dfdiscover/bin/DFaudittrace -s 254 -q -r -N -B | /opt/dfdiscover/utils/DFauditdb -reload -s 254 -stdin
DFauditdb can be used to load all studies at once using the -s all option provided the study servers can be started. In this case the -stdin is unavailable. Studies with large audit trails need to be loaded separately. Running DFauditdb with just the -s studynum option counts entries in the DFaudit table in DFaudit.db.
-s all
-s studynum
DFauditdb can be part of the server upgrade process like DFmigrate. DFauditdb is installed in the utils directory.
DFcertReq
DFcertReq is used to generate an SSL key and a signing request for use with DFedcservice. It must be run by root or datafax. DFserveradmin also provides this functionality in a point-and-click visual interface. Most administrators will find DFserveradmin to be the preferred interface for this task.
NOTE: There is no requirement to use DFcertReq and DF/Net Research, Inc. as the SSL certificate signing authority. There are many standard, commercial certificate signing authorities (known as CAs) that are internationally recognized. For a small annual fee paid to the CA, they will sign your certificate. Their signed certificate can be used in your DFdiscover installation. Some clients prefer this approach of using an independent CA.
DFexplore, DFsetup, DFadmin and DFsend communicate with DFedcservice using TLS (Transport Layer Security. Specifically, TLS v1.2 or v1.3 is used.) in the same way that Internet banking sites do. TLS provides an encrypted path through the internet that prevents eavesdropping and modification of your data by third parties. The client applications check that they are communicating with the correct DFdiscover server by means of a certificate that encodes the DFdiscover server's name and ownership. This certificate is generated by choosing a very large random number (specifically a 4096-bit RSA key), adding organizational ownership information to it and then requesting that DF/Net Research, Inc. certify this key as authentic. Subsequently when an DFexplore, DFsetup DFadmin or DFsend client connects to DFedcservice, it asks for this certificate and can then determine whether it is communicating in an encrypted manner with the correct server.
The process of generating a certificate starts with the execution of the DFcertReq script. This script generates a large random number and then prompts for organizational information, including the country, state, organization name and server name.
IMPORTANT: It is extremely important that the organizational information is up-to-date and accurate. Client applications may view this information to confirm the server's identity and certificate status.
The organizational information is combined with the large random number to create a unique certificate signing request text file. An email is created for certreq@dfnetresearch.com containing a request to certify that this is an authentic DFdiscover server. DF/Net Research, Inc. processes this request and emails back a small file containing the certificate, which is then installed in the DFdiscover system. At the end of these steps, communication between the DFexplore, DFsetup, DFadmin and DFsend client applications is encypted and secure.
DFcertReq will fail to email the signing request if the computer from which it is run is unable to send email via the internet. In this case, it is possible to manually generate the request by:
Transferring the files /tmp/cert.csr.text and /tmp/cert.csr, that are generated by DFcertReq, to an email enabled computer. Remember to perform the transfer in binary mode if using an application like ftp.
/tmp/cert.csr.text
/tmp/cert.csr
Attach the two transferred files to a new email message and send it to certreq@dfnetresearch.com.
At the time of executing DFcertReq, if there is no /opt/dfdiscover/lib/DFlogin.html file present, DFcertReq will also create the file, adding the organizational information collected from the user's responses. Subsequently, this information appears in the banner of each DFdiscover login dialog. To override this behaviour, before executing DFcertReq, create your own login banner message in /opt/dfdiscover/lib/DFlogin.html.
/opt/dfdiscover/lib/DFlogin.html
DFcertReq exits with one of the following statuses:
Creating a DFedcservice** SSL certificate and signing request**
# DFcertReq ********************************* * When asked for 'DFdiscover Server Name (fully qualified domain name)' * type the full name of the machine (e.g. dfdiscover.mycompany.com) * as it is called from the Internet. ********************************* ----------------- Generating Server Key ---------------- Generating RSA private key, 4096 bit long modulus (2 primes) ...............................................++++++ ..................++++++ e is 65537 (0x10001) ----------------- Decrypting Server Key ---------------- writing RSA key ----------------- Generating Server Signing Request ---------------- You are about to be asked to enter information that will be incorporated into your certificate request. What you are about to enter is what is called a Distinguished Name or a DN. There are quite a few fields but you can leave some blank For some fields there will be a default value, If you enter '.', the field will be left blank. ----- Country Name (2 letter code) [US]: State or Province Name (full name) []:California Locality Name (eg, city) []:San Diego Organization Name (eg, company) []:Pharmadrug Biotech Inc. Organizational Unit Name (eg, section) [Research]: DFdiscover Server Name (fully qualified domain name) [dfdiscover.your-company.com]:dfdiscover.pharmadrug.com Email Address [support@your-company.com]:support@pharmadrug.com Emailing certificate signing request to DF/Net Research, Inc... Once the certificate has been signed it will be emailed back to you.
DFclearIncoming
DFclearIncoming uses DFinnotify.rpc to submit all unprocessed images in the DFdiscover received image (incoming) directory to the defined incoming DFdiscover daemons.
Under normal circumstances, this command is not needed as DFdiscover automatically processes each incoming image as receipt completes.
This command is also automatically invoked by the DFdiscover bootstrap procedure when the software is restarted to process any images that were received while DFdiscover was not operational.
If one or more incoming daemons are processing when DFclearIncoming is started, DFclearIncoming will wait until all of the daemons have exited. Then it will empty out the /opt/dfdiscover/work/.dfincoming_work file and submit for processing all of the images that appear in the DFdiscover incoming directory.
/opt/dfdiscover/work/.dfincoming_work
DFclearIncoming exits with one of the following statuses:
Clean out the incoming image directory
# DFclearIncoming 1. Checking state of incoming daemons... 2. Checking /opt/dfdiscover/work/.dfincoming_work... 2 stale incoming entries obsoleted 3. Checking /opt/dfdiscover/incoming... 1 faxes awaiting processing. Processing fax00010.tif
DFcmpSchema [-a] [-v] [-p #] [-dDRF_filename] {-s #}
DRF_filename
DFcmpSchema performs a consistency check of the database against the data dictionary for the study. It reports:
records that have an incorrect number of fields
records that have illegal status, validation levels, or time stamps
records that have an impossible CRF image reference
key fields that are illegal or inconsistent with the current study or plate
values that occupy more columns than the maximum defined for the field
values that have an incorrect format
fields that have impossible values
choice and check fields whose data value does not match the coding defined in the study schema
missing delimiter at the end of data records for user defined CRF plates
missing or empty required fields on clean primary or secondary data records
date values that contain digits, characters, or ?? on clean primary and secondary records, and do not conform to the field imputation or partial rounding variable format
When the data dictionary for an existing study is changed via DFsetup, DFdiscover does not retroactively modify the database to match the new dictionary. DFcmpSchema is useful in this circumstance to identify and locate existing data that now violates the data dictionary.
DFcmpSchema applies two types of checking: basic and exhaustive checking. Basic checking is the default and includes all but the last two bulleted items described above. Exhaustive checking includes everything verified in basic checking plus the last two items for missing/empty required fields and inconsistent date values. Exhaustive checking can be performed by running DFcmpSchema with the -v option.
By default, DFcmpSchema applies basic checks to all primary records in the database, excluding any records in the new record queue. New records are only checked if plate zero is explicitly specified. Because the data fields in new records have not yet been verified, DFcmpSchema only checks fields 1-7, i.e. checking stops at the subject ID field.
For each inconsistency, the report includes the record, plate number, the line number in the exported data file, a synopsis of the inconsistency, the key fields of the record (so it can be retrieved with DFexplore), the data dictionary definition, and the current data value.
Check all primary and secondary records. The default is to check only primary records.
NOTE: Missed records are never checked.
Apply exhaustive checking. Exhaustive checking includes all basic checking plus additional checks on clean records only for
DFcmpSchema exits with one of the following statuses:
Report on all inconsistencies for study 240
% DFcmpSchema -s 240 2|1|0245R0032001|240|1||1||0|0||||||||2|02/11/11 13:26:52|02/11/11 13:26:52|E** Plate 1 (Screening Form), line 1: Incorrect field count: 20 should be 21.
Perform exhaustive checking for inconsistencies on plate 2 for study 254
% DFcmpSchema -v -p 2 -s 254> 1|7|0436R0008002|254|2|1|10052|KKL|23/06/04||09/09/24|||180||080|1|1|23/07/04|1|04/09/08 11:08:15|04/11/11 09:33:48|E** Plate 2 (Patient Entry Form), line 2: Required data field. Subject ID='10052', Visit='1' Field#10 (status=1, validation=7) Name='MEDCODE' Desc='Medication Code 1' Type=INT Required Width=4 DATA LEN=0 '' 1|7|0436R0006002|254|2|1|10056|POL|00/04/03|1112|07/08/24||111.1|166||070|1|1|05/06/04|1|04/09/08 10:03:14|04/11/11 09:45:32|E** Plate 2 (Patient Entry Form), line 3: Bad data. Subject ID='10056', Visit='1' Field#9 (status=1, validation=7) Name='EDATE' Desc='Entry Date' Type=DATE Required Width=8 Format='dd/mm/yy' Imputation='never' Range='1940 - 2039' DATA LEN=8 '00/04/03'
DFcmpSeq
This command must be run by user datafax or root.
DFcmpSeq verifies the contents of the sequence files maintained by DFdiscover in the /opt/dfdiscover/work directory. It verifies for each file that the number stored in the file is:
/opt/dfdiscover/work
greater than the highest sequence number assigned to a file in the TIFF archive directory (by any daemon)
greater than the highest sequence number referenced in the DFdiscover maintained fax_log file
In unusual circumstances it may be that a sequence file cannot be updated by DFdiscover when it should be. This can subsequently lead to problems processing incoming images because new sequence numbers assigned by DFmaster.rpcd overlap with previously used sequence numbers. Incoming image processing will fail and messages will be logged while this condition persists. DFcmpSeq must be used in this case to determine what the correct values are for the sequence files.
Typically this command is required only in circumstances where a sequence file has been accidentally deleted and its contents need to be re-created.
DFcmpSeq exits with one of the following statuses:
Report on the current status of the sequence files
In this example, DFcmpSeq finds two problems. The solutions are to insert 286 in /opt/dfdiscover/work/.seq9245 to fix the first problem and 3 in /opt/dfdiscover/work/.seq9401 to fix the second problem.
286
/opt/dfdiscover/work/.seq9245
/opt/dfdiscover/work/.seq9401
# DFcmpSeq checking daemon 1, configuration file /df/lib/DFinbound.1... Checking archive directory /archive/g3... checking YYWW 9618... checking YYWW 9621... checking YYWW 9622... checking YYWW 9604... checking YYWW 9605... checking YYWW 9606... checking YYWW 9607... checking daemon 2, configuration file /df/lib/DFinbound.1... Checking .seq files... Checking fax_log... YYYY/WW Archive Faxlog SeqFile Error 1992/45 0 285 279 FAXLOG > SEQFILE 1994/01 2 0 0 ARCHIVE > SEQFILE
DFisRunning [-q]
DFisRunning is a utility program that determines whether or not DFmaster.rpcd is currently running on the licensed DFdiscover machine. It is most useful in shell scripts and cron jobs that need to determine if DFdiscover is operational before proceeding.
DFisRunning exits with one of the following statuses:
Report on the current status of the DFdiscover master, first in quiet mode and then in non-quiet mode
% DFisRunning -q % echo $status 1 % DFisRunning DFmaster is running on 'oberon'. % echo $status 1
DFmigrate [-f] [-a] [-s #, #-#] [#study_directory]
study_directory
Several changes were introduced in release 2014.0.0 that are incompatible with earlier releases of DFdiscover. As a result any ongoing studies created in a release previous to 2014.0.0 must be migrated in order to make them compatible. DFmigrate is provided for this purpose. Until a study is migrated, it is not possible to use it in the current release of the software.
Studies that have been created in release 2014.0.0, or newer, do not need to be migrated. Similarly, studies that were previously migrated to be compatible with 2014.0.0, do not require (re-)migration.
IMPORTANT: It is strongly recommended that you create a backup of your original study directories prior to migration. If you have not already backed up all studies prior to the new installation you should do it now. The migration process does not touch study data or images, only study setup files including plate backgrounds are migrated.
The DFmigrate utility must be run by either user datafax or root. DFdiscover does not need to be running to run DFmigrate. If DFdiscover is running at the time of migration, the study to be migrated must not be in use. DFmigrate can be used to migrate all studies at once or one study at a time. The migration process follows the same steps regardless of the number of studies being migrated. The steps of the migration process are documented in the DFmigrate output as follows:
Sanity Checks - Before migrating any studies DFmigrate checks to ensure that:
no study directory is nested within another study directory,
each study directory and configuration file belongs to exactly one study
If a failure is detected in either of these requirements, DFmigrate terminates with an error message and no studies are migrated.
NOTE: Study directory paths are evaluated without regard to case, e.g. /opt/studies/study1 and /opt/studies/Study1 are considered identical, and not allowed as each study path must be unique.
/opt/studies/study1
/opt/studies/Study1
Updating DFserver.cf - This step updates the existing DFserver.cf file found in STUDY_DIR/lib to include new Minimum Version parameters. A copy of the original DFserver.cf file is created as DFserver.cf_oldversion and stored in STUDY_DIR/lib prior to the conversion. The study's minimum version restriction for DFexplore and DFsetup are set to the same version as DFmigrate (i.e. the current DFdiscover version).
STUDY_DIR/lib
Converting DFsetup - This step converts the existing DFsetup, DFschema, DFschema.stl, DFtips and DFcterm_map files found in STUDY_DIR/lib to the format expected by the current version. Copies of the original files (called DFsetup_oldversion, DFschema_oldversion, DFschema.stl_oldversion, DFtips_oldversion, DFcterm_map_oldversion) are created and stored in STUDY_DIR/lib prior to the conversion. The QC_Report_ID style is converted from type string to type number. If user-defined string styles also exist with the "Treat As Pre-printed Numerals" attribute, these too will be converted to type number during migration. If plate triggered procedures (Pre- and Post-processes) have been defined for any of the plates in DFsetup, a new plate arrival trigger file is created for each Pre- and Post-process. The plate arrival trigger files are named DFPrePost###.sh, where ### represents the plate number on which the process is defined. These files are saved to the study directory STUDY_DIR/ecbin. If a plate contains both a Pre- and Post-process, both processes will be merged into a single plate arrival trigger file.
QC_Report_ID
DFPrePost###.sh
STUDY_DIR/ecbin
This step is also responsible for creating the following directories in the STUDY_DIR parent directory: ecbin, dfsas, drf and dfschema, which did not exist in earlier releases.
Where they are not already defined, custom level labels are set to the default label values Level 0, Level 1,... , Level 7.
Level 0
Level 1
Level 7
Moving Lookup Tables to lut directory - All lookup tables must reside in either a study lookup table directory, specifically STUDY_DIR/lut, or in a DFdiscover level lookup table directory, named /opt/dfdiscover/lut. DFdiscover looks for referenced lookup tables first at the study level and then if they cannot be found there at the DFdiscover level. STUDY_DIR/lut is created and lookup tables defined in the study lookup table map file, STUDY_DIR/lib/DFlut_map, which can be found in STUDY_DIR/lib are moved to STUDY_DIR/lut. DFlut_map itself continues to reside in STUDY_DIR/lib/DFlut_map.
STUDY_DIR/lut
STUDY_DIR/lib/DFlut_map
NOTE: Lookup tables which do not reside in STUDY_DIR/lib, and were instead defined in STUDY_DIR/lib/DFlut_map with an absolute path location, will not be moved by DFmigrate and must be moved manually to either STUDY_DIR/lut or /opt/dfdiscover/lut. Also STUDY_DIR/lib/DFlut_map must be updated to remove any absolute pathnames to lookup tables.
Moving Edit checks code to the ecsrc directory - Edit checks are stored in the STUDY_DIR/ecsrc directory. If necessary this directory is created and the pre-migration edit check file DFedits is moved to this directory. Published edit check file DFedits.bin, which is used by DFexplore, is not moved and continues to reside in STUDY_DIR/lib.
STUDY_DIR/ecsrc
NOTE: Other edit check source files (i.e. any include files specified in DFedits) are not moved by DFmigrate and must be moved manually to the ecsrc directory in either STUDY_DIR or /opt/dfdiscover. Like lookup tables, DFdiscover now looks for edit check include files first in the study directory STUDY_DIR/ecsrc and if not found, then in the system location /opt/dfdiscover/ecsrc. All include files must be referenced by their file name alone, and an absolute path name is not allowed.
/opt/dfdiscover
/opt/dfdiscover/ecsrc
Converting postscript and old TIFF files to PNG - A DFbkgdXXX.png file is created for each plate background image found in the study STUDY_DIR/bkgd directory. These PNG files are necessary if you use the DFdiscover DFprintdb program to print CRF backgrounds merged with data records from the study database. These high-resolution PNG files are also used by DFexplore to print CRF backgrounds. The PNG files are created by converting existing high-resolution tiff files. If tiff files do not exist then the PNG files are generated by amalgamating the DFbkgd-prologue.ps file and the individual DFbkgdXXX.ps files for each plate and using Ghostscript (DFgs) to convert the merged PostScript files into high resolution PNG files.
DFbkgdXXX.png
STUDY_DIR/bkgd
DFbkgd-prologue.ps
DFbkgdXXX.ps
NOTE: If the PostScript conversion performed in this step fails, the PNG files can be created by re-importing and saving the CRF backgrounds in the DFsetup tool.
If during execution of DFmigrate any of the above steps fail, DFmigrate can be re-run for the same study after the outstanding issues have been resolved.
DFmigrate appends a title (which includes: study number, study directory, date and time) followed by any error or warning messages for each migrated study to a log file named dfmigrate_errors which is created in the directory from which DFmigrate is run if it does not already exist. This file should be checked when DFmigrate ends. It may contain messages describing manual steps that need to be taken to complete the migration of one or more studies.
dfmigrate_errors
When migrating pre 4.1 studies, if the -a (migrate all studies) option is used an entry is added to DFstudyspaces.db for each unique study parent directory found in DFstudies.db; and if the -s option is used to migrate a single study an entry is added for that study alone, unless it's study space is already defined.
Once migration has been successfully completed for a study the DFdiscover utility program DFstudyPerms should be run to verify and, if necessary, correct any permission problems that may exist for the study directories and files.
/opt/dfdiscover/lib/DFstudies.db
DFmigrate exits with one of the following statuses:
Migrate study 100 to the current version
DFmigrate gets the correct path for study 100 from its entry in /opt/dfdiscover/lib/DFstudies.db.
% DFmigrate -s 100 ****************************** DFmigrate: Study 100 - /opt/studies/test100 - Thu Apr 15 12:58:40 2010 ****************************** Step 1: Updating DFserver.cf ------------------------------------------------------------ File Converted: /opt/studies/test100/lib/DFserver.cf Step 2: Converting DFsetup ------------------------------------------------------------ OLD SETUP FILE <DFsetup v3.7.0> File Created: /opt/studies/test100/ecbin/DFPrePost001.sh. File Created: /opt/studies/test100/ecbin/DFPrePost002.sh. File Created: /opt/studies/test100/ecbin/DFPrePost003.sh. NOTE: Style type for the following styles has been been changed to Number as it was previously set to treat data as pre-printed numerals: STYLE -------------- QC_Report_ID File Converted: /opt/studies/test100/lib/DFsetup File Converted: /opt/studies/test100/lib/DFschema File Converted: /opt/studies/test100/lib/DFschema.stl File Converted: /opt/studies/test100/lib/DFtips File Converted: /opt/studies/test100/lib/DFcterm_map Step 3: Moving Lookup Tables to lut directory ------------------------------------------------------------ ERROR: Absolute file path found! lookup table file /opt/studies/test100/lib/DF_QClut not moved. Step 4: Moving Edit Checks code to ecsrc directory ------------------------------------------------------------ File Moved: from lib/DFedits to ecsrc/DFedits. WARNING: DFedits contains #include files which must be moved to ecsrc manually. Step 5: Converting postscript to png and old tiffs to png ------------------------------------------------------------ All background files converted from DFbkgd###.ps to DFbkgd###.png Study migration was successful. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ DONE: Migration has been performed on the following study: 100 Please review dfmigrate_errors for any migration errors or warnings and/or manual steps that may be required. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ %
Migrate a non-active study to current version
Study 207 is not active. It does not appear in the DFadmin study list and is not defined in /opt/dfdiscover/lib/DFstudies.db. The study directory is located at /tmp/studies/study207. It might be an old study copied from a backup medium, or a copy of an active study (e.g. cp -pr /opt/studies/study107 /tmp/studies/study207)
/tmp/studies/study207
% DFmigrate 207 /tmp/studies/study207
Migrate all studies found in /opt/dfdiscover/lib/DFstudies.db to current version
% DFmigrate -a
DFras2png {-s study_dir}
study_dir
DFras2png is used to convert Sun raster files in the study pages directory into PNG files. File names remain the same.
All recent versions of DFdiscover store images in PNG format rather than the Sun raster image format used in older versions. Converting the images to PNG format brings several advantages: smaller file sizes, color capability and better interoperability with other software. Very few packages can deal with the Sun raster file format.
Only files in the pages directory are converted. Any Sun raster image files in any of the other study directories are not touched. Any PNG files that exist in the pages directory are reported but are not re-converted. Any other files that are not Sun raster files are ignored. Study directories that are incorrectly specified or don't exist are reported. All reporting is written to stderr.
exits with one status:
Convert all Sun raster files in the pages directory found in the study directory /opt/studies/demo253 into PNG files
/opt/studies/demo253
% DFras2png -s /opt/studies/demo253
DFsetupdb — Add or reload study schema files to DFsetup.db sqlite database.
DFsetupdb [-reload] {-s # [-stdin]|-s #~#|all}
A new setup sqlite database DFsetup.db was introduced in version 5.9.0 to support setup History functionality in DFsetup. The study setup history in DFsetup requires all existing user and role history tracked in the study schema files (stored in the study directory under dfschema) to be loaded into the new setup history database. DFsetupdb is provided for this purpose. Until setup history is loaded, it is not possible to view history in DFsetup.
The DFsetupdb utility must be run by either user datafax or root with the -reload option. The reload option empties existing entries and reloads the study schema files to the tables in DFsetup.db. This process may take some time if the study schema files are large. This can be performed while DFdiscover is running, but ensure no study setup changes are made in DFsetup while it is running.
DFsetupdb must be part of the server upgrade process like DFmigrate. DFsetupdb is installed in the utils directory.
DFshowIdx [-F] [-q] [-l #] [-p #] {-s #}
DFshowIdx displays the per plate index files as described in plt###.ndx - per plate index files.
DFshowIdx exits with one of the following statuses:
DFstudyDiag {-s #}
There are a variety of consistency checks that must be in place for a study server to operate correctly. If one or more of these consistency checks fails, it may be difficult for a user to determine the failure condition so that it can be remedied. DFstudyDiag is the utility to use in situations like this.
DFstudyDiag checks a variety of places while diagnosing a study server. It consults with the DFdiscover master, reviews the DFdiscover studies database, communicates with the operating system's portmapper, and also interacts with one or more DFdiscover slaves. DFstudyDiag displays its progress which each of these interactions as these proceed.
DFstudyDiag exits with one of the following statuses:
What is the status of study 7's database server?
% DFstudyDiag -s 7 Diagnosing study server 7 starting Fri Apr 25 14:57:16 1997... >> Trying to contact study server directly... << Study server is currently operational and responding.
DFstudyDiag detects a problem with study 8
% DFstudyDiag -s 8 Diagnosing study server 8 starting Fri Apr 25 15:00:52 1997... >> Trying to contact study server directly... << Failed. >> Trying to load studies database from master... << OK. >> Contacting portmapper on candidate hosts... << OK. >> Contacting slaves on candidate hosts... << OK. >> Checking portmapper entries on candidate hosts... << OK. >> Looking for existing serverstatus file... << The file '/opt/dfdiscover/work/.serverstatus8' exists although no study << appears to be running. The file should be removed. Please show this output to your DFdiscover administrator.
DFstudyPerms [-v] [-f] [-g string] {#}
Incorrect permissions or file ownerships are the most common causes of problems in UNIX applications, and DFdiscover, being a UNIX application, is no exception.
DFstudyPerms is a useful utility for detecting permission and ownership problems on all required DFdiscover sub-directories and files for a study. It is not able to detect errors in files that are not required by DFdiscover.
DFstudyPerms exits with a status of 0 if no errors are detected and a non-zero status if errors are detected.
DFstudyPerms does not detect problems with ownership or permissions in the DFdiscover software files themselves. To reset ownerships and permissions with the DFdiscover software, use the SETPERMS script that can be found in /opt/dfdiscover.
DFstudyPerms exits with one of the following statuses:
-u
Check, in verbose mode, the permissions for study 251
% DFstudyPerms -v 251 checking study 251, group studies ...checking '/opt/studies/exemplar251' ...checking '/opt/studies/exemplar251/lib' ...checking '/opt/studies/exemplar251/bkgd' ...checking '/opt/studies/exemplar251/data' ...checking '/opt/studies/exemplar251/data/1602.jnl' ...checking '/opt/studies/exemplar251/data/1603.jnl' ...checking '/opt/studies/exemplar251/data/1604.jnl' ...checking '/opt/studies/exemplar251/pages' ...checking '/opt/studies/exemplar251/pages_hd' ...checking '/opt/studies/exemplar251/reports' ...checking '/opt/studies/exemplar251/work' ...checking '/opt/studies/exemplar251/lib/DFcenters' ...checking '/opt/studies/exemplar251/lib/DFfile_map' ...checking '/opt/studies/exemplar251/lib/DFschema' ...checking '/opt/studies/exemplar251/lib/DFsetup' ...checking '/opt/studies/exemplar251/lib/DFtips' ...checking '/opt/studies/exemplar251/lib/DFvisit_map' ...checking '/opt/studies/exemplar251/lib/DFlut_map' ...checking '/opt/studies/exemplar251/lib/DFmissing_map' ...checking '/opt/studies/exemplar251/lib/DFpage_map' ...checking '/opt/studies/exemplar251/lib/DFqcproblem_map' % echo $status 0
{infile} {outfile}
is used to convert TIFF files into individual PNG files whose names are comprised of the output stem followed by a 3-digit page number. The raster files are converted to 100DPI images but are not image processed in any other way.
exits with one of the following statuses:
Convert the file fax12345.tif into PNG files rast001, etc
fax12345.tif
rast001
% DFtiff2ras fax12345.tif rast
DFuserPerms [-S server] [-U username] [-C password] [-a] [-f] [-e errorlog] {-i inputfile}
errorlog
inputfile
Normally, creating and modifying users, passwords, roles, role permissions, and user roles is done with DFadmin. However, there may be cases where it would be useful to perform these operations from the command line. DFuserPerms is a tool that grants the ability to import these records from the command line. The records must be stored in an input file which conforms to the format used in the DFuserdb.log file (see System Administrator Guide, DFuserdb.log), although with the following exceptions:
Record Time Stamp and Record Modifier must not be included for any record type.
Instead of a Password Hash field for PASS records, the actual password to be imported must be specified.
IMPORTANT: - USRL and RLPM records require the corresponding ROLE record to be present in the same import file. However, PASS and USRL records do not require the corresponding USER to be present in the same file. - For ROLE records, the role ID specified in the import file does not affect which ID it is actually assigned. If the role’s name and study number matches an existing role, the existing role is edited to match the record being imported. If the role’s combination of name and study number is new and unique, it is imported as a new role with the next available ID. As a result, role names cannot be edited with DFuserPerms. - Though it does not matter to the ROLE record which ID it is given, if a USRL or RLPM record is being imported, the Role ID field is used to match it to a role imported in the same file. So if a user wishes to import USRL or RLPM records for a certain role, they must give the ROLE, RLPM, and USRL records the same role ID in their import file. See the second example. - The Expiry field for PASS records is required but irrelevant. DFuserPerms | sets the password expiry according to the settings in DFadmin. - When there is an error in the record formatting or data, DFuserPerms | does not import that specific record, but does import all other valid records in the file. As such, DFuserPerms | is considered to have run successfully even if there are invalid records. Thus errors in the record formatting or data are not written to stderr, but rather stdout. Only errors which prevent DFuserPerms | from running, such as authentication errors, are written to stderr. - If there is an error when reading a PASS record, the password characters are replaced with X’s when the error is written to stdout.
DFuserPerms exits with one of the following statuses:
Importing USER and PASS records
In this example, a new user named Jason is imported and a password is assigned to them. The input file contains:
USER|Jason|2|Jason Johnson|A Company Incorporated|||||||||0|2|1| PASS|Jason|apassword1234|1464208516
Importing the file:
% DFuserPerms -S servername -U username -C password -i inputfile ************************************ DFuserPerms: Server [servername] User [username] Fri Jun 10 09:40:37 2016 ************************************ Input file: inputfile Reading file =============================== Imported Records = 2 Total Rows = 2 =============================== DONE: Importing user-perms complete. Errors (if any) are logged above. ************************************
Importing ROLE, RLPM, and USRL records
In this example a new role called Company Permissions is imported, role permissions for this role are specified, and user Jason is assigned to have this user role. The input file contains:
ROLE|20|254|2|Company Permissions||*|*|*||*|0|0 RLPM|20|2|2|*|*|1,2,3,4,5|1,2,3,4,5|1,2,3,4|*|*|*|0|0 USRL|Jason|20|1|2|*|*
% DFuserPerms -S servername -U username -C password -a -i inputfile ************************************ DFuserPerms: Server [servername] User [username] Fri Jun 10 09:51:27 2016 ************************************ Input file: inputfile Reading file =============================== Imported Records = 3 Total Rows = 3 =============================== DONE: Importing user-perms complete. Errors (if any) are logged above. ************************************
Note that the ROLE, RLPM, and USRL records all used the same Role ID value (20). The DFuserdb.log file shows the the IDs have changed when they were imported, but DFuserPerms ensures that since they were imported with the same ID value, they are assigned to the same role ID (105):
20160610095127|ROLE|datafax|105|254|2|Company Permissions||*|*|*||*|0|0 20160610095127|RLPM|datafax|105|2|2|*|*|1,2,3,4,5|1,2,3,4,5|1,2,3,4|*|*|*|0|0 20160610095127|USRL|datafax|Jason|105|1|2|*|*
The DFdiscover edit check language provides a general-purpose programming environment in which to create procedures that can be used to perform logic checks on data fields, generate quality control notes, and calculate and insert values into data fields. Edit checks may reference one or more data fields from any available record in the database, and may include arithmetic, logical and comparison operators. Looping and conditional execution constructs are also provided. The language structure is based on the C programming language and is a loose subset of C.
Edit checks are defined, named and stored in a study edit check file (named DFedits and located in the study ecsrc directory). This file is accessed and edited by selecting View > Edit checks in DFsetup. Once defined, an edit check can be applied to the desired data field(s) using the style or variable definition dialogs, where the edit check can be set to execute on: [field entry, field exit, page entry, or page exit].
When choosing among the options of field entry, field exit, plate entry and plate exit, note the following:
most edit checks should execute on exit from the last field used in the edit check to ensure that the user has finished all relevant data entry and corrections.
if you do not trust users to tab through all data fields you may want to execute edit checks on both field and plate exit to ensure that a data record can not be saved without executing the edit checks.
do not attach edit checks to hidden fields if you want them to be executed by users who do not have access to hidden fields.
in Image view most edit checks should not be executed until after the duplicate resolution step has completed (which occurs on entry to the first non-key field), because the initial ICR values may be replaced by existing database values during duplicate resolution. In particular note that any edit checks triggered on exit from the key fields (visit and ID) will run to completion before duplicate resolution has a chance to occur, and thus might produce incorrect results.
any plate entry edit checks executed in Image view will be re-executed after the existing data record is brought forward by duplicate resolution.
Edit checks are most often executed interactively when entering and reviewing data records in DFexplore, DFcollect and DFweb but can also be executed in batch mode (see Batch Edit Checks).
The process used to define edit checks in DFsetup is detailed in Study Setup User Guide, Edit Checks.
In addition to user-defined field and plate entry and exit edit checks, there are 2 reserved edit check names:
DFopen_study - executed when the study is selected in the login dialog, and
DFopen_study
DFopen_patient_binder - executed when a subject binder is opened in DFweb, DFcollect or in DFexplore Data View, including when a user switches to Data View from any of the other views, and when a task record is selected for a different subject.
DFopen_patient_binder
These two reserved edit checks are optional; if they are defined by an edit check programmer they will execute in DFexplore, DFweb and DFcollect. Additionally, DFopen_study executes for the first batch in a batchlist when run in batch mode. They are typically used to set user preferences (using function dfpref), change visit and plate requirements from the default visit map specifications (using function dfneed), and to display messages (using function dfwarning or dferror). Since no data record has the focus when these edit checks run, the edit checks cannot refer to data fields unless the fields are fully qualified. For example, @[1001,0,1,15] references field 15 and VDATE[1001,0,1] references variable VDATE, both of plate 1 at visit 0 for subject 1001. In DFopen_patient_binder, @PID can be used to refer to the subject ID; thus VDATE[@PID,0,1] contains the value of the VDATE field for the subject whose binder is being opened. In DFopen_study, @PID equals 0 since no subject has yet been selected.
@[1001,0,1,15]
VDATE[1001,0,1]
@PID
VDATE[@PID,0,1]
The edit check language provides the following features:
Support for all DFdiscover data types
Read/Write access to variables on the current plate
Read access to variables on other plates
Local variables
Arithmetic, logical, comparison operators
Query functions
Lookup table support
Legal range check functions
Message output functions
String matching and manipulation functions
Information functions
User-defined functions
In DFexplore the data fields available to edit checks are restricted by user's site, subject, visit, plate and read level access permissions, as specified in the roles the user plays within each study. This helps to ensure that users will not be shown database values in edit checks to which they would normally not have access, but it also means that data records unavailable to the DFexplore user can not be used in edit checks. This restriction does not apply to hidden fields. Thus data values that are needed in an edit check, but should not normally be seen by users with a certain role, can be made available by defining them as hidden fields on plates to which the user has read access permissions.
As each edit check is defined, it must be given a name. This name can be any sequence of letters, but it is a good idea to use names that are descriptive of what the edit check is to accomplish. Names must start with a letter and may contain letters, numbers and underscores, _. There is no length limit on edit check names. Edit check names are case sensitive, so myeditcheck() and MyEditCheck() refer to different edit checks.
_
myeditcheck()
MyEditCheck()
Comments can be placed anywhere an edit check by preceding it with an octothorpe, #.
Edit checks begin with the line containing the keyword edit, the name of the edit check, a (, an optional parameter declaration list and a ). The optional parameter declaration allows arguments to be passed into the edit check. It is important to note that if parameters are declared, they must be supplied when the edit check is invoked. Following this is the body of the edit check, enclosed in a pair of braces, { and }. The body of the edit check contains zero or more statements that perform the desired operations.
edit
(
)
{
}
Execution of the edit check begins with the first statement in the edit check and proceeds to the next in sequence until the logic flow changes because of an if or while statement. Execution is terminated via an exit or return statement or when control reaches the end of the edit check.
if
Braces, { and }, are used throughout the edit check language to group two or more statements together to operate as one statement. They become especially important when used with the if and while statements that require a single action statement.
Each statement in the edit check language is terminated by a semicolon, ;. A simple edit check to convert pounds to kilograms could be written as follows:
;
# This is a comment edit lbs2kgs() { kgs = lbs / 2.205; }
Edit checks are placed one after the other in the DFedits file, so the addition of a similar function to convert kilograms to pounds would result in the following DFedits file:
edit lbs2kgs() { kgs = lbs / 2.205; } edit kgs2lbs() { lbs = kgs * 2.205; }
An example of an edit check with parameters would look like this:
edit isbetween(number low, number high) { if ((lbs < low) || (lbs > high)) { dferror("Weight is not between ", low, " and ", high, " pounds."); } }
To verify that the weight is in the range of 100-250 pounds we would attach the edit check isbetween(100, 250) on field exit on the variable lbs.
isbetween(100, 250)
lbs
Statements in the body of an edit check are executed in order. The logic of an edit check typically terminates after the last statement is executed, however, an edit check can be forced to terminate earlier by use of the return or exit statements. These two statements are very similar in purpose; they terminate the logic of an edit check at the point where the statement is executed - no following statements are executed.
The difference in the return and exit statements is in their behavior with respect to a sequence of edit checks that are to be executed at the same location on the current variable. The exit statement terminates the current edit check and does not execute any of the following edit checks on the variable, whereas return terminates the current edit check but then resumes processing with the next edit check defined on the variable, if any.
Difference between exit and return
In this example, a variable has the following definition for the field enter edit checks:
ageCheck, demog, conmed
The three edit checks are to be executed in the listed order. If an exit statement is executed in the body of ageCheck then edit checks demog and conmed are not executed. However, if a return statement is executed in the body of ageCheck, edit checks demog and conmed would still be executed.
ageCheck
demog
conmed
The same would be true for an exit or return statement inside the body of edit check demog, conmed would be skipped if exit executed, while it would not be skipped if return was executed.
Now that you have some idea of what edit checks are about and have seen an example, we will turn to a description of the elements of the edit check language.
Variables are storage spaces that can hold values. Each variable has a data type associated with it, where the data type is from the list:
The data types have the following properties:
number - an optional leading plus/minus sign, a sequence of one or more digits (whole part of the number), an optional decimal point, and, optionally, one or more digits (fractional part). The range of numbers (10 digits at maximum, except for subject ID which allows 15 digits at maximum) that can be represented is -2147483647 to 2147483647, inclusive.
-2147483647
2147483647
date - 2 or 4 digits for the year, 2 digits or 3 characters for the month, 2 digits for the day, and optional delimiters. The format of the date is taken from the field definition for database variables, or from the format date declaration (see “Date Variables”) for constant dates.
format date
string - a sequence of zero or more alphanumeric characters, plus a small set of two character sequences (see “String Constants”), enclosed in double quotes. The maximum length of a string is taken from the field definition for database variables, or is 16383 bytes for constant strings (4095 UNICODE characters).
time - 2 digits for hours, a colon delimiter, two digits for minutes or 2 digits for hours, a colon delimiter, two digits for minutes, another colon delimiter and two digits for seconds. The minimum and maximum values are based on a 24 hour clock with a minimum value of 00:00:00 and a maximum value of 23:59:59.
Field types map to data types in the following way:
Number
Date
String
Time
Choice
Check
VAS
Different types can have different operations performed on them. For example, it makes sense to multiply two numbers together, but not two dates. The edit check language uses type information to decide what operations are permitted and what the results are. Subtracting two dates returns the number of days between them. Adding dates doesn't make sense, so this operation is not permitted and will result in an error message. Adding a number (of days) to a date is permitted and is used to calculate a date in the future.
Converting between variable types
Unlike some programming languages, variables types for parameters of edit checks or functions may or may not be interchangeable. In these situations, the variables need to be converted to the proper variable type before use using another variable to act as an intermediary. For example, if you have a number that you need to convert to a string, the following code can be used:
number num1 = 2; #our initial number variable | string tmpStr; #a temporary string to convert numbers tmpStr = num1; #this statment converts the number variable to the string
Note that unlike in some other languages, inline or implicit typecasting should not be assumed when working with the edit check language. This variable, should you need to convert another number, can be re-used as if it is a normal string variable.
Variables allow read/write access to data fields on the current plate, read-only access to data fields on another plate in the database, or read/write access to local temporary storage areas. A database field is referenced by its name as defined in the setup tool.
For example, if you have a database field called DOB in your study setup, you would reference the first instance of that field on the current page using:
DOB
References can be refined by specifying which module the variable is in. To reference the DOB field in the current module that the edit check is running in, use:
.DOB
To reference the DOB variable in the first instance of the DEMOGRAPHICS module, use the construct:
DEMOGRAPHICS.DOB
To reference the DOB variable in the module DEMOGRAPHICS with instance ID of 5, use the construct:
DEMOGRAPHICS[5].DOB
The above constructs operate on the current page. It is also possible to reference variables in a read-only context on other pages by including the keys to those pages by adding the id, visit and plate parameters to the variable name using the form variable[id, visit, plate].
variable[id, visit, plate]
To reference the first instance of the DOB variable for subject ID 12345 at visit 1 on plate 2, you would use:
DOB[12345, 1, 2]
The id, visit, and plate fields can each be arithmetic expressions, or may be left blank in which case they default to the current id, visit, and plate. Thus:
id
DOB[,,]
means, the variable DOB for the current id, on the current visit and plate which is the same as writing DOB by itself. Writing:
DOB[,1,2]
means the first instance of variable DOB for the current subject ID, on visit 1, plate 2.
Similarly, module names can be added to more precisely specify which instance of the DOB variable you wish to reference. To reference the DOB field in the first instance of the DEMOGRAPHICS module for subject ID 12345 at visit 1 and plate 2, use:
DEMOGRAPHICS.DOB[12345, 1, 2]
To reference the DOB field in instance 5 of the DEMOGRAPHICS module for subject ID 12345 at visit 1 and plate 2 use:
DEMOGRAPHICS[5].DOB[12345, 1, 2]
Any string values written to database variables are sanitized (i.e., any '|' characters or control characters are converted to blank) before the data is written out to the database.
The first 6, and last 3, fields in all data records in a DFdiscover database are used for database management by DFdiscover, and are referred to by the same variable names in all DFdiscover studies. They are: DFSTATUS (data record status: final, incomplete, etc.), DFVALID (data record validation level: 1-7), DFRASTER (the name of the file holding the image of the CRF page corresponding to the data record), DFSTUDY (the DFdiscover study number), DFPLATE (the data record plate number), DFSEQ (the data record sequence or visit number), DFSCREEN (data record status without consideration for primary/secondary), DFCREATE (the record creation date and time) and DFMODIFY (the record's last modification date and time). Note that DFSEQ is assigned only if the sequence/visit number is in the barcode; otherwise the name is user-defined. A detailed description of these 6 fields can be found in plt###.dat field descriptions.
In addition to named references to database variables, the edit check language allows access to data via positional notation. We may want to reference the variable we're presently on, without knowing its name. Similarly, we may want to reference the previous variable or the next variable in a record. This is accomplished with the @[expression], @[id,visit,plate,field], @T, @(T-expression), and @(T+expression) constructs.
Except for @[id,visit,plate,field] these positional variables may only reference data on the current record.
The @[expression] form refers to the field number represented by expression. For example, if expression evaluates to 7, then it would refer to the ID field (always the 7th field on a plate). To refer to the current field, the expression @[.] can be used. @[.+1] would indicate the next field on the plate.
A second, older construct also exists which does the same job. This is the @T construct (current field), @(T+1) which refers to the next field, and @(T-1) which refers to the previous field.
IMPORTANT: Note that @(T-1) means the previous field, while (@T-1) means one less than the current field. Watch those brackets!
The @[id,visit,plate,field] form can be used to refer to any field on any record in the study database, by it's numeric keys. Those keys which will always be the same as the plate on which the edit check is triggered can be omitted. For example, @[,,55,10] refers to field 10 on plate 55 of the same visit for the same subject, and @[,,,10] is equivalent to @[10], both referring to field 10 on the current plate. While this form can be used to read and test the value of any field in the study database it cannot be used to change the value of fields on other records in the database. The edit check language does not allow changes to be made to records other than the one on which the edit check is triggered.
The use of positional variables is both convenient and necessary when writing reusable generic edit checks. However they depend on field positions remaining constant. Problems can arise if a plate is modified by adding, deleting or renumbering fields. Thus edit checks must be reviewed as part of any such modifications.
Local variables are temporary storage locations that can be used to hold intermediate results of calculations. Local variables are only valid inside the edit check or function in which they are defined. They do not retain their values across invocations of that edit check or function.
Local variables are declared at the beginning of the edit check or function where they will be used. The declaration includes the data type of the variable and it's name. It must precede any other statements inside that function or edit check. Local variables are automatically initialized to a blank value.
Local variable declaration
edit testdecl() { number average; average = (bp1 + bp2) / 2; }
In this simple edit check, the average of two variables is calculated and stored in the local variable average. Since this variable is local to this edit check, its value is no longer available as soon as the edit check completes.
average
Global variables are storage locations just like local variables but differ in that they maintain their values across edit checks. Global variables are initialized when the edit checks file is first loaded and maintain their values until an edit check changes it. That new value is then used in any subsequent references until it is changed again.
IMPORTANT: It is important to remember that edit checks can be executed in an arbitrary order as the user traverses fields. It is therefore important to consider where the values in global variables really came from when executing in interactive tools such as DFexplore.
Global variables are declared in the same way that local variables are, except that they are declared outside of edit checks.
Global variable declaration
number pages_traversed = 0; edit another_page() { pages_traversed = pages_traversed+1; dfmessage("Page ", pages_traversed, " PID=", PID, " Plate=", DFPLATE, " Visit=", DFSEQ); }
If this edit check is attached as a plate exit check, it will count how many pages were traversed by a user in the course of a session and record the subject, plate and visit numbers in the edit checks log file.
There are often cases where it is useful to group variables that are related to each other. The group declaration can be used to build one-dimensional arrays of variables. The group declaration syntax is as follows:
group
Group declaration syntax
group groupname variable1, variable2, ... ;
Once a group has been defined it is referenced by groupname[index].
groupname[index]
Compute the total dose from the individual dosages
edit compute_dose() { number total=0, i=1, blank=0; group dgrp dosage1, dosage2, dosage3, dosage4, dosage5; while (i<=5) { if (dfblank(dgrp[i])) blank = blank+1; else total = total + dgrp[i]; i = i+1; } }
The individual dosages are in the database variables dosage1 through dosage5. The group variable dgrp allows the individual dosages to be referenced through an index.
dosage1
dosage5
dgrp
Group variables can be used anywhere other variables can be used but must be declared after any local variables in that edit check or function have been declared and before the first statement of the edit check's body.
NOTE: Missing group variables If a variable is missing from a group, the missing variable is treated as a missing value. If a variable defined in a group is on a plate that is missing from the database, the variable will also evaluate to a missing value.
NOTE: Changing group variable values Once a group has been declared, it is possible to reference either a variable's originally assigned name or the group index number (ie: grp[2]). It is a good idea to remain consistent if possible in the event the edit check code needs to be modified at a later date to avoid unnecessary confusion.
NOTE: Groups with multiple variable types It is possible to put variables of different types into a single group, such as a group containing string variables and number variables. However, caution is advised in such cases, as such implementations do introduce the possibility of accessing a variable of the wrong type when accessing a group, particularly when accessing group variables during a loop. therefore, it is a good idea to keep groups limited to a single variable type in cases where the variables of a loop are being used in a situation which requires specific variable type.
Each date variable in a DFdiscover study setup has a date format associated with it. This date format is used to determine which part of the date string contains the year, month and day portions of a date.
Internally, the edit check language uses julian dates and converts them to printable representations as required. Dates that are stored back into the database are stored using the date format associated with that variable. Dates displayed in messages are output using the default date format, YY/MM/DD, which can be overridden via the date format instruction at the beginning of a DFedits file.
YY/MM/DD
date format
Set the default date output format to MM/DD/YY
MM/DD/YY
date format "MM/DD/YY"
This statement must be added to the DFedits file.
NOTE: Date Format All date constants used throughout the DFedits file are assumed to be in this format. The date format statement may only appear once in each edit check file, but it may appear anywhere within the file.
Incrementing a date value
date format "yy/mm/dd" edit main() { date d; d = "90/04/07"; d = d + 14; }
This example assigns the julian date for April 7th, 1990 to the local variable d, and then add two weeks (14 days) to this date. The two assignment lines cannot be combined into one line because the conversion of the string to a julian date must be completed before the subsequent addition of 14 to this value. Combining them into one line would indicate that the string "90/04/07" and the number 14 were to be added which is clearly meaningless.
The DFdiscover edit check language allows the use of missing values in data fields.
Data may be unavailable for one of four reasons:
the data field has been left blank,
the data field contains a missing value code
the CRF page has not yet arrived and thus the data record does not exist in the database, or
the CRF page exists as a missed record.
The edit check language provides five functions for dealing with missing values:
dfblank,
dfmissing,
dfmissval,
dfmisscode, and
dfmissingrecord.
In addition, two functions exist that return information about missed data:
Return values for dfblank, dfmissing, dfmissval, dfmisscode, and dfmissingrecord
missing value label
missing value code
The dfblank function is used to determine if a data record exists and the field is blank.
dfblank usage
TRUE if variable is blank. A blank string/number/date is one containing no characters or digits, not even a space. Also TRUE for check and choice fields having a code for "no choice".
FALSE in all other cases.
if (dfblank(DOB)) dfmessage("DOB is missing");
The record containing var must exist for dfblank to return TRUE.
var
Previous to 3.8, it was necessary to use, as a work-around, the construct
if ( @T == 99 )...
for check fields that coded "not checked" using a code other than 0, (e.g. 99 in this example). This work-around is no longer correct or supported. For check and choice fields any test for the code used for no choice, or unchecked, will always fail (i.e. evaluate to FALSE). The correct way to test for this condition is by using,
if ( dfblank( @T ) )...
The dfmissing function is used to determine if a variable is missing, either because the data record containing it does not exist, the data record containing it is missed, the value is blank, or the field contains a missing value code.
dfmissing usage
if (dfmissing(DOB)) dfmessage("DOB is missing"); # return TRUE if keys match an entry identified as missed if (dfmissing(DFSTUDY[,0,1])) dfmessage("Plate 1, visit 0 is a missed record");
dfmissingrecord()
The dfmissval function returns the missing value label equivalent to the variable's value.
dfmissval usage
The missing value label equivalent to the variable's value.
A Blank string in all other cases.
label = dfmissval(DOB);
The dfmisscode function returns the missing value code equivalent to the variable's value.
dfmisscode usage
The missing value code equivalent to the variable's value.
A blank string in all other cases.
code = dfmisscode(DOB);
The dfmissingrecord function returns information about a missing record.
dfmissingrecord usage
id is the subject ID of the missing record
visit is the visit number of the missing record
plate is the plate number of the missing record
Information about a missing record for the specified keys. Return values are as follows:
0 if the data record is in the database, that is, not missing.
1 if a missed record matching the specified keys is in the database.
2 if there is no record in the database matching the specified keys.
3 if the record in the database matching the specified keys has a pending status.
if( dfmissingrecord(99001,0,1) == 1 ) dfmessage("Plate 1 at visit 0 is missed for this subject." );
dflostcode returns information about a missed record entered in DFexplore, specifically the actual reason code assigned to the missed record.
dflostcode usage
id is the subject ID belonging to the missed record
visit is the visit number belonging to the missed record
plate is the plate number belonging to the missed record
code = dflostcode(99001,0,1);
dflosttext returns the actual text reason assigned to a missed record in DFexplore.
dflosttext usage
dfmessage( "The missed reason is ", dflosttext(99001,0,1) );
The edit check language supports two functions that can be used to identify data records that do not exist in the database - dfmissing and dfmissingrecord.
Using dfmissing requires testing for the existence of any one of the required fields on the data record of interest. For example, we could determine whether plate 1 at visit 0 is missing for the current subject by testing to see if the DFdiscover study number is blank for this record using dfmissing.
Testing for a missing record with dfmissing
if( dfmissing( DFSTUDY[,0,1] ) dferror("Plate 1 at Visit 0 is missing");
DFSTUDY is the name for the DFdiscover study number field that is present on all plates in the database. This field can never be left blank or contain a missing value code because it is maintained by the system and users cannot edit it. Thus the above test will only return true when the data record is missing or if the specified keys match an entry marked as missed.
dfmissingrecord removes the need for the 'trick' of testing for a missing record by evaluating the value of a known, fixed variable. dfmissingrecord takes as arguments the subject ID, visit and plate keys of the data record of interest and returns the following:
0 if the data record is in the database
1 if a missed record matching the keys is in the database
2 if there is no record in the database matching the keys
Testing for a missing record with dfmissingrecord
if( dfmissingrecord(,0,1) == 2 ) dfmessage( "Plate 1 visit 0 is not in the database" );
Calculate the subject's age provided both birth date (bdate) and the study entry date (edate) are available.
bdate
edate
if( !dfmissing( bdate ) && !dfmissing( edate ) ) age = ( edate - bdate ) / 365.25;
Complain if the subject was hospitalized (hospital has the value 2) but the hospitalization date has been left blank
if( hospital==2 && dfblank(hospdate) ) dferror("Hospitalization date is required.");
Check for a specific missing value label
if(dfmissval(hosp)=="not hospitalized" && !dfmissing(hospdate)) dferror("Reason for hospitalization has missing value code = \"not hospitalized\", but a hospitalization date has been entered");
This example shows an edit check that might be placed on a field used to record hospitalization date. It checks the reason for hospitalization field (hosp) for the existence of the missing value reason "not hospitalized" and if it finds this reason pops up an error message.
hosp
Check for a specific missing value code
If "N" was the missing value code for the above missing value reason, the same edit check could be written using dfmisscode.
if(dfmisscode(hosp)=="N" && !dfmissing(hospdate)) dferror("Reason for hospitalization has missing value " + "code = \"not hospitalized\", but a hospitalization " + "date has been entered");
This section summarizes the operators available and their results. In these tables a blank box indicates that the operation is not permitted and an error message will be produced.
The order of operations (highest to lowest precedence) is as follows:
You can override the default order of operations by using the grouping operators, ( and ). Expressions within parentheses are always evaluated first, and nested parenthetical expressions are always evaluated from the lowest nesting point to the highest. For example,
2 + 3 * 5
evaluates to 17, because normal precedence evaluates the * before the +, whereas
( 2 + 3 ) * 5
evaluates to 25, and
( ( 2 + 3 ) * 5 - 2 ) * 2
evaluates to 46. In this latter case, ( 2 + 3 ) is evaluated first (5), followed by ( 2 + 3 ) * 5 (25), and then ( ( 2 + 3 ) * 5 - 2 ) (23), and finally ( ( 2 + 3 ) * 5 - 2 ) * 2.
( 2 + 3 )
( ( 2 + 3 ) * 5 - 2 )
The addition operator, +, adds two operands and returns the result as summarized by this table. A missing variable may be a blank field, a missing value code or a missing data record. Returned missing values evaluate TRUE with dfblank and dfmissing.
Result table for Addition operator
NOTE: - Missing/String unlike other data types, string concatenation makes a distinction between blank fields which are concatenated as an empty string, and fields that contain missing value codes or where the data record is missing, which abort the concatenation and return a null string. - Date/Number addition returns the date number days in the future for a positive number or days in the past for a negative number. - Time/Number addition returns the time number seconds in the future for a positive number or seconds in the past for a negative number. - String/String addition returns the concatenated string, unless one of the strings is missing, in which case the result for Missing/String applies.
The subtraction operator, -, subtracts two operands and returns the result as summarized by the table. A missing variable may be a blank field, a missing value code or a missing data record. Returned missing values evaluate TRUE with dfblank and dfmissing.
Result table for Subtraction operator
NOTE: - Subtracting a number from a date returns the date number days in the past for a positive number of days or in the future for a negative number of days. - Subtracting a number from a time returns the time number seconds in the past for a positive number of seconds or in the future for a negative number of seconds. - Date/Date subtraction returns the number of days between the dates. - Subtracting a number from a time returns the time number seconds in the past for a positive number of seconds or in the future for a negative number of seconds. - Time/Time subtraction returns the number of seconds between the times.
The multiplication, *, division, /, and modulus, %, operators return the results as summarized by the table. A missing variable may be a blank field, a missing value code or a missing data record. Returned missing values evaluate TRUE with dfblank and dfmissing.
Result table for Multiplication/Division/Modulus operators
NOTE: - For division in which both operands are integers (no decimal or decimal places), the returned value will be truncated to an integer, e.g., 100/60 will return 1. To obtain a floating point result at least one of the operands must be a floating-point number, e.g., 100/60.0 returns 1.666667. - Local and database fields may be cast to a floating-point number before the operation is performed by adding 0.0. For example, if A and B are 2 numeric fields which contain the integers 100 and 60 respectively, any of the following expressions: (A+0.0)/B, A/(B+0.0), (A+0.0)/(B+0.0) will return 1.666667. - Division or modulus by zero results in an error message.
The exponentiation operator, ^, raises a number (the base) to the power of an exponent, and is written as:
^
base ^ exponent
where both the base and the exponent are numbers (integer, fixed point, or floating point). For example,
base
exponent
4 ^ 3 = 4 * 4 * 4 = 64 4 ^ 0.5 = sqrt( 4 ) = 2
The result of raising any number (including 0) to the exponent 0 (or 0.0) is 1. The result of raising 0 (or 0.0) to any positive exponent is 0. The result of raising 0 (or 0.0) to any negative exponent is illegal. The base can be negative only if the exponent is an integer value; that is, it is illegal to raise a negative number to a fractional exponent. In such a case, and in any case where the calculation is illegal, the returned result is a blank/empty string.
Results of exponentiation are summarized in the table. A missing variable may be a blank field, a missing value code or a missing data record. Returned blank/empty values evaluate TRUE with dfblank and dfmissing.
Result table for Exponentiation
NOTE: - The rows or the columns may be the base or the exponent - the table results are symmetric.
Calculation of Body Surface Area
The calculation of an individual's body surface area in meters square, given their weight in kilograms and height in centimeters, is:
((kg)^0.425 x (cm)^0.725 x 71.84) / 10,000
An average adult has a body surface area of 1.73 meters square.
NOTE: With a sufficiently large base and exponent, the calculated values may be extremely large. Performing large number computations on systems that are capable of accurately handling only numbers of 32 or 64 bits, means that the math libraries internal to the computer's operating system must use algorithms and compromises to handle such large numbers. As a result, with sufficiently large base and exponents, the calculated values may not be reliable. It is unlikely however, that these types of numbers will be used in edit checks.The same also holds true for calculations performed using very small fractional numbers.
The assignment operator, =, is used to store a value in a database or local variable. The assignment operator will cast (i.e. change) values from one type to another, and follows these rules:
The assignment of a date to a string type causes the string to contain the date value formatted according to the edit check date format, which is YY/MM/DD unless otherwise defined by the date format statement in the DFedits file.
The assignment of a string to a local date variable causes the string to be converted to a date using the default date format. If the resulting date is stored in the database it is stored according to the date format associated with the database field.
The assignment of a string to a local time variable causes the string to be converted to a time. If the resulting time is stored in the database it is stored according to the time format associated with the database field.
The assignment of a time to a string type causes the string to contain the time value formatted hh:mm:ss.
When assigning numeric values to data fields the whole number part of the value is honored, but decimal places may be truncated to the number of decimal places provided by the format. For example, assigning the value 9 to a numeric field with a format of nn.n would store the value as 95.1, and if the format was nn the stored value would be 95.
9
95.1
95
If the assigned value is too large to fit in the specified data field the result in DFexplore is a blank field, and a dialog appears describing the problem. The dialog identifies the field and the value that could not be assigned. In DFbatch the field is left unchanged and an error message is written to the log. For example, this would occur on attempting to assign the value 1000 to a numeric field with a format of nn, nnn, nn.nn, etc., or to an unformatted field with a store length less than 4.
1000
Missing values are stored in database fields according to their missing value codes. Legal missing value codes are defined in DFmissing_map. If DFmissing_map has not been defined the default DFdiscover missing value code is an *. It is legal to assign a missing value code to any field, regardless of type, e.g. bdate = "*".
"*"
A database field of any type can be blanked by assigning it a blank string, e.g. bdate="".
Attempting to assign a value to DFdiscover protected fields (e.g. fields 1 to the end of the barcode and the last 3 record status fields), results in the edit check being aborted with an error message.
Attempting to store a | character in a database field causes the character to be converted to a space.
Attempting to store an illegal value in a database check or choice field (i.e. a value which does not correspond to any of the defined boxes on the CRF) results in assignment of the no choice code to the database field.
It is possible to store values into a data field on the current data record by assigning the value to the variable name of the desired field. For example, if the current record has three variables named dispensed, returned and compliance to track how compliant a subject is at taking their medication, the following example would calculate compliance and store it in the compliance variable of the current record:
dispensed
returned
compliance
compliance = 100 * ( dispensed - returned ) / dispensed;
The edit check language provides the standard set of comparison operators:
less than, <
<
less than or equal, <=
<=
equal, ==
==
greater or equal, >=
>=
greater than, >, and
>
not equal, !=
!=
Be careful to differentiate between the assignment, =, and the equality, ==, operators.
Probable erroneous use of the = operator
number i=5,j=6; if (i = j) { dferror("i equals j"); } else { dferror("i is not equal to j"); }
Erroneous use of = results in i being assigned the value of j and then being checked for a non-zero value, which is probably not what was intended. The correct syntax for checking the equality of i and j is to use the equality, ==, operator.
String comparisons are done on a character-by-character basis using the ASCII sort order. This means that all uppercase letters come before all lowercase letters.
Date comparisons are done by julian date, regardless of the date format associated with the date.
Comparing two missing values returns TRUE for <=, ==, and >= and FALSE for all others. Comparing a missing value with any other value returns FALSE except for the != operator.
When performing a simple comparison, (e.g. if( @[15]==@[16] ) ), the edit check language does not distinguish between different missing value codes or between blank fields and fields containing missing value codes. Thus a simple comparison of 2 fields will return TRUE when one field is blank and the other contains a missing value code, or when the 2 fields contain different missing value codes. If such distinctions are important the dfblank and dfmisscode functions should be used.
if( @[15]==@[16] )
The edit check language represents FALSE as zero and TRUE as any non-zero number. Do not count on TRUE being any specific non-zero value (such as one).
The DFdiscover edit check language also provides the logical operators && (AND), || (OR), and ! (NOT). Each of these operators will cast their operands to TRUE/FALSE values as required based on the following rules:
&&
||
!
Casting rules for logical operators
The && (AND) operator returns TRUE only if both operands are TRUE. Otherwise it returns FALSE.
The || (OR) operator returns TRUE if at least one of the two operands is TRUE. If both operands are FALSE, it returns FALSE.
The ! (NOT) operator takes one operand and returns TRUE if the operand is FALSE and FALSE if the operand is TRUE.
NOTE: Prior to DFdiscover 2022 version 5.5, the ! (NOT) operator for string values could return an incorrect result. In these earlier versions, use !dflength(string) to test the string value condition.
!dflength(string)
The if statement provides a conditional execution mechanism. The two basic if constructs are:
if (condition) statement_1 if (condition) statement_1 else statement_2
statement in the above examples can be a single statement or a group of statements enclosed in a pair of { and }. condition is an expression that evaluates to TRUE or FALSE. If the condition evaluates to TRUE then statement_1 is executed, otherwise statement_2 is executed.
statement
condition
statement_1
statement_2
if statements can be nested and the most deeply nested else clause always goes with the most deeply nested if clause. These two examples are identical:
else
if (age < 50) if (sex == 1) dferror("<50 year old male"); else if (sex == 2) dferror("<50 year old female"); else dferror("<50 year old, no sex info");
and
The only differences are cosmetic.
The addition of braces makes the code even easier to read and understand without changing the meaning:
if (age < 50){ if (sex == 1){ dferror("<50 year old male"); } else { if (sex == 2){ dferror("<50 year old female"); } else { dferror("<50 year old, no sex info"); } } }
Indentation, while useful for visually indicating logic flow to the programmer, does not influence program execution. When writing edit checks, use indentation to show the logic flow. It will make maintaining the checks much easier. The addition of braces around blocks of code also helps, and makes your intentions clear to the language compiler and any subsequent reader(s).
The edit checks language includes a robust set of built-in functions for use in any edit check.
Generally speaking, the built-in functions behave identically across DFexplore, DFcollect and DFweb. However there are some differences dictated by the purpose of, and features available in, each application. The following table details where those differences are.
Edit check Function Implementation
In DFweb, all subject data is cached using a single network request when a subject binder is loaded. This cache is used by edit checks wherever possible to avoid delays caused by multiple network requests to load individual data records one by one. The subject cache is updated each time the user saves any data changes and every 2 minutes in the background, to make sure the data used by edit checks remains up to date.
In DFcollect, all subject data is cached in a single network request when a subject binder is loaded. This cache is used only if the device goes offline during the session, and is cleared on logout, unless the user manually downloads the subject data for offline use. When online, DFcollect makes network requests to load individual data records one by one. When offline, DFcollect uses the most recent data cached or downloaded while online, if available.
Function dfaccess can be used to change the user's access to data fields on the the current data entry screen in Data View. This is its only purpose. Changes made by dfaccess do not persist after leaving the current data screen and cannot be used to prevent users from seeing or printing data values in List View. For this purpose define Hidden Fields in DFsetup and specify whether users are allowed to see them when defining study roles in DFadmin.
When dfaccess sets a field to 'VIEWONLY', 'MASKED' or 'IMMUTABLE' users can not change the field manually, but an edit check can still change the field; thus some automated change may still occur or users might be prompted using dfask or dfcapture to consent or provide input for some change that is controlled by an edit check.
When a field is set to 'VIEWONLY', users are not prevented from adding or modifying metadata on that field. This remains subject to the metadata permissions defined in the user's study role.
When a field is set to 'MASKED', users cannot view or change the metadata. But the user can tell whether the field has metadata attached by observing the color of the masked field.
dfaccess usage
variable is a field on the current page.
mode is one of:
DFACCESS_IMMUTABLE - allow user to see but not change the data field or metadata.
DFACCESS_VIEWONLY - allow user to see but not change the data field. Allows user to make changes to the metadata.
DFACCESS_MASKED - hide the field value beneath a mask and do not allow the user to change it.
DFACCESS_HIDDEN - hide the data entry widget so that it does not appear on screen.
DFACCESS_NORMAL - return to normal field access
dfaccess(@T,DFACCESS_VIEWONLY); # Set current field to viewonly
# Use: make all data fields view only for users with specified roles # Run: on plate entry # e.g. VIEWONLY ("monitors,statisticians") - applies to only 2 roles # e.g. VIEWONLY( "ALL") - prevent all users from changing data fields. # note -assumes role names are single words edit VIEWONLY(string roles) { number fn=6; number max1=dfvarinfo(DFSCREEN,DFVAR_FLDNUM); number max=max1-1; string role=dfrole(); if( roles=="ALL" || dfmatch(role,roles,"W") ) { while( fn <= max ) { dfaccess(@[fn],DFACCESS_VIEWONLY); fn=fn+1; } } }
dfaccess can be used to prevent users from changing fields manually but it does not prevent fields from being changed by edit checks.
When a field is masked or hidden any queries or reasons are also hidden from view.
For two kinds of special fields, DFCREATE and DFMODIFY, dfaccess cannot change the access modes of them and dfaccessinfo will always return Normal.
dfaccess is implemented only in Data and Image views in DFexplore and in DFweb and DFcollect. It has no effect in DFexplore List View or in DFbatch.
dfaccessinfo takes one argument, the name of a field on the current page and returns the current access mode for the requested field. The return mode also depends on 'Show Hidden Fields' role permission.
In DFbatch only, dfaccessinfo will always return 'Normal'.
dfaccessinfo usage
Returns one of the following modes:
normal - User is able to see and change the field values/metadata.
hidden - The data entry widget is hidden. No change to the field value or metadata is allowed.
masked - The field value is hidden beneath a mask. No change to the field value or metadata is allowed.
viewonly - User is able to see but not change the data field. Changes to the metadata is allowed.
immutable - User is able to see the field values. No change to data field or metadata is allowed.
dfmessage(dfaccessinfo(@T)); # Print the access mode of the current field
This is a brief summary of dfaccessinfo:
In DFexplore:
If the property 'Hidden' is set to 'No' in DFsetup, the permission 'View Hidden Fields' is unchecked in DFadmin, dfaccessinfo will return 'normal'.
If the property 'Hidden' is set to 'Masked' in DFsetup, the permission 'View Hidden Fields' is unchecked in DFadmin, dfaccessinfo will return 'masked'.
If the access mode of the next field is set to 'Hidden' using dfaccess, the permission 'View Hidden Fields' is unchecked in DFadmin, dfaccessinfo will return 'hidden'.
If the permission 'View Hidden Fields' is checked in DFadmin, the access mode can only be changed using dfaccess. dfaccessinfo will return the current access mode according to the setting of dfaccess. If the Hidden property is changed using DFsetup instead of dfaccess, dfaccessinfo will always return 'normal'.
In DFbatch, dfaccessinfo will always return 'normal'.
As implemented, dfaccessinfo is a synonym for dfvarinfo(var,DFVAR_ACCESS).
dfvarinfo(var,DFVAR_ACCESS)
This function returns the numeric ID of the argument string subject alias.
dfalias2id usage
string alias; number id; id = dfalias2id( alias ); dfmessage( "This is subject ID ", id );
The dfask function is used to pose to the user a question with two possible answers and wait for the user to respond by choosing one of the answers. dfask returns 1 or 2 depending on which response is selected by the user.
dfask usage
query is the question to be asked (string)
default-option is the default response (1 or 2)
option1 is the label to appear on button 1 (string, typically Yes)
Yes
option2 is the label to appear on button 2 (string, typically No)
No
1 if button 1 is selected
2 if button 2 is selected
if (dfask("Add Query?", 1, "Yes", "No") == 1)
If dfask is executed in batch, there is no opportunity for the question dialog to appear and so default-option is always returned.
default-option
Optional width and height must be between the default and the screen size. If the values are valid, they may be adjusted to ensure the window size is not smaller than the default and not larger than the screen size. If either are missing or invalid, the default is used. These optional values are ignored by DFweb and DFcollect.
The dfbatch function returns TRUE/FALSE depending on whether the edit check is executing in batch mode or not.
dfbatch usage
if (dfbatch()) dfmessage("See this when in batch"); else dfmessage("See this when not in batch");
The dfcapture function is used to display a dialog for user input.
dfcapture usage
Your instructions will appear at the top of the dfcapture dialog. They can be in plain text with '\n' used for line breaks, or in html.
The field list is a '|' delimited string comprised of field name and field length pairs.
The default values are a '|' delimited string of initial field values or an empty string if there are no initial values.
The return value depends on whether the user presses OK or Cancel in the dfcapture dialog.
If the user presses Cancel dfcapture returns an empty string.
If the user presses OK dfcapture returns a '|' delimited string containing the values entered in the fields presented by the dialog.
Here's a very simple example.
string m="Please enter your name and address in the fields below."; string f="Name|50|Street|50|City|30|State|2|Zip|10"; string v=dfcapture(m,f,"");
Here's a more complicated example.
string mycapture(string a) { string msg= "<HTML><HEAD><STYLE>\n" + ".g {background-color: #99 9; color: #fff; padding: 2px; text-align: right;}\n" + ".h {backgrou nd-color: #fff; padding: 2px; text-align: right;}\n" + "</STYLE></HEAD>\n" + "<BODY><DIV>\n" + "<P>CHECK FOR DUPLICATES</P>\n" + "<P>Enter the following values:</P>\n" + "<TABLE BORDER=0 MARGIN=0>" + "<TR><TD CLASS=g>First Name:</TD><TD CLASS=h>Participant's first name</TD></TR>" + "<TR><TD CLASS=g>Last Name:</TD><TD CLASS=h>omit titles: Sr., Jr, II, etc.</TD></TR>" + "<TR><TD CLASS=g>Birth Month:</TD><TD CLASS=h>1-12</TD></TR>" + "<TR><TD CLASS=g>Birth Year:</TD><TD CLASS=h>yyyy, e.g. 1950</TD></TR>" + "</TABLE>\n" + "<P>DFdiscover will check for a match in the database.\n" + "If one is found you will be asked for confirmation.\n\n" + "If name or birth date is missing select cancel to enter\n" + "this subject without checking for duplicates.</P>\n" + "</DIV></BODY></HTML>\n"; string x="First Name|30|Last Name|30|Birth Month|2|Birth Year|4"; string s; s=dfcapture(msg,x,a); return(s); }
dfcapture has the following restrictions:
The maximum number of fields that can be specified is 10.
No field can contain more than 2000 characters.
The maximum length of the return string (including all field values and delimiters) is 16384 (4096 if the string contains UNICODE characters).
16384
If a default value is specified for a field, but exceeds the field length, it will be used but truncated.
The field name does not allow to be empty.
If the dfcapture dialog is moved by the user to a new screen location it will continue to open in that location each time dfcapture is called.
This function returns the study site number for a specified subject ID.
dfcenter usage
number cid; cid = dfcenter(SUBJECT ); # called using subject ID variable name 'SUBJECT' cid = dfcenter(@T); # works if edit check is on the subject ID field cid = dfcenter(@[7]); #always works since the subject ID is always field 7
If the input subject ID is not associated with any of the defined sites, returns the ERROR MONITOR site ID.
For historical reasons and backwards compatibility, the function continues to be named dfcenter(), not dfsite().
This function can be used to close the current study and return the user to the DFexplore login dialog and list of studies.
dfclosestudy usage
{ if(dfpasswdx(msg,3) == 0) dfclosestudy("Incorrect password. The study will be closed!"); }
When dfclosestudy closes a study, all remaining edit checks and pending events will be aborted/canceled.
dfclosestudy will be ignored and do nothing when executed within DFopen_patient_binder and DFopen_study edit checks, and when run in batch mode.
This function is used to convert a date to a string using a specified date format.
dfdate2str usage
date is a date variable
format is the date format to convert to
# Convert Visit date variable VDATE to a string in yy/mm/dd format string d; d = dfdate2str(VDATE, "yy/mm/dd");
This function returns the day component, as a number, from a date.
dfday usage
number day; day = dfday( @T ); if ( day == 1 ) dfmessage( "This is the first day of the month. Reset pill counter." );
The dfdirection function is used when an edit check program needs to know the user's field traversal direction while working in DFexplore.
dfdirection usage
number whichway; whichway = dfdirection();
In plate enter and plate exit edit checks dfdirection returns a number greater than 0. This ensures that these edit checks fire when traversal direction is being used to abort an edit check during backward traversal and/or mouse jumps.
In a field enter edit check dfdirection returns <0,0 or >0, depending on the method that was used to enter the field. In field exit edit checks it returns <0,0 or >0 depending on the method that was used to exit the field. Since the meaning of dfdirection is different in field enter and field exit edit checks, the return values may also be different.
In DFbatch the traversal method is always forward and so dfdirection returns a number greater than 0.
The dfentrypoint function is used to determine from which entry point a given edit check has been called.
dfentrypoint usage
OPEN_STUDY
OPEN_BINDER
PLATE_ENTER
PLATE_EXIT
FIELD_ENTER
FIELD_EXIT
string s; s = dfentrypoint();
In DFbatch
dfentrypoint will be ignored if it is called in DFopen_patient_binder,
dfentrypoint returns OPEN_STUDY if it is called in DFopen_study.
The dfexecute function is used to execute shell scripts stored in $STUDY_DIR/ecbin from within an edit check and return its output. It takes two arguments - the name of the script and the parameter list to pass to it. This function is available in DFexplore and DFbatch.
$STUDY_DIR/ecbin
dfexecute usage
Program is a string containing the name of the executable (program or script) to run. Executables must be registered for each study by storing them in $STUDYDIR/ecbin. For security reasons, only registered executables can be run from dfexecute. If the progam or script can not be found in this directory, or is not executable, DFexplore displays an error dialog with the message Error: invalid progam name.
$STUDYDIR/ecbin
Error: invalid progam name
Parameter List is one or more comma delimited strings containing any input parameters required by the script. Certain characters which have special meaning to the UNIX shell will not be allowed in the command line submitted, including dollar, back quote, semicolon, pipe, and file redirection. If any of these characters are encountered, the script will not run and a zero-length string is returned. Additionally, DFexplore displays the following error message: Error: Invalid parameters. If the ampersand (&) character is passed anywhere in the parameter list, the parameter list will be truncated where the ampersand was encountered.
Error: Invalid parameters
# both of the following examples pass 3 parameters to myscript.sh string s; s = dfexecute( "myscript.sh", "10156 2 1" ); s = dfexecute( "myscript.sh", "10156", "2", "1" );
The working directory for a script defaults to the study work directory.
The study must be in read-write mode for dfexecute to function. If the study is in read-only mode, dfexecute will fail and the system will log error 503 (study is in read-only mode).
error 503 (study is in read-only mode)
Script names cannot reference files or directories outside of $STUDYDIR/ecbin. For example, if the first parameter is set to the string "../myscript.sh", the script will not run because the first parameter references a script that is not registered.
Scripts and programs executed by dfexecute run with the user's DFdiscover permissions. Thus for example, a script that uses DFexport.rpc may produce different results for users with different database get permissions.
When run in a batch environment, dfexecute evaluates the APPLY which setting. If set to none, the function call is treated as a no-op and the message Did not perform dfexecute(Program=xxx, Arguments=xxx) is written to the batch log. This can be useful during testing to ensure that no external scripts are executed.
APPLY which
none
Did not perform dfexecute(Program=xxx, Arguments=xxx)
The dfgetfield function returns a field of an input string. Date variables in the input string are converted to printable strings using the default date specification. Empty strings are converted to "", and missing values are converted to * except for dates which are converted to ??/??/??.
??/??/??
dfgetfield usage
expn is the input expression which will be converted to the source string
fnum is the desired field number (1st field is numbered 1)
delim is the character that is used to delimit fields in expn
string S = "AB|CD|EF"; F = dfgetfield(S, 2, "|");
assigns CD to F.
CD
A start value of less than 1 is converted to a 1. An empty delimiter is converted to |. If fnum is greater than the number of fields in expn, an empty string is returned.
Date fields are first converted to julian dates and then converted to strings using the default date format.
These two functions are equivalent in purpose. The existence of two functions that do the same thing is for historical reasons and backwards compatibility.
The dflevel function returns the validation level that the user is currently validating records to.
dfgetlevel/dflevel usage
if (dflevel() == 5) dfmessage("Validating at level 5");
This function is used to retrieve a comma-delimited list of visit/sequence numbers that exist for a given subject ID and plate combination.
dfgetseq usage
id, a subject ID (may be omitted if referring to the current subject).
plate, a plate number
s= dfgetseq(99001, 1); # returns a list of all visit #numbers for subject 99001 on plate 1
The returned list of visit/sequence numbers can be split using the dfgetfield() function.
All visits present in the database are returned regardless of record status. Record status can be tested using the 1st data field in each data record, named DFSTATUS with codes: 0=lost, 1=final, 2=incomplete, and 3=pending/error. Status 3 records at level 0 are true pending records, i.e. delayed new data entry, while status 3 records are higher levels are more commonly referred to as 'error' records. Level is the 2nd field in each data record, is named DFVALID, and has values 0-7.
This function allows edit checks to change the help message displayed at the bottom of the DFexplore screen.
dfhelp usage
dfhelp( "Please enter the subject age" );
The dfhelp function is only executed in field enter edit checks in DFexplore Data and Image views; it is ignored in DFbatch.
The message displayed by dfhelp appears in the standard one line message window at the bottom of the screen on field entry, replacing any help message specified at the style or field level in DFsetup. The message is displayed while the user remains on the field and is cleared when the user leaves the field.
If the message is too long to be displayed in the message window it will be truncated. Users can view the entire message by hovering over it, or by clicking the dfhelp icon which appears at the beginning of the message.
Any HTML tags are removed in the message window, but clicking the help icon launches a dialog, with Print and OK buttons, which displays the message with HTML formatting.
Date expressions are displayed using the default date format. Use dfvarinfo with the STRING_VALUE option to display date fields as they appear on screen.
Blank values are printed as empty strings, other missing values are printed as *, except for dates which are displayed as ??/??/??.
This function returns the string alias of the argument numeric subject ID.
dfid2alias usage
string a; a = dfid2alias( @[7] ); dfmessage( "The subject alias is ", a );
The dfillegal function is used to set the validity of the argument field to illegal. In DFexplore this has the side effect of changing the field color of the field to the illegal value color.
dfillegal usage
var, which must be a database variable on the current plate. If var is a local variable, the function does nothing
optional_mode, an optional parameter which when 0 sets var status to illegal, and when non-zero undoes any illegal status previously set by dfillegal and re-evaluates var.
dfillegal(DOB);
dfillegal(DOB,1);
var can be a direct reference via the variable name, a positional reference via @T or the @[var] construct, or a member of a group variable.
This function can be used to set fields to the illegal color (even if the field contains a legal value); and unless the field is returned to legal or optional status, this will have the effect of forcing the user to sign the record off with status Incomplete, rather than Final. However, there are limitations.
If executed in batch mode, dfillegal has no effect and always returns 0.
dfillegal has no effect on fields that contain a missing value code, or that have a query - whether resolved or unresolved.
The effect is only transitory. It can be undone by the regular field exit processing that occurs as users traverse data fields using the keyboard or the mouse. Thus a field made illegal by dfillegal in a field entry edit check, will be reset to legal (if it is) on field exit, unless it is again set to illegal using dfillegal in a field exit edit check.
The effect only lasts as long as the record is visible on the screen. For example, if a field (with a legal value) is set to illegal by dfillegal in a plate exit edit check, on record save the field will turn red and the user will be prevented from assigning the record status Final. But if the user returns to the record, the field will no longer have the illegal color, because the effect of dfillegal was not permanent. Further, if the field had been made illegal using dfillegal in a field exit edit check, instead of a plate exit edit check, it would be possible for the user to immediately sign the record off with status Final, because field exit edit checks only fire when the user traverses through the field.
dfillegal has no effect on the value returned by dflegal; dflegal always applies the legal range specifications specified in the study schema, regardless of what dfillegal may have done to the field.
This function returns context information for the primary or secondary image of the argument keys.
dfimageinfo usage
ID, subject identifier
visit, visit number
plate, plate number
image, image ID
attr is one of DFIMAGE_ARRIVAL, DFIMAGE_FIRSTARRIVAL, DFIMAGE_LASTARRIVAL, DFIMAGE_FORMAT, DFIMAGE_SENDER, DFIMAGE_PAGES.
The return value is dependent upon the attr. The return value of dfimageinfo is a string in all cases. The return value is determined by locating, for the specified keys, the matching system fax_log record. From the fax_log record (described in System Administrator Guide, DFdiscover System Files - fax_log), the value of the requested attribute is returned for DFIMAGE_ARRIVAL, DFIMAGE_FORMAT, DFIMAGE_SENDER and DFIMAGE_PAGES.
For the attributes DFIMAGE_FIRSTARRIVAL and DFIMAGE_LASTARRIVAL, the value of image is ignored. All of the images for the matching keys are chronologically sorted and either the earliest or the most recent image is selected. This is useful for answering questions like "what is the earliest image information related to these keys?" and "what is the most recent image information related to these keys?".
image
string sender = dfimageinfo( , , , "", DFIMAGE_SENDER ); dfmessage( "The sender is ", sender );
Any of the ID, visit and plate arguments can be omitted. Omitted arguments are taken from the keys of the current record. The image key, image, is required but may be "", in which case the image ID of the primary record is used.
ID
The argument attr is required. If the attr parameter is not valid, a compile-time error message is issued.
attr
The dflegal function does legal range checking on a variable and returns TRUE or FALSE depending on the outcome. Legal range checks are based on the range specifications defined in the setup tool for each variable
dflegal usage
FALSE if any of the following conditions apply:
the field is required and the value lies outside the legal range or the field is blank
the field is optional and the value lies outside the legal range.
the field is qualified with keys which reference a data record which does not exist in the study database.
TRUE if the data record exists and any of the following conditions apply:
the field is optional and it is blank or contains a value within the legal range
the field is required and it contains a value within the legal range
the field contains a missing value code regardless of whether or not the field is required or optional.
if ( !dflegal(DOB) ) dfmessage( "The value of DOB is not legal." );
The dflength function returns the length of the string conversion of an expression.
String conversion of dates uses the default date specification which is YY/MM/DD unless a different format is specified in the DFedits file. A date field which is blank, contains a missing value code, is only partially complete, e.g. 90/12, or contains an invalid date, e.g. 99/15/22, is converted to ??/??/?? regardless of what the date format is. Thus dflength will always return 8 on blank, missing, incomplete and invalid dates.
For string conversion on all other field types (i.e. other than date) all missing codes are converted to *, and thus dflength will return 1 for any missing value code.
For numeric fields leading zeros are ignored during string conversion. Thus values 0025, 025 and 25 will all have a length of 2.
For all field types string conversion of a blank field results in a blank string with length zero. This is true for choice and check fields, which have a numeric value when the field is blank, as well as for the other data types which store a true null string in the database when the field is blank.
dflength usage
if ( dflength( "ABC" ) == 3 ) dfmessage( "The length of ABC is 3." );
This function can be used to log a user out of their current DFexplore session and close the DFexplore login dialog.
dflogout usage
{ if(dfpasswdx(msg,3) == 0) dflogout("Incorrect password. You will be logged out!"); }
When dflogout logs the user out of DFexplore, all remaining edit checks and pending events will be aborted/canceled. The current DFexplore session will close and the standard DFexplore auto logout dialog will be displayed informing the user that they have automatically been logged out of their current session. If the user chooses to log back in and re-open the study, they will be asked if they want to resume their previous session.
dflogout will be ignored and do nothing when executed within DFopen_patient_binder and DFopen_study edit checks, and when run in batch mode.
The dfmail function can be used to send an email from within an edit check. Email functionality may also be implemented as a shell script using the dfexecute edit check function.
dfmail usage
to-address - one or more email addresses to which the message will be sent. If there are multiple email addresses, each is delimited by a space.
reply-to-address - the email address of the person who will receive any replies to the emailed message. The sender ID in the email always appears as datafax@servername. Use the reply-to-address to ensure that email replies are delivered to a person.
datafax@servername
subject - a short subject line for the email message
message - a string containing the the body of the email message; the string may be plain text, or HTML formatted content
Each of these input parameters is required and none are permitted to be the empty string ("").
TRUE (1), successful; dfmail did not detect errors
FALSE (0), failed; dfmail detected errors
edit SubjectRandomized() { string id=dfvarinfo(ID[,1,2],DFVAR_STRING_VALUE,); if(ELIG[,1,2]==2) dfmail("jill@dfsite1.com john@dfsite1.com", "jack@centralsite.com", "Subject " + id + " Randomization", "This subject has now been randomized."); }
As a pre-requisite, dfmail relies on proper configuration of the server's email infrastructure. Consult with your system administrator to confirm that your DFdiscover server is configured to send email. Email systems vary according to the operating system.
Email systems are complex and beyond the scope of this document. It is possible for dfmail to report success (the function executed as expected) yet the email is not received. Confirm the server's email configuration, recipient email addresses, and possibly the recipient's junk folder.
When run in a batch environment, dfmail evaluates the APPLY which setting. If set to none, the function call is treated as a no-op and the message Did not perform dfmail(To= xxx, Reply=xxx, Subject=xxx, Message=xxx) is written to the batch log. This can be useful during testing to ensure that no emails are sent.
Did not perform dfmail(To= xxx, Reply=xxx, Subject=xxx, Message=xxx)
dfmail sends email from the DFdiscover server. As a result, the email's sender ID will appear as datafax@servername.
If the body of the message begins with <HTML or <html, the message is sent with text/html MIME content type; otherwise it is sent with text/plain content type.
<HTML
<html
text/html
text/plain
Among other things, content type determines how white-space and newline characters are interpreted and presented. Many resources regarding content type are available on the internet, including here.
The dfmatch function performs more sophisticated string matching than the basic comparison operators. dfmatch has the following capabilities:
match while ignoring spaces
match while ignoring punctuation
keyword/keystring matching
fuzzy string matching
dfmatch usage
key is the search string
str is the string to be searched
mode is the string representation of the search mode(s)
if ( dfmatch( "ASA", DRUG, "C" ) ) dfmessage( "The value of DRUG is ASA or asa." );
Valid modes are:
C Ignore case
P Convert each punctuation character to a space
S Ignore white-space characters in both the key and string to be searched. The white-space characters are space, newline, carriage return, form feed, and horizontal tab.
B Match must occur at beginning of string
F Fuzzy match allows for one of the following errors:
one missing character
one extra character
two transposed characters
one mismatching character
W Word match. Each word in key is searched for individually and may appear anywhere in the string, except when mode B is also used. In that case, the first word must start at the beginning of the string and subsequent words in the key must follow in the searched string.
Modes may be concatenated. For example CS ignores case and spaces, PS ignores punctuation and spaces, and CPS ignores case, punctuation and spaces. The order in which modes are combined is irrelevant.
CS
PS
CPS
The dfmessage/dfdisplay/dferror/dfwarning statements are used to output messages. In DFexplore dfwarning and dferror pop up their messages in dialogs that the user must acknowledge before continuing. The dfdisplay function displays its message in a dialog without any icon but includes a Print button. The dfmessage statement writes a message to the edit checks log. In batch, each generates a message that appears in the batch log file
dfmessage/dferror/dfwarning usage
dfmessage(expn1, expn2, ...)
dfdisplay(expn1, expn2, ...)
dferror(expn1, expn2, ...)
dfwarning(expn1, expn2, ...)
dfmessage( "This is a string, ", "2+2=", 2+2 );
Date expressions are displayed using the default date format. Blank values are printed as empty strings, other missing values are printed as *, except for dates which are displayed as ??/??/??.
In DFexplore the output of dfmessage commands is written to a temporary log file that cannot be saved.
for DFweb and DFcollect, any messages containing &, < or > characters must use entity references &, < or >.
&
<
>
The dfmetastatus function is used to return counts of queries or reasons present in the study database.
dfmetastatus usage
ID is a list of subject IDs or * (all)
visit is a list of visits or * (all)
plate is a list of plates or * (all)
attr is one of QC or REASON
arg1 is a list of numeric QC or REASON status codes from the following tables, or * (all):
Table 5.42. Query Status codes
Table 5.43. Reason Status codes
arg2 is a list of numeric query category codes from the table, or * (all):
Table 5.44. Query Category Codes
arg3 is a numeric query usage code from the table, or * (all):
Table 5.45. Query Usage codes
number n1=dfmeastatus("4001-4050","*","*","QC","0,1,2,6","*","1"); #returns the total number of external, unresolved and pending queries
number n2=dfmetastatus("*","1","1-2","REASON","2,3","*","*"); #returns the total number of rejected and pending reasons on plates #1-2, at visit 1 for all subjects
Function dfmode can be used to determine the user's current working mode.
dfmode usage
In DFexplore dfmode returns the user's current working mode: 'view', 'edit', 'modify', 'validate', 'DDE', or 'locked' if the current record is locked by another user.
In DFbatch dfmode always returns 'batch'.
if (dfmode()=="DDE") return;
dfmode always returns 'validate' when working in DFexplore Image View.
In DFbatch locked records are skipped thus dfmode will never return 'locked' because the edit check does not fire.
The dfmoduleinfo function returns information about the module referenced by the argument, including up to 20 custom module properties.
dfmoduleinfo usage
var, an existed database variable name or the number of an existing database variable when referenced by a record key.
attr is one of DFMODULE_NAME, DFMODULE_DESC, DFMODULE_USER1, DFMODULE_USER2, DFMODULE_USER3, DFMODULE_USER4, DFMODULE_USER5, DFMODULE_USER6, DFMODULE_USER7, DFMODULE_USER8, DFMODULE_USER9, DFMODULE_USER10, DFMODULE_USER11, DFMODULE_USER12, DFMODULE_USER13, DFMODULE_USER14, DFMODULE_USER15, DFMODULE_USER16, DFMODULE_USER17, DFMODULE_USER18, DFMODULE_USER19, DFMODULE_USER20.
DFMODULE_USER1
DFMODULE_USER2
DFMODULE_USER3
DFMODULE_USER4
DFMODULE_USER5
DFMODULE_USER6
DFMODULE_USER7
DFMODULE_USER8
DFMODULE_USER9
DFMODULE_USER10
DFMODULE_USER11
DFMODULE_USER12
DFMODULE_USER13
DFMODULE_USER14
DFMODULE_USER15
DFMODULE_USER16
DFMODULE_USER17
DFMODULE_USER18
DFMODULE_USER19
DFMODULE_USER20
The return value is dependent upon the attr. The possible values for attr, and their meaning, are:
DFMODULE_NAME returns the module name
DFMODULE_DESC returns the module description
DFMODULE_USER# returns the module information of the corresponding custom module property
edit test_dfmoduleinfo() { # List the module name, module description, first, second and third user defined module property information of the selected field on current plate. dfmessage("dfmoduleinfo(@T,DFMODULE_NAME)=" + dfmoduleinfo(@T,DFMODULE_NAME)); dfmessage("dfmoduleinfo(@T,DFMODULE_DESC)=" + dfmoduleinfo(@T,DFMODULE_DESC)); dfmessage("dfmoduleinfo(@T,DFMODULE_USER1)=" + dfmoduleinfo(@T,DFMODULE_USER1)); dfmessage("dfmoduleinfo(@T,MODULETAG2)=" + dfmoduleinfo(@T,MODULETAG2)); dfmessage("dfmoduleinfo(@T,DFMODULE_USER3)=" + dfmoduleinfo(@T,DFMODULE_USER3)); # List the module name of FirstName field on current plate. dfmessage("dfmoduleinfo(FirstName,DFMODU LE_NAME)=" + dfmoduleinfo(FirstName,DFMODULE_NAME)); # List the module name of Field 9 belongs to Plate 3, Visit 1 of Subject 1. dfmessage("dfmoduleinfo(subject 1, visit 1, plate 3, field 9, DFMODULE_NAME)=" + dfmoduleinfo(@[1,1,3,9],DFMODULE_NAME)); }
If the plate number referenced is not a valid plate number (less than 1, greater than 511, or a plate number not defined in the study setup), the edit check runtime will issue an error message to that effect.
If the attr parameter is not valid, a compile-time error message is issued.
The return value of dfmoduleinfo is a string in all cases.
Tags for custom properties can be used interchangeably with the default names.
This function returns the month component, as a number, from a date.
dfmonth usage
number month; month = dfmonth( @T ); day = dfday( @T ); if ( month == 4 && day == 1 ) dfmessage( "It is April Fool's Day. Be skeptical." );
The dfmoveto statement can be used to change the normal (keyboard) order of field traversal on a record.
dfmoveto usage
dfmoveto(DOB);
dfmoveto(@(T+3)); # 3 vars after current var
The first variable that can be moved to on the current record is the sequence number, if it is not barcoded; otherwise, it is the subject ID. Without edit checks, the focus moves to this first variable when the current record is displayed. Using dfmoveto in a plate enter edit check, the programmer can change this so that the focus starts on a different variable.
Attempting to move to a variable in advance of the first variable will quietly move to the first variable. The last variable that can be moved to is the DFSCREEN variable, which DFdiscover maintains at the bottom of every CRF page. It is not possible to move to any variable after DFSCREEN. Attempts to do so will generate an error message and the focus will move to the next sequential variable after the current variable.
It is not possible to move to a variable on another record. If multiple dfmoveto calls are executed in one edit check, the active variable becomes the last variable moved to.
dfmoveto is ignored:
in field enter and exit edit checks if the mouse is used to set variable focus. Variable focus remains on the selected variable if the variable is selected with the mouse.
in plate exit edit checks. At plate exit, each variable on the current record is always visited in sequential order - dfmoveto cannot be used to alter this order.
It is possible to create a loop by repeated calls to dfmoveto. For example, variables A and B each have field enter edit checks that execute dfmoveto to the other variable. Loops are detected, and aborted, after 20 consecutive executions of dfmoveto via edit checks without an intervening field exit action.
This function can be used to alter the subject binder icons and determine which visits and plates are shown when working in a subject binder. It is typically used in special edit check DFopen_patient_binder, which if present runs each time a new subject binder is opened.
dfneed usage
Action can have one of the following values:
DFNEED_TRIM - do not show empty visit or empty plate
DFNEED_HIDE - do not show visit or plate even if it is not empty
DFNEED_OPTIONAL - display visit or plate with the circle icon
DFNEED_REQUIRED - display visit or plate with the square icon
DFNEED_UNEXPECTED - display visit or plate with the diamond icon
DFNEED_RESET - re-evaluate the status of the listed visits and plates, updating the status icons in the subject binder
Visits is a comma delimited list of visit numbers and ranges
Plates is a comma delimited list of plate numbers and ranges
edit DFopen_patient_binder() { # Adverse Event report form comes in 2 versions (plates 4 and 5) # For each recorded AE (seq 101-199) hide the empty version number n=mymaxseq(4); # get last plate 4 AE report number number x=mymaxseq(5); # get last plate 5 AE report number if( x>n ) n=x; # set n to last recorded AE report number while(n>=101) { if( ! dfmissing(DFPLATE[,n,4]) ) dfneed(DFNEED_TRIM,n,5); if( ! dfmissing(DFPLATE[,n,5]) ) dfneed(DFNEED_TRIM,n,4); n = n - 1; } # Hide follow-up visits (2-99) until eligibility criteria have been met if( !myeligible() ) dfneed(DFNEED_TRIM,"2-99",""); # Hide endpoint adjudication form (visit 0 plate 9) from all roles # except the endpoint adjudicators if( dfrole()!="adjudicator" ) dfneed(DFNEED_HIDE,0,9); }
Trim is ignored if the data record exists or the visit contains records.
If all plates in a visit are hidden the visit will also be hidden.
dfneed only operates on subject binders; hidden records will be visible in List view and in task lists if the user's role can get the record. dfneed can be called within plate enter or exit edit checks as well as within DFopen_patient_binder.
Like other functions, dfneed accepts an empty string parameter for visit and plate number lists. However, in some cases, required plates may override changes made to a visit using dfneed. It is therefore recommended to instead use an acceptable wildcard value instead of an empty quote string, either "*" or "all", in the plate number parameter if the intention is to change an entire visit that contains required plates.
This function returns the label for the page referenced by the argument keys.
dfpageinfo usage
attr is DFPAGE_LABEL
string label = dfpageinfo( , , , DFPAGE_LABEL ); dfmessage( "The page label is ", label );
Any of the ID, visit and plate arguments can be omitted. Omitted arguments are taken from the keys of the current record.
This function can be used to ask users to re-enter their DFdiscover password.
dfpassword usage
message - a string containing a message to be displayed on the password entry dialog.
limit - the maximum number of attempts allowed for the user to enter the correct password. See Notes for more detail.
True (1) if the user enters their login password.
False (0) if the user fails to enter the correct password within the specified limit or gives up by selecting Cancel.
limit
string msg = "Please enter your password to obtain a randomization code."; if( dfbatch() ) return; # do not continue if this edit check is called in a batch job if(dfpassword(msg,3) ) dfexecute("Randomization.shx"); else { dferror("Password was in correct."); dfexecute("RandomizationFailure.shx"); }
If the user enters an incorrect value in the password dialog the message [Error: at least one of Username and Password is incorrect.] is displayed and the user may be able to try again, or select Cancel, to exit the dialog.
The limit parameter alone is used by DFcollect in offline mode. In all other applications (DFcollect on-line, DFbatch, DFexplore, DFweb), if the limit parameter is less than the system specified limit of failed login attempts and greater than 1, it will be used to limit password attempts returning 0 if the limit is reached. If the limit parameter is less than 1 or greater than the system specified limit, then the system specified limit is used to limit password attempts and the limit parameter is ignored. The system specified limit on failed login attempts is defined under the Master tab in DFadmin.
limit of failed login attempts
Generally dfpassword should not be called in batch but, if it is, it will always return TRUE.
This function is similar to dfpassword and can be used to ask users to re-enter their DFdiscover password.
dfpasswdx usage
1 if the user enters their login password correctly.
0 if the user fails to enter the correct password within the specified limit
-1 if the user selects Cancel.
string msg = "Please enter your password to obtain a randomization code."; if(dfpasswdx(msg,3) == -1) dfwarning("You have canceled password entry.");
dfpasswdx is similar to dfpassword except that it accommodates an additional return value of -1 to allow programmers to distinguish between the entry of an incorrect password and a cancel operation. dfpasswdx displays the same password dialog as dfpassword.
Generally dfpasswdx should not be run in batch: it will always return 1 (password was entered correctly).
The dfplateinfo function returns information about the plate referenced by the argument, including up to 20 custom plateproperties.
dfplateinfo usage
plate, the number of an existing plate.
attr is one of DFPLATE_DESC, DFPLATE_FIELDS, DFPLATE_SEQCODING, DFPLATE_USER1, DFPLATE_USER2, DFPLATE_USER3, DFPLATE_USER4, DFPLATE_USER5, DFPLATE_USER6, DFPLATE_USER7, DFPLATE_USER8, DFPLATE_USER9, DFPLATE_USER10, DFPLATE_USER11, DFPLATE_USER12, DFPLATE_USER13, DFPLATE_USER14, DFPLATE_USER15, DFPLATE_USER16, DFPLATE_USER17, DFPLATE_USER18, DFPLATE_USER19, DFPLATE_USER20.
DFPLATE_FIELDS
DFPLATE_SEQCODING
DFPLATE_USER1
DFPLATE_USER2
DFPLATE_USER3
DFPLATE_USER4
DFPLATE_USER5
DFPLATE_USER6
DFPLATE_USER7
DFPLATE_USER8
DFPLATE_USER9
DFPLATE_USER10
DFPLATE_USER11
DFPLATE_USER12
DFPLATE_USER13
DFPLATE_USER14
DFPLATE_USER15
DFPLATE_USER16
DFPLATE_USER17
DFPLATE_USER18
DFPLATE_USER19
DFPLATE_USER20
DFPLATE_DESC returns the plate label
DFPLATE_FIELDS returns the number of fields on the plate
DFPLATE_SEQCODING returns the coding method of the sequence on the plate, with 1 meaning predefined and 2 meaning that the sequence number is in the first data field
DFPLATE_USER# returns user-defined plate information of corresponding custom plate property
edit listvars() { # List the field names of all fields for a given plate number count; number nplates; string scount; string splate; splate=DFPLATE; dfmessage("Plate "+splate+": "+dfplateinfo(DFPLATE,DFPLATE_DESC)); scount = dfplateinfo(1,DFPLATE_FIELDS); nplates=scount; count=1; while( count <= nplates ) { dfmessage("\t"+dfvarinfo( @[count], DFVAR_NAME) ); count = count +1; } }
If the plate parameter is not a valid plate number (less than 1, greater than 511, or a plate number not defined in the study setup), the edit check runtime will issue an error message to that effect.
The return value of dfplateinfo is a string in all cases.
Function dfpref is used to set DFexplore user preferences, most commonly in edit checks DFopen_studyand DFopen_patient_binder, which if present run when the study is selected and when a subject binder is opened respectively.
dfpref usage
dfpref takes 3 parameters:
prefname - one of the DFexplore user preferences
prefvalue - one of the settable values for the preference
duration - how long the preference setting lasts, which may be:
DFPREF_CURRENT = current page only
DFPREF_SESSION = current user session
DFPREF_LOCK = always, can not be modified by the user
An edit check might call dfpref to set a preference on opening a subject binder or on opening a page. If the duration is DFPREF_CURRENT or DFPREF_SESSION the user may still change the preference via the DFexplore interactive preference dialog - that is, the preference is advisory only. If the preference duration is set to DFPREF_LOCK, the user is prevented from interactively changing the specified preference value. However additional calls to dfpref in subsequent edit checks, can make further changes without restriction.
DFPREF_CURRENT
DFPREF_SESSION
DFPREF_LOCK
The preference names and values listed below are entered in the parameter list as quoted strings. They are given in the order in which they appear within the DFexplore preferences dialog. Refer to DFexplore User Guide, User Settings for a description of each preference. Case is not significant when specifying preference name and value.
PreferenceName PreferenceValue ----------------- --------------- UseSubjectAlias Yes, No DefaultView Dashboard, Schedule, Image, Data, Queries Reasons, Reports, Status, List, Batch Edits AutoLogout minutes Language Any languages defined in Translations Data Window: ExpandVisits Yes, No OpenFirstPage Yes, No AdvanceField Yes, No OpenTaskPage Yes, No WarnTraverse Yes, No RetainPosition Yes, No ShowDatePicker Yes, No AutoAlignText Yes, No ShowMetaPanel Yes, No ShowDocumentPanel Yes, No eCRFColor #D4E6F1 (hex RRGGBB code) Image Window: AutoOpenImage Yes, No ScreenSplit Toggle, StickyToggle, DataLeft, DataRight, DataTop, DataBottom Record List: ShowVisit NumberLabel, Number, Label ShowPlate NumberLabel, Number, Label ShowSite NumberLabel, Number, Label RecordListNavigation Yes, No Query Defaults: QueryUsage External, Internal QueryType Clarification, Correction List View: ShowFieldName Generic, Unique, NumberGeneric, NumberUnique ShowCodedField Code, Label ShowDateField Default, Calendar, Julian, CalendarImpute, JulianImpute ShowFieldColor Yes, No ExpandText Yes, No Image View: ReviewOnLoaded Yes, No ReviewOnSaved Yes, No Reports View: NewReportTab Yes, No Background Options: BackgroundColor Black, White, Color BackgroundType Default, (any values defined in DFCRFType_map) Schedule View: ShowSubjectSchedule Yes, No ShowVisitScheduleInfo Yes, No ShowMissingAndOverdue Yes, No ShowUnexpectedVisitsAndPlates Yes, No ShowCycles Yes, No ShowCycleVisits Yes, No ShowCorrectionQueries Yes, No ShowClarificationQueries Yes, No Mode and Level: WorkingMode View, Edit, Modify, Validate, DDE SaveLevel 1, 2, 3, 4, 5, 6, 7 PendingPlateExitEC On, Off
Schedule View preferences, as well as WorkingMode, SaveLevel and PendingPlateExitEC do not appear in the DFexplore preferences dialog, but can be set in edit checks by dfpref as follows:
ShowSubjectSchedule, ShowVisitScheduleInfo, ShowMissingAndOverdue, ShowUnexpectedVisitsAndPlates, ShowCycles, ShowCycleVisits, ShowCorrectionQueries, ShowClarificationQueries Provide edit check control over which schedule sub-windows are visible in Schedule View. Each schedule sub-window is identified by a specific keyword from the list. The setting for each keyword is "yes" or "no" (case insensitive). The setting "yes" indicates that the sub-window is visible, "no" indicates that the sub-window is not visible.
WorkingMode
Mode can be set in DFopen_study and DFopen_patient_binder but not in any other edit checks.
dfpref sets mode for all records in data view only; Validate mode is always in effect in DFexplore Image View.
The mode set by dfpref can be over-ridden by users with 'data with select' permission using 'Select-Change Mode & Level', except when WorkingMode is set and locked using DFPREF_LOCK
When mode is set and locked, users who have 'data with select' permission will not be able to change the mode of any add hoc tasks they create in data or list view, however since these users have permission to define new tasks they will be able to create new tasks with any mode and level and assign the task to any user or role, including their own.
The mode set by dfpref is ignored while working on a task; the mode specified in the task definition has priority.
Setting WorkingMode to View prevents users from making changes to both data and metadata
Setting WorkingMode to Edit, Modify, Validate or DDE will not enable a user who does not have permission to write data records to make data changes. Function dfpref cannot be used to grant permissions that are not present in the user's study role.
Setting WorkingMode to View will disable all subsequent edit checks if 'Run Edit checks in View Mode' is set to 'No' in DFsetup.
Users with permission to use 'Batch Validate' to change the level of all records in their current task set can still use this feature; dfpref settings only apply to individual record saves.
DFweb and DFcollect do not support the 'DDE' working mode.
SaveLevel
SaveLevel has the same restrictions and behavior described above for WorkingMode with the following exceptions.
SaveLevel can be used in plate exit edit checks in both Data and Image Views, but only with DFPREF_CURRENT, to set the level at which the current data record will be saved.
When performing a task the SaveLevel set in a plate exit edit check has priority over the level specified in the task definition.
The dfpref request is ignored if the user does not have permission to write records at the specified level.
PendingPlateExitEC
This preference is used to turn off/on all plate exit edit checks when saving records as Pending in DFexplore. It can also be used to turn off/on all field exit edit checks attached to the current field (e.g. the field having the focus), at the time that the current record is saved as Pending. If PendingPlateExitEC is not set, the default behaviour is "On" (e.g. always run plate exit edit checks when saving records as Pending).
PendingPlateExitEC does not appear in the DFexplore Preferences dialog and can only be set by an edit check. For this reason, DFPREF_LOCK and DFPREF_SESSION behave in the same way.
Mode can be set in DFopen_study, DFopen_patient_binder or in any other edit check.
DFPREF_CURRENT, if set, will overwrite DFPREF_LOCK/DFPREF_SESSION.
PendingPlateExitEC can be called anywhere except in plate exit edit checks.
# Lock down user preferences for the clinical sites edit DFopen_patient_binder() { if ( dfrole()=="Clinical Site" ) { dfpref("ExpandVisits","Yes",DFPREF_LOCK) ; dfpref("AutoOpenImage","Yes",DFPREF_LOCK) ; dfpref("AdvanceField","No",DFPREF_LOCK) ; dfpref("BackgroundType","SITE",DFPREF_LOCK) ; dfpref("PendingPlateExitEC","Off",DFPREF_LOCK) ; } }
# During site monitoring using the 'Source Verification' task # records are sav ed to the task level when incomplete, but when final # are saved t o level 5 where the sites can no longer change them. edit SVfinal() { if ( dftask()!="Source Verification" ) return; if( DF STATUS==1 ) dfpref("SaveLevel","5",DFPREF_CURRENT) ; }
This function is used to return the current value ofDFexplore user preferences.
dfprefinfo usage
dfprefinfo takes 1 parameter, the name of an DFexplore user preference.
The preference name is entered as a quoted string. Case is not significant. Valid preference names and the values returned by dfprefinfo are shown below. Refer to DFexplore User Guide, User Settings for a description of each preference.
WorkingMode, SaveLevel and PendingPlateExitEC do not appear in the DFexplore preferences dialog, but can be tested in edit checks using dfprefinfo.
# On study selection show Auto Logout setting to users with the Clinical Site role edit DFopen_study() { if ( dfrole()=="Clinical Site" ) { string msg = "NOTE!\n\n" + "Auto Logout is set to " + dfprefinfo("AutoLogout") + " min.\n" + "To change this and other user preferences + select:\n" + "'Preferences...' from the + 'File' menu in Data View." ; dfwarning(msg); } }
This function returns the protocol version in effect for the argument subject ID on the argument date.
dfprotocol usage
date, date in the format "yyyy/mm/dd", "today" or "" (same as "today")
The return value of dfprotocol is a string in all cases. Return the protocol version in effect on the argument date.
The protocol version is identified by:
determining the site ID for the argument subject identifier,
for the site ID, comparing the date which each of the 5 DFSITE_PROTOCOLDATE values and returning the matching DFSITE_PROTOCOL value. If date is before DFSITE_PROTOCOLDATE1 return ""; if date is after DFSITE_PROTOCOLDATE5 return DFSITE_PROTOCOL5; otherwise return DFSITE_PROTOCOLn where date is on or after DFSITE_PROTOCOLDATEn and before DFSITE_PROTOCOLDATEn+1.
DFSITE_PROTOCOLDATE
DFSITE_PROTOCOL
DFSITE_PROTOCOLn
DFSITE_PROTOCOLDATEn
DFSITE_PROTOCOLDATEn+1
string protocol_vers = dfprotocol( , "2019/01/01" ); dfmessage( "The protocol ver sion in effect on 2019/01/01 is ", protocol_vers );
Determine the protocol in effect for the current subject for their current visit.
edit ProtocolAtVisit() { string vdate = dfvisitinfo( , , DFVISIT_DATE); date dt = dfstr2date(vdate, "yy/MM/dd", 1990, 0); # dates are in different formats - need to switch string version = dfprotocol( , dfdate2str(dt, "yyyy/MM/dd")); dfmessage("On visit date ", vdate, ", effective protocol version is ", version); }
The dfrole function is used to determine the study role bywhich a user has read access to specified data keys.
dfrole usage
The key fields (id, visit, and plate) for the CRF to be tested.
One or more of id, visit and plate values may be omitted and will default to the id, visit or plate of the current record. When omitting keys remember to include all commas, e.g. dfrole(,,44) returns the role by which the user has access to plate 44 of the current visit for the current subject.
if ( dfrole() == "Full Permissions" ) dfmessage ("You have permission to save data for these keys." );
The dfrole function can be used to make edit check behavior dependent upon a user's role in cases where an edit check is only be relevant or needs to behave differently for users with different roles.
In DFopen_study dfrole() returns a pipe delimited list of all roles the user has been assigned in the current study.
The dfsiteinfo function returns attribute information about the site overseeing the argument subject ID.
dfsiteinfo usage
attr is one of DFSITE_ID, DFSITE_NAME, DFSITE_CONTACT, DFSITE_ADDRESS, DFSITE_FAX, DFSITE_PHONE, DFSITE_INVESTIGATOR, DFSITE_SUBJECTS, DFSITE_TEST, DFSITE_COUNTRY, DFSITE_BEGINDATE, DFSITE_ENDDATE, DFSITE_ENROLL, DFSITE_PROTOCOL1, DFSITE_PROTOCOLDATE1, DFSITE_PROTOCOL2, DFSITE_PROTOCOLDATE2, DFSITE_PROTOCOL3, DFSITE_PROTOCOLDATE3, DFSITE_PROTOCOL4, DFSITE_PROTOCOLDATE4, DFSITE_PROTOCOL5, DFSITE_PROTOCOLDATE5, DFSITE_REPLYTO.
The return value is dependent upon the attr. The return value of dfsiteinfo is a string in all cases. The possible values for attr, and their meaning, are:
DFSITE_ID the unique, numeric identifier of the site
DFSITE_NAME the descriptive name of the site
DFSITE_CONTACT the name of the primary contact person at the site
DFSITE_ADDRESS the street address of the site
DFSITE_FAX the email address(es) or fax number(s) of the site
DFSITE_PHONE the telephone number of the site
DFSITE_INVESTIGATOR the name of the site's primary investigator
DFSITE_SUBJECTS the possible and actual subject IDs enrolled at the site
DFSITE_TEST yes, if the site has test subjects/data only; no, otherwise
DFSITE_COUNTRY 3-letter country code, from the ISO country codes list
DFSITE_BEGINDATE the begin date for the site, in yyyy/mm/dd format
yyyy/mm/dd
DFSITE_ENDDATE the actual, or anticipated, end date for the site, in yyyy/mm/dd format
DFSITE_ENROLL the expected number of subjects to be enrolled by the site
DFSITE_PROTOCOL1 the first protocol version in use at the site
DFSITE_PROTOCOLDATE1 the begin date of the first protocol version in use at the site
DFSITE_PROTOCOL2, DFSITE_PROTOCOL3, DFSITE_PROTOCOL4, DFSITE_PROTOCOL5 additional protocol versions, up to 5, in use at the site
DFSITE_PROTOCOLDATE2, DFSITE_PROTOCOLDATE3, DFSITE_PROTOCOLDATE4, DFSITE_PROTOCOLDATE5 the begin date of additional protocol versions in use at the site
DFSITE_REPLYTO emails sent to the site include this replyto email address
string enroll_target = dfsiteinfo( , DFSITE_ENROLL ); dfmessage("The site enrollment target is ", enroll_target );
If the ID is omitted, the subject ID of the current record is used.
dfcenter( ID ) is equivalent to dfsiteinfo( ID, DFSITE_ID ).
dfcenter( ID )
dfsiteinfo( ID, DFSITE_ID )
The sqrt function calculates the square root of the argument number, and returns it as an integer (if the result can be expressed as an integer with no loss of precision), or as a fixed point number.
sqrt usage
sqrt(65)
returns 8.062258
sqrt(16.0)
returns 4
Function dfstay is used in edit checks triggered on record Save, to abort the save operation and keep the user on the current page with the focus on a specified data field.
dfstay usage
dfstay(DOB);# stay on field DOB
dfstay(@T); # stay on the current field
Function dfstay is implemented only in DFexplore and has no effect in DFbatch.
Function dfstay is ignored until record Save is selected. Thus, dfstay is only honored in plate exit edit checks and in any field exit edit checks on the data field (if any) that has the focus when Save is selected, because any field exit edit checks on this field will run before the plate exit edit checks begin.
When dfstay is honored: all subsequent edit checks are canceled, the record save operation is canceled, and the focus moves to the field specified in the dfstay argument. These features make dfstay useful in edit checks designed to offer the user the opportunity to fix a problem before committing the record and moving on to the next record. And, in cases where several edit checks are triggered on plate exit they allow the user to stop and fix each problem as it is identified.
A typical example would be an edit check that used dfask to describe the problem and offer the user the option to 'fix it now' or 'fix it later'. If the user selects 'fix it later' a query could be added to the field, but if the user selects 'fix it now' dfstay would be used to put the focus back on the field and stop. The user could then resolve the problem by correcting the value, adding a reason or replying to a query, before again selecting Save to commit the record to the database.
Since dfstay cancels the record save request it can also be used to block users from saving changes to the study database under specified conditions. In such cases it would be wise to display a message explaining this to the user, e.g. 'This record requires a visit date and cannot be saved without one'.
dfstay can only put the focus on a data field on the current page. It can not be used to move to a field on some other page.
This function is used to convert a specified string to a date using a specified date format, start year, and imputation method. It can be used to create a local date variable or assign a value to a date field (see examples).
dfstr2date usage
date is a string containing the date, e.g. "09/01/15" for Jan 15, 2009 in yy/mm/dd format.
format is the date format, e.g. "yy/mm/dd"
start_year is the first year of an imputed century for 2 digit years, e.g. 1950
imputation method is one of:
0 never
1 beginning of the month or year
2 middle of the month or year
3 end of the month or year
dfstr2date converts the specified date string to the edit check language's internal representation of a date using the specified date format rather than the standard global edit check date format.
As for all dates the edit check language converts an invalid date string to it's internal representation of a missing value.
# Get the record creation date and store it in a user defined date field string c = dfgetfield(DFCREATE,1," "); @T = dfstr2date(c,"yy/mm/dd",2000,0);
# Store the number of days between record creation and last modification # in a user defined numeric field date cd,md; string cs = dfgetfield(DFCREATE,1," "); string ms = dfgetfield(DFMODIFY,1," "); cd = dfstr2date(cs,"yy/mm/dd",2000,0); md = dfstr2date(ms,"yy/mm/dd",2000,0); @T = md - cd;
This function returns attribute information about the study, including the study number, name, start year and up to 20 custom study properties.
dfstudyinfo usage
DFSTUDY_USER2
DFSTUDY_USER3
DFSTUDY_USER4
DFSTUDY_USER5
DFSTUDY_USER6
DFSTUDY_USER7
DFSTUDY_USER8
DFSTUDY_USER9
DFSTUDY_USER10
DFSTUDY_USER11
DFSTUDY_USER12
DFSTUDY_USER13
DFSTUDY_USER14
DFSTUDY_USER15
DFSTUDY_USER16
DFSTUDY_USER17
DFSTUDY_USER18
DFSTUDY_USER19
DFSTUDY_USER20
The return value is dependent upon the attr. Specifically, the descriptive name of the study, the study number and the start year of the study (as defined by the value of Study launched in the year in the study global settings) are returned respectively for the DFSTUDY_NAME, DFSTUDY_NUMBER, and DFSTUDY_YEAR attributes. Corresponding user-defined description is returned for its custom study property.
The return value of dfstudyinfo is a string in all cases.
string studyname = dfstudyinfo( DFSTUDY_NAME ); dfmessage( "This is study ", studyname );
The dfsubstr function returns a substring of an input expression converted to a string.
dfsubstr usage
expn is the input expression which will be converted to the source string.
start is the starting character position of the desired substring (1st character position is 1).
len is the length in characters of the desired substring.
Desired substring.
If len is greater than the number of characters remaining to the end of the string, len is adjusted to include only the characters remaining in the string.
string s="Hello world!"; A = dfsubstr( s, 3, 5 );
assigns "llo w" to A.
A start value of less than 1 is converted to 1. A len of less than 1 is converted to 0. If start is greater than the length of the string, or len is 0, an empty string is returned.
Date fields are first converted to julian dates and are then converted to strings using the edit check date format. This may lead to unexpected results.
Example: Date format conversion
date format "mmm/dd/yy" ... edit example() { string A; # at this point EDATE, a database variable, contains 98/11/14 A = dfsubstr( EDATE, 1, 3 ); }
In this example, the date format of the EDATE variable, yy/mm/dd, and that of the edit checks global default, mmm/dd/yy, do not match. The conversion of EDATE to a string creates the string NOV/14/98 The subsequent dfsubstr execution extracts the first 3 characters, assigning "NOV" to A.
EDATE
yy/mm/dd
mmm/dd/yy
NOV/14/98
The dftask function is used to determine the current task name in DFexplore and the current batch name in DFbatch.
dftask usage
dftask returns a task name or an empty string as follows:
When called with no parameters:
returns the current task name, even if the current record is not in the task set, or
returns a blank string if there is no active task
When called with a set of record keys (id,visit,plate):
returns the current task name, if the specified record is in the task set, otherwise
it returns a blank string.
Since functions default to current values if any of the keys are not specified, dftask(,,) returns the task name if the current record is in a task set, otherwise a blank string.
When 'Import Subject Documents' is used in DFexplore Data View to 'Import data entry worksheets/CRFs' a task set is created for all imported pages, and dftask() returns "Import Subject Documents' as the task name.
In batch dftask() can only be called without parameters and returns the 'name' attribute of the current 'BATCH' element. If a DFbatch input file has multiple BATCH elements the current one will be returned.
number answer; string task = dftask(); if ( task == "SiteMonitoring" ) ans = dfask("Does the recorded data match medical records?",1,"Yes","No"); if ( answer == 1 ) ...
The dftime function returns the current time on the machine in HH:MM:SS format.
dftime usage
string now; now = dftime();
The dftoday function returns a date value representing today's date.
dftoday usage
date today; today = dftoday(); if ( VDATE > today ) dferror( "Visit occurred in the future!" );
The dftool function returns TRUE/FALSE depending on whether the edit check is executing in the named tool or not.
dftool usage
toolname is a literal string equal to the name of the tool to test for.
Valid tool names are DFexplore, DFbatch, DFcollect and DFweb. Additionally, for backwards compatibility, iDataFax is accepted as a synonym for DFexplore.
DFexplore
DFcollect
DFweb
iDataFax
The argument must be a literal string - it cannot be a string from a variable.
NOTE: If edit checks are running in DFexplore within Batch View, the correct tool name is DFbatch.
if (dftool("DFexplore")) dfmessage("running in DFexplore");
if (dftool("iDataFax")) dfmessage("running in DFexplore");
This function is executed, if it is found in plate exit edit checks, when a record is saved in Image View or Data View using DFexplore or processed in DFbatch. It is a no-op when encountered in plate enter edit checks or field enter/exit edit checks.
dftrigger can be used to add a new record to the current subject binder and/or to execute all or specified plate entry edit checks on a specified data record in the current subject binder. This provides a mechanism for creating conditional plates and updating data fields on other records that depend on values entered and saved on the current record.
In DFbatch, dftrigger can be used to add new records to the database or to visit them to update their data fields programmatically.
dftrigger verifies user permissions and attempts to perform record locking. If the subject ID in the argument keys is different than the current subject ID, a record level lock for the single record identified by the keys is requested. If the subject ID in the argument keys is the same as the current subject ID,
in Data View, the lock is already held (subject level locking) and no further record locking is required, or
in Image View and Batch Edits View a record level lock is requested.
If permissions are insufficient or the locking request fails,dftrigger does nothing and returns 0, otherwise it returns 1.
If the target data record already exists in the database and no plate entry edit checks are specified dftrigger then can be used to change the status and/or level of the target record, and/or to change to a different plate when the target record is saved.
New records added to the current subject binder by dftrigger in Data View will immediately appear in the subject record list, and if the user is working on a task set the new record will be added to the task set list.
dftrigger usage
id, visit, and plate identify the keys of the target record.
One or more of id, visit and plate values may be omitted and will default to the id, visit or plate of the current record. When omitting keys remember to include all commas, e.g. dftrigger(,,44,3,0,"GetInitials",0) will run the GetInitials plate entry edit check for plate 44 for the current subject and visit, but will not open the record. If plate 44 doesn't exist it will be created and added to the study database with status pending and level 0.
Status must be one of: 1=final, 2=incomplete, 3=pending, or blank. When blank the status of an existing record will not be changed and the status of any newly created records will be set to 3=pending. The following restrictions apply:
if an existing record has status incomplete because of illegal values or unresolved queries it's status can not be changed to 'Final'.
if any of the specified plate entry edit checks add an illegal value or unresolved query to a new or existing data record status is set to incomplete.
Level is the workflow or validation level which must be: 0-7 or blank. When blank the level of an existing record will not be changed and the level of any newly created records will be set to 0. However, level cannot be reset from 1+ back to zero - any such requests will be ignored. When using dftrigger to set status or level on a target plate, neither status or level can be left blank. If either status or level is left blank, then it will have the same result as leaving both of them blank.
The edit check parameter may be an empty string if no edit checks are to be run, "ALL" to run all plate entry edit checks, or a list of edit check names separated by commas or spaces, e.g. "GetInitials,CalcTargetDate". Only plate entry edit checks will run; any other edit check names will be silently ignored. For dialogs, most of dialogs will not appear and the default action will be taken for dialogs that prompt the user for input. Specifically, the dialogs that will not appear include dfask, dfmessage/dfdisplay/dfwarning/dferror, dfaddqc, dfeditqc, dfreplyqc, and dfaddreason. The dialogs that will still appear include dfcapture, dfclosestudy, dflogout, dfpassword, and dfpasswdx.
The final action parameter determines whether the triggered record should be created, opened, and added to the current task set. Actions 0-4 create the specified data record with status pending and level 0 if it does not already exist in the study database. The action parameter is ignored in DFbatch and in DFexplore Image View. In Data View it functions as follows:
0 = Run the edit checks but do not open the data record.
1 = Open the record only if it was changed, i.e. a new pending level 0 record was created, or an existing record was modified by a plate entry edit check. Do not add the record to the current task set.
2 = Open the record even if it was not changed. Do not add it to the current task set.
3 = Same as action 1, but record is added to the current task set (if any).
4 = Same as action 2, but record is added to the current task set (if any).
5 = Open the record. If it does not exist in the study database open a new blank record; do not create a pending level 0 record, and do not add it to the current task set.
Unlike other edit check statements dftrigger requests are not executed immediately, instead they are added to a dftrigger queue and executed in order as a final step during plate exit edit check processing.
In DFexplore actions 1-5 add the data record to the current list of open records and takes the user to that record instead of going to the next record in the list. Since dftrigger is ignored in all but plate exit edit checks the action request cannot be used to interrupt data entry on the current page.
If all 3 keys are omitted action 1-4 can be used to return to the current data record after it is saved to the study database. Only those plate entry edit checks specified in the edit check parameter will be executed. In the following example the user will remain on the current page after selecting a Save button, and plate entry edit check INIT2 will be re-executed.
edit EXIT2() { # run on exit from plate 2 if( NewRandomization() ) { dfwarning("Please print this page and take it to the hospital pharmacy.") ; dftrigger(,,,,,"INIT2",2) ; } }
If plate exit edit checks contain multiple dftrigger functions with non-zero actions, all of the relevant target data records will be added to the current record list and the focus will move to the first such record identified during plate exit edit check processing.
# If subject was hospitalized and hospitalization record (plate 56) does # not exist at this visit, create the hospitalization record, run the # GetInitials plate entry edit check, and then open the record if (HOSPITALIZED == TRUE && dfmissingrecord(,,56)==2 ) { dfwarning("Please complete hospitalization details on the next page.") ; dftrigger(,,56,3,0,"GetInitials",1) ; }
The dfuserinfo function returns information about the user running the edit check, including their login name, full name, and email as defined in DFadmin.
dfuserinfo usage
dfuserinfo takes 1 parameter, the name of a property.
The property name is entered as a quoted string. Case is not significant. Valid property names are UserName, FullName, and Email, which returns the user's login name, full name, and email address respectively, as defined in DFadmin. Refer to System Administrator Guide, DFadmin - Users for details.
edit AboutMe() { dfwarning("My username is: ", dfuserinfo("UserName")); dfwarning("My full name is: ", dfuserinfo("FullName")); dfwarning("My email is: ", dfuserinfo("Email")); }
dfuserinfo returns an empty string if the property name is spelled incorrectly.
If the full name property is blank in DFadmin, dfuserinfo("FullName") returns the username. If the email property is blank in DFadmin, dfuserinfo("Email") returns an empty string.
dfuserinfo("UserName") returns the same value as dfwhoami.
The dfvarinfo function returns information about the variable referenced by the argument, including up to 20 custom variable properties.
dfvarinfo usage
var, which must be a database variable.
attr is one of DFVAR_NAME, DFVAR_GENERIC, DFVAR_UNIQUE, DFVAR_TYPE, DFVAR_DESC, DFVAR_HELP, DFVAR_FORMAT, DFVAR_LABEL, DFVAR_SUBLABEL, DFVAR_ENTER_VALUE, DFVAR_LEGAL, DFVAR_REQUIRED, DFVAR_ESSENTIAL, DFVAR_FLDNUM, DFVAR_STRING_VALUE, DFVAR_PROMPT, DFVAR_UNITS, DFVAR_COMMENT, DFVAR_INSTRUCTION, DFVAR_ACCESS, DFVAR_CONSTANTVALUE, DFVAR_SKIP, DFVAR_SKIPCONDITION, DFVAR_MODNUM, DFVAR_MODNAME, DFVAR_MODDESC, DFVAR_USER1, DFVAR_USER2, DFVAR_USER3, DFVAR_USER4, DFVAR_USER5, DFVAR_USER6, DFVAR_USER7, DFVAR_USER8, DFVAR_USER9, DFVAR_USER10, DFVAR_USER11, DFVAR_USER12, DFVAR_USER13, DFVAR_USER14, DFVAR_USER15, DFVAR_USER16, DFVAR_USER17, DFVAR_USER18, DFVAR_USER19, DFVAR_USER20.
DFVAR_NAME
DFVAR_GENERIC
DFVAR_UNIQUE
DFVAR_HELP
DFVAR_FORMAT
DFVAR_LABEL
DFVAR_SUBLABEL
DFVAR_ENTER_VALUE
DFVAR_LEGAL
DFVAR_REQUIRED
DFVAR_ESSENTIAL
DFVAR_FLDNUM
DFVAR_STRING_VALUE
DFVAR_INSTRUCTION
DFVAR_ACCESS
DFVAR_CONSTANTVALUE
DFVAR_SKIP
DFVAR_SKIPCONDITION
DFVAR_USER1
DFVAR_USER2
DFVAR_USER3
DFVAR_USER4
DFVAR_USER5
DFVAR_USER6
DFVAR_USER7
DFVAR_USER8
DFVAR_USER9
DFVAR_USER10
DFVAR_USER11
DFVAR_USER12
DFVAR_USER13
DFVAR_USER14
DFVAR_USER15
DFVAR_USER16
DFVAR_USER17
DFVAR_USER18
DFVAR_USER19
DFVAR_USER20
optional_arg is required only when specifying the DFVAR_LABEL or DFVAR_SUBLABEL attribute. In this case, optional_arg is the numeric code for which the choice label or sub label is being requested.
DFVAR_NAME and DFVAR_GENERIC variable's name (previously known as the generic name)
DFVAR_UNIQUE variable's alias (previously known as the unique name)
DFVAR_DESC the variable's description
DFVAR_HELP the variable's help message
DFVAR_TYPE string equal to the variable's type from the following list of variable types: [number], [string], [date], [time], [vas], [choice], [check]
DFVAR_FORMAT the variable's format string
DFVAR_LABEL the label associated with the numeric code specified as the third argument. If the code and/or label is not found, an empty string is returned. DFVAR_LABEL is only applicable where coding exists for field types choice, check and numeric. The third argument is required for this attribute.
DFVAR_SUBLABEL the sub label associated with the numeric code specified as the third argument. If the code and/or sub label is not found, an empty string is returned. DFVAR_LABEL is only applicable where coding exists for field types choice, check and numeric. The third argument is required for this attribute.
DFVAR_ENTER_VALUE the variable's value when the plate was entered. If the variable is not on the current record, an empty string is returned. If after saving a data record the focus remains on the saved record, the enter value of all fields is updated to the saved value.
DFVAR_LEGAL the variable's legal range string or an empty string if no legal range is defined. If the definition of the legal range is the meta-word $(ids) (meaning the list of subject IDs defined in the sites database), the meta-word is returned, rather than its expansion into the actual list of subject IDs.
DFVAR_REQUIRED string, from the following choices, identifying if a value in the field is required: [Y (value is required or essential), N (value is not required or essential, e.g. value is optional)]
DFVAR_ESSENTIAL string, from the following choices, identifying if a value in the field is essential: [Y (value is essential), N (value is not essential (e.g. is required or optional)]
DFVAR_FLDNUM the variable's number (tab order position) as a string
DFVAR_STRING_VALUE the string representation of the variable's current value, including literals representing any missing value codes. If the variable is not on the current record, and the requested record does not exist, an empty string is returned
DFVAR_PROMPT the variable's prompt property
DFVAR_UNITS the variable's units property
DFVAR_COMMENT the variable's comment property
DFVAR_INSTRUCTION the variable's instruction property
DFVAR_ACCESS the variable's access mode
DFVAR_CONSTANTVALUE the variable's constant value, or empty string if it is not constant
DFVAR_SKIP the number of fields to skip, or 0 if no skip is defined
DFVAR_SKIPCONDITION the variable's skip condition, or empty string if no skip is defined
DFVAR_MODNUM the instance number of the module containing the variable
DFVAR_MODNAME the name of the module containing the variable
DFVAR_MODDESC the description of the module containing the variable
DFVAR_USER# the user-defined variable information for corresponding variable custom property
if ( dfvarinfo(@(T+2), DFVAR_TYPE) == "number" ) dfmessage( dfvarinfo( @(T+2), DFVAR_NAME ), " is a number." ); # get MYDATE value as a string from plate 1 for the current id str=dfvarinfo(MYDATE[,0,1],DFVAR_STRING_VALUE); dfmessage("MYDATE on plate 1 is " + str); # get the string value of a partial date field dfmessage( "You have entered the date as ", dfvarinfo(@T, DFVAR_STRING_VALUE)); # get the label for the current choice code dfmessage ( "You marked ", dfvarinfo( @T, DFVAR_LABEL, @T ) ); # get the label for a specific choice code dfmessage( "Code 1 has the label ", dfvarinfo( @T, DFVAR_LABEL, 1 ) ); # get the current field's prompt property dfmessage( "field ", @T, " has the prompt property: ", dfvarinfo(@T, DFVAR_PROMPT)); # get the current field's comment property dfmessage( "field ", @T, " has the comment property: ", dfvarinfo(@T, DFVAR_COMMENT)); #is this field in an AE module if ( dfvarinfo( @T, DFVAR_MODNAME ) == "AE" ) ...
dfvarname is implemented as dfvarinfo(var, DFVAR_NAME)
dfvarinfo(var, DFVAR_NAME)
dfaccessinfo is implemented as dfvarinfo(var,DFVAR_ACCESS)
The dfvarname function returns the name of the variable referenced by the argument.
dfvarname usage
if ( dfvarname(@(T+2)) == "PTID")
Although var can be any variable, it is only really useful with positional variables and group references.
See also dfvarinfo function for additional information about variables.
Function dfview can be used to determine the user's current working view.
dfview usage
In DFexplore, dfview returns the user's current DFexplore view: 'data' or 'image', which are the only 2 views in which edit checks can be run.
In DFbatch dfview always returns 'batch'.
if (dfview()=="image") return;
The dfvisitinfo function returns information about the visit, or visit map metadata, referenced by the argument.
dfvisitinfo usage
attr is one of DFVISIT_DATE, DFVISIT_TYPE, DFVISIT_ACRONYM, DFVISIT_LABEL, DFVISIT_DUE, DFVISIT_OVERDUE, DFVISIT_REQUIREDPLATES, DFVISIT_OPTIONALPLATES, DFVISIT_ORDERPLATES, DFVISIT_MISSEDPLATE.
string vdate = dfvisitinfo( , , DFVISIT_DATE ); dfmessage( "The visit date is ", vdate );
If the ID is omitted, the subject ID of the current record is used. If the visit is omitted, the visit number of the current record is used.
The dfwhoami function returns the login name of the user running the edit check.
dfwhoami usage
string username; username = dfwhoami(); if ( username == "bob" ) dfmessage( "hey bob, how are ya?" );
This function returns the year component, as a number, from a date.
dfyear usage
number year; year = dfyear( @T ); if ( year < 2000 ) dfmessage( "The event occurred in the 2nd millenium." );
The int function is used to return the integer portion of a floating point number.
int usage
int(365.25)
returns 365
The edit check language provides several functions for dealing with queries:
By way of example, the following generic edit check could be used to add a missing value query to a field which has been left blank.
edit addmissing() { if( dfblank(@T) ) dfaddqc(@T,1,"",1,2,""); }
The dfaddqc function allows an edit check to add a query to a field on the current record. It is not possible to add queries to other records. It is possible to add more than one query to a field, provided that the category code is different.
In DFexplore and DFbatch query creation permissions matter; depending on which of the 3 possible query creation permissions the user has for fields on the current data record, behavior will vary as follows. The Query Add dialog (DFexplore) is only displayed if the user has full query creation permission. If the user has permission to create queries within edit checks only, the Query Add dialog (DFexplore) is not displayed and the query is added automatically. And if the user has no query creation permission dfaddqc is a no-op; it does nothing.
Additionally, when using DFbatch to execute edit checks, the actions of dfaddqc are further impacted by the attributes of the APPLY element.
APPLY
The behavior of dfaddqc is the same in DFexplore and DFbatch with regard to which data records can have queries added. In both cases a user will only be able to add queries to records for which they have access (read) permission; any restrictions on sites, subjects, visits, and plates will be applied.
dfaddqc usage
var is a database variable on the current plate and is the variable to which the query is to be attached.
code is a category code from the table:
Table 5.79. Category codes
query is the text that should appear as the query field, or "" if there is no text. This text is “sanitized” by replacing any non-printable character with a space/blank and each ‘|’ with a ‘?’.
usage is a numeric usage code from the table:
Query Usage codes
refax is a numeric code from the table:
Query Refax codes
note is the text that should appear in the resolution note, or "" for no text. This text is “sanitized” by replacing any non-printable character with a space/blank and each ‘|’ with a ‘?’.
optional_mode is a numeric value which can be used to suppress the default behavior of showing the Query Add dialog. If it is omitted or has the value 0 the dialog is displayed and the query may be accepted, modified, or canceled by the user. If the value is 1 the dialog is not displayed and the query is silently added, if possible (i.e. if the field does not already have a query with the same category code, and in DFexplore if the user has permission to create queries).
if ( dfaddqc( VDATE, 1, "Date missing", 1, 2, "" ) ) dfmessage( "Added a query to VDATE." );
The dfaddmpqc function allows an edit check to add a user-defined missing page query.
dfaddmpqc usage
id, visit, and plate provide the key information for the data record that is missing.
query is the text that should appear as the note field, or "" if there is no query. Any invalid (i.e., '|' or control characters) are converted to blank.
note is the text that should appear in the resolution note, or "" for no text. Any invalid (i.e., '|' or control characters) are converted to blank.
if ( dfaddmpq c( 12345, 0, 1, "Excl. crit. required", 1, 2, "" ) ) dfmessage( "Added a missing page query." );
Attempts to add a missing page query will fail if the data record exists, regardless of its status (final, incomplete, pending or missed), or if a missing page query already exists.
In DFexplore, DFweb, and DFcollect, the user must have permission to create queries for the specified data record, otherwise dfaddmpqc is a no-op; it does nothing. In DFbatch no restrictions are applied; missing page queries can be created for any record regardless of user permissions.
Some, but not all, of the id, plate and visit parameters may be omitted in which case they will default to the current id, visit, and plate respectively. Although it is legal to use the default for all three key fields, thereby indicating the current record, this will never produce a missing page query, as one cannot add a missing page query to the current page, which clearly is not missing.
Setting the refax option has no effect on dfaddmpqc. User-defined and DFdiscover-created missing page queries, together with DFdiscover-created overdue visit queries, are always correction queries.
Missing page queries are removed by the study server when the data record arrives regardless of the status of the data record (final, incomplete, pending or missed).
These queries can also be removed using edit check function dfdelmpqc. We recommend using complimentary add and delete statements, and running these edit checks in batch. Designing edit checks like this:
if ( condition is TRUE ) dfaddmpqc(...); else dfdelmpqc(...);
allows the edit check to correct the Query database if the subject data changes in a way that makes the query no longer required.
The dfanyqc function returns the query status of a variable.
dfanyqc usage
var may be any variable in the study database. It may be a direct, positional or group reference.
optional_category_code may be any category code in the study's query category map. It is an optional parameter.
If optional_category_code is omitted or has value 0, the status of the first query is returned.
If optional_category_code is greater than 0, the status of the query matching optional_category_code is returned.
A numeric status from the table:
Query status codes
if ( dfanyqc( VDATE ) == 1 ) dfmessage( "VDATE has a new query." );
If optional_category_code is greater than 0, and if optional_category_code is not found or there is an error, the return value is 0.
If optional_category_code is illegal (negative), the status of the first query is returned.
If var is not a database variable, the return value is 0. dfanyqc will not report deleted as a query status, but instead will report no note exists for a deleted query.
If in a study with single query permission, the optional_category_code is ignored, and the status of the first (only) query is returned.
When executed within DFopen_patient_binder and DFopen_study edit checks, the return value is 0.
The dfanyqc2 function returns the query statuses of a variable.
dfanyqc2 usage
Returns a string of all query status codes for the referenced variable, each status code delimited by pipe (|).
Each numeric status is from the table:
dfmessage( dfanyqc2(@T) );
If var is not a database variable, the return value is an empty string. dfanyqc2 will not report deleted as a query status, but instead any such query will be skipped in the output.
If only one query exists for a field, dfanyqc and dfanyqc2 return the same value.
When executed within DFopen_patient_binder and DFopen_study edit checks, the return value is an empty string.
The dfanympqc function determines if a missing page (code 21), EC missing page (code 23) or overdue visit (code 22) query exists for the specified arguments.
dfanympqc usage
if ( dfanympqc( 12345, 1, 12 ) ) dfmessage( "A missing page query already exists in the database." );
If a missing page, EC missing page or overdue visit query does not exist in the database for the specified keys, dfanympqc returns FALSE.
Some, but not all, of the id, plate and visit arguments may be omitted in which case they will default to the current id, visit, and plate respectively. Although it is legal to use the default for all three key fields, thereby indicating the current record, dfanympqc will always evaluate to FALSE as the dfanympqc function will only execute if the current page exists in the database.
dfanympqc will detect missing page, EC missing page and overdue visit queries when executed within a DFopen_patient_binder edit check or in batch mode. dfanympqc will be ignored when executed within a DFopen_study edit check.
The dfdelmpqc function deletes a missing page (code 21), EC missing page (code 23) and overdue visit (code 22) query. The EC missing page query must have been previously created by an invocation of dfaddmpqc, while the missing page or overdue visit query must have been previously created by DF_QCupdate.
dfdelmpqc usage
if ( dfdelmpqc( 12345, 0, 1 ) ) dfmessage( "Deleted a missing page query." );
If the referenced CRF exists or there is no missing page, EC missing page, or overdue visit query for the keys, dfdelmpqc returns FALSE.
Some, but not all, of the id, plate and visit parameters may be omitted in which case they will default to the current id, visit, and plate respectively. Although it is legal to use the default for all three key fields, thereby indicating the current record, this will always fail as one cannot delete a missing page, EC missing page or overdue visit query from the current page, which clearly cannot have one.
dfdelmpqc will delete missing page, EC missing page and overdue visit queries when executed within a DFopen_patient_binder edit check or in batch mode. dfdelmpqc will be ignored when executed within a DFopen_study edit check.
The dfeditqc function can be used to modify several properties of a query, namely: status, category, type (correction/clarification), use (external/internal), query detail, note, and reply.
dfeditqc usage
var is any database variable (of the current record).
attr is the name of the attribute of interest, from the list DFSTATUS, DFQCVAL, DFQCPROB, DFQCRFAX, DFQCQRY, DFQCNOTE, DFQCREPLY, DFQCUSE. These are the schema names for the data fields defined in DFqc.dat - the Query database. NOTE: Changes to the value of the DFQCVAL attribute are limited to interactive edit checks only. Attempts to change DFQCVAL in DFbatch are silently ignored.
value is the new value to assign to the attribute.
optional_mode is a numeric value which can be used to suppress the default behavior of showing the Query Edit dialog. If it is omitted or has the value 0, the dialog is displayed and the edit may be accepted, modified, or canceled by the user. If the value is 1 the dialog is not displayed and the edit is applied without confirmation.
The return value is 0 if dfeditqc determines that the requested change is not allowed because: the record status is secondary, edit mode is view only or DDE, the specified data field does not have a query on it, or one or more attribute literals are not from the list above; otherwise, the return value is 1.
The return value does not indicate if the user actually accepted the edits by choosing OK interactively, or query actions are applied in DFbatch.
# Resolve a missing value query if data present edit ResolvQuery() { string curstatus; # use: on exit from a required field # resolve a missing value query if the field is no longer blank if(dfqcinfo(@T,DFQCPROB) == "1" ) { # if there is a query but it is already resolved, skip curstatus = dfqcinfo(@T,DFSTATUS); if (curstatus < 3 || curstatus > 5) { if( ! (dfblank(@T)) ) { dfe ditqc(@T,DFSTATUS,5,DFQCRFAX,1,DFQCNOTE,"Resolved by ResolvQC edit check"); } } } }
If DFQCPROB attribute is not specified, edit the first query.
If DFQCPROB attribute is specified, edit the query with the category code.
If an edit check using dfeditqc is run interactively via DFexplore, DFexplore checks that the current record is defined, the mode is not double data entry, the record status is primary, and that the referenced field has a query already on it. If any of these conditions fail, dfeditqc simply returns 0. Otherwise, the query dialog is displayed, unless optional_mode has been set to 1, in which case the edit is applied directly and the query dialog is not displayed.
In the special case of changing the query status to delete (to delete the query) a confirmation dialog is immediately displayed to indicate that the query will be deleted and cannot be undone. Once the query edit dialog or query delete confirmation dialog appear on-screen, the return status of the function is set to 1. Choose OK or Cancel to confirm the dialog is reflected in the return statuses with 1 or 0, respectively.
If an edit check using dfeditqc is run in DFbatch, two new elements have been added: EQ and NEQ, to record the actions of the function. The first element has the same attributes and child elements as the existing Q element. The attribute values are 0 unless they have been changed by the edit check function call. The query and note child elements appear only if their respective values have been set by the function. Note that the element values do not identify whether it has been changed from its previous value, simply that the value was specified as an argument in the function call. The second element records the number of times that the function was called. Again, note that it does not record the number of times that edited values were changed - just the number of times that the function was called.
The behavior in response to changes to DFQCPROB depends upon the study global setting for Allow Multiple Queries per Field. If multiple queries are not enabled, changing the value of the query category code, (DFQCPROB), causes the existing query to be deleted and a new query to be added with the specified category code. If multiple queries are enabled, the query with the matching query category code is edited. In both cases, if there is no matching query, no edit is applied.
The dfreplyqc function can be used to add/modify the reply text of an existing outstanding query.
dfreplyqc usage
category is the unique query category.
reply_text is the user-supplied reply text. If not supplied, the user will need to enter the reply text.
optional_mode is a numeric value which can be used to suppress the default behavior of showing the Query Reply dialog. If it is omitted or has the value 0, the dialog is displayed and the reply may be accepted, modified, or canceled by the user. If the value is 1 the dialog is not displayed and the reply, if provided, is applied without confirmation.
# Reply to a category 30, clinicalQC for example, query if ( dfreplyqc( @T, 30, "Confirmed", 0 ) ) dfmessage ("A reply has been added to the query." );
The dfresqc/dfunresqc functions return TRUE/FALSE depending on whether the variable contains a resolved/unresolved query.
dfresqc usage
dfresqc(var, optional_category_code)
dfunresqc(var, optional_category_code)
optional_category_code may be any category code defined in the study's query category map. It is an optional parameter.
If optional_category_code is omitted or has value 0, TRUE/FALSE is returned for the first query.
If optional_category_code is greater than 0, TRUE/FALSE is returned for the query matching optional_category_code.
For dfresqc, TRUE if the variable has a resolved query, FALSE otherwise.
For dfunresqc, TRUE if the variable has an unresolved (including status pending) query, FALSE otherwise.
if ( dfresqc( VDATE ) ) dfmessage( "VDATE has a resolved query." );
The dfqcinfo function returns information about the query on any database variable. The information returned can be selected from any of the fields defined for a query.
dfqcinfo usage
attr is the name of the field of interest and must be one of DFSTATUS, DFVALID, DFRASTER, DFSTUDY, DFPLATE, DFSEQ, DFPID, DFQCFLD, DFQCCTR, DFQCRPT, DFQCPAGE, DFQCNAME, DFQCVAL, DFQCPROB, DFQCRFAX, DFQCQRY, DFQCNOTE, DFQCREPLY, DFQCCRT, DFQCMDFY, DFQCRSLV, DFQCUSE. These are the schema names for the data fields defined in DFqc.dat - the query database.
If optional_category_code is omitted or has value 0, the attribute value of the first query is returned.
If optional_category_code is greater than 0, the attribute values of the query matching the optional_category_code is returned.
The return value is the string equivalent of the value in the requested query field of the specified database variable.
If the value is from a coded list, the return value is the code not the label of the code.
If the referenced variable does not exist, or does not have a query note, the return value is "".
if ( dfqcinfo(@(T+2), DFQCPROB) == "1" ) dfmessage( dfvarinfo( @(T+2), DFVAR_NAME ), " has a query for a missing value." );
The dfqcinfo2 function returns information about the (multiple) queries on any database variable. The information returned can be selected from any of the fields defined for a query.
dfqcinfo2 usage
attr is the name of the field of interest and must be one of DFSTATUS, DFVALID, DFRASTER, DFSTUDY, DFPLATE, DFSEQ, DFPID, DFQCFLD, DFQCCTR, DFQCRPT, DFQCPAGE, DFQCNAME, DFQCVAL, DFQCPROB, DFQCRFAX, DFQCQRY, DFQCNOTE, DFQCREPLY, DFQCCRT, DFQCMDFY, DFQCRSLV, DFQCUSE. These are the schema names for the data fields defined in DFqc.dat - the Query database.
A string is returned of all query attribute values delimited by pipe (|).
If the attribute values are from a coded list, the delimited values returned are the codes, not the labels of the codes.
If the referenced variable does not exist, or does not have a query, the return value is "".
dfmessage( dfqcinfo2(@(T+2), DFQCPROB) );
The edit check language provides several functions for dealing with reasons:
The dfaddreason function allows an edit check programmer to add a custom reason to a data field. When a data field is changed by an edit check, the system will immediately attach the static reason text Set by edit check ecname, where ecname is the edit check name, to the changed field, and will replace any existing reason on that field. The dfaddreason function can then be used by the edit check programmer to overwrite the static reason with a custom reason.
Set by edit check ecname
ecname
dfaddreason usage
var must be a database variable.
text is the text string containing the reason.
optional_mode is a numeric value which determines whether or not the reason dialog box is displayed. If it is omitted or has the value 0 the dialog is displayed and the reason may be accepted, modified, or canceled by the user. If the value is 1 the dialog is not displayed and the reason is silently added.
the actual reason added
"" if the action was canceled or it is not possible to add the reason.
dfaddreason(@T, "clarification from site");
If a DFexplore user does not have corresponding permission - Create Reason permission if there is no reason currently on the field or Modify Reason by Edit Check permission if there is a reason already on the field within edit checks, dfaddreason does nothing, regardless of the optional_mode setting.
If a DFexplore user has permission to both create and approve reasons any reasons added using dfaddreason will be automatically approved, but if the user does not have permission to approve reasons any reasons added using dfaddreason will have status pending regardless of the optional_mode setting.
If the user has permission to approve reasons only within edit checks, (the shaded or dash setting) then reasons added using optional_mode 1 will be automatically approved while reasons added via the reason dialog will always have status pending whether added manually or through dfaddreason.
The reason text is "sanitized" by replacing each non-printable character with a space/blank and each '|' with a space.
When the reason dialog is displayed in DFexplore both the current reason (if there is one) and the new reason are shown in the dialog. The new reason may be accepted, modified, or canceled by the user. The user cannot apply a new blank reason; a valid text string must be entered before a new reason can be saved; this can be as little as a single valid character.
The return value of dfaddreason is the supplied reason (with any invalid characters converted to blank), or the empty string if the dialog is canceled.
If dfaddreason is executed in batch, the new reason replaces the current reason if one exists. The reason text string is returned by the function, with any invalid characters converted to blank. Reasons will be saved to the database only if the <APPLY which="data"> element is set in the batch file.
The system will automatically generate a Set by edit check ecname reason as soon as a data field is changed, subject to the dfautoreason setting. A subsequent dfaddreason will overwrite this automatically generated reason.
If the edit check changes the data value after a dfaddreason call, then the system will automatically generate another Set by edit check ecname reason that will overwrite the previous reason from the dfaddreason call.
No record is kept of reasons that were not saved.
The dfanyreason function is used to determine whether a data field has a reason attached to it. If dfanyreason is used in batch mode, it will only report on reasons that are already present on the field. In batch mode, it will not detect auto-generated reasons added by the execution of edit checks.
dfanyreason usage
Reason status codes
if ( dfanyreason( VDATE ) == 1 ) dfmessage( "VDATE has an approved reason." );
The dfautoreason function can be used to suppress the automatic addition of reasons to data fields that are changed by edit checks.
dfautoreason usage
dfautoreason(0); # do not add reason for subsequent data changes FLD1 = 44; # no reason will be added for this change dfautoreason(1); #turn autoreasons back on for subsequent data changes FLD2 = 22; # an autoreason will be added for this change
Whenever a data field is changed by an edit check a reason is automatically added to the field, such as set by edit check EditCheckName. This is the default behavior in both DFbatch and DFexplore. It is not necessary to include dfautoreason(1) to produce this behavior.
set by edit check EditCheckName
dfautoreason(1)
Used with caution, dfautoreason(0) can be deployed to suppress the automatic generation of reasons. There is however one important exception: if a data field already has a reason it can not be changed without providing a new reason to explain the new value. Thus an autoreason will always be added if needed to replace an existing reason.
dfautoreason(0)
Automatic reasons can be added or suppressed for each change made by an edit check. The scope of any call to dfautoreason(0) is limited to the current edit check. It is not possible to disable automatic reason creation across many edit checks with one call to dfautoreason(0).
IMPORTANT: Disabling automatic reasons using this language feature must always be given careful consideration before implementation. Please consider all appropriate regulatory guidance first.
The dfreasoninfo function is used to determine attribute information of the reason, if any, attached to a data field.
dfreasoninfo usage
attr is the name of the field of interest and must be one of DFSTATUS, DFVALID, DFRASTER, DFSTUDY, DFPLATE, DFSEQ, DFPID, DFRSNFLD, DFRSNCDE, DFRSNTXT, DFRSNCRT, DFRSNMDF. These are the schema names for the data fields defined in DFreason.dat - reason for data change records.
The return value is the string equivalent of the value in the requested reason field for the specified database variable.
If the referenced variable does not exist, or does not have a reason, the return value is "".
string status, msg; status = dfreasoninfo(@(T+2), DFSTATUS); if ( status == "" ) msg = "does not have a reason"; else if ( status == "1" ) msg = "has an approved reason"; else if ( status == "2" ) msg = "has a rejected reason"; else if ( status == "3" ) msg = "has a pending reason"; dfmessag e( dfvarinfo( @(T+2), DFVAR_NAME ), " ", msg, "." );
Lookup tables provide a mechanism for coding information based on a key field. This mechanism can be used for several applications, including:
Lookup tables are defined in the DFlut_map file in the study lib directory. The DFlut_map file contains entries which link the table name used in edit checks with the file name containing the lookup records. Lookup table files must be stored in the study lut directory, or the DFdiscover lut directory. The study level file has priority if both exist.
Lookup tables themselves are comprised of entries containing a key followed by zero or more fields of return result. Within a lookup table there is one entry per line, and fields within each entry are delimited by |. A lookup table must be a plain ASCII text file. If it is anything other than this, an error message will appear in a blocking, warning dialog upon starting DFexplore. If DFbatch is used, an error message will appear on the command-line upon starting DFbatch.
Entries can have one of three possible formats.
single field This field is both the search key and returned value
2 fields The first field is the search key and the second field is the returned value, as in this example:
AIDS|autoimmune deficiency syndrome
3+ fields The first field is the search key and all other fields are returned as a single | delimited string that can be parsed using the dfgetfield function.
A detailed description of both lookup table files and DFlut_map can be found in The study lut directory and DFlut_map - lookup table map.
The dflookup function is the interface between the edit check language and DFdiscover lookup tables. The dflookup function can do silent lookups or, in DFexplore, DFweb, and DFcollect, it can display the lookup table, optionally position the cursor and ask the user to make a selection.
dflookup usage
table is the symbolic name of the table as defined in DFlut_map
var is any variable
default is the string return value if no match is found
alg is the search algorithm to use from the following possible values:
-1 Exact match - do not display the lookup table. Return a result for an exact match only, otherwise return default
0 Choose match - always display the lookup table and highlight the first entry in the table. In batch mode, this algorithm always returns default. In interactive mode, the default value is not used and a null string is returned if the lookup table dialog is canceled.
>0 Partial match - always display the lookup table. Display the lookup table and highlight the first entry in the table that matches on the specified number of characters. In batch mode, this algorithm [always] returns default. In interactive mode, the default value is not used and a null string is returned if the lookup table dialog is canceled.
>0
string s; s = dflookup("drugs", @T, "unknown", -1);
If a lookup table contains multiple fields for a key, the return result is the string containing all of the fields after the key. The dfgetfield function can subsequently be used to extract individual fields from the return result.
In DFcollect, lookup tables must have a file size below 40 MB. If an edit check references a lookup table that is 40 MB or larger, the user will see a warning that the lookup table is not available.
The following limitations exist when this function is called from DFopen_patient_binder() and DFopen_study()
It is often desirable to execute certain parts of an edit check several times. An example would be a total dose calculation based on multiple fields on a form. The edit check language provides the while loop construct for this purpose.
The while statement executes its body (either a single statement, or a group of statements surrounded by braces) as long as the condition is true. If the condition is FALSE when the while statement is first started, the body of the loop is not executed and execution continues at the first statement following the end of the while statement.
A simple edit check that prints all the numbers from 1 to 10
edit printnums() { number counter; counter = 1; while (counter <= 10) { dfmessage("count is now ", counter); counter = counter + 1; } }
In this example, the variable counter is initialized to 1 (remember that local variables are set to blank when they are declared). The body of the while loop is executed as long as the value counter is less than or equal to 10, causing the 'count is now' messages to be produced.
counter
If you are using a variable as the condition in the while loop it is very important to make sure that the body of the while loop changes the value in such a way that the condition will eventually be false and the while loop is exited. Failure to do this results in an endless loop. The edit check language interpreter places an upper limit on the number of instructions that can be executed to gracefully recover from this programming error.
Compare visit date fields of adjacent visits
edit vdatecheck() { number v; v = DFSEQ; while (v > 0) { if( vdate[,v,] <= vdate[,v-1,] ) dferror("Visit ", v-1, "follows visit ", v); v = v- 1; } }
In this edit check a local variable, v, is first set to the value of the current visit number (this by default, is the DFSEQ data variable) and then, using a while loop, is successively set to the visit number of the preceding visit until the value reaches zero. A message is printed if the date for a given visit is earlier than the date of the visit that precedes it. This is written under the assumption that the plate on which the vdate variable appears is to be collected at each visit. If visit dates appeared on different plates for different visits, the edit check would have to be modified.
vdate
NOTE: This example assumes that all visit/sequence number fields (field 6) in the database have been assigned the DFdiscover name of DFSEQ. Field 6 has this name only if the field is defined to be in the barcode; otherwise the name is user-defined. It is legal however, for field 6 to be assigned a user-defined name of DFSEQ.
Occasionally there is the need to break out of a loop prematurely. In the case of nested loops, the break applies to the innermost while loop.
Example use of break
edit find_headache() { number seq; seq = DFSEQ; while (seq >= 0) { if (AEevent[,seq,] == "headache") break; seq = seq - 1; } if (seq >= 0) { dfmessage("headache on seq ", seq); } else { dfmessage("no headache found!"); }
In this example we have an adverse event form and would like to see if the subject had a headache event on a prior form. The edit check looks at all prior adverse event forms until it finds one with AEevent == "headache". When we find the headache event, we print out the sequence number for that form and break out of the loop;
NOTE:Note that the local variable seq retains its value when the loop is terminated.
seq
The continue statement also modifies looping behavior, but unlike the break statement, it causes execution to resume at the top of the loop. In the case of nested while loops, the continue statement causes control to be transferred to the top of the innermost active loop skipping the remaining steps in the loop.
If a variable is being used in the while condition, it needs to be adjusted before the continue statement, otherwise an endless loop may occur.
In addition to the built-in functions, the edit check language allows users to define their own functions. User-defined functions allow an edit checks programmer to create sections of code that can be reused by different edit checks, and in generic cases, by multiple studies.
Functions are similar to edit checks but some differences exist:
Unlike edit checks, user functions can return values,
Edit checks can be assigned to data fields in the setup tool, whereas user functions are executed when called by an edit check.
User functions have the same general form as edit checks, but instead of beginning with the word edit they begin with the keyword of the type of value which they return. This is followed by the name of the function and the opening bracket, (, just like in an edit check. The functions arguments, if any, are declared next and are followed by the closing bracket, ). This ends the function and argument declaration and is then followed by the opening brace, {, and the body of the user function.
User-defined function to add two numbers and report when the sum is greater than 10
number add(number n1, number n2) { number result; result = n1 + n2; if (result > 10) { dferror(result, " is greater than 10"); } return(result); }
The result of calculations in the user function is returned via the return statement.
Functions can be used for many purposes such as to implement sophisticated calculations that are used over and over again.
User function that calculates a dose level based on the subject's weight and sex
number calcdose(number wt, number sex) { if (dfmissing(wt) || dfmissing(sex)) return(0); if ((wt < 40) || (wt > 400)) { dferror("Subject weight error"); return(0); } if (sex == 1) return(200 + .2*wt); if (sex == 2) return(100 + .1*wt); dferror("Unable to calculate dose. Sex not 1 or 2."); return(0); }
The last 2 lines may seem unnecessary given we have returned at the top of the edit check if sex is missing, as long as there are only 2 options for sex. But, it never hurts to be cautious.
sex
Several edit checks can now share the calcdose function and if any dosage calculations need to be changed, only the calcdose function needs to be altered.
calcdose
Calling a user defined function from an edit check
edit drugcheck() { number dose; dose = calcdose(lbs, sex); if ((dose != 0) && (DRUGDOSE > dose)) { dferror("Drug overdose given!"); } }
The #include directive can be used to include other files into a DFedits file. This can be useful in cases where a number of user functions or common edit checks are to be shared across multiple studies.
To include another file in a DFedits file, use the following syntax:
#include other filename
where other filename is the file to be included, represented as a string. All include files must be located in the study ecsrc directory or in the DFdiscover ecsrc directory. The study level file has priority if the same file exists in both places.
other filename
Including DFedits.gen in the DFedits file
DFedits.gen
#include my generic edit checks #include "DFedits.gen" #the rest are my local edit checks edit checkAge() { ...
Included text is treated exactly the same way as text that appears directly in the DFedits file. A side effect of this is that care must be taken when naming edit checks. It is not valid to define an edit check named checkAge in a file to be included and define an edit check with the same name in the local edit check file. To reduce this possibility, consider naming generic edit checks with a common prefix such as GEN_.
checkAge
GEN_
NOTE: It is not possible to start a line with the comment phrase #include as this is interpreted as an include directive. In such a case, use # include to start a comment with the word include (note the intervening space).
Earlier we saw this simple edit check to convert pounds to kilograms.
Convert pounds to kilograms
Although syntactically correct the edit check does not take into account the possibility that the lbs field might be missing or contain an illegal value.
Steps for a more rigorous solution
A more rigorous solution might include the following steps.
Ignore the edit check if lbs is blank or contains a missing value code.
Only calculate kgs if lbs contains a legal value.
kgs
If lbs does not contain a legal value add a query to lbs.
Convert pounds to kilograms with error checking
edit lbs2kgs() { # exit if lbs is blank or has a missing value code if( dfmissing(lbs) ) return ; else if( lbs>=40 && lbs <=400 ) kgs = lbs / 2.205; else dfaddqc(lbs,2,"Please verify weight.",1,2,""); }
The most difficult task in writing edit checks is to decide [exactly] what you want the edit check to do. The most appropriate action might be viewed differently by different users and might depend on various conditions involving legal and missing values for one or more other variables. Using the previous compliance check as an example, is it the intent to:
insert a calculated value of compliance into the database only when it is not already recorded?,
if ( dfblank( compliance ) ) compliance = 100 * ( dispensed - returned ) / dispensed;
insert a calculated value of compliance into the database only when the record is at or below a certain validation level?,
if ( DFvalid <= 3 ) compliance = 100 * ( dispensed - returned ) / dispensed;
or warn the user when the calculated value does not match the recorded value?
number calculated; calculated = 100 * ( dispensed - returned ) / dispensed; if ( !dfblank(compliance) && calculated != compliance ) dferror( "Reported compliance of ", compliance, "does not match calculated compliance of ", calculated );
NOTE: Remember that an edit check can be executed at different times i.e. on entry to the plate, on entry to a field, on exit from the plate or on exit from a field, or at multiple times, e.g. on exit from the field and again on exit from the plate. Also remember that a field entry or exit edit check will be executed every time that the field with the edit check is traversed, and skipped entirely if the user scrolls past the field, never entering it.
Most edit checks are written as field exit edit checks and placed on the last field of those involved in the edit check so that all information has been validated before the edit check is executed. But only you can determine the best time to execute a particular edit check.
Be careful with illegal values, missing values, blank fields, existing queries, and calculated fields that have already been filled. Errors can easily arise from failure to account for unexpected data.
It is up to the programmer to design edit checks that take appropriate account of the variety of conditions which may arise. Although more time consuming to write, carefully crafted edit checks are more likely to do what you really want, and to be easily accepted by those performing data entry and validation for the study.
Finally, keep the following in mind with respect to assigning calculated values to database variables. The FDA's Guidance for Industry: Computerized Systems Used in Clinical Trials explicitly states
blockquote > Features that automatically enter data into a field when that field is > bypassed should not be used.
Hence, consider carefully edit checks that assign values to data fields. If the data value is a reported value on the CRF, a preferred alternative is to compute the expected value of a field and then compare it against the recorded value, signaling an error when the two do not match.
Like any langauge, the quality of edit check code benefits from repeated definition and use; the experience of the edit checks creator. Your first challenge is learning the language - what does the syntax allow me to do? Getting your first edit check to compile is a rewarding feeling.
As you learn the syntax, you then make decisions about the semantics of the edit checks code - is this the desired behavior or result of the edit check? Do I display a message, add a query, prompt the user to fix something? Syntactically, one can write several different edit checks to solve a problem - they each compile. But which one has the desired effect?
As you become even more proficient, it's worthwhile spending time and effort to optimize your edit checks. Can they execute quicker? Is the edit checks code clear to another reader? The suggestions in the following sections are intended to give you resources to increase the efficiency of your edit checks.
Each edit check takes time to execute. Most are extremely quick, hardly noticable. Some take more time and that time can become noticeable. Imagine the impact of that edit check if it is executed once on each CRF. Imagine the impact of that edit check if it is executed five times on each CRF. What is the impact if the user is geographically located several time zones away, or on a slow internet connection?
It is worthwhile improving the execution time of every edit check. If you can save 0.1 sec per edit check, over the duration of a study this can amount to hours or even days.
One of the most valuable, yet more expensive features in edit checks in terms of execution time, is the ability to request data for other CRFs, other visits, and even other subjects. It's an incredibly valuable feature - it does however require a request from the database and hence some execution time.
The edit checks language executes every clause in conditional statements. It does not implement shortcut evaluation, which you may have experienced in other programming languages.
In DFexplore, edit check logic is executed in the client and the edit check definition file, DFedits.bin, is stored on the server. This means that the file must be transmitted to DFexplore from the server each time that DFexplore starts. Edit checks that are included, but never referenced, should be removed to help reduce the size of the file that is transmitted.
In DFexplore, an edit check cache is maintained for each record request. If a record request matches the subject ID of the currently open subject binder (which it typically would), the result of the first request is added to the cache and the result for each subsequent request is pulled from the cache. Results for requests for other subject IDs continue to come directly from the database.
In this example, there are 5 requests for data records from the database. 3 are fulfilled by database responses and 2 (displayed in bold) are fulfilled from the local cache.
if (var1[,21,5] >= 100) dfmessage( msg1 ); if (var2[,22,5] < 150 && var1[,21,5] >= 150) dfmessage( msg2 ); if (var1[99001,11,1] == 1) dfmessage( msg3 ); if (var2[,22,5] < 150) dfmessage( msg4 );
Some conditional tests can be simplified with one or more else clauses. Some conditional tests may not be necessary. Consider this example:
if ( Date[,0,1] < "01/01/16" ) value = 1; if ( Date[,0,1] >= "01/01/16" && Date[,0,1] < "01/01/17" ) value = 2; if ( Date[,0,1] >= "01/01/17" ) value = 3;
The Date[,0,1] >= "01/01/16" test is not necessary if an else clause is used. Similarly, Date[,0,1] >= "01/01/17" is not necessary.
Date[,0,1] >= "01/01/16"
Date[,0,1] >= "01/01/17"
The logic can be simplified:
date v1date = Date[,0,1]; if ( v1date < "01/01/16" ) value = 1; else if ( v1date < "01/01/17" ) value = 2; else value = 3;
Function calls in edit checks have a cost. If the same function is called several times in the same context, it may be more efficient to store the function result in a local variable and use the local variable. For example:
if ( dflevel() == 1 || dflevel() == 2 || dflevel() == 3 || dflevel() == 4 || dflevel() == 5 ) # do something
can be more efficiently written as:
number mylevel = dflevel(); if ( mylevel == 1 || mylevel == 2 || mylevel == 3 || mylevel == 4 || mylevel == 5 ) # do something
or even more efficiently as:
number mylevel = dflevel(); if ( mylevel >= 1 && mylevel <= 5 ) # do something
There is no shortcut evaluation in DFdiscover edit checks. Hence it can be more efficient to simplify multiple conditions. In this example, all 5 conditions are evaluated even if the first one, or any one, fails.
if ( condition1 && condition2 && condition3 && condition4 && condition5 && condition6 ) something_happens; return;
Although it appears longer, this re-writing will execute quicker because any failed condition will prevent the other conditions from being tested.
if ( !condition1 ) return; if ( !condition2 ) return; if ( !condition3 ) return; if ( !condition4 ) return; if ( !condition5 ) return; if ( !condition6 ) return; something_happens; return;
Messages that are displayed for the user, or written to a new query, benefit from having as much detail in them as possible. This often involves inserting context into the message.
string msg1 = "The visit date " + dfvarinfo( @T, DFVAR_STRING_VALUE ) + " is in the future."; string msg2 = "The subject reported " + @(T-1) + ". Was follow-up prescribed?"; if ( dfblank( @T ) ) return; ...
In this example, the work to construct the two messages is immediately wasted if the simple exclusion test is true.
A more efficient solution delays construction of the messages until they are actually needed:
string msg1, msg2; if ( dfblank( @T ) ) return; msg1 = "The visit date " + dfvarinfo( @T, DFVAR_STRING_VALUE ) + " is in the future."; msg2 = "The subject reported " + @(T-1) + ". Was follow-up prescribed?"; ...
While some edit checks are unique, i.e. used only once in the entire study database, other edit checks may need to be repeated at several locations. In such cases there is considerable advantage in being able to create a generic edit check that can be written once and then applied to all of the data fields on which it needs to be executed. To do this we need a way of referring to data fields other than by explicit variable names, as the relevant variable names will likely change depending on the field to which the generic edit check is attached.
As an example consider a list of medical history items (diabetes, hypertension, etc.) with a Yes/No question followed by a Duration question for each history item. Duration is only required when the Yes/No question is answered Yes. The following is a generic edit check that could be applied to each duration field to check such medical history questions.
edit medhx() { if( @T==0 || dfblank(@T) ) { # duration is zero or blank if( @(T-1)==1 ) dferror("duration is missing"); } else { # a duration has been specified or marked missing if( @(T-1)==2 ) dferror("should history = Yes?"); } }
Note the use of @T and @(T-1) to refer to data fields in the above example. @T refers to the data field to which the edit check is attached. Other fields can be referenced relative to this field by adding or subtracting the desired number of fields. Thus @(T-1) refers to the field immediately before @T, @(T+1) refers to the field immediately after @T, and @(T+5) refers to the fifth field following @T. Don't forget the brackets: @T+1 adds 1 to the value of the field to which the edit check is attached, whereas @(T+1) refers to the value of the field following the field to which the edit check is attached. Writing an edit check this way allows you to attach the same edit check to different fields in the CRF which share the same meaning and variable block structure.
@T
@(T-1)
@(T+1)
@(T+5)
@T+1
In the previous example, the edit check is attached to the duration field and the Yes/No question is thus @(T-1). In this example we have assumed that 1=Yes and 2=No. The edit check generates a message if duration is missing when a history item is marked Yes, and also complains if a duration has been specified or marked with a missing value code when the history variable has been answered No.
duration
history
Why did we attach the edit check in this example to the duration field and not to the Yes/No question? In general it is best to write edit checks with the intention of attaching them to the last field in the block of fields to be checked. This allows data entry to be completed on all fields within the block before the edit check is executed. Data entry staff will quickly get frustrated with edit checks if they pop-up messages just as they are about to make a correction which would have fixed the problem. Also, if the edit check is executed too soon, it may fail to detect an error condition that only becomes apparent after the block of fields has been completed.
Generic edit checks may also include references to data fields on other plates, but must do so explicitly, by using the variable name of each such field.
In the following examples, any variable names that are not locally declared are assumed to be the names of variables specified in the study schema.
If "Other" Race check box has been checked, does the "Other, specify" field also contain a value?
This example deals with 2 scenarios. If "Race" = "Other", then the "Other, specify" field must contain a value. If "Race" does not equal "Other", then the "Other, specify" field should not contain a value With this type of edit check, it is also a good idea to first check for existing queries on either the "Race" or "Other, specify" field. If queries exist, the edit check should probably abort as the problem has already been dealt with.
edit checkrace(){ #this is a field exit edit check attached to RACEOTH #abort if either RACE or RACEOTH has a query if( dfanyqc(RACE) || dfanyqc(RACEOTH) ) return; if( RACE==4 && dfblank(RACEOTH) ) dfaddqc(RACEOTH,1,"",1,2,""); if( RACE!=4 && !dfblank(RACEOTH) ) dfwarning("Other race was specified when unexpected."); }
Does the reported date of surgery occur before the study enrollment date?
edit days2surg() { number diff; diff = surgDate - entryDate; if ( diff < 0 ) dfwarning( "Surgery occurred ", -diff, " days before enrollment?" ); }
Does the subject meet all eligibility criteria?
This example also shows one way to use local variables that are symbolic names for choice values. [ This can be useful because the current version of the edit checks language does not interpret references to choice or check variable labels. ]
edit isElig() { choice no, yes; yes = 1; no = 2; if ( crit1 == yes && crit2 == yes && crit3 == yes && crit4 == yes && crit5 == yes && crit6 == yes ) { if ( elig != yes ) dfwarning( "Subject meets eligibility criteria", "but is not reported as eligible." ); } else { if ( elig == yes ) dfwarning( "Subject has not met all eligibility", criteria but is reported as eligible." ); } }
Compute age from birth date and compare with Age field
The local variable age is declared at the beginning of the edit check. Within the body of the edit check, age must be coerced to an integer because it is possible that the calculation used to assign a value to age will yield one or more decimal places. If the age value contains decimals, age and the database variable AGE will never match when compared because AGE is defined in the database as an integer.
age
AGE
edit checkage(){ number age; string s, m; if( dfmissing(AGE[,0,1]) || dfunresqc(AGE[,0,1]) || !dflegal(BDATE) || dfunresqc(BDATE) || !dflegal(EDATE) || dfunresqc(EDATE) ) return; age = int( (EDATE - BDATE)/365.25 ); if(age!=AGE[,0,1]){ m = "Age (" + (s=age) + "yrs) computed from birth" + " and entry dates\ndiffers from age (" + (s=AGE[,0,1]) + "yrs) on Form 1."; if(dfask(m + "\nAdd Query.",1,"Yes","No")==1 ) dfaddqc(BDATE,3,m+ "Explain below and resend corrected page.", 1,1,""); } }
If an adverse event has been checked on a follow-up form has the adverse event report been received?
This is an example of a cross-plate consistency check. This particular test can be dependent on the order that CRFs are validated from the new queue and hence should be programmed carefully. If the adverse event report is the subsequent page to the follow-up form during new record entry, it will not be present in the database when the follow-up form's edit checks are executed. A possible solution, as shown here, it to only execute this edit check when the follow-up form has been validated to at least level 1.
edit haveAEreport() { number AEplate = 10; # Using a variable as a symbolic name if ( DFVALID < 1 ) return; if ( hadAE == 1 ) # the hadAE choice box was checked "yes" # The assumption in this example is that the AE # form has the same visit number as the visit that # it is reported on. if ( dfmissingrecord( ,,AEplate ) dfwarning( "An adverse event was reported but" + " the AE report has not been received." ); }
Is the subject's weight within the normal range for their gender?
In some cases a single legal range will not apply to all subjects. Rather than settle for a single range in the study schema that is wide enough to accommodate all subjects, we can write an edit check to apply different legal ranges to different subjects.
edit weightLegal() { if ( sex == 1 ) { # male if ( weight < 120 || weight > 275 ) { dfwarning( "Weight is outside normal range for males." ); dfillegal( weight ); } } else if ( sex == 2 ) { # female if ( weight < 90 || weight > 210 ) { dfwarning( "Weight is outside normal range for females." ); dfillegal( weight ); } } }
Compute body mass index
This example computes a value for a database variable, bmi, that is stored in the database but is not on the original CRF. Remember that all you need to do this is to leave placeholders for these computed variables in your database schema when defining the CRF in DFsetup. This example also uses the dfmoveto function to move to the bmi variable if it can be computed, or to skip to a subsequent field if it cannot.
bmi
edit storeBmi() { if ( !dfmissing( weight ) && !dfmissing( height ) ) { # store the computed value and move to it bmi = weight / ( height * height ); dfmoveto( bmi ); } else # skip over bmi and move to dbp dfmoveto( dbp ); }
Send an e-mail notification of elevated blood pressure
The first part of the example uses two pieces: the edit check and the shell script that it calls. Note that this could all be done within the edit check, as is shown the second part of the example, but it could be harder to read.
The following example will send mail to a user named 'brian' each time a subject with elevated blood pressure is encountered when validating new records.
edit bpsendmail() { # send only on initial entry (validation = 0) if ((DFVALID == 0) && (dbp > 95 || sbp > 140)) dfexecute("bpalert", "brian ", ID, " ", DFSEQ, " ", dbp, " ", sbp); }
Note the use of extra spaces in the dfexecute call. These spaces are needed to ensure that the parameters passed to the script are not combined into one argument.
The bpalert script might be written as follows:
bpalert
#!/bin/sh # bpalert - alert user in $1 that subject $2, visit $3 # has elevated blood pressure dbp = $4, sbp = $5 mail -s "HYPERTENSION ALERT" $1 <<END SUBJECT = $2, VISIT = $3 DBP/SBP = $4 / $5 END
Use batch edit checks to verify consistency of subject initials
This example uses the dfbatch function to first determine if this edit check is being executed in batch or interactive mode. If dfbatch evaluates to TRUE, the edit check is being executed in batch and consistency checking will occur. If this edit check is triggered interactively, it will not execute. As this edit check executes only in batch mode, it is important to include a useful error message.
edit batchcheckinitials(){ #check consistency of subject initials in batch mode string m = "Different initials: visit 0 plate 1 = " ; if( !dfbatch() ) exit; if( @T != PINIT[,0,1] ) { m = m + PINIT[,0,1] + " visit ", @[6], " plate ", @[5], " = ", @T; dferror(m) ; } }
Check visit date order
Errors in visit dates can result in incorrect overdue visit and next visit calculations. This edit check verifies that visit dates are in the expected order. This edit check makes use of the group statement to define an array of visit date variables for visits 1 through 6. It will abort if it comes to a visit date variable that is blank or contains a missing value code. It is important that the error message is descriptive enough to help the user identify which visit date is likely in error.
edit checkvisitdates(){ #check order of visit dates across visits 1 to 6 #this is a field exit edit check attached to the variables #used as VisitDates number v=1, flag=0; string s, m = "Visit Dates are not in order:"; group vd vdt[,1,5], vdt[,2,5], vdt[,3,5], vdt[,4,5], vdt[,5,5], vdt[,6,5]; if( dfmissing(@T) ) exit; while( v<=6 ){ m = m + "\nVisit " + (s=v) + "" + (s=vd[v]); if( v<@[6] && !dfmissing(vd[v]) && vd[v]>=@T ){ flag=1 ; m = m + "?";} if( v>@[6] && !dfmissing(vd[v]) && vd[v]<=@T ){ flag=1 ; m = m + "?";} v = v + 1 } if( flag==1 ) dferror("Visit Date Problem(s)\n",m); }
The following message would be displayed in a pop-up dialog box during data entry if the edit check detected a visit date discrepancy.
Define a look-up table of study investigators
There are 2 steps involved in defining this edit check. The first step is to create a file containing a list of investigator names (e.g., $STUDY_DIR/lib/investigators.lut). For example:
$STUDY_DIR/lib/investigators.lut
Dr. Joseph Smith Dr. Sally Brown Dr. Jane Doe Dr. Michael Green
The file investigators.lut must also be registered in $STUDY_DIR/lib/DFlut_map. For example, the following entry would be appended:
investigators.lut
$STUDY_DIR/lib/DFlut_map
INVESTIGATORS|investigators.lut
The second step is the definition of the edit check.
edit investigatorpicklist(){ #this is a field exit edit check attached to the #Investigator style string s, m ; #look for an exact match, if found abort s = dflookup("INVESTIGATORS",@T,"",-1); if( !dfblank(s) ) return; #display pick list for user selection s = dflookup("INVESTIGATORS",@T,"",4); if( !dfblank(s) ){ @T=s; return;} #if user does not pick a value from pick list and leaves #the field blank, add a missing value query if( dfblank(@T) ) dfaddqc(@T,1,"",1,2,""); else{ #if user does not pick a value from pick list and enters #some other value, add an illegal value query m = @T + "is not a registered investigator."; dfaddqc(@T,2,m,1,2,""); } }
Define a generic function that will fill fields in the specified range of field numbers with the specified value
A generic function such as this can be referenced by name within any other edit check in the study. If you define generic functions in your DFedits file, be sure to define them at the beginning of the file. In this example, the generic function is called GEN_Fill, f and f2 define [by their numeric position] the range of fields that are to be filled, and value represents the string value that is to fill the specified fields.
GEN_Fill
f
f2
#GEN_Fill(f,f2,value) #enter the specified value into fields in the specified range number GEN_Fill(number f, number f2, string value) { while(f<=f2){ @[f]=value; f = f + 1; } }
Note that the function body does not actually return a [number] per its declaration. In such a case, the edit check environment returns a 0 value from the function, but this behavior should not be relied on. It is better to be explicit.
Here are some tips to help debug edit checks:
Use dferror/dfmessage/dfwarning statements at strategic places in your edit check.
Remember that local variables are initialized to blank and any operation on blank data produces a blank result.
Variables are passed by value and not by reference - it is not possible to add a query to a function argument from within a function.
Be careful to use a double equals, ==, in if statements if you are trying to compare values. A single equal, =, is valid and performs an assignment and test for non-zero. The compiler will produce a warning message if you try to assign values in an 'if' statement.
If the Check Syntax function in the setup tool reports problems, the actual error may be on the line just before the reported line number.
Do not use the same name for both edit checks and database variables. If you do, the compiler will issue an error message.
Do not use the same name for a local variable and a database variable. If you do, the local variable declaration hides the database variable until the end of the edit check or function.
The single quote, ', and the double quotes, ", are not interchangeable. Use double quotes whenever you are dealing with strings. Single quotes return the ASCII value of the single character they contain.
Use raw data entry to quickly test correct behavior.
Think about what happens if fields contain missing values.
Think about what happens when fields used in your calculations have queries attached to them.
Think about what happens if your edit check is executed again.
The shell level program DFcompiler can be used to compile the edit checks for DFexplore. Edit checks can be compiled from the command line as follows:
% DFcompiler -s # -o DFedits.bin DFedits
It is easiest to run DFcompiler from the study ecsrc directory and DFedits.bin will be saved in the same ecsrc directory.
Edit checks can also be compiled by selecting Publish in the Edits Checks dialog of DFsetup. Edit checks are loaded automatically when DFexplore, DFweb or DFcollect starts and can also be reloaded following a new compile, in DFexplore only, from the File > Reload > Edit checks menu.
The recommended procedure, when it is necessary to implement and test new edit checks or modifications of existing edit checks, is to use DFsetup's Development and Production studies feature. Among other benefits, this allows study users to continue using the existing edit checks without interference until the new version has been fully tested. It is also recommended that previous production versions of DFedits be preserved in case you ever need to rollback. A simple way to do this, for a study administrator, is simply to rename DFedits to DFedits_yyyymmdd, where yyyymmdd is the date on which the DFedits file was replaced by a new version.
_yyyymmdd
Allowable characters: A-Z, a-z, 0-9, _
No length limit
Case sensitive
First character must be a letter
Modifiers: [\n (new line), \r (carriage return), \t (tab), \f (form feed), \\ (backslash)]
\r
\f
\\
Maximum length: 1024 characters
The edit check language interpreter limits the number of instructions that can be executed per edit check to 1,000,000. This limit is imposed to provide endless-loop detection and prevention.
Reserved words cannot be used as edit check or variable names.
Batch is a framework for edit check execution in unattended (batch) mode. The framework specifies which data records are to be operated upon by edit checks and what is to happen when edit checks are invoked. It uses the same edit checks that are used in interactive edit checks through DFexplore and introduces no changes to the language.
From this point forward we'll refer to batch edit checks as DFbatch, the name of the program that oversees the batch edit checks process.
With DFbatch, it is possible to:
With DFbatch all of this is possible in an unattended mode.
This chapter contains overview descriptions as well as step-by-step instructions for most of the tasks you will perform with DFbatch.
IMPORTANT: Read entire chapter before proceeding DFbatch is a very powerful, and potentially dangerous if misused, application. We recommend that you read each section before using DFbatch for the first time. If you do not have time to read everything before starting, read DFbatch Basics and Using DFbatch at a minimum.
If you have previously used DFbatch, Using DFbatch, BATCHLIST Element Reference, and BATCHLOG Element Reference are good reference sections.
Limitations of DFbatch are outlined in Limitations.
A great deal of the value added by DFbatch is in the log information that it records. The syntax and semantics of the log information is described in Reference Pages. The XML nature of the log information lends itself to presentation in an HTML-enabled, web browser environment.
Sample control files and their purpose are listed in Example Control Files. Where file examples are given throughout the chapter, they may be fragments of a complete file, or a complete file themselves. If the example file begins with the XML declaration,
<?xml version="1.0"?>
it is the complete file, otherwise it is only a fragment from a file.
Common Pitfalls and System Messages describes common pitfalls and enumerates all of the possible error messages.
DFbatch can be thought of as an additional layer that sits on top of the edit checks language, in much the same way that DFexplore is another layer, albeit interactive, that sits on top of the same edit checks.
DFbatch interacts with the study database and the edit checks language in the same manner as DFexplore
DFbatch reads the edit checks file as input, in the same way that DFexplore does. Where DFexplore provides interactive control over the records that are to be processed, DFbatch provides a language, stored in a control file, wherein the records to be processed can be specified non-interactively by the user. That specification can subsequently be applied to a study database on a regular (or irregular!) basis, in an efficient and unattended manner. This causes DFbatch to retrieve from the study database, one-by-one, the records selected by the control file, and apply the edit checks, field-by-field, to every field where they are referenced. Data field changes and queries are sent back to the study database, and all messages are logged. This is equivalent to retrieving sets of task records in DFexplore and traversing over each field of each record with the keyboard or mouse to invoke the edit checks.
While DFbatch is useful in most situations where edit checks are processed, it is not necessarily appropriate in all situations. Before proceeding to implement DFbatch, consider the following situations where it is inappropriate:
Application of edit checks to level 0 records Batch edit checks cannot be applied to level 0 (new) records. If your workflow model is such that most edit checking must be performed during validation of new records, batch edit checks may not be appropriate.
Creation of queries that require user confirmation In batch, calls to dfaddqc() must always return with either a success status (equivalent to the user clicking OK) or a fail status (equivalent to clicking Cancel). It is not possible within a batch for some function calls to return success and some function calls to return false. If there is some uncertainty in the response (possibly because additional external information that is not available to the edit check must be reviewed), DFbatch is not appropriate.
dfaddqc()
Display of lookup tables DFbatch returns the default response for dflookup() in all circumstances unless an exact match can be found. If matching must be performed by a visual search of the lookup table, DFbatch is inappropriate.
dflookup()
Changes to record key fields The record key fields cannot be modified in batch, this includes: the study number, plate number, visit number, and subject ID.
On the other hand, DFbatch is appropriate in many more situations where interactive edit checks are not workable or painfully inefficient.
Cross-plate edit checks Records are generally processed in the order that they are received within a document. In an interactive setting, it is very possible that the other plate involved in a cross-plate check is not yet in the database. Typically, the logic of the edit check will gracefully fail because the other plate is missing. However, in batch, the likelihood of this occurring is substantially reduced, and at those times when it does occur likely represents a plate that is truly missing.
Re-application of edit checks When the logic of an edit check changes mid-study, it is too time consuming to re-apply study edit checks interactively via DFexplore, except for all but the smallest databases. With DFbatch, re-applying edit checks is trivial.
Application of new edit checks Using DFbatch, it is easy to add new edit checks over the course of a study and apply them retrospectively to existing data.
Improved audit information DFbatch logs a great deal of information about the environment in which edit checks execute. This can prove to be valuable debugging or audit information that is simply not available during interactive edit checks.
DFbatch is a command-line utility as well as a view in DFexplore. This section describes the command-line version of DFbatch which, apart from how it's invoked, is identical to the DFexplore DFbatch view. The command-line version of DFbatch is invoked from a command prompt or a scheduled cron or Windows Task Scheduler task.
DFbatch reads a control file of instructions as input, uses those instructions to select records from the study database, and executes all referenced or requested edit checks for those records, sending additions and changes back to the study database. At the same time, all actions taken by DFbatch are logged to an output file.
The output is a complete record of what happened during the execution of DFbatch. It is created so that incremental changes can be easily identified and is in a format that is amenable to post-processing. It is by default post-processed to an HTML document, which is only one possible view of the log results. The post-processing can be skipped, delayed, or subsequently re-applied to create a different view of the same results. DFbatch goes to great lengths to separate the creation of log information from its presentation.
It should be noted that DFbatch does not bypass the normal database activities that already define DFdiscover. Database records are requested from the study server in the same manner as existing applications, permissions and record locking are enforced, and changes are similarly sent back to the study server, where updates are performed and journal records are written. Record locks are observed so that DFbatch cannot modify a record that is currently locked by another DFdiscover application.
Certain aspects of edit checks are intrinsically interactive and do not lend themselves naturally to a batch environment. The dfask() edit check function, for example, is an interactive function where the user selects one of the two presented responses. Lookup table functions, via dflookup(), are also inherently interactive. Fortunately, each of these functions also requires the programmer to provide a default response as an argument. DFbatch uses this default response as the function return value in batch mode. This allows an edit check that would otherwise require user interaction to complete in a non-interactive environment.
dfask()
Perhaps the easiest way to learn DFbatch is to look at an example.
DFbatch needs a batch control file as input. This must be created by the user in advance of invoking DFbatch. The purpose of the batch control file is to specify criteria that select records from the study, optionally name edit checks to be executed (by default all edit checks referenced by variables of selected records are executed), and state actions to be applied as the edit checks are executed.
Example batch control file
<?xml version='1.0'?> <BATCHLIST version='1.0'> <BATCH name="simple"> <TITLE>This is a very simple batch</TITLE> <DESC>This batch executes edit check aeCoding on all records that are currently at level 2.</DESC> <ACTION> <APPLY which="none"/> <LOG which="data msg qc" file="simple_out.xml"/> </ACTION> <CRITERIA> <LEVEL include="2"/> <EDIT>aeCoding</EDIT> </CRITERIA> </BATCH> </BATCHLIST>
The first thing that you will notice is a new language. The batch control file is specified in eXtensible Markup Language, XML. In XML parlance, the input file is a document. [ It turns out that the log output from DFbatch is also written in XML. ] There are many public domain tools (written in Java, Perl, C, etc) that are to parse and transform XML. Some web browsers are even able to display XML directly.
XML is a markup language and the markup is in elements and attributes. The extent of an element is marked with tags, namely a start tag and an end tag. The start tag and end tag for an element have the same name but different notation. The start tag is denoted with <TAG> and the matching end tag is denoted with </TAG>. The terminology is important: for the element named TAG, the start tag is <TAG> and the end tag is </TAG>.
<TAG>
</TAG>
TAG
The content of an element is the data between the start and end tags. The relationships in the content are achieved by nesting elements.
In the example,
BATCHLIST
BATCH
DESC
ACTION
LOG
CRITERIA
LEVEL
EDIT
are elements. Each document must have exactly one element, called the root, within which all other elements are nested. In the example, BATCHLIST is the root element. It is an error for any text to appear after the end tag of the root element.
Nested elements must be properly balanced or the document will be invalid, meaning that it cannot be parsed.
Properly balanced, nested elements
<A><B></B></A>
Improperly balanced, nested elements
<A><B></A></B>
In this example the relationship between A and B is ambiguous and so is not valid.
Elements that have no content are called empty elements. Empty elements are denoted with a single tag that represents both the start tag and the end tag. The notation for an empty tag is <TAG/>. In the example,
<TAG/>
are empty elements. Empty elements are typically used to convey information via their attributes.
Data about elements can be kept in nested elements or, if the data is itself not structural, in attributes. In the example,
version is an attribute of BATCHLIST
version
name is an attribute of BATCH
which and file are attributes of LOG
which
include is an attribute of LEVEL
include
The example control file contains a single BATCH named simple. The batch has a brief TITLE and a more verbose description in DESC. These two elements are present for documentation purposes only and do not influence the processing.
simple
The records to be processed are selected by the CRITERIA element. In this example, only those records that have a current validation level equal to 2 and reference the aeCoding edit check are selected. For each selected record (those records that have a validation level of 2), the data fields are traversed in tab order and the edit check aeCoding is executed at every data field that references it by name through at least one of the variable's plate enter, field enter, field exit, or plate exit attributes. Other edit checks are not executed. Any actions that cause messages to be generated, queries to be added, or data field values to be changed are logged to the file simple_out.xml. The changes are not however applied to the database as is indicated by the which attribute of the APPLY element.
aeCoding
simple_out.xml
Of equal importance to the behavior that is stated, is the behavior that is not stated. In particular,
There is nothing about the input control file that specifies which study it applies to DFbatch control files are meant to be re-used across studies. Hence the input file itself does not state which study it belongs to. The association with a particular study is made when DFbatch is executed.
Many possible retrieval criteria are not stated As is true in the task set building dialog of DFexplore, criteria that are not stated match everything on that criteria. So for example, since no selection has been done by subject ID or visit number, all subject IDs and visit numbers are included.
The example control file must be saved to a named file that is accessible to the DFbatch program. There are two possible locations:
Stored on the server Server-side study-specific control files must be kept in the STUDY_DIR``/batch directory.
STUDY_DIR``/batch
Stored locally Locally stored control files can be stored anywhere on the system that you have access to.
For file naming conventions, we recommend that the control file use the name of the batch as the basename with a _in.xml extension (meaning input XML file). For example, a control file named simple for study 254 which is stored on the server where study 254 has a STUDY_DIR of /opt/studies/val254 would be saved to /opt/studies/val254/batch/simple_in.xml.
_in.xml
/opt/studies/val254
/opt/studies/val254/batch/simple_in.xml
To process the control file, DFbatch needs to login to your DFdiscover server the same way DFexplore does when you login. If you need to use a proxy server to access the internet, DFbatch automatically uses the same proxy settings as DFexplore. If you need to change these settings, use the Proxy Setting window in DFexplore to do this. Next, you need to tell DFbatch the name of your DFdiscover server, your username and your password. You can do this using -S, -U and -C command line options, by setting DFSERVER, DFUSER and DFPASSWD environment variables or by using DFpass. DFbatch is invoked with the command:
% cd /opt/studies/val254/batch % DFbatch -S server.somedomain.com -U datafax -C passwd 254 -i simple_in.xml
The login, study number and control file arguments can appear in any order. For more invocation options, see Invoking DFbatch.
The result of the command is the execution of the batch and the creation of the log file, /opt/studies/val254/batch/simple_out.xml on the server (as requested by the file attribute of the LOG element). The log file records the details of the batch execution but is not post-processed in any way.
/opt/studies/val254/batch/simple_out.xml
A more common way of running DFbatch is to process the batch, applying any actions and creating the log file as before, and then immediately post-processing the log to create a viewable HTML page. This is achieved with -p xsl as in:
-p xsl
% cd /opt/studies/val254/batch % DFbatch -S server.somedomain.com -U datafax -C passwd -p xsl 254 -i simple_in.xml -o somedir/simple.html
The option requests post-processing via a default XSL transformation (supplied with DFbatch) and the result is written to simple.html, typically in a directory that can be viewed from a web browser. [ The HTML output is explicitly re-directed to an output file at execution time (rather than making that file an attribute of the input control file) to allow for the various web-publishing infrastructures that are present at DFdiscover installations. ]
simple.html
Alternatively, an existing log file can be transformed at any time into an HTML file using DFstyle, a companion program to DFbatch. The invocation of DFstyle requires a log file created by a previous invocation of DFbatch (not an input control file).
% cd /opt/studies/val254/batch % DFstyle -p xsl simple_out.xml > somedir/simple.html
The exit status of DFbatch reflects the success of the command. If the exit status is 0, DFbatch executed successfully. Any other exit status indicates that an error occurred. The text of the error will appear in one of the last elements of the log file.
Testing DFbatch exit status in C-shell
% /opt/dfdiscover/bin/DFbatch -S server.somedomain.com -U datafax -C passwd 254 -i simple_in.xml % echo $status 0
In this section, we have introduced DFbatch and worked through our first example demonstrating creation of an execution log file and an HTML file.
The next section covers the control file layout, syntax, and semantics in greater detail.
This section covers the DFbatch control file language in detail and also provides addition detail for how DFbatch can be used.
The control file is an XML document that must address two needs:
it must state selection criteria for records to retrieve and/or edit checks to execute, and
it must state actions to take as edit checks are applied.
The document root element of the control file is always BATCHLIST. A BATCHLIST contains one or more BATCH elements. A BATCHLIST that contains zero BATCH elements is an empty input file. Although this is syntactically valid, it has no semantic purpose, and results in no processing. Most users will define one BATCH per control file as this is an easy organizational way to think about batches.
The basic processing element of the control file is a BATCH. BATCHes appear sequentially within the input and cannot be self-nested (that is, a BATCH may not contain another BATCH).
The general format of an input file
<?xml version="1.0"?> (1) <BATCHLIST version="1.0"> (2) <BATCH name="batch1"> (3) <TITLE>title of the batch</TITLE> (4) <DESC>description of the batch</DESC> <ACTION> (5) action directives </ACTION> <CRITERIA> (6) record selection criteria edit checks to be included </CRITERIA> </BATCH> <BATCH name="batch2"> <!-- another batch definition appears here --> (7) </BATCH> </BATCHLIST>
ODRF
The input file elements serve one of four purposes:
Descriptive These elements, TITLE and DESC, are present for identification and documentation purposes only.
Processing Directive These elements, CONTROL and MOVETO, allow the user to modify the behavior and limits of DFbatch as it executes.
CONTROL
MOVETO
Action These elements, ACTION, APPLY, LOG and ODRF, are used to specify what actions are to be taken as edit checks are executed.
Record and Edit check Selection These elements, CRITERIA, IDRF, ID, PLATE, VISIT, CREATE, MODIFY, LEVEL and STATUS, are used to select from the database the records that are to be processed, and optionally, EDIT, to select the specific edit checks that are to be applied to the fields of those records.
IDRF
PLATE
VISIT
CREATE
MODIFY
STATUS
Certain characters have special meaning to an XML parser and hence cannot appear directly in an XML document. This includes the input control file. To use one of these five special characters in an XML document, the corresponding entity must instead be used.
Special characters
"
'
Use of entity for special character
<?xml version='1.0'?> <BATCHLIST version="1.0"> <BATCH name="special"> <TITLE>Run checkElig & checkDemo</TITLE> <DESC>Among other things these edit checks test that the subject's age is >= 20 and <= 50. </DESC> <!-- note that &, <, >, ', and " can appear inside comments because comments are not parsed --> ... </BATCH> </BATCHLIST>
Record selection criteria are similar to those found in the By Data Fields retrieval option of DFexplore. Retrieval criteria for records are by: [site ID, subject ID (not subject alias), visit, plate, level, status, creation date, modification date]. These map respectively to the elements: [SITE, ID, VISIT, PLATE, LEVEL, STATUS, CREATE, MODIFY]. These criteria are specified as nested elements of the CRITERIA element. If any element is omitted (or empty), the records to be selected are not constrained by this element. For example, if PLATE is omitted (or empty), there is no restriction on plate and hence all plates are retrieved, subject to the other criteria. [ Retrieval by center and pattern are currently excluded. ] Where multiple elements are given as criteria, the selected records must satisfy the inclusion criteria of each element, that is, the selected records are the intersection set of the sets of selected records satisfying each element individually.
SITE
Each element is expected to appear at most once. If an element (accidentally) does appear more than once, the last element overrides any previous occurrences.
Each criterion allows a value, range, and/or lists of both. These are expressed as the value of the include attribute as in:
include="val,min-max,val2"
Selection criteria for all final records of visits 1 through 5, 10, and 20 through 29 inclusive
<CRITERIA> <STATUS include="final"/> <VISIT include="1-5,10,20-29"/> </CRITERIA>
The CRITERIA element accepts one optional attribute, sort. The value of the attribute determines what sort order, if any, is applied before processing to records that meet the selection criteria. The value is constructed from one or more sort keys (precedence for multiple leys is left to right) from the list: [id, visit, plate] where multiple keys are separated by ; and each key is preceded by either + to indicate that the key is sorted in ascending order, or - to indicate that the key is sorted in descending order.
sort
Use of the sort attribute to sort selected records on ascending id, then descending visit, and then descending plate
<CRITERIA sort="+id;-visit;-plate"> .... </CRITERIA>
It is also possible to select records by an existing DFdiscover Retrieval File (DRF). This is specified with the IDRF empty element and the file attribute as in:
Select records specified by IDRF
<IDRF file="test.drf" />
The element instructs DFbatch that the records to be processed in the batch can be found in a file named test.drf located in the study /drf directory.
test.drf
/drf
Select records specified by IDRF in a sub-folder using relative pathname
<IDRF file="sub_folder/test.drf" />
Select records specified by IDRF in a sub-folder using absolute pathname
<IDRF file="/STUDY_DIR/drf/test.drf" />
In the above two examples, the element instructs DFbatch that the records to be processed in the batch can be found in a file named test.drf located in the study /drf directory.
The use of IDRF is mutually exclusive of the other selection criteria and it is an error for the two to appear together in one batch's CRITERIA. Additionally, records selected by the IDRF element are processed in the order that they appear in the file, ignoring any value assigned to the file attribute of the parent CRITERIA.
By default, all of the edit checks that are referenced by data fields on the records selected by the criteria, are executed.
For each selected record, DFbatch performs the following steps
The data fields of the record are traversed in the normal order (typically plate top to plate bottom), executing any plate enter edit checks that are defined at each field. A plate enter edit check may change the traversal order as the result of a dfmoveto() call.
dfmoveto()
Starting at the first data field, the data fields are traversed in order. At each field any field enter edit checks, and then any field exit edit checks are executed. A field exit edit check may change the traversal order as the result of a dfmoveto() call.
The data fields of the record are again traversed in the normal traversal order, executing any plate exit edit checks that are defined at each field. Note that a dfmoveto() call always returns 0 in a plate exit edit check.
It is possible to include only certain edit checks to be executed by naming them in EDIT elements within the CRITERIA element. Records and fields are still traversed in the same manner but only those edit checks that match one of the included names are executed. This can be useful, for example, when a new edit check is introduced and requires testing.
Execute only the aeCoding edit check on plate 5 records
<CRITERIA> <PLATE include="5" /> <EDIT>aeCoding</EDIT> </CRITERIA>
The EDIT element is the only element that is allowed to repeat within CRITERIA. A single EDIT element may also list multiple comma or space separated edit checks by name in the element body.
Equivalent specifications for multiple edit checks
<CRITERIA> <EDIT>aeCoding</EDIT> <EDIT>checkInit</EDIT> <EDIT>sigLookup</EDIT> </CRITERIA>
is equivalent to
<CRITERIA> <EDIT>aeCoding,checkInit,sigLookup</EDIT> </CRITERIA>
When EDIT elements appear in the criteria, DFbatch optimizes the retrieval by automatically excluding from selection those plates that contain no variables which reference any of the named edit checks. For example, if the edit check named checkElig is referenced only by variables on plate 2, only records from plate 2 are selected even if PLATE does not appear in the criteria. However, if the PLATE element is present and includes a plate other than 2, the resulting set of selected records will be empty as there can be no records that reference both the checkElig edit check and are from a plate other than 2.
checkElig
Edit checks are capable of changing (or assigning) data field values, adding queries to fields, adding and deleting missing page queries for records other than the current record, and generating information, warning, and error messages. This leads to three important categories:
dfaddmpqc()
dfdelmpqc()
dfmessage()
dfwarning()
dferror()
In an interactive environment, the user has the ability to visually review and confirm or cancel each of these actions. In DFbatch this is not possible. Instead the batch control file must carefully state which actions will be allowed to proceed unattended and which ones will be canceled. This is specified in the ACTION element and its APPLY, LOG, and ODRF child elements.
The descendants of ACTION and their attributes
<ACTION> <APPLY when="changes|all" level="1|2|3|4|5|6|7" which="none msg data qc"/> <LOG when="changes|all" which="none msg data qc" file="logfile_out.xml" mode="create|write" share="yes|no" history="yes|no" /> <ODRF when="changes|all" which="none msg data qc" file="outfile.drf" mode="create|write" share="yes|no"/> </ACTION>
This element controls what categories of actions are applied back to the database.
IMPORTANT: Use with care This is the only element that controls whether or not DFbatch can request irreversible database changes to be made and hence it must be used with care. If you are uncertain about the behavior of a batch control file or the edit checks that it may execute, use <APPLY which="none"/>.
<APPLY which="none"/>
The attributes of APPLY have the following semantics:
when This attribute must have a value of all or changes. If the value is all, every selected record is sent back to the database as an update, even if the record contains no differences from its previous state. If the value is changes, only those records that contain changes are sent back to the database for update.
when
changes
level This attribute must be a single integer in the range of legal validation levels, 1 through 7 inclusive. It indicates the validation level to assign to records when they are sent back to the database for update. This causes DFbatch to behave in a manner equivalent to Validate mode in DFexplore. If this attribute is not specified, the validation level of any record sent back for update is not changed. This is equivalent to Edit mode in DFexplore.
level
which This attribute has a value that is one or more space delimited categorical keywords from the list: [none, data, msg, qc]. data indicates that all records with changed data fields should be updated back to the database. qc indicates that all queries added via dfaddqc() or dfaddmpqc(), modified by dfeditqc() or deleted via dfdelmpqc(), should update the database. msg is intended for logging purposes only and is present here for symmetry only. none indicates that no changes to data or queries should be updated back to the database. This is the recommended attribute value for this attribute in the context of APPLY.
msg
dfeditqc()
This element controls what categories of actions are logged to file.
The attributes of LOG have the following semantics:
when This attribute must have a value of all or changes. If the value is all, every selected record, and every selected edit check, is logged to file, even if the edit check or record created no changes or messages. If the value is changes, only those records and edit checks that created a change or message are logged to file.
which This attribute has a value that is one or more space delimited categorical keywords from the list: [none, data, msg, qc]. data indicates that all records and edit checks which would change data fields should be logged to file. qc indicates that all queries that would be added via dfaddqc() or dfaddmpqc(), modified via dfeditqc() or deleted via dfdelmpqc(), should be logged. msg requests that all messages generated via dferror(), dfwarning(), and dfmessage() be logged. none indicates that no actions should be logged. The recommended value for this attribute in the context of LOG is data msg qc, as in which="data msg qc".
data msg qc
which="data msg qc"
file This attribute specifies the pathname of the file that will store the log information. The location of the batch control file (server-side or local) determines where the log file will be stored. The attribute is not required if no logging has been specified with which="none".
which="none"
If the log file location is local and the pathname is given as a relative pathname, it is assumed to be relative to the directory that contains the input file. If a pathname is given, and the log file will be stored on the server, the pathname portion is discarded and only the filename portion will be used. If the attribute is missing but logging has been specified, DFbatch will construct a filename for the log as follows:
The directory name of the input file is the base.
Append the value of the batch's name attribute.
Append the fixed suffix, _out.xml.
_out.xml
It is also possible to include a sub-folder as part of filename specification. The location can start from either a relative or absolute path. The filename can be given as sub_folder/xx_out.xml, which is a relative path, or STUDY_DIR/batch/sub_folder/xx_out.xml, which is an absolute path. In both cases, the output file will be created in STUDY_DIR/batch/sub-folder/xx_out.xml. It is not possible to include ../ as a sub-folder: including ../ in any pathname will lead to an error.
sub_folder/xx_out.xml
STUDY_DIR/batch/sub_folder/xx_out.xml
STUDY_DIR/batch/sub-folder/xx_out.xml
../
mode This attribute specifies whether the log file should be created, create, or (over)written, write.
mode
create
write
share This attribute specifies whether the log file is readable and writable by other members of the same group, yes, or by the creator only, no. This attribute is applicable only to locally stored log files.
share
yes
no
history This attribute specifies whether DFbatch should distinguish between log entries that are new to this execution of the batch and entries that were present in previous executions, yes, or not, no.
If history is requested, it is required that the file be created with mode="write" and that the name of the log file and the batch criteria not change between executions. That is, DFbatch must be able to read an existing log file to create a previous execution history and then overwrite that log file with the history and new entries.
mode="write"
The purpose of this history mechanism is not to build a complete historical view containing all log entries ever generated. Instead it creates the log file containing only the entries from the current execution, flagging each of the entries to indicate whether this is the first time that it has appeared or whether it first appeared in a previous execution. Only those entries which are generated by the current execution are included when history is enabled.
The entries in the batch output are the same whether history is checked or not. If history is yes, the entries are divided by date to show in which run the entry first appeared. If history is no, all entries are shown together.
This element controls which records are written to a DFdiscover Retrieval File.
The attributes of ODRF have the following semantics:
when This attribute may have a value of all or changes. If the value is all, every selected record is written to the DRF, even if the record had no changes or messages. If the value is changes, only those records that had a change or message are written to the DRF.
which This attribute has a value that is one or more space-delimited categorical keywords from the list:
data: all records and edit checks which [would change data fields should create a DRF record
qc: all queries that would be added via dfaddqc or dfaddmpqc, modified via dfeditqc or deleted via dfdelmpqc, should create a DRF record
msg: all records for which messages are generated via dferror dfwarning and dfmessage shall be written to the DRF.
none: no DRF should be created (this is equivalent to not specifying the ODRF element).
The presence of a result DRF allows for the subsequent review of records that were (if changes were applied) or would have been (if changes were logged) processed by the batch.
file This attribute specifies the file that will receive the output DRF records. The attribute is not required if ODRF has been specified with which="none".
Output DRF files are written to the study /drf directory, and must have the .drf suffix, e.g. test.drf. If the attribute is missing but output DRF has been specified, DFbatch will construct a filename for the DRF.
It is also possible to include a sub-folder as part of filename specification. The location can start from either relative or absolute path. The filename can be given as sub_folder/xx_out.xml, which is a relative path, or /STUDY_DIR/batch/sub_folder/xx_out.xml, which is an absolute path. In both ways, the output file should be found in /STUDY_DIR/batch/sub-folder/xx_out.xml. The filename cannot contain ../ or /.. to alter the file path. Doing this will lead to error.
/STUDY_DIR/batch/sub_folder/xx_out.xml
/STUDY_DIR/batch/sub-folder/xx_out.xml
mode This attribute specifies whether the DRF should be created, create, or (over)written, write.
share This attribute specifies whether the DRF is readable and writable by other members of the same group, yes, or by the creator only, no. This attribute is applicable only to locally stored DRF files.
For the current implementation of DFbatch, the input file must contain only the elements and attributes that have been described and are valid for the document type definition. The presence of unknown elements or attributes will cause reading of the batch to fail, resulting in the batch not being processed. This will likely change in a future release.
For additional descriptions of each element and the attributes (and their purpose) that are valid for each element, consult BATCHLIST Element Reference.
The simplest way to invoke DFbatch is from the command-line with the command
% DFbatch -S server -U username -C password -i control_file study#
where the -S, -U and -C options, -i control_file and study# arguments are all required, The command line options -S, -U and -C may be used to supply user credentials for authentication, although a much better approach is to use DFpass as described in “User Credentials” and DFpass. The control_file can be located locally if the infile contains a path, or it can exist on your server, in the 'STUDY_DIR/batch' directory.
This DFbatch command processes, in document order and in the context of the selected study, all of the BATCH elements that appear in control_file.
Executing DFbatch on study 254 with the control file /opt/studies/val254/batch/test_in.xml
/opt/studies/val254/batch/test_in.xml
% DFbatch -S idemo44.datafax.com -U datafax -C passwd 254 -i test_in.xml
DFbatch accepts several other command-line options:
-O out_dir_name
xlsx_file
-X out_dir_name
DFbatch is ideally a batch process, run during off-hours. In the UNIX environment this is easily accomplished with the cron program. To use cron, simple include the needed DFbatch command-line equivalents in the crontab file. For additional information on using cron, refer to the UNIX man pages.
crontab
Using cron to run DFbatch at scheduled times
0 23 * * * /opt/dfdiscover/bin/DFbatch -S idemo44.datafax.com -U datafax -C passwd 254 -i test_in.xml
Remember that the environment and path variables defined by your login are not available to cron, and hence /opt/dfdiscover must be set explicitly.
NOTE: DFbatch exit statusWhen executing DFbatch several times with a sequence of control files (as might be done in a shell script), it is recommended to check the exit status of each execution before continuing with the next. Refer to DFbatch exit status.
Recall that the root document element, BATCHLIST, can contain one or more nested BATCH elements. The option -b batchname(s) can be used to select for processing specific batches by their name. For example, the command:
-b batchname(s)
% DFbatch -S idemo44.datafax.com -U datafax -C passwd -b simple 254 -i test_in.xml
processes only the batch named simple from the control file, independent of how many other batches are defined. If the control file defines only the batch named simple then execution of DFbatch with and without -b simple are equivalent. However, if the control file defines three batches named [simple, hard, and, difficult], it would be possible to process only a subset of the defined batches with a command like
-b simple
hard
difficult
% DFbatch -S idemo44.datafax.com -U datafax -C passwd -b "simple difficult" 254 -i another_in.xml
which would process the two batches simple and difficult while skipping the batch named hard.
When multiple batches appear in the control file, and multiple batches are processed, default ordering specifies that the batches be processed in the order that they appear in the file. Using -b it is possible to alter the processing order at invocation time without altering the control file.
-b
Consider the following skeleton of a control file.
<?xml version="1.0"?> <BATCHLIST> <BATCH name="b1"> ... </BATCH> <BATCH name="b2"> ... </BATCH> <BATCH name="b3"> ... </BATCH> </BATCHLIST>
Invoked in the default fashion,
% DFbatch -S idemo44.datafax.com -U datafax -C passwd 254 -i simple_in.xml
DFbatch will process batches b1, then b2, and finally b3. Order of processing can be altered on an ad-hoc basis through use of -b, as in the following example that processes batch b3 first, and then b1, skipping b2:
b1
b2
b3
Process batches named b3 and b1
% DFbatch -S idemo44.datafax.com -U datafax -C passwd -b "b3 b1" 254 -i simple_in.xml
NOTE: If non-default ordering is commonly achieved with this method, it is recommended that the control file be edited to permanently re-order the batches.
The log information recorded by a batch execution is also in XML format. This makes it amenable to further processing, filtering, or transformation (also called styling). Of course, post-processing is only meaningful if the control file uses a LOG element and the value of the which attribute is not none; otherwise, there is no log information available to post-process.
[ For this release, the only post-processing that is supported by DFbatch is transformation of the log output via XSL. ] This is accomplished with one or more XSL style sheets that are able to transform the log information into HTML without affecting the log file itself. HTML is the most common output format but is certainly not the only one. An XSL transformation could just as easily create another XML file, plain text output, or even PDF. The simplest method for specifying post-processing is at DFbatch execution time with the command:
% DFbatch -p xsl study control_file > htmlfile
This command processes, in the same manner as already described, the batch in control_file in the context of study study. When the processing completes the log information is immediately post-processed with the default XSL transformation to create an HTML view of the log information that is stored to htmlfile. Note that the log file is not changed which allows post-processing to re-occur at a future time with the same XSL transformation, or a different XSL transformation to achieve different views of the same log. This post-processing of XML via XSL (or other) transformation is the key ingredient that allows DFbatch to separate content creation from presentation.
htmlfile
DFbatch comes with a default file for XSL transformation that is stored in /opt/dfdiscover/lib/xsl/batchlog.xsl and is referenced by the file /opt/dfdiscover/lib/stylesheets.xml. It is possible to create other XSL transformations and incorporate them into /opt/dfdiscover/lib/stylesheets.xml. In such a case, it is possible to post-process the log information with an alternate, non-default, XSL transformation using the command:
/opt/dfdiscover/lib/xsl/batchlog.xsl
/opt/dfdiscover/lib/stylesheets.xml
% DFbatch -p XSL=name study control_file > htmlfile
which applies the XSL transformation named name to the log file, instead of the default transformation. Note that the case of the option is important: -p xsl requests the default transformation, -p XSL=name requests a specific transformation.
-p XSL=name
So far, we have only seen how to post-process a log file at the time of DFbatch execution. This unfortunately does not de-couple log creation from presentation as DFbatch must be run again to re-do the post-processing. De-coupling log creation from presentation requires an additional program, DFstyle.
The DFstyle program can be invoked at any time to apply a transformation to an XML file (not just a batch log file). DFstyle is invoked with either:
% DFstyle -p xsl XML file > htmlfile
which applies the default transformation, or
% DFstyle -p XSL=name XML file > htmlfile
which applies the transformation named by name. Notice that XML file can be any XML file, although in the context of DFbatch, it is an existing log file (not control file) from a previous execution. Also notice that the study number is not supplied to DFstyle - it does need the context of the study number to transform the log file.
XML file
Using DFstyle in conjunction with DFbatch, it is now possible to independently execute a batch to create a log, and then transform that log information into another format for viewing, thus achieving the de-coupling of content creation from presentation.
Complimentary uses of DFbatch and DFstyle
Remember that DFstyle can transform a log file independent of when the log file was created. As a result the following two scenarios produce the identical HTML file.
The first scenario performs the post-processing at DFbatch execution time,
% DFbatch -S idemo44.datafax.com -U datafax -C passwd -p xsl 254 -i /opt/studies/val254/batch/example_in.xml \ -o /opt/studies/val254/batch/htmlfile
while the second uses DFstyle to perform the post-processing at some, potentially much, later point in time:
% DFbatch -S idemo44.datafax.com -U username -C passwd 254 -i /opt/studies/val254/batch/example_in.xml % DFstyle -p xsl example_out.xml > htmlfile
If multiple XSL transformations are defined, it also possible to create multiple presentations of the same log file.
Creating multiple presentations
% DFbatch -S idemo44.datafax.com -U datafax -C passwd -p xsl 254 -i /opt/studies/val254/batch/example_in.xml \ -o /opt/studies/val254/batch/htmlfile1 (1) % DFstyle -p XSL=secondtransform example_out.xml > htmlfile2 (2) % DFstyle -p XSL=thirdtransform example_out.xml > htmlfile3 (3)
secondtransform
thirdtransform
The challenge in using DFstyle to transform a log file is to know the name of the log file that was created by a batch control file. Consistent naming between batches and input/log files simplifies this challenge.
Keep the following issues in mind as you learn about and begin to use batch edit checks.
Decide on file naming conventions and stick to them There are potentially many input control files and output log and drf files that can be created and naming conventions will help you to keep them straight. Minimally, try using suffixes to distinguish between file types, something like this:
If possible, define only one BATCH per input control file. Use the name of the batch as the base name of the input file.
Use the APPLY element sparingly Instead, use LOG and ODRF as often as possible. Remember that the APPLY element will cause actual, irreversible changes to be made to the database. There is no rollback feature in DFbatch.
Further remember that the owner of created queries and the last modifier of changed data becomes the person that executed DFbatch. This implies that different batches should be run by different people, likely at different validation levels, just as if the edit checks were being tripped by individuals during interactive validation. Certain batches will lend themselves to being run by first level data entry, at validation level 1, while others will lend themselves, such as adverse event and medication coding, to being run by subsequent reviewers at higher validation levels. Alternatively, you may decide to make one person responsible for running all batch edit checks. DFbatch does not require or restrict either implementation method.
Perform extensive testing of edit check logic in DFexplore Test all edit checks extensively in DFexplore before using them in DFbatch. In the 10-20 seconds required to validate a record interactively, DFbatch will process hundreds of records. It is much easier to deal with a single edit check error than potentially hundreds of the same error.
Re-usability of batches versus specificity Do certain types of batch control files lend themselves to being generic while others are study specific? Where multiple batches need to be defined, is it better to define them in one BATCHLIST with multiple BATCHes, or in multiple BATCHLISTs with one (or a few) BATCHes per.
One reason to place multiple BATCHes in one BATCHLIST is that the order of execution of the each BATCH is executed in the order that is in encountered in the file, reading the file from top to bottom.
Executing DFbatch from cron What timing issues need to be considered before running DFbatch from the UNIX cron?
Try to limit how much a batch does Avoid input control files that execute all edit checks on all records. This can lead to very large log files that are difficult to post-process and subsequently apply history to. [ Experience to date has shown that log files larger than 5MB in size cannot really be post-processed efficiently. ]
Primary records only Remember that only primary records are processed in batch.
Consider dedicating a validation level to DFbatch If DFbatch is being used to make changes to database records and add/delete queries, consider using one of the validation levels only for the purpose of marking records that have been through batch processing. For example, assume that all data records go through two levels of review by data clerks (first review moves record from level 0 to 1, and second review moves record from level 1 to 2) resulting in database records (hopefully final) at level 2. If the next step in the data review process required application of all edit checks to all level 2 records, level 3 could be used as an indication of records that had been through batch processing. The input control file would look something like:
<?xml version="1.0"?> <BATCHLIST version="1.0"> <BATCH name="level2"> <TITLE>Process all level 2, primary records</TITLE> <DESC>This batch processes all level 2, primary data records, moving them to level 3 after they are processed.</DESC> <ACTION> <APPLY when="all" which="data msg qc" level="3"/> (1) <LOG when="changes" which="data msg qc" (2) history="no"/> </ACTION> <CRITERIA sort="+id;+visit;+plate"> <STATUS include="primary"/> <LEVEL include="2"/> (3) </CRITERIA> </BATCH> </BATCHLIST>
when="changes"
Log only the changes (this is not required, but does reduce the amount of log information recorded when edit checks do nothing and records are not changed).
Also, turn off history. Because of the way that this control file is designed, this attribute setting really makes no difference. Since level 2 records are always promoted to level 3, it will never occur that the same log entries repeat, because the selected set of records will always be different.
This section describes the behavior of interactive edit check functions in the non-interactive environment of DFbatch, and lists items that cannot and should not be done with DFbatch.
Certain features of the edit checks language are intrinsically interactive, for example, the dfask() function. When these features are encountered in DFbatch, they must be handled intelligently and consistently. The following describes the behavior of edit check functions when executed in DFbatch. Functions that are not listed behave identically in both environments.
dfask( query, dflt, accept, cancel ) always returns dflt.
dfask( query, dflt, accept, cancel )
dflt
dflookup( table, var, dflt, method ) always returns dflt when method has any value other than -1. If method=-1, dflookup behaves the same in both environments, returning the result element from table if an exact match can be made, dflt otherwise.
dflookup( table, var, dflt, method )
method
method=-1
table
WARNING: Improper coding of dflookup without consideration to this behavior in DFbatch can lead to unexpected results. In particular, previously coded fields can be replaced with the default.
Improper usage
The following use of `dflookup` would *always* replace the value in the current variable with the default parameter, `""` in this case.
string s; s = dflookup( "TABLE", @(T-1), "", -1 ); if ( s != "" ) @T = s; else @T = dflookup( "TABLE", @(T-1), "", 0 );
The intention of the edit check appears to be a good one: store an exact match if it can be found, otherwise display the table and store the selected result. In DFexplore, this works fine. However, in DFbatch, the second invocation of dflookup() will never display the table and will always return "", which after the assignment, effectively erases any previous value in @T.
An example of a general implementation strategy for dflookup that considers the behavior in DFbatch is given below.
Implementation strategy for dflookup
# Look-up the value in @(T-1) and store the result in @T if ( dfbatch() ) { # Use the result only if an exact match is available and then # only if the field is not already coded result = dflookup( "TABLE", @(T-1), "", -1 ); if ( result == "" ) return; if ( dfblank( @T ) ) @T = result; else if ( result != @T ) dferror( "Previously coded value of ", @T, " and new result ", result, " do not match." ); } else { # insert existing dflookup code here }
dfillegal() always returns 0. Setting the field color to the illegal value color is only meaningful in DFexplore.
dfillegal()
dfbatch() always returns 1 in DFbatch and 0 in DFexplore.
dfbatch()
The behavior of dfaddqc(), dfeditqc(), dfaddmpqc(), and dfdelmpqc() is controlled by the which attribute of the APPLY element. If APPLY contains the attribute which="qc", the behavior of these functions is equivalent to the user interactively always choosing OK, that is, the queries are always added/deleted. If the value for which does not contain qc, the behavior of these functions is equivalent to the interactive user who always chooses Cancel, that is, the query operations are always canceled.
which="qc"
The behavior of dfmessage(), dfwarning(), and dferror() is controlled by the which attribute of the APPLY or LOG elements.
All other errors that would cause a dialog to appear in interactive edit checks cause a log message to be written to the batch log file, if there is one, or standard error, otherwise. For example, if the referenced lookup table is not a plain ASCII file, an error message will appear on the command line when the batch is initialized.
The following actions are not possible with DFbatch.
It is not possible to change the status of any primary record to secondary in batch.
It is not possible to change the value of any key field of a record. This means that the plate, visit, and ID numbers may not be changed, regardless of whether or not those fields are barcoded. This limits the usefulness, in batch, of edit checks that calculate and assign the visit number of a CRF.
It is not possible to apply batch edit checks to new (validation level 0) records, or records with status missed.
It is not possible to to assign a validation level that is greater than the maximum permitted to the user by their DFdiscover permissions.
It is not possible to query or interact with the user when something unexpected happens. Most unexpected events in DFbatch result in the writing of a message to the log file followed by immediate termination of the batch execution.
The FDA's Guidance for Industry: Computerized Systems Used in Clinical Trials explicitly states
Features that automatically enter data into a field when that field is bypassed should not be used.
Although not stated, the spirit of the guidance relates to reported values. Calculated or derived values that are not part of the source record should be exempt from this guidance.
To be safe, however, we do not encourage creating or processing edit checks that blindly assign values to data fields. A preferred alternative is to compute what the expected value of a field is and then compare it against the recorded value, signaling an error when the two do not match.
Comparing calculated and reported values
Rather than simply replacing reported values, edit checks should be written to compare calculated and reported values, issuing an error message when the two do not match.
# s contains the calculated value for what should be in @T s = "some calculated value"; if ( s != @T ) dferror( "The expected value, ", s, ", and the reported value, ", @T, ", do not match." );
Should it be necessary to add or change data values using edit checks, DFdiscover will by default, automatically create a reason indicating that the change was made by an edit check rather than manually. These reasons have this standard format:
Set by edit check ECname
Going with the old saying that "a picture is worth a thousand words", this section is a collection of example control files.
Execute all edit checks on all records, logging their actions without actually applying them
<BATCHLIST> <BATCH name="batch1"> <TITLE>All edit checks on all records</TITLE> <ACTION> <!-- this APPLY statement isn't actually needed as the default is to apply none, but being explicit about it never hurts --> <APPLY which="none" /> <LOG which="data msg qc" file="batch1_out.xml" mode="write"/> </ACTION> <CRITERIA> </CRITERIA> </BATCH> </BATCHLIST>
Execute all edit checks on all incomplete, level 1 records for plates 1 through 5, applying no changes, and logging only the messages generated
<BATCHLIST> <BATCH name="batch2"> <ACTION> <APPLY which="none" /> <LOG which="msg" file="batch2_out.xml" mode="write"/> </ACTION> <CRITERIA> <STATUS include="incomplete" /> <LEVEL include="1" /> <PLATE include="1-5" /> </CRITERIA> </BATCH> </BATCHLIST>
Perform AE coding on all level 4 records, assigning the records to level 5 when coding is complete
<BATCHLIST> <BATCH name="batch3"> <ACTION> <APPLY which="data" level="5" /> <LOG which="data" file="batch3_out.xml" mode="write"/> </ACTION> <CRITERIA> <LEVEL include="4" /> <PLATE include="12" /> <EDIT>aeCoding</EDIT> </CRITERIA> </BATCH> </BATCHLIST>
Notice the statement:
<APPLY which="data" level="5" />
requests DFbatch to make changes to the processed records and change their validation level to 5. Before including this statement in the batch control file, you should thoroughly test that the aeCoding() is performing as expected.
aeCoding()
Execute the echoWho edit check on all level 1 and 2 records, processing records in the specified order
echoWho
<BATCHLIST> <BATCH name="batch4"> <ACTION> <APPLY which="none" /> <LOG which="data msg qc" file="batch4_out.xml" mode="write"/> </ACTION> <CRITERIA sort="-id;+visit;+plate;-img"> <LEVEL include="1-2" /> <EDIT>echoWho</EDIT> </CRITERIA> </BATCH> </BATCHLIST>
Execute the missingAEreport() edit check on all plate 15 records, adding and logging only queries
missingAEreport()
<BATCHLIST> <BATCH name="batch5"> <ACTION> <APPLY which="qc" /> <LOG which="qc" file="batch5_out.xml" mode="write"/> </ACTION> <CRITERIA> <PLATE include="15" /> <EDIT>missingAEreport</EDIT> </CRITERIA> </BATCH> </BATCHLIST>
This would be a beneficial edit check in a scenario where plate 15 contains a question like: "Did the subject experience an adverse event? If yes, please submit adverse event report". The assumption is that the missingAEreport() edit check tests the condition and adds a missing page query if the condition is true. Interactively, we don't know if the missing adverse event report is possibly the next page in the document, so this makes more sense to do in batch.
Experience has demonstrated that the following areas of DFbatch are known to be problematic and may result in confusion at the end-user level:
The control file lists both plate and edit checks selection criteria. The plate list that results from the intersection of these two criteria is the empty set. The result is that DFbatch executes without error but no records are processed.
Edit checks that assign the sequence number based upon fields after the sequence number and then dfmoveto() back to the sequence number. This can lead to a loop condition because the attempt to assign the sequence number will always fail.
At runtime, the interpreter may detect errors that generate messages. Error messages from DFbatch appear either on the command-line (or the shell from which DFbatch was started via cron) or in the BATCHLOG file as a M element with a type of s (short for system).
BATCHLOG
type
Even though these system messages appear as M elements, it is not possible to filter them by excluding msg from the which attribute of either the APPLY or LOG elements. They will always appear in the output log file. While it is possible to subsequently filter out these messages with an alternate style sheet, this practice is not recommended.
Messages that appear on the command-line have the following format:
ERROR[batchname,type]: message (1) (2) (3)
The severity of the error detected from the list:
aa abort all - processing of the current batch and any subsequent batch is halted
ab abort batch - processing of the current batch is halted but processing resumes with the next batch, if any
w warning - processing of the current batch continues and is followed by processing of any subsequent batches
This section describes every element in the BATCHLIST document type definition. The description is offered in two complimentary views. The first view is a tree diagram illustrating the relationships between the elements. The second view is a reference where the meaning and use of every element is described.
A Document Type Definition (DTD) defines the set of elements and their attributes that are valid within a document, how many occurrences of each are allowed, and in what order they are allowed. The batch edit checks input control file has its own DTD that is listed here.
The root element of the input control file is BATCHLIST.
The tree structure of the DFbatch input control file, starting at the root, BATCHLIST.
The description of each element in this reference is divided into sections.
Provides a quick synopsis of the element. The content of the synopsis varies according to the nature of the element, but may include any or all of the following sections:
Content model This is a concise description of the elements that it can contain. This description is in DTD "content model" syntax which describes the name, number, and order of elements that may be used inside an element. The syntax contains: element names, keywords, repetitions, sequences, alternatives, and groups.
Element names An element name in a content model indicates that an element of that type may (or must) occur at that position.
Keywords A content model that consists of the single keyword EMPTY identifies an element as the empty element. Empty elements are not allowed to have any content. The #PCDATA keyword indicates that text may occur at that position. The text may consist of any characters that are legal in the document character set.
EMPTY
#PCDATA
Repetitions Repetition of an element is accomplished by following the element name with one of the following characters: `for zero or more times,+for one or more times, or?` for exactly zero or one time. If no character follows the element name, then it must appear exactly once at that position.
for zero or more times,
for one or more times, or
Sequences If element names in a content model are separated by commas, then they must appear in sequence.
Alternatives If element names in a content model are separated by vertical bars, then they are alternatives, requiring the selection of one or another element.
Groups Parenthesis may be used around part of a content model to form a group. A group formed this way can have repetition characters and may occur as part of a sequence.
Note that at this time, there is no element that allows a mixed content model, that is, an element that accepts both text and sub-elements.
Attributes Provides a synopsis of the attributes on the element.
Describes the semantics of the element in detail. Typically contains the following sub-sections:
Processing Expectations Summarizes specific processing expectations of the element. Many processing expectations are influenced by attribute values. Be sure to consult the description of element attributes as well.
Parents Lists the elements that are valid parent elements.
Children Lists the elements that are valid child elements.
Provides examples of proper usage for the element.
ACTION — processing directives for the current batch
Content model
ACTION ::= (APPLY?, LOG?, ODRF?)
Attributes
Parents
These elements contain ACTION: BATCH
Children
The following elements occur in ACTION: [ APPLY , LOG , ODRF ].
Processing action that applies all records with new queries to the database and logs all activity
<ACTION> <APPLY which="qc" when="changes" /> <LOG which="data msg qc" when="all" /> </ACTION>
APPLY — which actions are applied to database, when, and at what validation level
APPLY ::= EMPTY
The value for this attribute is one or more (space-delimited) words from the enumerated list. Inclusion of a word in the attribute value indicates that the batch is interested in the action of this operation.
Edit checks operate in three distinct areas:
none is beneficial only when used on its own. Specifying another word in addition to none is the same as not specifying none at all. For example,
which="none msg"
is equal to
which="msg"
Level specifies the validation level that should be assigned to processed records when they are written back to the database. The level may never be greater than the maximum validation level permitted to the user.
If the level is not specified, the validation level of the record is not changed. This is equivalent to working in Edit mode in DFexplore.
NOTE: A processed record (a data record) will be written back to the database when: - the data record has been changed by one or more edit checks and which="data" has been specified, or - the data record has not been changed, which="data" and when="all" have been specified, and the record’s current validation level does not match the level specified in level="#", or - a query has been added or changed, which="qc" has been specified (so the query is written), and the record’s current validation level does not match the level specified in level="#".
which="data"
when="all"
level="#"
These elements contain APPLY: [ ACTION].
The following elements occur in APPLY: None.
Apply no changes to the database - this is the recommended usage for this element
Apply all records to the database that have had data fields modified, qc notes added or deleted, or messages generated and assign those records a validation level of 2
<APPLY when="all" which="data msg qc" level="2" />
BATCH — a named set of processing actions and record retrieval set
BATCH ::= (TITLE?, DESC?, ACTION?, CRITERIA?)
These elements contain BATCH: BATCHLIST
The following elements occur in BATCH: [ TITLE , DESC , ACTION , CRITERIA ].
A simple BATCH that logs all messages generated by the msgIllegalBP edit check when applied to level 1, plate 2 records
<BATCH name="checkBP"> <TITLE>Check systolic and diastolic blood pressure readings</TITLE> <ACTION><LOG which="msg" /></APPLY> <CRITERIA> <EDIT>msgIllegalBP</EDIT> <LEVEL include="1" /> <PLATE include="2" /> </CRITERIA> </BATCH>
BATCHLIST — root element of input control file
BATCHLIST ::= (CONTROL?, BATCH+)
The version of the BATCHLIST language. If subsequent versions of the language are developed, the version number will change. The number to the left of the . is the major version number and the number to the right is the minor version number.
.
Note that this version number is in no way related to the version number of the XML language that appears at the head of each input file in the processing instruction
A BATCHLIST is simply a container element for one or more BATCH elements. The BATCHLIST is the root element of the input control file.
Processing Expectations
When multiple BATCH elements appear in a BATCHLIST, they are, by default, processed in the order that they are defined in the BATCHLIST. The processing order can however be altered at runtime through the -b option.
The BATCHLIST root element must be present even if only one BATCH is defined. It must appear exactly once in the file.
It is an error for any characters to appear after the </BATCHLIST> end tag.
</BATCHLIST>
These elements contain BATCHLIST: None.
The following elements occur in BATCHLIST: [CONTROL, BATCH].
A BATCHLIST containing two BATCHes, named first and example
<?xml version="1.0"?> <BATCHLIST version="1.0"> <BATCH name="first"> <CRITERIA> <PLATE include="1-10" /> <EDIT>codeAE</EDIT> </CRITERIA> </BATCH> <BATCH name="example"> <ACTION> <APPLY which="data msg qc" level="2" when="changes" /> </ACTION> <CRITERIA> <LEVEL include="1" /> </CRITERIA> </BATCH> </BATCHLIST>
CONTROL — grouping element for batch edit checks processing options
CONTROL ::= (MOVETO?, REASON?)
Certain elements of the control input file language may themselves be present to provide end-user control over the actions and behavior of the batch edit checks program itself. These elements are defined within a CONTROL or REASON element.
The CONTROL and REASON elements apply to all BATCH elements within the input control file and hence is defined once before any BATCH definition.
These elements contain CONTROL: [BATCHLIST].
The following elements occur in CONTROL: [MOVETO, REASON].
Specify that the looping limit is 30 consecutive dfmoveto() invocations and a non-default reason for data changes
<CONTROL> <MOVETO number="30" /> <REASON>converting all responses to 2 decimals of precision</REASON> </CONTROL>
CREATE — select records for inclusion by the record creation date
CREATE ::= EMPTY
Records are selected by creation date with this element. If the creation date of a record is in the range of included creation dates (specified by the value of the include attribute), the record becomes part of the set of records to be processed.
If no inclusion criteria are specified, records are selected for inclusion independent of the creation date. This is equivalent to not specifying a CREATE element.
The date notation for expressing constant date values is the standard DFdiscover notation, yy/mm/dd, which means two-digit year of century, two-digit month of year, and two-digit day of month. The word today may appear in the selection criteria, and has the effect of selecting records created with todays date.
These elements contain CREATE: CRITERIA
The following elements occur in CREATE: None.
Select records that were created on or after January 1, 2000
<CREATE include="00/01/01-today" />
CRITERIA — record selection criteria for the batch
CRITERIA ::= ((IDRF| (ID | PLATE | VISIT | CREATE | MODIFY | LEVEL | STATUS)*), EDIT*)
The sort method to be applied to the selected records before processing the edit checks.
The value for the attribute contains one or more (semi-colon delimited) words from the list: [id, visit, plate, img] and each word allows an optional leading + or - symbol. Each word indicates what key the sort is on, the first word has the highest priority, the last word has the lowest, and the optional symbol indicates ascending sort order (+) or descending sort order (-).
img
If a sort key is omitted, the sort order on that key in the selection set of records is arbitrary.
Example: Use of the sort attribute
Sort by ascending subject ID, and then descending visit identifier within subject.
sort="+id;-visit"
The CRITERIA specifies the record selection criteria for the current batch. Records can be selected by:
a previously created DRF file using the IDRF child element,
key fields using the ID, VISIT, PLATE, CREATE, MODIFY, LEVEL, and STATUS child elements,
edit checks that they reference using the EDIT child element,
a combination of DRF and edit check names, or
a combination of key fields and edit check names.
These elements contain CRITERIA: BATCH
The following elements occur in CRITERIA: [ IDRF, ID, PLATE, VISIT, CREATE, MODIFY, LEVEL, STATUS, EDIT ].
Select records from plate 5, modified on or after January 1, 2000, at validation level 3, for subjects 1000 to 4999 inclusive, and for visits 1 through 10, inclusive, and 14
<CRITERIA> <PLATE include="5" /> <MODIFY include="00/01/01-today" /> <LEVEL include="3" /> <ID include="1000-4999" /> <VISIT include="1-10,14" /> </CRITERIA>
DESC — verbose description for the batch purpose
DESC ::= #PCDATA
This optional description is for documentation purposes only. The contents are not read nor understood by DFbatch.
These elements contain DESC: [BATCH].
The following elements occur in DESC: None.
Description of a batch
<DESC>This batch executes edit checks needsAE and findAEreport on all records that are currently at level 1. </DESC>
EDIT — select records for inclusion by the edit checks that reference them
EDIT ::= #PCDATA
This element selects the edit checks to execute by their name. EDIT is the only element that is permitted to repeat with CRITERIA.
The body text of the element is a single edit check name, or a comma-delimited or space-delimited list of edit check names.
These elements contain EDIT: CRITERIA
The following elements occur in EDIT: None.
Select records that reference the edit checks medHx or codeAE
<EDIT>medHx</EDIT> <EDIT>codeAE</EDIT>
An alternative and equivalent selection of records that reference the edit checks medHx or codeAE is:
<EDIT>medHx,codeAE</EDIT>
ID — select records for inclusion by the subject ID
ID ::= EMPTY
Records are selected by subject ID with this element. If the subject ID of a record is in the range of included subject IDs (specified by the value of the include attribute), the record becomes part of the set of records to be processed.
If no inclusion criteria are specified, records are selected for inclusion independent of the subject ID. This is equivalent to not specifying an ID element.
These elements contain ID: CRITERIA
The following elements occur in ID: None.
Select records for subject IDs 1000 through 4999 inclusive, 10000 through 14999 inclusive, and 99999
<ID include="1000-4999, 10000-14999, 99999" />
IDRF — DFdiscover Retrieval File of keys for records to be selected
IDRF ::= EMPTY
A relative or absolute pathname. For relative pathnames, the base directory is the $STUDY_DIR/drf directory of the study. The pathname must be specified using UNIX file and directory naming semantics. If the pathname is not given, the processing system will create a pathname from:
$STUDY_DIR/drf
the base directory of $STUDY_DIR/drf of the study followed by
the name of the batch, and terminated with
the fixed suffix, drf
The pathname specification can also include a sub-folder, the basic rule is, /STUDY_DIR/drf/sub-folder/xx_out.drf, which starts from absolute path, or sub-folder/xx_out.drf, which is a relative path. In both ways, the output file should be found in /STUDY_DIR/drf/sub-folder/xx_out.drf. The filename cannot contain ../ or /.. to alter the file path. Doing this will lead to error.
/STUDY_DIR/drf/sub-folder/xx_out.drf
sub-folder/xx_out.drf
Records can be selected by a previously created DRF. Each record in the DRF is retrieved and processed with the exception of any secondary records which are omitted. Secondary records are only used to link images to data records and can not be processed in batch.
These elements contain IDRF: CRITERIA
The following elements occur in IDRF: None.
Select records from the DRF named /opt/studies/val254/drf/needcoding.drf
/opt/studies/val254/drf/needcoding.drf
<IDRF file="/opt/studies/val254/drf/needcoding.drf" />
If the study drf directory is /opt/studies/val254/drf, the following statement has equivalent meaning.
/opt/studies/val254/drf
<IDRF file="needcoding.drf" />
LEVEL — select records for inclusion by the validation level
LEVEL ::= EMPTY
Records are selected by validation level with this element. If the current validation level of a record (not of the user performing the batch) is in the range of included validation levels (specified by the value of the include attribute), the record becomes part of the set of records to be processed.
If no inclusion criteria are specified, records are selected for inclusion from the minimum validation level, 1, up to and including either:
the level specified by the level attribute of the APPLY element, if it is specified explicitly, or
the maximum validation level permitted to the user executing the batch
This is equivalent to not specifying a LEVEL element.
Permissible validation levels are in the range 1 through 7 inclusive.
These elements contain LEVEL: CRITERIA
The following elements occur in LEVEL: None.
Select records that are at validation level 2 or 3
<LEVEL include="2,3" />
LOG — which actions are logged to an external file and when
LOG ::= EMPTY
A relative or absolute pathname. For relative pathnames, the base directory is the same as the base directory of the input file. The pathname must be specified using UNIX file and directory naming semantics. If the pathname is not given, the processing system with create a pathname from:
the base directory of the input file followed by
the fixed suffix, _out.xml
The pathname specification can also include a sub-folder, the basic rule is, /STUDY_DIR/batch/sub-folder/xx_out.xml, which starts from absolute path, or sub-folder/xx_out.xml, which is a relative path. In both ways, the output file should be found in /STUDY_DIR/batch/sub-folder/xx_out.xml. The filename cannot contain ../ or /.. to alter the file path. Doing this will lead to error.
sub-folder/xx_out.xml
Should the file be created, (create), or (over)written, (write)? Generally, files will be created with write so that they can be subsequently overwritten when the batch is re-run. However, if a batch is only intended to be run once or you would like to ensure that the log file does not accidentally overwrite a log file from another batch that might have the same name, use create.
NOTE: The combination of attributes mode="create" and history="yes" has no semantic meaning as it will never be possible to preserve the history of a log file that can only be created once.
mode="create"
history="yes"
These elements contain LOG: [ ACTION].
The following elements occur in LOG: None.
Log all records that have had data values changed, messages generated, or queries added or deleted
<LOG when="all" which="data msg qc" />
Log records which have had data values changed or messages generated to an output file in a subfolder
<LOG when="all" which="data msg" file="test/example_out.xml" />
<LOG when="all" which="data msg" file="/STUDY_DIR/batch/test/example_out.xml" />
The output file should be found in /STUDY_DIR/batch/test/example_out.xml
/STUDY_DIR/batch/test/example_out.xml
MODIFY — select records for inclusion by the record modification date
MODIFY ::= EMPTY
Records are selected by last modification date with this element. If the plate identifier of a record is in the range of included modification dates (specified by the value of the include attribute), the record becomes part of the set of records to be processed.
If no inclusion criteria are specified, records are selected for inclusion independent of the modification date. This is equivalent to not specifying an MODIFY element.
The date notation for expressing constant date values is the standard DFdiscover notation, yy/mm/dd, which means two-digit year of century, two-digit month of year, and two-digit day of month. The word today may appear in the selection criteria, and has the effect of selecting records last modified on todays date.
These elements contain MODIFY: CRITERIA
The following elements occur in MODIFY: None.
Select all records that were modified today
<MODIFY include="today" />
MOVETO — define limit on number of consecutive dfmoveto() calls
MOVETO ::= EMPTY
During processing of an edit check, it is possible for one edit check or a combination of edit checks to create a loop. Specifically, a loop can occur when:
two edit checks, on two different fields, each execute the dfmoveto() statement, resulting in a move to the other field
or execution of the dfmoveto() statement in a single edit check can move the focus back to a previous field allowing subsequent traversal to move the focus back to the current field and triggering the same edit check
A loop is detected after a number of consecutive dfmoveto() invocations. By default this number is 20. When a loop is detected, edit checks processing halts. In an interactive environment, a warning dialog is displayed and the user has the option of aborting the current edit check or allowing it to continue, presumably resulting in another loop shortly thereafter. In batch edit checks, the current edit check is always aborted.
It is possible, although highly unlikely, for a particularly complex combination of edit checks to invoke dfmoveto() more than the default number of times, 20, without a loop existing. In this case, the limit on the loop can be raised by setting the number attribute of the MOVETO element to a higher value.
number
This element applies to all of the BATCH elements in the current BATCHLIST.
These elements contain MOVETO: [ CONTROL].
The following elements occur in MOVETO: None.
Increase the looping limit to 25 consecutive dfmoveto() invocations
<MOVETO number="25" />
ODRF — which records are logged to an external DFdiscover Retrieval File (DRF) and when
ODRF ::= EMPTY
A relative or absolute pathname. For relative pathnames, the base directory is the $STUDY_DIR/drf directory of the study. The pathname must be specified using UNIX file and directory naming semantics. If the vpathname is not given, the processing system will create a pathname from:
the fixed suffix, .drf
should the file be created, (create), or (over)written, (write)? Generally, files will be created with write so that they can be subsequently overwritten when the batch is re-run. However, if a batch is only intended to be run once or you would like to ensure that the log file does not accidentally overwrite a log file from another batch that might have the same name, use create.
These elements contain ODRF: [ ACTION].
The following elements occur in ODRF: None.
Write all records that have had messages generated to the DRF named /opt/studies/254/batch/outputs/sample1.drf and enable file sharing
/opt/studies/254/batch/outputs/sample1.drf
<ODRF when="all" which="msg" file="/opt/studies/254/batch/outputs/sample1.drf" share="yes" mode="write" />
Write records which have had data value changed or messages generated to the DRF in a sub-folder
<ODRF when="all" which="data msg" file="test/sample1.drf" />
<ODRF when="all" which="data msg" file="/STUDY_DIR/drf/test/sample1.drf" />
The output file should be found in /STUDY_DIR/drf/test/sample1.drf
/STUDY_DIR/drf/test/sample1.drf
PLATE — select records for inclusion by the plate identifier
PLATE ::= EMPTY
Records are selected by plate identifier with this element. If the plate identifier of a record is in the range of included plate identifiers (specified by the value of the include attribute), the record becomes part of the set of records to be processed.
If no inclusion criteria are specified, records are selected for inclusion independent of the plate identifier. This is equivalent to not specifying a PLATE element.
These elements contain PLATE: CRITERIA
The following elements occur in PLATE: None.
Select all records for plates 21 through 29 inclusive
<PLATE include="21-29" />
REASON — define the text that will be inserted into reason for data change records for each data value that is changed and applied to the database
REASON ::= #PCDATA
During processing of an edit check, a data value may be changed. Further, that data value may have been defined in the study setup so that it has a minimum reason level after which all data value changes require reasons. If the changed data value is applied to the database, and the record's validation level is at or above the minimum reason level, then a reason is required. The text of this element is used as the reason.
If, in the batch definition, this element is not present or is invalid (i.e. contains '|' or control characters, or consists entirely of an empty string) and a reason is required for a data value change that is to be applied to the database, then the system creates a reason text that is the concatenation of the fixed string DFbatch and the current batch's name.
These elements contain REASON: [ CONTROL].
The following elements occur in REASON: None.
Indicate the the reason for all of these changes is the installation of a new lookup table
<REASON>applying coding from new lookup table</REASON>
STATUS — select records for inclusion by the record status
STATUS ::= EMPTY
Records are selected by status with this element. If the status of a record is in the range of included statuses (specified by the value of the include attribute), the record becomes part of the set of records to be processed.
If no inclusion criteria are specified, records are selected for inclusion independent of the status. This is equivalent to not specifying a STATUS element.
Permissible status values must be chosen from the list: [final, incomplete, pending, missed, all].
These elements contain STATUS: CRITERIA
The following elements occur in STATUS: None.
Select all final and incomplete records
<STATUS include="final,incomplete" />
TITLE — brief title for the batch
TITLE ::= #PCDATA
The body text of this optional element is a descriptive title for the batch.
These elements contain TITLE: [ BATCH].
The following elements occur in TITLE: None.
Batch title
<TITLE>Locate Missing Adverse Event Reports</TITLE>
VISIT — select records for inclusion by the visit identifier
VISIT ::= EMPTY
Records are selected by visit identifier with this element. If the visit identifier of a record is in the range of included visit identifiers (specified by the value of the include attribute), the record becomes part of the set of records to be processed.
If no inclusion criteria are specified, records are selected for inclusion independent of the visit identifier. This is equivalent to not specifying a VISIT element.
These elements contain VISIT: CRITERIA
The following elements occur in VISIT: None.
Select all records with a visit identifier of 10
<VISIT include="10" />
This section describes every element in the BATCHLOG document type definition. The description is offered in two complimentary views. The first view is a tree diagram of the document type definition. The second view is a reference where the meaning and use of every element is described.
A Document Type Definition (DTD) defines the set of elements and their attributes that are valid within a document, how many occurrences of each are allowed, and in what order they are allowed. The batch edit checks input control file has its own DTD that is outlined in the BATCHLOG DTD figure below.
The BATCHLOG DTD
The root element of the input control file is BATCHLOG.
This reference describes every element in the BATCHLOG document type declaration. The organization of the reference is identical to that for the BATCHLIST document type and can be reviewed at Organization of Reference Pages.
A — meta information, including record status, validation level, and CRF image name, for the context record
A ::= EMPTY
A A contains meta information for the context record. The meta information includes the record status, validation level, and the CRF image name.
The CRF image name is guaranteed by DFdiscover to be unique across an entire installation. Therefore, it can be used as a key in implementing relationships across documents.
These elements contain A: [R]
The following elements occur in A: None.
Meta information for the context record indicating a record status of final, validation level of 5, and CRF image name of 0022/0001001
0022/0001001
<A s='1' l='5' im='0022/0001001'/>
BATCHLOG — container element of batch output log file
BATCHLOG ::= (HL, STUDY, CWD, SRC, USER, OUTDRF?, OUTLOG, (R | M)*, SUMMARY)
The name of the BATCH that generated this output.
The value of this attribute must match the value of the BATCH name in the source file identified by the value of the SRC element.
SRC
The version of the BATCHLOG language. If subsequent versions of the language are developed, the version number will change. The number to the left of the . is the major version number and the number to the right is the minor version number.
The BATCHLOG is the root element of the output log file. A BATCHLOG is simply a container element for the output from one execution of a batch. If multiple batch outputs are to be recorded they must be written one batch output per log file.
It is an error for any characters to appear after the </BATCHLOG> end tag.
</BATCHLOG>
These elements contain BATCHLOG: None.
The following elements occur in BATCHLOG: [HL, STUDY, CWD, SRC, USER, OUTDRF, OUTLOG, R, M, SUMMARY].
A BATCHLOG
<?xml version="1.0"?> <BATCHLOG version="1.0" name="test1"> output for this batchlog </BATCHLOG>
CWD — the current working directory in which DFbatch executed to create this log file
CWD ::= EMPTY
A CWD identifies the directory in which DFbatch was executed when this log file was created.
CWD
Any filenames in the output log that have values which are relative pathnames can be taken as relative to the value of this element.
These elements contain CWD: [BATCHLOG]
The following elements occur in CWD: None.
The current working directory /opt/studies/val254/lib and the source file test_in.xml
/opt/studies/val254/lib
test_in.xml
<CWD>/opt/studies/val254/lib</CWD> <SRC>test_in.xml</SRC>
The combination of these elements indicates that the source file for the batch was in /opt/studies/val254/lib/test_in.xml.
/opt/studies/val254/lib/test_in.xml
D — information about a change in a data value for a single variable
D ::= (V, O, N)
HN
fail
success
A D contains the information of a change in a data value that occurred when the edit check named in the parent element was executed. It identifies which variable (V) was changed, what the original value of the variable was (O), and what the new value is (N).
O
The V element identifies the variable for which the data value was changed. In this context, it is not expected to have any child elements.
The O element will always be present, even if the variable had no original value (was blank), and similarly the N element will always be present, even if the variable has no new value (becomes blank).
These elements contain D: [E]
The following elements occur in D: [V, O, N]
A data changed element
<D fr='1' c='1'><V n='othsp6'/><O></O><N>-</N></D>
This element indicates that the value of the variable othsp6`` was successfully changed from blank to-`.
othsp6`` was successfully changed from blank to
DD — day of month information for a date stamp
DD ::= #PCDATA
A DD contains day of month information. The value will be numeric and in the range 01 and 31 inclusive. The value will always be leading zero-padded to two digits, if necessary.
DD
01
31
These elements contain DD: [DT]
The following elements occur in DD: None.
The 25th day of the month
<DD>25</DD>
DT — timestamp information for a single execution of DFbatch
DT ::= (YY, MM, DD)
A DT contains date information, specifically the year (YY), month (MM), and day (DD). This element appears as a child of an HN element and records the current date that the batch was run, or the historical date of a previous execution of the batch.
DT
The content model does not allow for the representation of a partial date, and so each of YY, MM, and DD must be present as children.
These elements contain DT: [HN]
The following elements occur in DT: [YY, MM, DD].
A date entry for October 19, 2000
<DT><YY>2000</YY><MM>10</MM><DD>19</DD></DT>
E — container element for an edit check including its name and what actions occurred when the edit check was executed
E ::= (D | M | Q | MQ | MX)*
plate enter
plate exit
field enter
field exit
The execution of an edit check can cause no changes to occur, or changes to one or more data values, messages, queries, missing page queries, or deletions of missing page queries, to occur. This element identifies the edit check by name and when it was executed, and is the parent element for the changes, if any. The parent of this element is the variable that was current when the edit check was executed.
If the detail level (identified by the when attribute of the LOG or APPLY element in the input batch file) has a value of changes, then this element will never appear in the output as an empty element. It may only appear as an empty element when the attribute value is all in which the element identifies that the edit check was executed but no changes occurred.
These elements contain E: [V]
The following elements occur in E: [D, M, Q, MQ, MX].
An edit check element and its children
<E w='pn' n='no_diabetes'><M fr='1' t='w'>Section 5.c questions should be blank subject is not diabetic. </M> </E>
This fragment identifies that the edit check named no_diabetes was triggered at plate enter (pn) of the context variable and its execution caused the shown warning message.
no_diabetes
pn
EQ — a query edited by the execution of dfeditqc
EQ ::= (V?, QR?, QRPLY?, NT?)
missing
illegal
inconsistent
illegible
fax noise
other
user-defined category
Any queries edited by the dfeditqc function during the execution of an edit check are recorded in an EQ element.
EQ
The V element identifies the variable by name to which the query was edited, even if it is the current variable.
The query and note child elements appear only if their respective values have been set by the function. Note that the element values do not identify whether it has been changed from its previous value, simply that the value was specified as an argument in the function call.
An EQ element does not imply that changes to a query were added to the database, simply that they would have been added if the APPLY element had allowed queries to be added or changed.
The history element that records which batch execution generated this element can be determined by matching the values of the fr attribute and the fd attribute.
fr
fd
These elements contain EQ: [E]
The following elements occur in EQ: [V, QR, QRPLY, NT]
A query on the age variable was edited
<EQ fr='8' pr='3' u='1' f'=1' c='1'><V n='age'/> <QR>The reported age is inconsistent with the exam date and the subject's birth date</QR></Q>
A user defined query added to the visitDate variable
visitDate
<Q fr='9' pr='30' u='1' f'=1' c='1' prlbl="Requires Follow-Up"><V n='visitDate'/> <QR>The reported visit date does not match up with the trial schedule and needs to be verified by the trial coordinator.</QR></Q>
HL — container element for history elements
HL ::= (HN+)
A HL is simply a container element for one or more HN elements. A BATCHLOG always records at least the date and time of the current batch execution, and if history is enabled, additionally records the date and time of every previous execution.
HL
The HN child elements can be ordered in reverse chronological order by descending sequence of fd. While the oldest HN element generally has a value of 1 for this attribute, this is not required. Similarly, it is not required that there are no gaps in the ascending sequence of fd attributes, but this is usually the case.
These elements contain HL: [BATCHLOG]
The following elements occur in HL: [HN].
Two history elements
<HL> <HN fd='1'> <DT><YY>2000</YY><MM>10</MM><DD>19</DD></DT> <TM><HR>09</HR><MI>01</MI><SC>40</SC></TM> </HN> <HN fd='2'> <DT><YY>2000</YY><MM>10</MM><DD>20</DD></DT> <TM><HR>00</HR><MI>10</MI><SC>03</SC></TM> </HN> </HL>
HN — timestamp information for a single execution of DFbatch
HN ::= (DT, TM)
A HN contains date and time information regarding a single execution of DFbatch.
If the element has a cur attribute with a value of 1, the element records the date and time of the current execution. Otherwise, it records the date and time of a previous execution.
cur
Where multiple HN elements are present, they can be ordered chronologically by sorting their fr attributes in ascending order. The HN elements also appear within the parent HL element in chronological order, but this is not required.
These elements contain HN: [HL]
The following elements occur in HN: [DT, TM].
A history entry
<HN fd='1'> <DT><YY>2000</YY><MM>10</MM><DD>19</DD></DT> <TM><HR>09</HR><MI>01</MI><SC>40</SC></TM> </HN>
This is the entry for the oldest execution of the batch which took place on October 19, 2000 at 09:01:40. Because the element does not have a cur attribute, it is not the history element for the current (and only) execution.
HR — hour information for a time stamp
HR ::= #PCDATA
A HR contains hour of day information in 24-hour notation. The value is numeric in the range 01 to 24, and is always zero-padded to two digits.
HR
These elements contain HR: [TM]
The following elements occur in HR: None.
The 14th hour, commonly called 2pm
<HR>14</HR>
K — container element for key identification of the parent element
K ::= EMPTY
This element contains the key identifiers of the parent element, which can be a record, a new missing page query, or the deletion of an existing missing page query.
These elements contain K: [R, MQ, MX]
The following elements occur in K: None.
A key element that identifies subject ID 1, visit 10, and plate 9
<K i='1' v='10' p='9'/>
M — the message generated by the execution of dfmessage, dferror, or dfwarning
M ::= #PCDATA
Any messages generated by the functions dfmessage, dferror, or dfwarning during the execution of an edit check are recorded in a M element.
Messages can also be generated by the DFbatch system during the execution. As a result, a M element can occur at almost any point in the BATCHLOG output.
The history element that records which batch execution generated this message can be determined by match the values of the fr attribute and the fd attribute.
These elements contain M: [BATCHLOG, V, R]
The following elements occur in M: None.
A message generated by dfwarning
<M fr='1' t='w'>Section 5.c questions should be blank subject is not diabetic. </M>
MI — minute information for a time stamp
MI ::= #PCDATA
A MI contains minute of hour information. The value is numeric in the range 00 to 59, and is always zero-padded to two digits.
MI
These elements contain MI: [TM]
The following elements occur in MI: None.
The 45th minute of the hour
<MI>45</MI>
MM — month information for a date stamp
MM ::= #PCDATA
A MM contains month of year information. The value is numeric in the range 01 (January) to 12 (December), and is always zero-padded to two digits.
These elements contain MM: [DT]
The following elements occur in MM: None.
The month June
<MM>06</MM>
MQ — a missing page query generated by the execution of dfaddmpqc
MQ ::= (K, QR?, NT?)
Any missing page queries generated by the dfaddmpqc function during the execution of an edit check are recorded in a MQ element.
MQ
Since missing page queries are never added to the current record, a K child element is needed to identify the keys that will get the missing page query. If query or note arguments were specified to dfaddmpqc, QR or NT child elements, respectively, will be present.
QR
NT
A MQ element does not imply that a missing page query was added to the database, simply that one would have been added if the APPLY element had allowed queries to be added.
These elements contain MQ: [E]
The following elements occur in MQ: [K, QR, NT]
A missing page query added for subject 1001, visit 1, plate 5, with no additional query or note
<MQ fr='5' c='1'><K i='1001' v='1' p='5'/></MQ>
MX — deletion of a missing page query generated by the execution of dfdelmpqc
MX ::= (K)
Any missing page queries deleted by the dfdelmpqc function during the execution of an edit check are recorded in a MX.
MX
Since missing page queries are never deleted from the current record, a K child element is needed to identify the keys from which the missing page query is deleted.
A MX element does not imply that a missing page query was deleted from the database, simply that one would have been deleted if the APPLY element had allowed query actions.
These elements contain MX: [E]
The following elements occur in MX: [K]
A missing page query deleted for subject 99001, visit 2, and plate 4
<MX fr='4' c='1'><K i='99001' v='2' p='4'/></MX>
N — the original value before a data change
N ::= #PCDATA
A N element records the new data value of a variable after it was changed by assignment.
If a field has a value assigned by an edit check, and the new value is identical to the original value, this is considered to be no change, and as a result is not recorded in the output log.
This element will be present, even if the new value is blank, in which case the element will be an empty element.
These elements contain N: [V]
The following elements occur in N: None.
A new value of 1
<N>1</N>
ND — the number of data changes recorded
ND ::= EMPTY
The ND contains summary information including the number of data changes that were successful, the number that failed, and the number that were successful but required truncation of the data value. attributes.
ND
By counting the number of D elements present in the output log and grouping them by the value of their c attribute, a stylesheet can tabulate the same numbers as reported by the attributes of this element.
These elements contain ND: [SUMMARY]
The following elements occur in ND: None.
Summary information of 1 successful data change, no failed changes, and no successful but truncated changes, applied to the database
<ND ok='1' notok='0' trunc='0' apply='1'/>
NEQ — the number of times queries were edited
NEQ ::= EMPTY
The NEQ element contains summary information including the number of times dfeditqc was called.
NEQ
By counting the number of EQ elements present in the output log and grouping them by the value of their c attribute, a stylesheet can tabulate the same numbers as reported by the attributes of this element.
These elements contain NEQ: [SUMMARY]
The following elements occur in NEQ: None.
dfeditqc called 4 times successfully, 1 time with a failure, APPLY set for queries
<NEQ apply='1' ok='4' notok='1'/>
NM — the number of messages recorded
NM ::= EMPTY
The NM contains summary information including the number of messages that were successful - it is not possible for message creation to fail.
NM
NOTE:If NM indicates that messages were successful but the records to be processed by DFbatch are locked and unavailable, the output log will also indicate this by an entry such as</br
<NR ok='0'/>
(number of selected records is 0). Be aware that edit check execution will not have been applied to the selected records in this case.
By counting the number of M elements present in the output log, a stylesheet can tabulate the same numbers as reported by the attribute of this element.
These elements contain NM: [SUMMARY]
The following elements occur in NM: None.
Summary information of 210 messages generated, APPLYset for messages
<NM apply='1' ok='210'/>
NMQ — the number of missing page queries added
NMQ ::= EMPTY
The NMQ element contains summary information including the number of missing page queries added successfully, and the number that failed.
NMQ
By counting the number of MQ elements present in the output log and grouping them by the value of their c attribute, a stylesheet can tabulate the same numbers as reported by the attributes of this element.
These elements contain NMQ: [SUMMARY]
The following elements occur in NMQ: None.
Summary information of 1 failed missing page query addition, APPLY element set.
<NMQ ok='0' notok='1' apply='1'/>
NMX — the number of missing page queries deleted
NMX ::= EMPTY
The NMX element contains summary information including the number of missing page queries deleted successfully, and the number that failed.
NMX
By counting the number of MX elements present in the output log and grouping them by the value of their c attribute, a stylesheet can tabulate the same numbers as reported by the attributes of this element.
These elements contain NMX: [SUMMARY]
The following elements occur in NMX: None.
Summary information of 4 successfully deleted missing page queries and 2 that failed, APPLY set.
<NMX ok='4' notok='2' apply='1'/>
NODRF — the number of DRF records written
NODRF ::= EMPTY
The NODRF element contains summary information including the number of DRF records written successfully, and the number that failed.
NODRF
The DRF records are written to the file identified bu the value of the file attribute of the ODRF element in the input file.
These elements contain NODRF: [SUMMARY]
The following elements occur in NODRF: None.
210 DRF records successfully written
<NODRF ok='210' notok='0'/>
NQ — the number of queries added
NQ ::= EMPTY
The NQ element contains summary information including the number of queries added successfully, and the number that failed.
NQ
By counting the number of Q elements present in the output log and grouping them by the value of their c attribute, a stylesheet can tabulate the same numbers as reported by the attributes of this element.
Q
These elements contain NQ: [SUMMARY]
The following elements occur in NQ: None.
4 queries successfully added and 1 failure, APPLY set for queries
<NQ apply='1' ok='4' notok='1'/>
NR — the number of records selected by the criteria
NR ::= EMPTY
The NR element contains summary information including the number of records selected by the criteria.
NR
These elements contain NR: [SUMMARY]
The following elements occur in NR: None.
2365 records selected
<NR ok='2365'/>
NSEC — the total number of seconds elapsed for the execution of the batch
NSEC ::= #PCDATA
The NSEC element contains a numeric value that is the number of seconds elapsed in the execution of the batch. The value is reported without any fractional seconds and so it is possible for the number of seconds elapsed to be 0. attributes.
NSEC
The number of seconds elapsed does not include the time spent, if any, styling the resulting output file for presentation.
The end time of the batch execution can be calculated by adding the number of seconds to the timestamp of the current HN element.
These elements contain NSEC: [SUMMARY]
The following elements occur in NSEC: None.
A batch that executed in 12 seconds
<NSEC>12</NSEC>
NSM — the number of system messages recorded
NSM ::= EMPTY
The NSM element records the number of system messages recorded during the execution of the batch. System messages are generated by the edit checks interpreter or database server and do not include any messages generated by the dfmessage, dferror, or dfwarning functions.
NSM
By counting the number of M elements with a t attribute of s, a stylesheet can tabulate the same numbers as reported by the attributes of this element.
These elements contain NSM: [SUMMARY]
The following elements occur in NSM: None.
Good news, no system messages
<NSM ok='0'/>
NT — an additional note added to a query or missing page query
NT ::= #PCDATA
The NT element contains the additional note, if any, supplied by the dfaddqc or dfaddmpqc edit check functions.
If the text of the note contains an embedded newline, the newline should cause a line break in the presentation of the note.
These elements contain NT: [Q, MQ]
The following elements occur in NT: None.
An additional note added to a query
<NT>This page should have been received over 4 weeks ago</NT>
O — the original value before a data change
O ::= #PCDATA
A O element records the original data value of a variable before it was changed by assignment of a new value.
This element will be present, even if the original value was blank, in which case the element will be an empty element.
These elements contain O: [V]
The following elements occur in O: None.
An original value of 56.6
<O>56.6</O>
OUTDRF — the filename of the output DRF
OUTDRF ::= #PCDATA
The value of the OUTDRF element is the filename to which output DRF records were written. If output DRF records were not selected, this element will not be present.
OUTDRF
If the filename does not begin with /, then it is assumed to a relative filename from which the absolute pathname can be derived by prepending the value with the value of the CWD element.
These elements contain OUTDRF: [BATCHLOG]
The following elements occur in OUTDRF: None.
The DRF /opt/studies/val254/drf/example.drf
/opt/studies/val254/drf/example.drf
<OUTDRF>/opt/studies/val254/drf/example.drf</OUTDRF>
OUTLOG — the filename of the output log, namely this file
OUTLOG ::= #PCDATA
The value of the OUTLOG element is the filename to which output log information was written. The value is self-referential naming the file that contains the element.
OUTLOG
These elements contain OUTLOG: [BATCHLOG]
The following elements occur in OUTLOG: None.
The log file /opt/studies/val254/lib/example_out.xml
/opt/studies/val254/lib/example_out.xml
<OUTLOG>/opt/studies/val254/lib/example_out.xml</OUTLOG>
Q — a query generated by the execution of dfaddqc
Q ::= (V?, QR?, NT?)
Any queries generated by the dfaddqc function during the execution of an edit check are recorded in a Q element.
The V element identifies the variable by name to which the query was added, even if it is the current variable.
A Q element does not imply that a query was added to the database, simply that one would have been added if the APPLY element had allowed queries to be added.
These elements contain Q: [E]
The following elements occur in Q: [V, QR, NT]
A query added to the age variable
<Q fr='8' pr='3' u='1' f'=1' c='1'><V n='age'/> <QR>The reported age is inconsistent with the exam date and the subject's birth date</QR></Q>
Example 6.86. A user defined query added to the visitDate variable
QR — an additional question asked in a query or a missing page query
QR ::= #PCDATA
The QR element contains the additional query, if any, supplied by the dfaddqc or dfaddmpqc edit check functions.
If the text of the query contains an embedded newline, the newline should cause a line break in the presentation of the query.
These elements contain QR: [Q, MQ]
The following elements occur in QR: None.
The query added to a query
<QR>Please identify the correct response by initialing and dating it</QR>
QRPLY — the reply field of a query
QRPLY ::= #PCDATA
The QRPLY element contains the reply to a query, if any, supplied by the dfeditqc edit check function.
QRPLY
If the reply to the query contains an embedded newline, the newline should cause a line break in the presentation of the query.
These elements contain QR: [EQ]
The following elements occur in QRPLY: None.
The reply to a query
<QRPLY>This was a clerical error</QRPLY>
R — a container element for all of the processing associated with a single record
R ::= (K, A?, V*, M*)
Each record that is processed by DFbatch is identified in the log by a R element. The children of this element record the actions that occurred during the processing of the record. The record that is being processed is identified by the key fields which are recorded in the K element.
If the detail level of logging is all, R elements may appear in the output with no child elements other than the required K element, indicating that the identified record met the record selection criteria but then executed no edit checks, or no edit checks that recorded changes.
It is possible for a system message to be reported at the record level.
These elements contain R: [BATCHLOG]
The following elements occur in R: [K, A, V, M].
A record element and its children
<R> <K i='1' v='10' p='9'/> <A s='1' l='5' im='9949/0017009'/> <V n='id9'><E w='pn' n='no_diabetes'><M fr='1' t='w'>Section 5.c questions should be blank subject is not diabetic. </M> </E></V> </R>
S — a reason for data change generated by the execution of dfaddreason
S ::= (V?, T?)
Any reasons for data change generated by the dfaddreason function during the execution of an edit check are recorded in a S element.
The V element identifies the variable by name to which the reason for data change was added, even if it is the current variable. The T element contains the text of the reason for data change. This text matches the input reason text supplied by the programmer in the dfaddreason function call.
A S element does not imply that a reason for data change was added to the database, simply that one would have been added if the APPLY element had allowed data changes.
These elements contain S: [E]
The following elements occur in S: [V, T]
A reason for data change added to the RACE variable
<S fr='1' c='1'><V n='RACE'/> <T>previous response was obscured on the sent CRF</T></S>
SC — second information for a time stamp
SC ::= #PCDATA
A SC contains second of minute information. The value is numeric in the range 00 to 59, and is always zero-padded to two digits.
SC
These elements contain SC: [TM]
The following elements occur in SC: None.
The 30th second of the minute
<SC>30</SC>
SRC — the name of the input file whose execution created this batch log
SRC ::= #PCDATA
A SRC identifies the filename that contains the input BATCH whose execution created this batch log.
These elements contain SRC: [BATCHLOG]
The following elements occur in SRC: None.
The batch input file example_in.xml, from the same directory as the current working directory
example_in.xml
<SRC>example_in.xml</SRC>
STUDY — the DFdiscover study number
STUDY ::= #PCDATA
A STUDY identifies the study number to which the batch and the results apply.
STUDY
These elements contain STUDY: [BATCHLOG]
The following elements occur in STUDY: None.
Study 254
<STUDY>254</STUDY>
SUMMARY — container element for summary information elements
SUMMARY ::= (NSEC, NSM, NR, ND, NQ, NEQ, NMQ, NMX, NM, NODRF)
A SUMMARY is simply a container element for other elements containing specific summary information about the most recent execution of the batch.
SUMMARY
These elements contain SUMMARY: [BATCHLOG]
The following elements occur in SUMMARY: [NSEC, NSM, NR, ND, NQ, NEQ, NMQ, NMX, NM, NODRF].
The summary element and its children
<SUMMARY> <NSEC>12</NSEC> <NSM ok='0'/> <NR ok='2365'/> <ND apply='0' ok='1' notok='0' trunc='0'/> <NQ apply='0' ok='0' notok='0'/> <NEQ apply='0' ok='0' notok='0'/> <NMQ apply='0' ok='0' notok='0'/> <NMX apply='0' ok='0' notok='0'/> <NM apply='0' ok='210'/> <NODRF ok='210'/> </SUMMARY>
T — the text reason accompanying a data change
T ::= #PCDATA
A T element records the reason for change text supplied as an argument during the execution of the dfaddreason function. If the execution of dfaddreason returns an empty string, then DFbatch records any empty string for the T element. When DFbatch has finished processing the record and required reasons are still missing, the system inserts a default reason constructed from the batch name.
These elements contain T: [S]
The following elements occur in T: None.
Reason for change text
<T>data entry error made during first entry</T>
TM — timestamp information for a single execution of DFbatch
TM ::= (HR, MI, SC)
A TM contains time information. It is contained in a HN element that identifies the execution of the batch that this time applies to.
TM
All timestamps are reported by hours, minutes, and seconds.
These elements contain TM: [HN]
The following elements occur in TM: [HR, MI, SC].
A time entry for 10 minutes, 3 seconds after midnight
<TM><HR>00</HR><MI>10</MI><SC>03</SC></TM>
USER — the login name of the user that executed the batch
USER ::= #PCDATA
A USER identifies the login name of the user that executed the batch creating the results which the element is a part of.
USER
These elements contain USER: [BATCHLOG]
The following elements occur in USER: None.
Identification of the batch executed by user user1
user1
<USER>user1</USER>
V — variable identification
V ::= (E | M)*
A V element identifies the variable containing the edit checks to be executed, or the variable to which a query is being added.
If the element appears as a child of a Q element, it may not have any further descendants.
These elements contain V: [BATCHLOG, D]
The following elements occur in V: [E, M].
A variable element for id9 and its children
id9
<V n='id9'><E w='pn' n='no_diabetes'><M fr='1' t='w'>Section 5.c questions should be blank subject is not diabetic. </M> </E></V>
YY — year information for a date stamp
YY ::= #PCDATA
A YY contains year information. The value is numeric and is always reported in 4 digit format that includes the century.
These elements contain YY: [DT]
The following elements occur in YY: None.
The year 2000
<YY>2000</YY>
XML is really a meta language - a language for creating new languages. The syntactic rules for each language are the same, but the vocabulary and semantics of each language can be quite different. In the case of DFbatch, we used XML to create two new languages: the input control file language, rooted at BATCHLIST, and the output log file language, rooted at BATCHLOG.
XML was chosen as the input and output language format for a variety of reasons, a few of which are listed below:
It is not coincidental that XML is so easy to use - it was designed this way. There are very few rules to live by when using XML. This makes it easy to implement lightweight programs (typically parsers) that enforce those rules. The programs never have to worry about exceptions to the rules - if there is an exception, it is no longer XML - it's as simple as that.
The following rules define what is and what is not an XML document:
An XML document is constructed from elements and attributes. Elements are delimited by tags, a start-tag and a matching end-tag. The content for the element, which may be character data, other elements, or a combination, is between the tags.
The value of an attribute must be enclosed in matching quotes, either single quotes, '', or double quotes, "".
An XML document must have exactly one root element.
An XML document must be well-formed. Each starting tag must be balanced by a closing tag, and pairs of tags must be nested properly.
Improperly nested tags
<bold>This text is bold and <italic>this is bold-italic </bold>, but what is this?</italic>
The following reserved characters may not appear directly in an XML document: &, <, >, ", and '.
Comments are denoted in the SGML notation.
Comments
<!-- this is a comment -->
XML documents may optionally also be valid. A valid XML document, in addition to being well-formed, also adheres to the structure of a Document Type Definition (DTD).
XML has many companion technologies, some that apply business rules, and others that define schemas. By far though, most of the additional technologies are focused on transformation (converting one XML document into another XML document, or filtering parts of an XML document) and formatting (converting an XML document to HTML or PDF, for example). The following illustration is a simple view of how XML fits in with these other technologies.
Companions to XML
If you would like to learn more about XML, we recommend the following starting points for additional reading:
The XML Cover Pages
Cafe con Leche XML News and Resources
XML.ORG - The XML Industry Portal, hosted by OASIS
DFdiscover comes with approximately 40 standard reports that are useful for most clinical trials. There will always be the need for custom reports for almost every trial. You can create reports using any of the programming tools that you are familiar with (e.g SAS®, a database report tool, UNIX shell scripts, etc.) and then install them in DFdiscover so that your DFdiscover users can execute them via Reports View in DFexplore. Also, don't hesitate to take a look at the standard DFdiscover reports included in your DFdiscover installation reports directory. Most are UNIX shell scripts, which can serve as models for creating your own study specific reports.
After a report has been created the easiest way to make it available to study staff is through DFexplore's Reports View. It can then be executed by permitted users by simply clicking the report name.
If reports are to be executed from DFexplore they must be installed in the study reports directory and have execute permissions for at least owner and group.
When you install a new report in the study reports directory, be sure to update the .info file kept in that same directory with a complete description of how the report works, including what options it accepts and what output it generates. Take a look at the existing .info file in your DFdiscover reports directory for examples of how on-line documentation should be formatted. However, make sure you don't include your own documentation in this file, as it will be replaced with each new DFdiscover release, as new standard reports are added.
Reports executed from DFexplore are always provided with the study number as an argument, plus the options that the user has provided in DFexplore's Options field. Within a Bourne shell script, for example, the study number is always $1, so you can do things like:
Options
study=$1
Use the shell level program DFgetparam.rpc to obtain the values of configuration parameters from the study server rather than trying to parse the configuration file yourself, or even worse, hard coding in the values of study directories, etc. The following Bourne shell example shows how to use DFgetparam.rpc to get the study database directory, where the study number has already been set to $study:
db=`/opt/dfdiscover/bin/DFgetparam.rpc -s $study DATABASE_DIR`
Temporary files created by your report should be written to the study work directory. Using DFgetparam.rpc you can evaluate the value of the work directory as follows:
work=`/opt/dfdiscover/bin/DFgetparam.rpc -s $study WORKING_DIR`
and then use the temporary file in one of these ways:
$work/tmp $work/$$tmp
HTML markup can be used to format report output directed to the reports view in DFexplore. HTML reports must start with one of the following tags: <HTML>,<!DOC>,<!doc>,<TABLE>,<table>,<P>,<p>.
NOTE: If DFexplore is run using the web-based DFnavigator, links are disabled. Page breaks are output when printing or saving report output to PDF.
Many study reports are re-executed at regular intervals throughout the trial. Such routine reports can be created by putting all of the computational steps (data retrieval, analysis and report creation) into a UNIX shell script. This script can then be documented and installed in the study reports directory. Once this is done, and tested, it can then be executed by anyone with access to the reports tool, the study, and privileges for the particular report. Of course some analyses must be performed by a skilled statistician each time they are done, but most of the summary reports created for trial and database management purposes are routine tables and graphs. In such cases setting up a shell script, which includes all of the steps needed to create the report, can relieve programmers and data analysts from those tasks which are simple, repetitive and time consuming and allow them to concentrate on those analyses which require more attention and skill.
DFdiscover Study Files describes the location and format of all study files including: the data files created from the study CRFs, the quality control data base, the journal files used to record every transaction written to the study database, and all study configuration files. Any of these files might be needed as input to a study report program.
DFdiscover includes several shell level programs which are described in Shell Level Programs. These programs can be used to export and reformat data, and read study configuration parameters. Any number of other programming tools might also be incorporated into shell scripts to create study specific report programs. These might include other DFdiscover programs (e.g. DFsas), UNIX commands (e.g. awk, grep, sed, sort), third party packages (e.g. SAS), report generators (e.g. perl) or even low-level programming languages.
With these tools, programmers can extract the data they need from the study database, and then analyze and summarize it using whatever programming tools they are familiar with or think most suitable to the task at hand.
The following shows a simple example of a UNIX shell script that reports subject entry by site for the past 9 months, as well as the total to date, for an example study.
The basic outline of this program is documented with comments (which begin with the # symbol). This hopefully is enough to give you an idea of the level of effort required to produce simple study specific reports. For those unfamiliar with UNIX shell programming, this may whet your appetite for an exploration of the many useful tools built into the UNIX operating system. You can accomplish a great deal without invoking big systems like SAS, and it's really not that hard, honest!
#!/bin/sh # RANDOMIZATIONS BY SITE FOR PAST 9 MONTHS AND TOTAL TO DATE # $1 is study number. # yy = current year # mm = current month # hx = number of months to be printed yy=`date +"%y"` mm=`date +"%m"` hx=9 # Export all primary randomization records (plate 1) received to date # and use field 13 as the visit date. Site number is assumed to be pid/1000 /opt/dfdiscover/bin/DFexport.rpc -s primary $1 1 - | nawk -F\| ' BEGIN { YY="'$yy'"+0; m=MM="'$mm'"+0; HX="'$hx'"+0 # determine which months are to be printed if(m >= HX) { for(i=HX;i>=1;i--) {k[i]=YY*100+m;--m} } else { jn = HX - MM jm = 13 - jn jy = YY - 1 for(i=1 ;i<=jn;i++) {k[i]=jy*100+jm; ++jm} for(jm=1;i<=HX;i++) {k[i]=YY*100+jm; ++jm} } } { # Read each exported record and count cases by month and center mon=substr($13,4,2); yr=substr($13,7,2) center=int($7/1000); yymm= yr*100 + mon n[yymm,center] +=1 mtot[yymm] +=1 ctot[center] +=1 } END{ # Print Report printf"RECENT AND TOTAL RANDOMIZATIONS BY SITE\n" printf"SITE " for(i=1;i<=HX;i++) printf"%6d",k[i] printf" TOTAL\n\n" for(i=1;i<=200;i++) { # center number range is 1~200 if( ctot[i]>0 ) { printf"%5d: ",i for(j=1;j<=HX;j++) if (n[k[j],i] > 0) printf"%6d",n[k[j],i]; else printf " "; printf" :%6d\n", ctot[i] SUM += ctot[i] } } printf"\nTOTAL: " for(j=1;j<=HX;j++) if (mtot[k[j]] > 0) printf "%6d",mtot[k[j]]; else printf " "; printf" : %5d\n", SUM }'
When this UNIX shell script is installed in the study reports directory, and given execute permissions for the permitted members of the study group, it can be executed.
The documentation for study specific reports must be kept in a separate file from the documentation for all standard DFdiscover reports. Specifically, it must be kept in the .info file in the study reports directory.
Report titles appear in the scrolling list of reports in the order in which they are defined in the file. Any reports not defined in this file will appear in alphabetical order at the end of the list of reports that have been defined.
The general layout of the file is repeating blocks, of the following structure, where each block contains the documentation for one report:
.BEGIN report_name (1) .TITLE report_title (2) .OPTION first_option (3) .OPTION next_option (4) documentation text (5) .END (6)
.BEGIN
report_name
.OPTION
DFsas is an intermediary between DFdiscover and SAS®. It can be used to extract summary data files from the study database and to create the corresponding SAS® job file for subsequent processing by SAS®.
NOTE: SAS® version 7 assumed Unless otherwise stated, the SAS® functionality described herein is based upon version 7.
DFsas reads specifications from a DFsas job file. As an introduction to DFsas consider the following example DFsas job file.
Typical DFsas job file
# GLOBAL SPECIFICATIONS DFNUM 248 SASDIR /studies/df248/sas SASJOB test SASVERSION 7 CHECK labels CHOICE codes VLABEL yes BLANK all asis RETAIN QUOTES no MISSING all asis MISSING missed .L NOVISIT . RECSTATUS final incomplete missed RECLEVEL all MERGE yes # DATA RETRIEVALS # variables from plate 001, seq 001 RECORD 1 9 race labels 10~31 # variables from plate 002, seq 000 to 005 RECORD 2 0~5 9 vdate 11 SBP 12 DBP # SAS PROCS SAS proc print;
Executing DFsas with the example job file results in the creation of 3 files: test.sas (the SAS® job file), and test.d01 and test.d02 (two data files, one for each of the 2 record types specified in the DFsas job file). The SAS® job and data files created by DFsas are plain ASCII files, which can be moved to any platform capable of running SAS®.
test.sas
test.d01
test.d02
From the above example, one can see that a DFsas job file has three parts:
The global specifications identify the study database, SAS® job name, and those specifications that are to be applied to all data fields. The global statements in the Typical DFsas job file include the following:
The DFNUM statement identifies the DFdiscover study number (248 in this example) for which the SAS® job is to be created.
DFNUM
The SASDIR statement identifies the full path name of the directory where the DFsas job file can be found and where the output SAS® job and data files are to be written.
SASDIR
The SASJOB statement identifies the name of the DFsas job file. This file includes all of the specifications needed to create the SAS® job and data files. This name also serves as the root name of the SAS® files to be created. In the example, test.sas will become the name of the SAS® job file, and test.d01 and test.d02 will become the names of the SAS® input data files.
SASJOB
The SASVERSION statement specifies the version of SAS® to be used with the SAS® job file created by DFsas. This statement is created by specifying the -SAS # option when generating the DFsas job file. If this option is not specified, the default of SAS® version 7 is used.
SASVERSION
-SAS #
The CHECK statement is used to specify the desired output for check fields (i.e. those data fields defined by a single box which is either checked or not). In our example we are requesting that the labels specified in the study schema be written to the data files for all check fields (instead of the numeric codes, 0 and 1). Alternatively sublabels can be used to output sublabels if present. All check fields must have labels defined in the study schema. Note that local specification of codes at the variable level overrides specification of labels or sublabels at the global level.
CHECK
sublabels
codes
labels
NOTE: If a label is not provided in DFsetup by the user, DFsetup writes out the code as the label.
If labels are to be written, DFsas encloses the label in double quotes and removes any double quotes present in the label's text. Single quotes present in the label text are preserved.
The CHOICE statement is used to specify the desired output for choice fields (i.e. those fields like race, for which there are 2 or more response options). As with CHECK, the options are codes, labels or sublabels. In the example, codes have been requested. If labels are requested DFsas will read them from the study schema. DFsas will enclose the label in double quotes and remove any double quotes present in the label's text. Single quotes present in the label text are preserved. If no labels are found, DFsas will default to the numeric codes. If no sublabels are found, the label value is used. Note that local specification of codes at the variable level overrides specification of labels or sublabels at the global level.
CHOICE
The VLABEL statement, followed by yes indicates that SAS® variable labels are to be written to the SAS® job file. These labels are read from the description of each field found in the study schema.
VLABEL
When writing variable labels, DFsas encloses the label in double quotes and removes any double quotes present in the variable description text. Single quotes present in the variable label are preserved.
The BLANK statement is used to identify how blank fields are to be written to the data files. In the example, all blank fields will be written as they are (i.e. they will remain blank).
BLANK
The RETAIN QUOTES no statement instructs SAS to discard any quotation marks when reading string fields from the input data files. If no is changed to yes, quotation marks are preserved.
RETAIN QUOTES no
The MISSING statement is used to identify how fields that contain a DFdiscover missing code, are to be written to the data files. In the example MISSING all asis specifies that all missing codes are to be output as they are, (i.e. without any re-coding). The MISSING missed .L statement specifies that the missing value code .L is to be output in fields exported from missed data records.
MISSING
MISSING all asis
MISSING missed .L
.L
The NOVISIT statement is used to specify a value to be used for all variables for visits that do not exist in the database for a subject, when constructing data records containing variables that are repeated across visits. The SAS® . (dot) missing value code has been used in the example.
NOVISIT
The RECSTATUS command is used to select data records by record status. The default is to export all final, incomplete and missed data records.
RECSTATUS
The RECLEVEL command is used to select tdata records by workflow level. The default is to export records at all levels. Levels can be specified using a list of values and ranges, as in: RECLEVEL 3-5,7.
RECLEVEL
The MERGE statement followed by 'yes' specifies that the merge command needed to combine all of the data files into a single SAS® data set, should be included at the end of the SAS® job file. In cases where a more detailed merge command is needed, use MERGE no, and include a new merge specification at the top of the SAS® procedures section.
MERGE
MERGE no
This section identifies the data fields to be extracted from the study database. The RECORD statement identifies the plate from which data fields are to be extracted. The plate number may be followed by a visit or sequence number or range, to identify the repetitions of the plate which are to be included. If there is only one possible sequence number for a particular plate, the sequence number need not be specified.
RECORD
In the example, for each primary data record in the database with plate number 1, a summary data record will be written to test.d01. Each summary data record automatically includes the subject ID as the first data field. In the example, the subject ID is followed by data fields 9 through 31. The field numbers correspond to the numbering used in the study schema. The SAS® job file will automatically include the variable name ID for the subject ID. The variable name race will be used in the SAS® job file for field 9. Since variable names are not specified for fields 10 through 31, variable names for these fields will be taken from the study schema. If a variable's field name has been specified it will be used; otherwise the appropriate number of leading characters of the variable's alias will be used.
race
The variable race (a choice field) is also followed by the keyword labels. The result will be to output value labels (e.g. Caucasian, Asian, etc.) instead of codes for this particular choice field, thus overriding the global CHOICE codes specification. DFsas will enclose the labels in double quotes and remove any double quotes present in the text of the label. Single quotes present in the label's text will be preserved.
CHOICE codes
The retrieval specified for plate 2 will be written to file test.d02. It is similar, except that a range of visit numbers has been specified. One summary data record will be created for each subject who has at least one record in the specified set (i.e. plate 2, visit numbers 0 through 5 inclusive). In each summary data record, separate data fields are created for each of the specified sequence numbers. In retrievals of this type the sequence number is appended to each variable name when the SAS® job file is created. Thus the variable names written to the SAS® job file in the example will be: [ID, vdate0, SBP0, DBP0, vdate1, SBP1, DBP1, etc]. When variable names are not specified they will be taken from the study schema, and if a sequence number range is specified, the sequence number will be appended to each variable name. It is therefore important when specifying the variable field names to allow for the sequence number that will be appended when this type of retrieval is specified.
vdate0
SBP0
DBP0
vdate1
SBP1
DBP1
It is also possible to create a separate data record for each visit (or sequence) number that appears in the database. To do this, specify the plate number alone, without any qualifying visit or sequence numbers.
DFsas also includes the ability to create normalized records for adverse events, medications, medical history items, etc. This is not shown in the preceding example but is illustrated later in this chapter.
DFsas will automatically write the commands required to merge the summary data files on subject ID to the SAS® job file (unless MERGE no is specified in the global specifications section). In addition, all lines at the end of a DFsas job file, following the SAS® statement, are written to the end of the SAS® job file. In the example, after the data is merged into a SAS® data set it will be printed using the SAS® print procedure.
After
% DFsas test
is executed to create the SAS® job file, test.sas, and the data files, test.d01 and test.d02, the job can be submitted to SAS®. If SAS® is installed on the same machine being used to run DFsas, SAS® can be run directly with the command:
% sas test.sas
SAS® will put the output from the print command in file test.lst, and will write job control and messages to file test.log.
test.lst
test.log
A DFsas job file can be created manually by using a text editor (e.g. vi) and following the syntax described in detail later in this chapter. But a quicker method is to first use the DFsas job creation option to build an initial DFsas job file for all variables in the database or all variables on specified plates, and then use an editor to make any desired modifications. Modifications might include:
changing global statement options
removing specifications for variables that are not required
changing variable names or labels
specifying the desired sequence number range for specified plates,
indicating how a normalized data set is to be created
SAS® imposes various limits, those limits vary with SAS® version and those limits have an impact upon the behavior of DFsas. Specifically, there are character limits on variable name, variable description and maximum length of text/string value. DFsas enforces the following maximums:
for SAS® version 6, and older:
variable name length = 8 characters
variable description length = 40 characters
text/string length = 200 characters
for SAS® version 7, and newer:
variable name length = 32 characters
variable description length = 256 characters
text/string length = 32767 characters
An initial DFsas job file is created by using the -c (or -C ) option. Option -C includes all variables while -c includes all variables except for the following meta variables:
DFRASTER (the DFdiscover CRF page identifier)
DFSTUDY (the DFdiscover study number)
DFPLATE (the DFdiscover plate number)
DFCREATE (the record creation date/time field)
DFMODIFY (the last date/time the record was modified)
DFSCREEN (screen value of the record status)
These fields are infrequently used in a SAS® analysis and can thus generally be omitted by using -c instead of -C when creating a DFsas job file.
Build a DFsas job file named job1 for all variables on all plates for study number 7.
job1
% DFsas job1 -C 7 -p all
The DFsas job file created by this command contains the field number, name and description of all data fields described in the study schema for all user defined study plates in the range 1-500, plus DFdiscover plates 510 (reasons) and 511 (queries). Users can then scan the list of variables and remove any that are not required.
To create a DFsas job file for specified plates, replace the keyword all with a list of the desired plate numbers, as in:
% DFsas job1 -c 7 -p 1~3 6 10~12 33
To split long string fields into a number of smaller fields use -s like this:
% DFsas job1 -C 7 -p all -s 200
This example splits all string fields that are defined with a maximum length greater than 200 characters into multiple fields of 200 characters each. For example a 500 character comment field named COMMENT would be split on word boundaries into 3 fields named COMMENT1, COMMENT2 and COMMENT3, each with a maximum of 200 characters. If -s is specified, DFsas uses the SAS® version number to decide how long it can make the new variable names that it needs to create. DFsas creates a new variable for each substring by using the variable name as a base and adding a number (1, 2, 3, etc.). If SAS® version 6 or earlier is being used and the variable name is already 8 characters long, DFsas will reduce it as needed to keep the resulting variable names within the 8-character limit. For example, splitting a field named LONGNAME produces fields named LONGNAM1, LONGNAM2, etc. If SAS® version 7 is specified, a limit of 32 characters will be imposed when creating new variable names for each substring.
COMMENT
COMMENT1
COMMENT2
COMMENT3
LONGNAME
LONGNAM1
LONGNAM2
NOTE: Warning for shortened names A warning message is printed whenever DFsas has to shorten an existing variable name.
String splitting may be needed for versions of SAS® 6 and older, as they will not read string fields that are longer than 200 characters. As described later, a DFsas job file may include instructions to split specified string fields into 1 or more shorter fields. When creating a default DFsas job file, using -c or -C , one can instruct DFsas to automatically include the specifications needed to split all string fields that are longer than a specified maximum length by also specifying -s .
NOTE: Warning for long fields A warning message is printed if the DFsas job file specifies string fields longer than the SAS® maximum and that have not been split into smaller strings. This is merely a warning as DFsas can be used to assemble data sets for purposes other than SAS®.
To truncate long string fields, use -t as in:
% DFsas job1 -C 7 -p all -t 200
which truncates all string fields to a maximum of 200 characters.
Use -t to truncate string fields to a specified maximum character length. Truncation occurs at exactly the specified number of characters, even if this truncates the string in the middle of a word. Strings that are shorter are not affected.
To export dates in calendar, string, original or julian formats use the -d option. As described in “Qualified Dates”, DFsas can export each date field in one or more of 4 different formats. When creating a default DFsas job file, using -c or -C , you can automatically generate the specifications needed to have all dates formatted in one or more of these formats by using -d followed by one or more of the single letters (date qualifiers) csoj to specify the desired date formats.
csoj
For example, the following command will generate the DFsas specifications needed to convert all dates on plates 1-3 to both c (calendar) and s (string) formats.
% DFsas job1 -C 7 -p 1~3 -d cs
This results in creation of 2 output fields for each date field. These fields are named by appending 1,2,3 etc. to the original field name. DFsas uses the SAS® version number to determine the maximum length allowed for these variable names (see Impact of SAS® limits). If the base name is too long to generate a legal SAS® name by appending a number (1-4 in the case of qualified dates), then the base name is shortened and a warning message is printed similar to the following: WARNING: date datename -> datenam1 to datenam4 (SAS 8 char names)
WARNING: date datename -> datenam1 to datenam4 (SAS 8 char names)
To select subjects use -I as in:
% DFsas job1 -C 7 -p all -I 1001~1999,3001~3999,5066
Use -I to specify a list of subject IDs to be included in the SAS® data sets. In the above example, 2 subject ID ranges and one single value have been specified. If this option is not used all subjects with primary records for the specified plates will be included. Note that the specified ID string must not contain spaces.
To select data records having a specified list of visit numbers, use -v as in:
% DFsas job3 -C 254 -p all -v 0~99
In the above example, a range of visit numbers (0-99) has been specified. If this option is not used, DFsas will create DFsas job files for all visits.
To suppress warning messages, use -w as in:
% DFsas job2 -C 254 -p all -w
After you have setup a DFsas job file, you can execute DFsas again, this time with no options, to create the SAS® job file and assemble the required summary data files. Variable name and label lengths created in the data files will be determined by the SAS® version number specified in the global SASVERSION statement in the DFsas job file (see Impact of SAS® limits).
Create SAS® job file and data files for DFsas job file job1
% DFsas job1
DFsas creates the SAS® job file (job1.sas) and the required data files (job1.d01, job1.d02, job1.d03, etc.). These are plain ASCII files that can be copied to the computer used to run SAS® and submitted for execution.
job1.sas
job1.d01
job1.d02
job1.d03
DFsas displays the number of subjects and number of records written to each data file, as it proceeds to build the data files.
1. RECORD 1 Subjects = 172 Records = 172 2. RECORD 2 Subjects = 154 Records = 154 3. RECORD 3 Subjects = 152 Records = 152
If a DFsas job file requests data from a plate that contains no data, DFsas will not create an output data file or SAS® input statements for that plate, unless the force option is used. The force option is invoked by including -f on the command line after the DFsas job file name.
% DFsas job1 -f
By default, the data export script created when building a DFsas job file is removed after it is executed. Including -dbx as a command line option preserves the data export script.
-dbx
% DFsas job1 -f -dbx -c 7 -p all
This script may be useful for debugging purposes but is not required for proper operation of DFsas or SAS®.
The default behavior of using field names will not create a valid SAS job file when used on a study that has more than one instance of a module, whether on the same or on different plates. Including -a in the command line creates job files using the field alias and will work as expected provided all field aliases are unique for all fields.
% DFsas job1 -a
DFsas performs limited checks on the integrity of the DFsas job file before proceeding to create the SAS® job file and data files. In general it is up to the user to make sure that SAS® rules for variable names, labels, missing value codes, etc. are followed. DFsas checks the following:
DFNUM Has a legal DFdiscover study number been specified?
SASDIR Does the SAS® directory exist?
SASJOB Has a SAS® job name been specified?
Reserved character, | Has the field delimiter been used inappropriately? DFsas checks for illegal use of | in the specification of fixed fields, default labels for check fields and recoded values for blanks and missing value codes.
Plate and variable numbers Have legal values been specified? DFsas checks the specified plates to verify that they have been defined for the study, and also verifies that the specified field numbers have been defined for each plate.
Field numbers do not contain extraneous characters (e.g. [12-, 12., 12a, a12, etc]).
12-
12.
12a
a12
Field number ranges are legal (e.g. 12~55, and not: 55~12, 12~15~55, etc.)
12~55
55~12
12~15~55
If a string split is specified, is the field a string?
No more than one field qualifier is used on any field.
If a date qualifier is specified, is the field a date?
Variable name and label length DFsas checks the length of all variable names and labels in the DFsas job file and prints error messages and stops if there are any which exceed the limits of the specified SAS® version.
If the DFsas job contains a request to split a specified data field, DFsas uses the SASVERSION statement to determine the maximum length allowed for the variable names that it creates (see Impact of SAS® limits).
Dates are governed by the global statements IMPUTE, INFORMAT and OPTIONS and by the date field qualifiers, jocs.
IMPUTE
INFORMAT
OPTIONS
jocs
The IMPUTE statement controls how date values for SAS® are imputed from DFdiscover partial dates.
IMPUTE no Date fields are exported exactly as they appear in the database, regardless of what imputation method has been defined.
IMPUTE no
IMPUTE yes All 2-digit years are converted to 4 digit years using the pivot year defined for each date. In addition the imputation method defined in the study schema for each date field is used to convert missing days and/or months to legal values. If the imputation method is set to none only 2 digit to 4 digit year conversions are performed. When imputing dates, any nonsensical dates, i.e. dates with impossible values for day, month or year, are converted to the default DFdiscover missing code, *. If other missing value codes have been defined for the study, any dates that use them will retain their original missing value code.
IMPUTE yes
The INFORMAT statement controls the type of SAS® informat statements DFsas generates.
informat
INFORMAT dates Create SAS® informat statements for date fields that SAS® recognizes. DFdiscover allows some date formats that SAS® does not recognize as dates, (e.g. Jan 25, 2001). When this global statement is used, date fields with formats that SAS® does not allow are converted to string fields and a corresponding character informat statement is created.
INFORMAT dates
INFORMAT all Convert all fields to type character, including dates.
INFORMAT all
INFORMAT all dates Use date formats for all date fields and convert all other fields to type character. Finally, 4-digit year imputation from 2-digit years can be achieved through the SAS® OPTIONS YEARCUTOFF statement. The DFsas global statement:
INFORMAT all dates
OPTIONS YEARCUTOFF
OPTIONS YEARCUTOFF=yyyy
instructs DFsas to add a SAS® YEARCUTOFF statement to the job file, such that yyyy defines the beginning of a 100 year period for imputing the century in 2 digit dates.
yyyy
Using OPTIONS YEARCUTOFF
With the following DFsas global statement,
OPTIONS YEARCUTOFF=1940
years 40 to 99 will be interpreted as 1940 to 1999, and years 00 to 39 will be interpreted as 2000 to 2039.
NOTE: One YEARCUTOFF statement per SAS® job file SAS® only allows one YEARCUTOFF statement per job file that can be a problem when trying to allow for both a birth date and a recent follow-up visit date. In such situations it may be preferable to let DFdiscover do the imputation by using the global DFsas statement IMPUTE yes.
The global statements already described set the desired format for all dates. This can be overridden at the individual field level by using one of the date qualifiers, jocs, after the date's field number, as in:
12:c
which will output field 12 as a calendar date. Qualified dates will be written with an INFORMAT statement appropriate to the qualified date format, and will override the global statement INFORMAT all, if present.
The date qualifiers are:
:j - julian value This option converts the date to a number representing the number of days since a date in 4712 BC. No informat statement is written to the SAS® job file as this is treated by SAS® as a number.
:o - original value This option turns off imputation to yield the value exactly as it appears in the database, overriding any global specification IMPUTE yes.
With this date qualifier a date informat statement is written to the SAS® job file using the original date format specified in the study schema. This option should only be used for fields that cannot contain a partial date. Otherwise SAS® will complain.
:c - calendar value This option converts 2 digit years to 4 digit years and performs imputation as specified in the study schema. If imputation leads to a nonsensical date it is converted to *, the DFdiscover default missing value code. When :c is specified the appropriate date informat statement is written to the SAS® job file overriding any global INFORMAT statement.
:s - string value Like the :o option, :s also turns off imputation to yield the value exactly as it appears in the database, but the field is considered to be a string and thus a character (not date) informat statement is written to the SAS® job file. This option allows one to output partial dates exactly as they appear in the database with a character informat that SAS® will accept.
:s
In addition, to date qualifiers, DFsas includes time qualifiers that cause DFsas to generate appropriate SAS® time types. Time variables in DFdiscover are handled correctly automatically, however this qualifier must be applied manually to numeric or string fields that have the format nn:nn or nn:nn:nn.
nn:nn
nn:nn:nn
:t5 - SAS time type TIME5 This option indicates that the field is to be identified as the SAS® time type, TIME5, represented as HH:MM The appropriate SAS® informat statement is written to the SAS® job file for any fields so qualified.
:t5
TIME5
HH:MM
:t8 - SAS time type TIME8 This option indicates that such fields are to be identified as SAS time type TIME8, represented by HH:MM:SS. The appropriate SAS® informat statement is written to the SAS® job file for any fields so qualified.
:t8
TIME8
HH:MM:SS
When creating a default DFsas job file, with -c or -C , the following rules are applied:
The global statement INFORMAT dates is written to the DFsas job file.
If all dates in the study schema use the same pivot year, DFsas creates the following global statements:
OPTIONS YEARCUTOFF=yyyy (1) IMPUTE no
If all dates do not use the same pivot year a YEARCUTOFF statement cannot be generated because SAS® only allows one such statement. Instead, DFsas converts all dates to 4 digit years using the global statement IMPUTE yes.
If more than one output format is requested for date fields using the -d csoj option, DFsas creates a unique variable name for each of the output date variables by appending 1, 2, 3, 4 to the variable name. DFsas uses the SAS® version number to determine the maximum length allowed for these variable names.
This section includes a description of how global statements can be manipulated for string fields.
SAS® cannot input a string field that is longer than 200 characters. DFdiscover allows much longer string fields. Thus when preparing string fields for SAS® it is necessary to either truncate strings that are longer than 200 characters or split them into multiple shorter fields. This can be done within a DFsas job file using the same string splitting syntax used by DFexport.rpc.
String splitting
Field 44, a 500-character field named comment, is split into 5 fields of up to 100 characters each. The w qualifier specifies that splitting is to occur on word boundaries. The field names are created by appending 1,2,3, etc. to the root name provided in the DFsas job file. The resulting 5 fields are named comment1, comment2, comment3, comment4 and comment5. DFsas uses the SASVERSION statement in the DFsas job file to determine the maximum lengths allowed for the variable names it creates.
comment
comment1
comment2
comment3
comment4
comment5
44:5x100w comment "Physician comment field"
It is also possible to split string fields on exact character boundaries, by using the c qualifier.
44:5x100c comment "Physician comment field"
String fields may be truncated by specifying a split that does not add up to the original field length, as in:
44:2x200w
This creates 2 fields of not more than 200 characters each, with the division occurring on word boundaries. If it is necessary to truncate the second field, truncation will occur on a word boundary.
Similarly,
44:1x200c
truncates field 44 to not more than 200 characters. In this case truncation will occur on a character boundary and thus may occur in the middle of a word.
When creating a normalized data set, field names are specified after the NORMALIZE key word. A separate name must be specified for each of the string fields that result from any string field splitting that is specified in the data record creation section that appears after the RECORDS keyword, as in this example:
NORMALIZE
RECORDS
NORMALIZE if(length(drug) > 0) drug "Drug name" why1 "Drug indication - part 1" why2 "Drug indication - part 2" RECORDS 200 20,25:2x200w 30,35:2x200w 40,45:2x200w
New fields can be created consisting of substrings of database fields. The specification is: field number : start_character x number_of_characters. Substrings can be extracted from any field type: strings, dates, or numbers. But substrings are extracted from the string value of the field. So be careful with numeric fields; unless they are zero padded you may not get the results you want. In the following example variable sitenum is created from the 1st 3 digits of the subject ID field. This will give the desired results provided all IDs are zero padded, e.g. 001001 not 1001, for subject 1 at site 1.
7:1x3 sitenum "site number - 1st 3 digits of subject ID"
When SAS reads string fields from an input data file it normally removes any quotation marks it finds. This can be prevented by tagging string field names with ~$ instead of $ in SAS input statements. (Aside: per communication from SAS support, his trick only works when the DSD option is used in the INFILE statement; but this has been standard practice for reading the pipe delimited output files created by DFsas from the beginning.)
~$
$
By default DFsas tags string fields with $ and thus quotes are removed. You can instruct DFsas to tag string fields with ~$ and thus retain input quotes, in 2 ways: by including the global statement
RETAIN QUOTES yes
which is applied to all string fields, or by using the :q field level qualifier on those fields for which quotes are to be retained, as illustrated in the following example.
:q
RECORD 1 8 PINIT Subject Initials 9 AGE Age 10 SEX Sex 10:q PCOMP Subject complaint
There is one limitation. It is not possible to use more than one field level qualifier at a time. Thus you cannot instruct DFsas to both split a string field and retain quotes using field level qualifiers. The only way this can be accomplished is by using the global statement RETAIN QUOTES yes in combination with field level string splitting specifications. However, note that there is no guarantee that matching quotes will appear in each of the split string fields.
This section includes a detailed description of all DFsas commands. Each DFsas job file has three parts. Global specifications come first, followed by instructions for each of the summary data files to be created, and ending with any SAS® command lines you want to include in the SAS job file.
DFsas job files are constructed from records, where each record is composed of a keyword followed by value specifications for the keyword. Keywords and options must be typed exactly as illustrated in the examples. All keywords are in uppercase. Blank lines and extra spaces may be inserted anywhere to aid readability. Comments can be added on lines by starting them with a single octothorpe, #, followed by at least one space and then your comment.
The global specifications, located at the top of each DFsas job file, apply to all of the subsequent data retrieval specifications in the job file. They include the following:
DFNUM This keyword is followed by the DFdiscover study number of the database from which you plan to export a SAS® data set. For example:
DFNUM 248
SASJOB The file name specified after this keyword is the name of the DFsas job file and is also the root name for the SAS® job and summary data files created by DFsas.
The following sets the DFsas job name to job1:
SASJOB job1
The SAS® job file name is constructed by appending .sas to the specified job name, constructing a filename similar to job1.sas. A separate data file is created for each set of retrieval specifications introduced by the keyword RECORD or NORMALIZE. These summary data files are named by appending .d## (where ## is in the range 01 to 99, inclusive) to the job name (e.g. job1.d01, job1.d02). Summary data files are numbered in the order in which they appear in the DFsas job file.
.sas
.d##
SASVERSION When using the -C or -c option to create a default DFsas job file, the target SAS® version can be specified on the command line using the -SAS # option. SAS® version 7 is the default if the -SAS # is not specified. Each DFsas job file contains a SASVERSION statement regardless of whether or not the -SAS # is explicitly specified on the command line.
SASVERSION 7
SASDIR The SAS® job file and summary data files created by DFsas are written to the directory specified by the value of this keyword. A full path name must be given.
SASDIR /studies/mystudy/sas/baseline
RUNDIR If you plan to move the SAS® job and data files created by DFsas to another platform, such that the files will be processed by SAS® from a different file location than they were created by DFsas, use this keyword and specify the location where you intend to install the files. If a RUNDIR statement is present, DFsas will use this directory when creating filename statements in the SAS® job file. Without the RUNDIR statement DFsas uses SASDIR when creating filename statements. A full path name must be given.
RUNDIR C:\studies\mystudy\sas\baseline
LIBNAME DFsas includes support for SAS® libraries, and also allows users to create their own data set names for both the individual data files corresponding to DFdiscover plates, and for the merged data set that currently has the default name final. SAS® libraries are used to save a SAS® data set created during a SAS® run, and to use a previously saved SAS® data set. When libraries are not used, the SAS® data sets created during a SAS® run are temporary, and are removed when the run is completed.
SAS® libraries may be used by specifying the LIBNAME statement in the global specifications section of a DFsas job file as per the following example.
LIBNAME
SAS® data sets mylib1 and mylib2 are created by specifying the previously saved SAS® data sets of baselib and aelib, respectively, in a LIBNAME statement in the global section of a DFsas job file.
mylib1
mylib2
baselib
aelib
LIBNAME mylib1 `C:\studies\study254\sas\baselib' LIBNAME mylib2 `C:\studies\study254\sas\aelib'
NOTE: DFsas will not create the LIBNAME if it does not exist. The user must do this.
Data set names may be specified for each input data file by including a DATA statement immediately following the RECORD statement used to introduce data set specifications for a plate in a DFsas job file, or immediately below a normalized data set.
DATA
The data set name mylib1.vitals is specified using a DATA statement following the RECORD statement for plate 5.
mylib1.vitals
RECORD 5 DATA mylib1.vitals
The data set name mylib2.meds is specified using a DATA statement following the NORMALIZE statement for normalized data set specifications.
mylib2.meds
NORMALIZE if( length(medname)>0 ) DATA mylib2.meds
To change the name of the final merged data set, from the current default value of final, add the desired name of the merged data set to the global MERGE statement.
The name mylib1.baseline replaces the default name of final using the MERGE statement in the global specifications section of a DFsas job file.
mylib1.baseline
MERGE yes mylib1.baseline
OPTIONS YEARCUTOFF=1940 OPTIONS PAGESIZE=60 LINESIZE=72 NOCENTER
IDNAME When DFsas is used to create a DFsas job file it determines the field variable name assigned to the subject ID field from the first plate in the schema and writes this out as the global IDNAME statement. The variable name can be changed to any desired value, keeping in mind the SAS® restrictions on variable names. If the IDNAME statement does not appear in a DFsas job file, DFsas uses the name ID.
SUBJECTS This statement can be used to specify a list of subject IDs and ID ranges to be included in the SAS® data sets. The default setting is
SUBJECTS all
which includes all subjects who have a primary data record in at least one of the plates being extracted from the study database.
Select a subset of subjects
The following statement selects subject IDs in the range 1001 to 1999, 3001 to 3999, and values 5501 and 6002. The order in which the values are specified is not important.
SUBJECTS 1001~1999,3001~3999,5501,6002
VISITS This option can be specified on the command line during SAS® job creation by using the option -vvisit#_list or added to the global specifications section by editing the DFsas job file. If this option is not specified, DFsas will create DFsas job files containing the global statement VISITS all . For backwards compatibility, DFsas will assume VISITS all if the VISITS global statement is missing from a DFsas job file.
visit#_list
VISITS all
VISITS
This option is similar to the existing SUBJECTS option in that both are applied during data export. Since data export occurs before data record processing, this option takes precedence over the visit specification on the RECORD statement. Thus, for example, specifying RECORDS 12 20, which indicates the creation of a data file for plate 12, visit 20, would produce no output at all if the global VISITS statement was not set to all or did not include 20. Also, if the global VISITS statement has already been used to select visit 20, the RECORDS statement would not need to include the visit specification. For example, RECORDS 12 would be adequate to produce the desired data file.
SUBJECTS
RECORDS 12 20
20
RECORDS 12
The specified visit number list is applied to all of the specified plates. It is not necessary that all visits be relevant for all plates.
NOTE: The VISITS option does not allow the specification of different visit numbers for different plates.
SUBJECTALIAS Specifying
SUBJECTALIAS yes
instructs DFsas to export the subject alias as the identifier in place of the traditional subject ID. It is possible to include both subject ID and subject alias by specifiying
and also requesting field 7 in the list of fields to be exported.
CHECK Check fields are data fields that are entered using a single box which is either checked or left blank. An example might be a question that asks the investigator “Check all of the symptoms that apply to this subject from the following list”. Each of the symptoms will have a check box with only 2 possible values, for example, 1 if the box was checked and 0 if it was left blank.
This keyword is used to control the output for all data fields of this type. You may either print the codes (e.g. 0,1) or replace them with the labels specified in the study schema.
The 3 possibilities are specified as follows:
CHECK codes
which outputs the codes (e.g. 0, 1) for all check fields and creates a proc format statement for the value labels, and
CHECK labels
which outputs the schema labels (e.g. no for 0, yes for 1).
CHECK sublabels
which outputs the schema sublabels (e.g. negative for 0, positive for 1).
When using the labels or sublabels option, missing value codes are preserved. If you request labels for all check fields, but have not specified any labels for a particular check field in the study schema, DFsas will use the codes. If you request sublables and none are specified, the label is used.
The default for this global specification is CHECK codes, meaning that the numeric codes found in the database will be output to the SAS® data file if this statement does not appear in your DFsas job file.
You can override the global specification of labels, sublabels or codes at the individual field level as described in Data Retrieval Specifications.
CHOICE Choice fields consist of a question followed by 2 or more mutually exclusive response options. Examples include: race, yes/no questions, severity mild/moderate/severe, etc. As for check fields, choice fields can be output as either codes, labels or sublabels,
CHOICE codes CHOICE labels CHOICE sublabels
When codes are output to the data file, DFsas creates a proc format statement for value labels and writes it to the SAS job file.
Because the choice data type will apply to fields having many different response options it doesn't make sense to have default labels. Thus if you request labels and DFsas cannot find them in the study schema it will simply output the codes. If you request sublabels and none are specified in the study schema, the labels are used. As for check fields, DFsas will first check for missing value codes before substituting value labels or sublabels for the codes stored in the database.
The default for this global specification is CHOICE codes, meaning that the numeric codes found in the database will be output to the SAS® data file if this statement does not appear in your DFsas job file.
You can over-ride the global specification of labels, sublabels or codes at the individual field level as described in Data Retrieval Specifications.
IMPUTE Dates which use 2 digit years can be converted to 4 digit years and missing day and/or month values can be imputed with this statement, such that
disables imputation and outputs dates as they appear in the database and
enables imputation and imputes the day and month in partial dates and convert 2 digit years to 4 digits.
Imputation of day and month, and conversion of 2 digit years to 4 digit years is performed using the imputation method and pivot year set in the study schema.
If a date variables has its imputation method set to never, imputation by DFsas will only result in conversion of 2 digit years to 4 digit years.
never
If a date is nonsensical, (i.e. does not yield a true date even after application of the imputation method, if any) imputation will output the DFdiscover default missing value code, *.
If a date field is blank or contains a missing value code, as defined in the study missing values map, imputation has no effect, i.e. the field will be output as is, with a blank or missing value code.
NOTE: The use of imputation in a DFsas job file (via any of the ways that it can be done) has the potential of creating missing value codes and thus the RECODE statement should be used to change the DFdiscover * to either the SAS® . or some other user-defined missing value code that SAS® will accept. If the user has defined a missing map which only contains legal SAS® missing value codes, then the RECODE statement could still be used for dates. For example, RECODE date *
RECODE
RECODE date *
NUMBER Numeric fields may be defined with labels, just like choice and check fields. In such cases you have the option of outputting the numeric values, as in
NUMBER codes
or, the labels defined for the values, as in
NUMBER labels
STRING It is rare to put value labels on string fields but DFdiscover does allow it. For example single letter entries in a string field may be used as codes for longer descriptions that are defined in value labels. In such cases you have the option of outputting the string values,
STRING codes
or the labels, as in:
STRING labels
If string splitting is used in conjunction with a request for labels and the resulting split string values are coded with labels, the labels will be output.
STRING coding and string splitting
Suppose you have a string field where you enter codes for problems that have occurred
Further, the user enters any combination of the letters A-Z into the string field in the database. To export the labels, one could then include
22:12x1c prob labels
in a record specification, such that the string field is field 22 and has a maximum length of 12 characters. The record specification would produce 12 fields named prob1 to prob12, each containing the label corresponding to the letter, or the letter itself if no label is defined.
RETAIN QUOTES This statement determines how SAS input statements are written for string fields. If
is specified (which is also the default setting), string field names are followed by a $ sign, and SAS will remove any quotation marks found in these fields when the input data files are read. If
is specified, string field names are followed by ~$, which instructs SAS to retain quotation marks.
VLABEL Specifying
VLABEL yes
instructs DFsas to include variable labels in the SAS® job file. Variable labels are copied from the field descriptions included in the study schema file, with the exception that any double-quote characters appearing in the field description are removed from the variable label.
VALFMT If VALFMT yes is specified, equivalent proc format value label statements will be written to the SAS® job file for each variable that has code labels defined in it's style or variable definition in the study schema. With value formats defined in the SAS® job file, users can elect to use the data values as in the SAS® statement proc print, or use the value labels as in the SAS® statement proc print label.
VALFMT yes
proc format
proc print
proc print label
DFsas creates value formats for all variables that have code labels, whether they are defined in the style or at the variable level. If code labels are defined in the style, the style name is used as the value format name in the SAS® job file, unless the style name would be an illegal SAS® name, in which case DFsas makes up a legal SAS® name. The style name must meet the SAS® restriction that it be no longer than 8 characters, start with a letter or underscore, thereafter contain characters that are letters, digits, or an underscore, and not end with a digit. If the code labels are defined at the variable level (instead of in a style), DFsas makes up a new SAS® value format name.
When DFsas creates a value format name, it does not check to see if the variable codes and labels exactly match another value format already in use. Thus if coding and labels are not defined in the style, it is possible that DFsas may create and use more value formats than are really required. If the same code labels are used by more than one variable, it is recommended that the codes and labels be defined in a style which the variables then use.
When creating new value format names, DFsas uses F####v where #### starts at 0001 and is incremented for each new value format name that DFsas creates as it reads through the list of study styles stored in the file DFschema.stl. v is the visit number which is appended to the variable's field name. There is one exception. For DFSTATUS and DFSCREEN variables, which are DFdiscover protected fields appearing on every plate, DFsas uses value format names DFSTATv and DFSCRNv respectively. Thus these 2 names should not be used for user-defined styles with code and label definitions of their own.
F####v
####
DFSTATv
DFSCRNv
For normalized records DFsas creates SAS® value formats based on the code labels defined in the study schema for the first data record defined in the block of normalized records. Typically all records in the normalized block have the same variables with the same codes (which is why they are being normalized) but DFsas does not check to make sure that this is the case.
NOTE: Value format and variable names: SAS® requires that value format names be different from variable names. DFdiscover does not impose this restriction and currently DFsas does not check for this possible SAS® error.
BLANK Data fields which are blank (i.e. empty) in the database can be output as they are or can be recoded to some other value (e.g. the SAS® . missing code). Since check and choice fields contain a numeric code when no box has been selected, check and choice fields are never blank. Consequently the BLANK global specification only applies to string, date and numeric fields.
The BLANK keyword is followed by either the keyword all or by the data type from the list string, date, or int, which is to be recoded. This is then followed by either the keyword asis or by the value to be inserted for blank fields. For example,
asis
BLANK all asis
outputs all blank fields without any recoding.
BLANK all .
recodes all blank fields to a period.
BLANK all blank
recodes all blank fields to the word blank.
blank
BLANK string .
recodes blank string fields to a period.
BLANK int 0
recodes blank int fields to 0.
It is legal to include more than one BLANK specification, to specify different recoding instructions for string, date and int data types. Any such field type specifications will take precedence over a BLANK all specification.
BLANK all
RECODE A RECODE statement can be used to recode all data fields of a given type. Since SAS® provides extensive recode capabilities, only minimal recoding has been implemented in DFsas. DFsas only allows one RECODE statement per data type and only allows for the replacement of one data value by another.
The RECODE keyword is followed by the data type, from the list check choice string date int, which is to be recoded. This is followed by the value to be recoded and then the new value to be written to the SAS® data input files. For example:
RECODE choice 0 .
recodes 0 to a period in all choice fields, and
RECODE check 0 9
recodes 0 to 9 in all check fields
NOTE: If you request output labels instead of codes for check and/or choice fields, and a label has been specified for zero for some or all check and/or choice fields, then the label will appear in the data field, and the above recode specification will have no effect. The desired recode can however be obtained when outputting labels, by specifying the label that is to be recoded instead of the numeric value (see Recode Processing Order).
MISSING Data fields that contain a missing value code in the database can be output as they are or can be recoded to some other value. For example, while several different missing value codes may be used in a DFdiscover study, you might want to change all missing value codes to the SAS® . missing code for a particular SAS® analysis.
The MISSING keyword is followed by either the keyword all or by the data type, from the list check, choice, string, date, or int, which is to be recoded. This is followed by either the keyword asis or by the value to be used in place of all missing codes. For example:
outputs all missing codes without any recoding,
MISSING all .
recodes all missing codes for all fields to a dot,
MISSING all NA
recodes all missing codes for all fields to "NA", and
MISSING string .
recodes all missing codes in string fields to a dot.
It is legal to include more than one MISSING specification, to specify different recoding instructions for choice, check, string, date and int data types. However, if the all specification appears it takes precedence over all other specifications.
When creating new SAS® job files, a second MISSING statement is included by default. This statement MISSING missed allows you to specify the missing value code to be used in data fields for missed records. The keyword missed can be followed by a user-defined missing value code that will be inserted into each data field after field 7 (Subject ID) in each missed data record. If the global statement MISSING missed exists with no missing value code specification, all data fields following field 7 are left blank for missed records.
MISSING missed
It is important to note that the MISSING all global statement described above will not insert missing value codes into fields of missed records. A separate MISSING missed statement must be specified in order to do this. However, if the MISSING missed statement specifies the DFdiscover missing value code (*) and a MISSING all code statement is used to recode DFdiscover missing value codes, then the MISSING all code takes precedence and this statement will apply to missed records as well.
MISSING all
MISSING all code
NOTE: The MISSING missed global statement will have no effect on the output unless missed records are requested by the RECSTATUS global statement (See RECSTATUS). By default, the RECSTATUS statement includes final, incomplete and missed records. If missed records are requested but a MISSING missed statement has not been included, or has been included but with no missing value code specification (i.e. literally as MISSING missed), all data fields following field 7 will be left blank for missed records.
The MISSING statement can also be used to specify that a SAS® MISSING statement is to be included in the SAS® job file created by DFsas. A SAS® MISSING value statement is generated if the DFsas job file contains the following:
MISSING SAS missing_value_list_from_study_missing_value_map
For example the global statement:
MISSING SAS A B C
will generate the SAS® MISSING statement:
MISSING A B C;
to indicate that the values A, B and C represent missing values.
NOTE: A MISSING SAS statement is generated automatically when DFsas is executed with -c or -C , regardless of whether or not DFmissing_map contains legal SAS® missing value codes. If only legal SAS® codes have been used the MISSING SAS statement should be removed, as SAS® expects this statement only for non-standard codes. The SAS® missing value statement is written at the top of each data step rather than just once at the top of the entire SAS® job file.
MISSING SAS
Generate a SAS® job file called testjob for plate 5 of study #254, view the global MISSING statements, and recode all illegal SAS® missing value codes.
testjob
% DFsas testjob -c 254 -P 5
After executing the above command, we see that the SAS® job file testjob contains the following MISSING statements.
MISSING all asis MISSING SAS * _A _U _N
We see that * is used as a missing value code in DFdiscover. This is not legal in SAS® and thus needs to be changed. An appropriate change might be to modify the MISSING statements in the job file to appear as follows:
MISSING recode * _D MISSING SAS _D _A _U _N
F In addition to extracting data from the DFdiscover database you might also want to include fixed fields, i.e. data fields that contain the same value for all cases. An example might be a study protocol number or a study name that you want to be able to include in data listings. Another possibility is that you might plan to merge data sets from different studies at some point and want to have the protocol number included in the data set for subjects from each study.
Fixed fields can be defined globally (in which case they are added to all data records from all plates) or they may be included in the variable list following the RECORD statement used to begin data retrieval from a specified plate (as described in Data Retrieval Specifications). Fixed fields can be defined for both normalized or non-normalized data sets.
A fixed field is defined using a statement in the following format:
F fixed_value variable_name variable_label (1) (2)
fixed_value
variable_label
Common uses of fixed fields
F 18345 SNUM "Study Number" F 44.171 PNUMBER "Protocol Number" F "Study 44-171" PNAME "Protocol Name" F 09/15/97 STARTDT "Study Start Date" F "150mg" STDOSE "Study Treatment Dose"
The first example is interpreted as an int field and the last 4 examples are all string (character) fields.
Exactly the same format is used to specify fixed fields, whether defining them in the global statements section of the DFsas job, or locally in the variable list following a RECORD statement for a particular plate. The only difference is that globally defined fixed fields are inserted at the beginning of each data record immediately after the subject ID, while locally defined fixed fields are inserted in the position in which they are defined within the variable list. A DFsas job may include both globally and locally defined fixed fields.
There are 3 restrictions on fixed fields:
they can not contain single or double quotes
they can not contain the | or - characters,
and they can not begin with a minus/dash (-).
NOVISIT By specifying visit numbers after the RECORD statement, it is possible to construct data records in which the data for repeating visits are laid out on a single summary data record for each subject (e.g. ID dbp0 dbp1 dbp2 dbp3 etc.). The NOVISIT keyword is used to specify a value to be inserted for variables that don't exist in the database for a subject, because that visit has not been completed, or received. Whatever follows the NOVISIT keyword will be used, exactly as typed. Do not include quotes or the quotes will also be inserted. The only exception is the word blank, which is interpreted as requesting that the fields be left blank. A few examples follow.
NOVISIT .
Inserts the SAS® missing value code, i.e. the dot.
NOVISIT NA
Inserts the 2 letters NA.
NOVISIT blank
Leaves the fields blank.
RECSTATUS If the RECSTATUS statement is not used, DFsas retrieves final, incomplete and missed data records from the database. It does not retrieve pending records because this status is normally used to indicate that the data is not ready for statistical analysis, and in some cases might even indicate that one or more of the keys (subject ID, visit or plate) are uncertain. Missed records are included by default so that the data set includes all cases as is typically desired for an intention to treat analysis. The MISSING missed statement is also be included by default in new SAS® job files. This outputs blank fields for missed records. Alternatively a missing value code can be specified (See MISSING).
The RECSTATUS statement can include any combination of the following record status keywords: final, incomplete, pending and missed.
NOTE: While it is also possible to export 'secondary' records this is not generally useful, as the only meaningful data field on secondary records is the image ID.
The RECSTATUS statement applies only to data plates and not to queries. If you want to export data by query status, you must specify plate 511 when building an initial DFsas job file, as in
% DFsas job 1 -c 7 -p 511
In the data retrieval specifications for job1, specify
RECORD 511 unresolved
RECORD 511 resolved
to export unresolved or resolved queries respectively.
SETNAME DFsas names the input data files which it creates by appending d01, d02, d03, etc. to your SAS® job file name (specified using the SASJOB command). It is possible instead to have DFsas append the plate number as the extender following the SAS® job name. This is done using the SETNAME statement, as follows:
SETNAME byplate
which results in the naming of data sets by plate number (not d01, d02, etc.). The ASCII data files will be named jobname.# and the SAS® data sets will be named data# where # is the plate number.
This might be helpful if you want to have a common data set name across different SAS® jobs or different studies, in situations where the same variables are always pulled from each plate, or the same plate numbers are used in different studies.
Obviously, this option can not be used if you build more than one data set from a single plate. In the case of normalized data sets, which involve the creation of normalized records from more than 1 plate, DFsas will use the plate number from the first RECORDS statement to name the data set when SETNAME byplate is specified.
INFORMAT This statement controls the creation of SAS® informat statements. When a DFsas job file is created the following INFORMAT statement is automatically included:
which instructs DFsas to create SAS® informat statements for date fields that SAS® recognizes. DFdiscover allows some date formats that the current version of SAS® does not recognize as dates, (e.g. Jan 25, 2016). Any date fields that SAS® does not allow are converted to string fields.
DFsas identifies string fields which are up to 8 characters long using the SAS® $ identifier in the data variable list, and generates informat statements for any string fields that have more than 8 characters.
DFsas converts fields which contain formatting characters to type string. Also if labels are requested for check or choice fields they too will be converted to SAS® string fields by DFsas, and if any of these labels is more than 8 characters long DFsas will automatically generate a SAS® informat statement for that field.
A global INFORMAT statement can be used to get DFsas to convert all variables to type string and generate the appropriate SAS® INFORMAT statements as follows:
This converts all fields to type string including dates. To use date formats for dates and convert all other fields to type string use the following:
INFORMAT dates all
FORMAT This statement is used to create SAS® output format statements for dates. The following example sets the output format for all dates to the SAS® date format DDMMYY10:
DDMMYY10
FORMAT dates DDMMYY10
If this option is used without specifying a SAS® date format, DFsas creates SAS® format statements which correspond to the format defined for each date in the study schema.
FIELDCODE By default, for choice and check fields, code labels are taken from the variable style and written as SAS® format statements. Generally speaking, this is the desired behavior for such fields. There may however be situations where the unique code labels for the field may be preferred. The default behavior can then be overridden with this statement as a global specification.
FIELDCODE yes
With this specification, the code labels specified at the individual field level are written as SAS® format statements.
MERGE In order to read all of the data files created by DFsas into a single SAS® data set it is necessary to include the appropriate SAS® commands after the data statements in the SAS® job file. Most of the time the commands that will be needed are something like the following:
data final; merge test.d01 test.d02; by ID;
In this example, a data set named final will be created by merging files test.d01 and test.d02 on the subject ID. The above is what is written to your SAS® job file by default, or when you explicitly ask for it by including:
MERGE yes
The summary data files are sorted on ID when they are created by DFsas and thus proc sort is not needed in SAS®.
proc sort
Sometimes, for example when building normalized data sets, you may want to specify a different SAS® merge command. In such cases specify:
and then put the desired SAS® merge commands at the top of the SAS® procedures section of your DFsas job file.
DFsas includes several commands (labels, MISSING, BLANK, RECODE and NOVISIT) that can be used to recode data fields before they are written to the SAS® input data files. These commands are applied in the following order:
labels converts numeric values to labels in choice and check fields
MISSING converts different missing value codes to a single code
BLANK converts blank fields to a specified value
RECODE converts a single specified value to another specified value
NOVISIT the missing visit code is applied to all fields of visits not in the database
WARNING: Side effects of ordering: Watch out for possible unintended order effects. For example, if for check field you use coding: 0="box not checked", 1="box checked", and your global statements include both CHECK labels and RECODE check 0 . the RECODE statement will have no effect because zeros in check fields will have already been converted to the label "box not checked".
RECODE check 0 .
After the global statements come the data retrieval statements, which identify the data fields to be included in your SAS® job. DFsas will create 1 or more data input files depending on the number of different plates from which data are to be retrieved. These data files are merged within SAS®, using the SAS® merge command, to create the final SAS® data set. The summary data files created by DFsas are named using the DFsas job name plus .d01, .d02, .d03, etc. as suffixes (e.g. test.d01, test.d02, etc.)
The definition of each data set begins with the keyword RECORD or NORMALIZE. The RECORD keyword is used when you want to pull specified data fields from a single plate, to create a single record for each subject (or for each subject/visit combination). It is followed by the plate number and optionally by record status criteria and/or either a single visit/sequence number or a list of visit/sequence numbers and ranges. This identifies the records in the study database from which variables are to be exported.
If more than one visit or sequence number list is specified, the individual sequence numbers or ranges must be separated by a space, and the 2 numbers making up a range must be separated by the range delimiter (- or ~), without any intervening spaces.
~
Sample data retrieval
This example specifies that the variables listed below the RECORD keyword (i.e. status and vdate) are to be pulled from plate 101. Further it specifies that these variables are to be pulled from several visit numbers 0 to 5, 9, 12, 21 to 25 and 99. All of these variables will be assembled in a single record for each subject. The variables for visit 0 will be written first, then the variables for visit 1, and so on. Be careful, you can end up with very long records this way.
RECORD 101 0~5 9 12 21~25 99 1 status 9 vdate
If visit or sequence numbers are not specified, DFsas will create a summary data record for each data record that appears in the database for the specified plate. Thus instead of all variables being laid out on a single record for each subject, each subject will have as many summary data records as they have visits for that particular plate in the study database. In some situations this may be what you want to happen. Or you might know that the plate only occurs at one visit. It's up to you to decide how you want the data organized when you import it into SAS®.
If the global RECSTATUS and RECLEVEL statements are used, only data records meeting the specified status and level criteria will be retrieved from the study database. It is also possible to specify status and level criteria on the RECORD statement line itself, in which case they override the global specifications. In the following example final and incomplete records at levels 3,4,5 and 7 and at visits 0,1,2,3,4, and 9 will be exported from plate 101.
RECORD 101 final incomplete level:3-5,7 0~5 9
NOTE: Status and level specifications must follow the plate number and come before the visit criteria (if any). Status and level specifications can appear in any order. The level specification must not contain spaces.
When exporting queries (plate 511) the record status keywords resolved and unresolved may be used to retrieve only those queries which have the desired resolution status:
resolved
unresolved
The variables to be retrieved from the specified plate are listed following the RECORD statement. Variables are identified by field number, as specified in the study schema. Each line, following the RECORD statement, can define either a single variable or a single contiguous range of variables. It is not legal to include a list of variables and ranges on a single line.
The variables to be retrieved may also be specified using the notation NF or NF-#. NF refers to the last field on the current plate, while NF-# refers to the field which is # fields before the last field.
NF-
NOTE: Use of the NF and NF-# notations are not permitted in normalization specifications. When specifying field mapping to a normalized data structure, the field numbers must be specified explicitly. Also, these notations may not be combined with any field qualifiers (:c, :j, :o, :s, etc.).
The following example shows legal syntax. Note that variables may be defined in any order.
RECORD 22 12~17 10 34~45 NF-1 NF
DFdiscover maintains a separate data file for each CRF plate. A listing of the study data dictionary for all plates, with data field numbers and variable names, can be displayed and printed by running DF_SSvars.
Do not specify the subject ID (always field number 7) as one of the variables to be retrieved. Since subject ID is needed to merge the summary data files into a SAS® data set this field is retrieved automatically and written as the first field of each summary data record. The variable name is set to the name specified in the study schema but may be changed by editing the global IDNAME statement. If the IDNAME statement is missing from a DFsas job file the subject ID is assigned variable name ID.
Normally variable names will be extracted from the study schema and written to the SAS® job file. This is what happens when you specify a field number without a variable name in a DFsas job file. However, you can override the schema names, and specify new ones in the DFsas job file, as we have done in the preceding example.
If a variable name is not specified in the DFsas job file, a name is constructed from the study schema as follows. If a field name exists it will be used, and if not, the required variable alias will be used. If SAS® version 6 is used and either of these names is longer than 8 characters, it will be truncated to 8 characters when it is written to the SAS® job file. If a sequence number range is specified, the sequence number will be appended to the variable names in that retrieval set. When DFsas is run with the existing DFsas job file to create SAS® job and data files, DFsas checks the length of all variable names in the DFsas job file and prints a warning if there are any which exceed the limit of the SAS® version specified.
All of these variables will be created for each subject record. If a subject does not have one or more of the specified visits in the database, the fields for missing visits will be written with the SAS® missing value code, i.e. ., a dot.
If a visit or sequence number is not specified, variable names will be used as they appear in the study schema, or as specified in the DFsas job file, without any appended visit or sequence numbers. If a particular plate is only completed at one visit there is really no need to specify a sequence number, but if one is specified, it will be appended to the variable names.
The first 6, and last 3, fields in all data records in a DFdiscover database are used for database management by DFdiscover, and are referred to by the same variable names in all DFdiscover studies. They are: DFSTATUS (data record status: final, incomplete, pending), DFVALID (data record validation/workflow level: 0~7), DFRASTER (the name of the file holding the image of the CRF page corresponding to the data record), DFSTUDY (the DFdiscover study number), DFPLATE (the data record plate number), DFSEQ (the data record sequence or visit number), DFSCREEN (data record status without consideration for primary/secondary), DFCREATE (the record creation date and time) and DFMODIFY (the record's last modification date and time). DFSEQ is assigned only if the sequence number is defined as being in the barcode; otherwise the variable name is user-defined. If you want to retrieve these variables from more than one data file you will have to give them unique variable names in your DFsas job file.
Date variables are specified as character string fields in the SAS® job file.
You can override the global specification to output codes, value labels or value sublabels for any field by including the keyword codes, labels or sublabels after the variable name. For example:
10 sex1 codes 10 sex2 labels
will output the variable sex (field 10) twice, sex1 will be numeric (e.g. 1,2) and sex2 will be text (e.g. male, female). Note that if you wish to specify codes or labels at the variable level, a variable name must be specified as illustrated above, i.e. 10 codes, would assign the variable name codes to field 10, probably not what you intended.
When DFsas is executed with the -c or -C options to create a DFsas job file, the data retrieval specifications for each plate includes one line for each variable with the database field number, variable name and variable label found in the study schema file. Both the variable name and variable label can be modified in the DFsas job file. Variable labels can be increased to the SAS® version's character maximum in the DFsas job file. When DFsas is executed to create the SAS® job and data files, it is the variable names and labels found in the DFsas job file that are used. DFsas reverts to the variable names and labels found in the schema if they are not specified in the DFsas job file. If the DFsas job file contains a variable label that is longer than the character maximum allowed by the SAS® version, the label is truncated to the maximum limit in the SAS® job file. Note that a variable label cannot be either "codes" or "labels", as these are key words used to determine whether the output data file is to be written with the numeric codes or text labels for specified variables.
It is not uncommon to see case report forms designed so that several medications, medical history items, adverse events, etc. can be recorded on the same page. Associated with each entry there are typically several data fields. For example, medications might include: drug name, indication, dose, start date, stop date, etc. These questions might be repeated several times on the same page so that several medications can be recorded in the same place in each subject's case record book. The same question blocks may also appear on optional continuation pages, or be used at both baseline and follow-up, and thus may appear on several different plates and at several different visits in the study database.
A normalized data set for such repeating question blocks, puts all of the data fields related to a single item (e.g. a medication, medical problem, or adverse event) on a single data record. Each subject might thus contribute none, one or more such records to the normalized data set, depending on the number of items (medications, medical problems, or adverse events) that were recorded for each subject.
When building a normalized data set, you usually only want to output records for those items that have data recorded. That is, you want to skip over empty question blocks. Also, you might want to specify additional retrieval criteria (e.g. build a data set consisting of only serious adverse events).
It is also possible that you might want to have a data field appear in the normalized records that does not actually appear in the study database. For example, if medical history items are described on the case report forms (e.g. hypertension, diabetes, etc.) but only appear as a yes/no choice field followed by other details on the case report forms, you would probably want to include a field in the normalized data records which specified the type of medical problem (e.g. "hbp" for hypertension, "dbm" for diabetes, etc.) being described by each data record.
Finally, in the SAS® data set you may want to merge a normalized data set with other subject data. For example you might want to combine adverse event data with demographic data (e.g. age, sex), initial diagnosis, etc.
DFsas includes the ability to create normalized data sets with all of these features. The syntax is illustrated in Creating a Normalized Data Set.
The third and final section of a DFsas job file begins with the keyword SAS on a line by itself. All lines following this keyword are written directly to the SAS® job file. This can be used to include SAS® commands following the data definition statements in the SAS® job file. This section is optional. Without it the SAS® job will end with the SAS® commands needed to merge the summary data files on subject ID.
SAS
Sometimes you will want to be able to create more than one record for each subject, because certain data can naturally be divided into repeating record blocks. Examples include CRFs in which several medications, adverse events, or medical problems are listed on a single CRF page. In such cases you may want to be able to create a separate data record for each medication, adverse event or medical problem.
The example that we will use shows how to create a normalized data set for medical problems reported at baseline. Our objective will be to create a record for each problem that exists (i.e. for each problem checked Yes). Also we will show how to merge the problem records with the subject’s age and sex, and then print them out, one problem on each line. The case report forms used to record this data might appear as illustrated in the Sample Case Report Form below.
Sample Case Report Form
Normalization code for Sample Case Report Form
MERGE no NORMALIZE if(item==2) SORT 1:n 2 medprob "Medical Problem" item tempvar yrs "Medical Problem - duration (yrs)" mon "Medical Problem - duration (mon)" treat labels "Medical Problem - on treatment" ok "Medical Problem - controlled or stopped" rx "Medical Problem - treatment specify" RECORDS 2 "hbp" 10~15 "dbm" 16~21 "smoker" 55 56 "." "." 57 "." RECORDS 3 "afib" 41~46 "angina" 47~52 RECORD 1 12 sex labels 13 age SAS data final; merge medhx.d01 (in=medprob) medhx.d02; by ID; if medprob; proc print;
Although most specifications will probably be simpler than the example shown above, this example illustrates most of the DFsas normalization features. A detailed description follows below. In this example we will assume that the yes/no questions are coded 1=no, 2=yes, and that plates 1, 2 and 3 only occur at baseline.
In our example the only global specification shown is MERGE no. This will suppress the standard merge command so that it can be replaced by the command shown at the bottom under the SAS keyword. With the specified merge command only subjects who have one or more medical problems will be included in the normalized data set. If the usual merge command were used subjects with no medical problems who appeared in the baseline (age/sex) file, would appear as a single record in the merged data set, with SAS® missing value codes (i.e. the dot) for the medical problem variables.
In our example, each medical problem is defined by 7 data fields, named: [medprob, item, yrs, mon, treat, ok, rx]. Each of these 7 data fields is defined on a separate line following the NORMALIZE command. The normalized records to be created consist of only 6 of these fields.
medprob
item
yrs
mon
treat
ok
rx
The variable named item is classified as a tempvar. Tempvars are variables that are needed for case selection but which are not be included in the normalized data records. Variable item is needed to select the problems to be printed but is of no use by itself. It is a choice field with 2 boxes coded 1 for no (the problem does not exist) and 2 for yes (problem exists). Since this field will equal 2 for all records that are written to the normalized data set, it would be an uninformative constant and can therefore be eliminated. It is legal to specify more than one tempvar within a set.
The type (choice, check, etc.) of each variable is determined from the definition in DFschema of those variables that make up the first normalized record. For this reason it is important to choose the first record carefully and to avoid using a record in which some of the fields are missing and therefor specified with a fixed value. The record that begins with "smoker" in the preceding example is like this and thus would be a poor choice for the first record.
If you can not avoid fixed fields in the first record, (e.g. medprob in the example) the variable type is determined by inspection of the fixed value. If this value is entirely numeric (i.e. composed of 1 or more digits and optionally including a decimal) it is defined as a number to SAS®, otherwise it is defined as a text field. When specifying fixed numeric fields remember to enclose them in double or single quotes, otherwise they will be interpreted as database field numbers.
When DFsas creates a normalized data set, it determines the field type (i.e., string, number, date) of each variable by examining the definition of each variable on the first normalization record specified. If a fixed string is specified, DFsas examines it to determine whether it is a string or a number. In some cases it may determine the type incorrectly. For example, if the first record contains a field that is defined with a SAS missing value code, e.g. ".M", there is no way for DFsas to know whether the field is a number or a character string. To eliminate ambiguity, users may specify the string qualifiers of c (string) or n (number) for fixed strings in the DFsas job file. String qualifiers should only be specified on the first normalization record, because this is the only one examined to determine the field type of each variable in the normalized data set. This is all that is needed because each field can only be of one type. For example
RECORDS 3 "bp" 9~13 "mmHg":c "wght" 14~18 "kg" "hght" 19~23 "cm" "pulse" 24~28 "beats per min" "other" 29~33 ""
String qualifiers are defined in the DFsas job file and consist of:
:c - character string This option specifies that a fixed string is to be interpreted as a character field.
:n - number This option specifies that a fixed string is to be interpreted as a numeric field.
:n
There are several important points to bear in mind when using string qualifiers:
The qualification is only used from the first normalized record and is ignored if present on any other normalized records.
Qualified strings may have embedded spaces, as for pulse in the example above.
pulse
Single or double quotes may be used to specify qualified strings, but quotes of any type are not allowed within a string, and the :q qualifier is not allowed. Qualified strings must have matching quotes.
Empty strings are allowed, and may be specified with either single or double quotes, as for other in the above example.
The if statement after the NORMALIZE command is optional. It instructs DFsas to print a data record if variable item equals 2 (i.e. the problem exists). The if statement uses awk syntax. Other examples of legal selection criteria for the above example would include:
medical problems among older men
if(item==2 && sex==1 && age>65)
untreated problems which are not resolved
if(item==2 && treat==1 && ok==1)
problems among women of any age or men over 65
if(item==2 && (sex==2 || age>65))
Another common use of the case selection specification is to restrict record selection to a desired set of visits (e.g. baseline only, follow-up only, etc.). This can be accomplished by including the visit number in the variable list (perhaps as a tempvar) and then using it in the case selection specification. For example:
medical problems at visit 0 (baseline)
if(item==2 && vnum==0)
medical problems from visits 1 through 5 inclusive
if(item==2 && vnum>=1 && vnum<=5)
Another common example will arise when normalizing medication records or adverse events. In such cases each question block typically begins with a string variable (e.g. a drug name or adverse event name). To select blocks in which something was specified use the awk length function as follows:
length
if(length(drugname)>0) # include if a drug name is specified
DFsas will also accept SELECT if(condition statement) on one or more separate lines following the NORMALIZE statement. If more than one SELECT statement is used, an OR is implied among them, i.e. a record only has to match one SELECT statement to be created. For example, the statement:
SELECT if(condition statement)
SELECT
NORMALIZE if ( x==1 || y==2 || (z==3 && length(comment)>0) )
can also be written as:
NORMALIZE SELECT if(x==1) SELECT if(y==2) SELECT if(z==3 && length(comment)>0)
NORMALIZE and SELECT must both appear in uppercase, if must be in lowercase, and the names must use case exactly as defined in the NORMALIZATION variable definitions. SELECT if must be considered a single phrase. if must follow SELECT and it is not legal to have anything except space(s) between SELECT and if. For example,
SELECT if
SELECT {if(...)}
is not legal.
Remember that the variables used for case selection must appear in the list of normalized variables. However, if they are not wanted in the normalized data records they can be removed by defining them as tempvar, as was done for variable item in the example.
tempvar
Variable names may be followed by the keyword labels or codes to override global specifications. This is illustrated in variable treated, for which labels are requested. For example, this variable might have labels no and yes, which will be written to the normalized data set in place of the codes 1 and 2.
treated
Value labels are taken from the study schema. Since all records are assumed to follow the same format, it does not matter which of the records specified under the RECORDS keyword(s) is used to locate variable labels. DFsas uses the very first record specified.
The variable name statements all end with a variable description or label. This is needed because these variables do not exist in a single place in the database, and thus a unique variable description can not be read from the study schema.
The RECORDS statement identifies the plate from which the variables will be extracted. In our example, there are 2 RECORDS statements, one for records to be created from plate 2 and one for records to be created from plate 3.
As previously described for RECORD statements, it is possible to specify record status retrieval criteria following the plate number on a RECORDS statement. For example:
RECORDS 1 final #only export final data records from plate 1
Each line following a RECORDS statement identifies the variables to be extracted from the specified plate to form a single normalized data record. Variables are specified as a space delimited list of single field numbers, field number ranges and/or quoted fixed string fields. The field numbers corresponding to the variables you want to select can be determined from DFsetup, or by running DF_SSvars or DF_SSschema from DFexplore in Reports View.
There must be a one-to-one correspondence between the variable names specified under the NORMALIZE statement, and the data fields identified under the RECORDS statement. Note that the first data field, which maps to variable medprob, is a fixed string field, and is a constant which identifies the question on the study case report forms from which the problem record was created. This technique is also used to insert missing codes for 3 variables (mon, treat and specify) for history of smoking, which in our example inquires about duration in years (variable yrs ) and whether the subject has stopped (variable ok ), but not about duration in months, and treatment. Remember to enclose all fixed string fields within single or double quotes.
specify
DFsas automatically sorts each normalized data set on subject ID, however sometimes this may not be enough. You may wish to sort the normalized set on some other variable(s) within subject ID, or even sort on some other variable ahead of subject ID.
You can specify your own sort order by including a SORT statement after the NORMALIZE statement as in:
SORT
SORT 1:n 2
This example statement will sort the normalized data set on the 1st field (ID) in numerical order, and then within ID will sort on the 2nd field (medprob) in ascending ASCII or character order. Note that field numbers correspond to the order in which the variables are defined in the normalized set. Remember that although subject ID is not included in the list it is always present as the first field.
Legal field sort qualifiers include:
:n Sort the field in ascending numeric order, as in:
2:n
which sorts on field 2 in ascending numeric order.
:r Sort the field in descending alphanumeric order, as in:
:r
2:r
which sorts on field 2 in descending alphanumeric order.
:nr Sort the field in descending numeric order, as in:
:nr
2:nr
which sorts on field 2 in descending numeric order.
If a sort field is specified with no qualifier, sorting is done in ascending alphanumeric order (a.k.a. ASCII collating sequence).
The normalized data file created by running DFsas with our example is illustrated below. The fields would be named: [ID, medprob, yrs, mon, treat, ok, rx] in the SAS® job file.
22001|dbm|30|0|yes|2|insulin 22001|hbp|22|0|yes|1|beta blockers 22006|smoker|50|.|.|1|. 22009|angina|1|6|no|1|
DFsqlload provides a convenient method for migrating data from the proprietary DFdiscover storage to storage in a relational database. Although this link is not maintained in real time, the relational side can be synchronized with the proprietary side whenever required. Data integrity is maintained in that only changes made to the data on the validated DFdiscover side are kept. These changes are reflected in the relational copy only after synchronizing with DFsqlload. DFsqlload can be scheduled using cron or run interactively as required. In this way, snapshots of the database at a known time can be generated and used for whatever purpose the user deems important, much in the same way as DFsas provides snapshots of the database in SAS format.
DFsqlload is only useful in situations where one of the following four database products are being used or are planned on being used in a particular computing environment.
MySQL - a common, multi-platform, open source relational database product.
PostgreSQL - another multi-platform, open source relational database product that has more sophisticated features than MySQL, which may make it better suited for server environments.
Oracle - commercial relational database product.
Microsoft SQLServer - commercial relational database product.
All four products are network-aware, and provide native as well as JDBC and ODBC-based connectivity from client applications running on any platform.
DFsqlload is one of three data export utilities available in DFdiscover. DFsqlload provides an easy to use export facility for relational databases. DFexport.rpc is a flat-file data export utility, and DFsas provides DFdiscover export capability for the SAS environment.
All methods can be used at any time and produce a true equivalent to the data stored in DFdiscover at the time the export is performed. This chapter will focus on how this is accomplished with relational databases.
There are many applications written for relational databases for many purposes. The two main applications benefiting from DFsqlload are reporting and analysis. Writing custom reports for DFdiscover is possible, but can take effort compared to the many report-writing and spreadsheet applications with relational database connectivity on the market today. For Windows users, there is Microsoft Office, Crystal Reports, Visual Basic, to name just a few of the applications available. UNIX users have OpenOffice, many java-based report writers, and Perl at their disposal. Mac users in most cases can pick the best from both worlds. There are hundreds of applications out there, some of which you may already be using.
DFdiscover is a validated system. The ways to get data into DFdiscover have been rigorously tested and proven accurate. Providing for one-way movement of data out of DFdiscover allows DFdiscover to maintain the integrity of the DFdiscover database by limiting the ability of outside processes to compromise the validity of the system.
Plates: DFdiscover keeps all data from a given form type together as one record type or plate. There may be, for example, blocks of information on a particular plate that repeat and would be better modeled with separate tables. DFsqlload makes no attempt to do this. When DFsqlload is run, each plate becomes its own relational table.
Fields: DFdiscover plates are made up of fields, each field storing data according to the layout of the plate on the paper CRF form. When a DFdiscover study is setup, you will recall that each plate is imported into the setup tool as a PS or PDF representation of the printed page that will ultimately be used to collect data for the study you are designing. The same fields defined during DFdiscover setup are used to define column names for the relational tables DFsqlload will create. There are some differences in the rules for naming columns in relational tables, but these differences are handled by DFsqlload as it has its own set of rules for doing so. DFsqlload provides two options for the creation of tables. One is to export all data as if it were character string data. The other option is to create columns of the same data type as the source fields.
Coding: DFdiscover has the ability to use multiple missing value codes for different purposes. Data values with corresponding labels may be used to represent different discreet conditions. Dates may be incomplete and as such, have rules for imputation. None of these concepts are an integral part of relational databases and require different ways to handle these situations as they arise. DFsqlload provides options for passing both coded values and labels to the relational side.
Schemas and Tablespaces: DFdiscover keeps the information for any given study together under one study number. There is no sharing of data between studies, so in cases, for example, where two studies share the same sites, the sites database is duplicated for the two studies. Each study database is independent of the others. Some relational database products allow for multiple groups of potentially similar data in the same database through use of the concept of the database schema. DFsqlload is aware of these possibilities and supports multiple schemas if the relational database product supports them. Tablespaces provide for another layer of hierarchy in data modeling and are supported as well.
It is very simple to create a relational database equivalent to a DFdiscover study once a working relational database environment is established on your network. On the relational database side, you will need to know the following:
What relational database product am I using - MySQL, PostgreSQL, MS SQLServer or Oracle
What is the name of the server hosting the relational database
In the case of Oracle, what tablespace has been assigned to me for this purpose by my DBA.
In the case of Oracle and Postgres and MS SQLServer, what database will be used to store my DFdiscover schemas
What are my login credentials for the relational database server I am using.
With the answers to these questions, simply run DFsqlload as shown and a relational database equivalent to your DFdiscover study will be ready to use. Using Oracle (the default) as an example, the command would be:
% DFsqlload bluto:mystudies:foo.bar:olive:popeye 254
which will create a snapshot of study 254 in the mystudies database, in the foo schema with the tablespace name bar on the server bluto with user credentials username olive and password popeye.
mystudies
foo
bar
bluto
olive
popeye
By default, all columns are typed as closely as possible to the data types used by DFdiscover during setup. All dates are imputed. No coded values are translated. No missing codes are translated. No optional tables are created. If these defaults are acceptable, then this is all you need to know.
The default settings for DFsqlload should be adequate for most users. However, if the purpose of using DFsqlload is to provide a relational database view of a study to a set of specialized users with different requirements, then you might want to get as much information as possible from the DFdiscover side to the relational database side. For this scenario, you will need to use some of the options DFsqlload offers.
Answers to the following questions will affect how DFsqlload is used. See also the reference page for DFsqlload elsewhere in this guide.
Q: What is the target database type?
Q: Is it important to have the data type of each column in my relational tables match as closely as possible the data types for each field in my DFdiscover plates?
Q: My DFdiscover setup makes extensive use of codes and value labels for these codes. How can I make this information accessible from relational tables?
Q: My DFdiscover setup includes subject aliases. How can I access this information from relational tables?
Q: In my DFdiscover setup, I have defined a number of missing value codes that are important for correct interpretation of the data. How do I make those codes available in relational tables?
Q: How are dates handled by DFsqlload?
Q: How do I find out if something went wrong?
Q: Can I add tables to my SQL database outside DFdiscover?
Q: What if the DFdiscover study schema is changed?
Q: What happens if I modify the definition of one of the SQL tables used by DFdiscover?
Q: How does DFsqlload update my SQL tables?
A: DFsqlload supports four popular SQL server platforms - Oracle, PostgreSQL, MS SQLServer and MySQL. Your target database will need to be one of these four types. Once this is known, DFsqlload is directed to create a set of relational tables for a given database type using the -flavor option. If the target database type is Oracle, then this option is not required as DFsqlload assumes Oracle by default. Otherwise,
Target is a PostgreSQL database - use -flavor postgresql
Target is a MySQL database - use -flavor mysql
Target is a MS SQLServer database - use -flavor mssql
Target is an Oracle database - omit this option, or use -flavor oracle
A: The default behavior of DFsqlload is to preserve data typing. The older version of DFsqlload that was shipped with DFdiscover 3.7 and 3.7.001 did not preserve data typing and all user fields were converted to type VARCHAR. If you want to retain this behavior, then you will need to use the -type option or the applications you have written that use any of the tables created by these older versions of DFsqlload will probably not work. If you are writing new applications, but want your old applications to work as well, you can get DFsqlload to create both typed and untyped tables in your target relational database.
To preserve Data Typing - omit this option or use -type typed To provide both typed and untyped tables - use -type both To provide just untyped tables - use -type untyped
A: Relational tables rely on other relational tables when a code has a corresponding label and it is the label that you want to report. Rather than create a separate table for every coded variable, DFsqlload offers two options which may be used together or separately depending on the specific requirements. The -coding option controls the inclusion of label data. By default, you get just the codes. You can create a column for the code and a column for the label corresponding to that code using the option -coding both. If all you want is the labels, use the option -coding label. The other way to get value/label into your relational tables is to create the optional DFCODING table. This is done using the -table dfcoding option. This optional table contains all codes and labels for all DFdiscover fields with type CHOICE. See the DFsqlload reference page for more details.
A: Use the -table dfsubjectalias option to request creation of the optional DFSUBJECTALIAS table containing two columns, DFpid and DFalias. Thereafter it is an easy SQL join statement to include the subject alias together with the subject id.
The DFSUBJECTALIAS table is also created by default if subject aliases are defined when loading of all tables is requested.
A: How missing codes are handled depends upon the -type option. If the target tables are untyped, then by default, codes are output in the relational tables as is. To get the labels corresponding to a missing value code, use the option -missing label. If the target tables are typed, then all missing codes are converted to NULL and logged as such, regardless of how the -missing option is used. If the -table dfnullvalue option is used, a record of each substitution is created in the optional DFNULLVALUE table.
A: By default, DFsqlload creates date columns using the correct data type for the target system. A string representation of a date can be output in a separate column using the option -date both. If just the string representation is required, use the -date untyped option. Partial dates (i.e., dates where the day or month are missing) are imputed by default, according to the rules specified in the DFdiscover setup. If imputed dates are not desired, you can turn it off using the -noimpute option, in which case any partial dates will be converted to NULL.
A: DFsqlload writes extensive logging information. When DFsqlload encounters a problem, the problem is written to stderr by default, unless overridden with the -q option. In either case, problems are logged in the DFsqlload log file for a given run. If DFsqlload encounters problems with DFdiscover data, it replaces the problem data with a NULL value, writes the substitution to the log file and optionally creates a record for the substitution in the DFNULLVALUE table. The following identifies typical problems and how they are handled by DFsqlload.
Any field that is blank or contains only white space (space, tab) will be converted to a NULL. These substitutions are not logged.
All missing value codes are converted to NULL if the target tables are typed (the default). This is applied consistently, even if the missing value code happens to be legal for some field types.
Any value wider than the storage width defined for the field in the DFdiscover schema is converted to NULL.
If a field has a format defined in the DFdiscover schema, values are checked for adherence to the format and are converted to NULL if they do not conform.
Invalid dates are converted to NULL.
If date imputation is not defined in the DFdiscover schema, partial dates are converted to NULL. Any imputed dates that are not legal dates are also converted to NULL.
If a check or choice box contains an undefined code, it is converted to NULL.
If you use the -d drfname option, a .drf file will be created containing a reference to each DFdiscover record having one or more non-blank substitutions to NULL.
Complete records will be rejected if the following conditions are encountered.
the record does not contain the correct number of fields
any of the DFdiscover fields are blank or invalid
a record with the same keys has already been imported
These cases will also appear in the .drf file if the -d drfname option is used.
A: Yes. DFsqlload will ignore them.
A: DFsqlload recreates the tables it needs each time it is run. Be careful when changing DFsqlload program options from run to run. SQL tables created from a previous run are not recreated if they are not required by the current run.
A: DFsqlload will drop the table (and all of your changes) and recreate it from the current DFschema and data in the DFdiscover study database.
A: If there have been no changes to the data definitions, the data is dropped from the SQL table and reloaded from DFdiscover. This occurs even if there have been no data changes for that DFdiscover plate since the last time DFsqlload was run. If the -merge option is used, DFsqlload will determine the date and time of the last run, then extract any data changes from the data journal files using DFjournal. The extracted changes are then used to update the SQL tables.
DFdiscover software uses several third-party software components as part of its server side and/or client tools.
The copyright information for each is provided below. If you would like to receive source codes of these third-party components, please send us your request at help@dfnetresearch.com.
Copyright© 1994-2011, OFFIS e.V. All rights reserved.
This software and supporting documentation were developed by
OFFIS e.V. R&D Division Health Eschereg 2, 26121 Oldenburg, Germany
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Neither the name of OFFIS nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Copyright© 2009-2014 Petri Lehtinen <petri&digip.org>
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Copyright© 1991 Bell Communications Research, Inc. (Bellcore)
Permission to use, copy, modify, and distribute this material for any purpose and without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies, and that the name of Bellcore not be used in advertising or publicity pertaining to this material without the specific, prior written permission of an authorized representative of Bellcore. BELLCORE MAKES NO REPRESENTATIONS ABOUT THE ACCURACY OR SUITABILITY OF THIS MATERIAL FOR ANY PURPOSE. IT IS PROVIDED "AS IS", WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES.
Copyright© 1991-2, RSA Data Security, Inc. Created 1991. All rights reserved. License to copy and use this software is granted provided that it is identified as the "RSA Data Security, Inc. MD5 Message-Digest Algorithm" in all material mentioning or referencing this software or this function. License is also granted to make and use derivative works provided that such works are identified as "derived from the RSA Data Security, Inc. MD5 Message-Digest Algorithm" in all material mentioning or referencing the derived work. RSA Data Security, Inc. makes no representations concerning either the merchantability of this software or the suitability of this software for any particular purpose. It is provided "as is" without express or implied warranty of any kind. These notices must be retained in any copies of any part of this documentation and/or software.
Copyright© 1993,1994 by Carnegie Mellon University All Rights Reserved.
Permission to use, copy, modify, distribute, and sell this software and its documentation for any purpose is hereby granted without fee, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation, and that the name of Carnegie Mellon University not be used in advertising or publicity pertaining to distribution of the software without specific, written prior permission. Carnegie Mellon University makes no representations about the suitability of this software for any purpose. It is provided "as is" without express or implied warranty.
CARNEGIE MELLON UNIVERSITY DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL CARNEGIE MELLON UNIVERSITY BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
Copyright© 1988-1997 Sam Leffler Copyright© 1991-1997 Silicon Graphics, Inc.
Permission to use, copy, modify, distribute, and sell this software and its documentation for any purpose is hereby granted without fee, provided that (i) the above copyright notices and this permission notice appear in all copies of the software and related documentation, and (ii) the names of Sam Leffler and Silicon Graphics may not be used in any advertising or publicity relating to the software without the specific, prior written permission of Sam Leffler and Silicon Graphics.
THE SOFTWARE IS PROVIDED "AS-IS" AND WITHOUT WARRANTY OF ANY KIND, EXPRESS, IMPLIED OR OTHERWISE, INCLUDING WITHOUT LIMITATION, ANY WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT SHALL SAM LEFFLER OR SILICON GRAPHICS BE LIABLE FOR ANY SPECIAL, INCIDENTAL, INDIRECT OR CONSEQUENTIAL DAMAGES OF ANY KIND, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER OR NOT ADVISED OF THE POSSIBILITY OF DAMAGE, AND ON ANY THEORY OF LIABILITY, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
Portions© 1996-2019, PostgreSQL Global Development Group Portions© 1994, The Regents of the University of California
Permission to use, copy, modify, and distribute this software and its documentation for any purpose, without fee, and without a written agreement is hereby granted, provided that the above copyright notice and this paragraph and the following two paragraphs appear in all copies.
IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS ON AN "AS IS" BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
Copyright© 1998-2019 The OpenSSL Project. All rights reserved.
All advertising materials mentioning features or use of this software must display the following acknowledgment: "This product includes software developed by the OpenSSL Project for use in .the OpenSSL Toolkit." (https://www.openssl.org/)
The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to endorse or promote products derived from this software without prior written permission. For written permission, please contact openssl-core@openssl.org.
Products derived from this software may not be called "OpenSSL" nor may "OpenSSL" appear in their names without prior written permission of the OpenSSL Project.
Redistributions of any form whatsoever must retain the following acknowledgment: "This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit. (https://www.openssl.org)
THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT "AS IS" AND ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
This product includes cryptographic software written by Eric Young (eay@cryptsoft.com). This product includes software written by Tim Hudson (tjh@cryptsoft.com).
Copyright© 1995-1998 Eric Young (eay@cryptsoft.com) All rights reserved.
This package is an SSL implementation written by Eric Young (eay@cryptsoft.com). The implementation was written so as to conform with Netscapes SSL.
This library is free for commercial and non-commercial use as long as the following conditions are aheared to. The following conditions apply to all code found in this distribution, be it the RC4, RSA, lhash, DES, etc., code; not just the SSL code. The SSL documentation included with this distribution is covered by the same copyright terms except that the holder is Tim Hudson (tjh@cryptsoft.com).
Copyright remains Eric Young's, and as such any Copyright notices in the code are not to be removed. If this package is used in a product, Eric Young should be given attribution as the author of the parts of the library used. This can be in the form of a textual message at program startup or in documentation (online or textual) provided with the package.
Redistributions of source code must retain the copyright notice, this list of conditions and the following disclaimer.
All advertising materials mentioning features or use of this software must display the following acknowledgement: "This product includes cryptographic software written by Eric Young (eay@cryptsoft.com)" The word "cryptographic" can be left out if the routines from the library being used are not cryptographic related :-).
If you include any Windows specific code (or a derivative thereof) from the apps directory (application code) you must include an acknowledgement: "This product includes software written by Tim Hudson (tjh@cryptsoft.com)"
THIS SOFTWARE IS PROVIDED BY ERIC YOUNG "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
The licence and distribution terms for any publically available version or derivative of this code cannot be changed. i.e. this code cannot simply be copied and put under another distribution licence [including the GNU Public Licence.]
GNU GENERAL PUBLIC LICENSE Version 2, June 1991
https://www.gnu.org/licenses/gpl-2.0.html
Copyright© 1989, 1991 Free Software Foundation, Inc.
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA
Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Lesser General Public License instead.) You can apply it to your programs, too.
When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights.
We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations.
Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and modification follow.
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you".
Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does.
You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program.
You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee.
You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions:
You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change.
You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License.
If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program.
In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License.
You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following:
Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or,
Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or,
Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable.
If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code.
You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance.
You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it.
Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License.
If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances.
It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice.
This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License.
If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License.
The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns.
Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation.
If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY
BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
The files in the base, psi, lib, toolbin, examples, doc and man directories (folders) and any subdirectories (sub-folders) thereof are part of GPL Ghostscript.
The files in the Resource directory and any subdirectories thereof are also part of GPL Ghostscript, with the explicit exception of the files in the CMap subdirectory (except "Identity-UTF16-H", which is part of GPL Ghostscript). The CMap files are copyright Adobe Systems Incorporated and covered by a separate, GPL compatible license.
The files under the jpegxr directory and any subdirectories thereof are distributed under a no cost, open source license granted by the ITU/ISO/IEC but it is not GPL compatible - see jpegxr/COPYRIGHT.txt for details.
GPL Ghostscript is free software; you can redistribute it and/or modify it under the terms the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
GPL Ghostscript is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program so you can know your rights and responsibilities. It should be in a file named doc/COPYING. If not, write to the
Free Software Foundation, Inc., 59 Temple Place Suite 330, Boston, MA, 02111-1307, USA.
GPL Ghostscript contains an implementation of techniques covered by US Patents 5,055,942 and 5,917,614, and corresponding international patents. These patents are licensed for use with GPL Ghostscript under the following grant:
Whereas, Raph Levien (hereinafter "Inventor") has obtained patent protection for related technology (hereinafter "Patented Technology"), Inventor wishes to aid the the GNU free software project in achieving its goals, and Inventor also wishes to increase public awareness of Patented Technology, Inventor hereby grants a fully paid up, nonexclusive, royalty free license to practice the patents listed below ("the Patents") if and only if practiced in conjunction with software distributed under the terms of any version of the GNU General Public License as published by the
Free Software Foundation, 59 Temple Place, Suite 330, Boston, MA 02111.
Inventor reserves all other rights, including without limitation, licensing for software not distributed under the GNU General Public License.
5055942 Photographic image reproduction device using digital halftoning to para images allowing adjustable coarseness 5917614 Method and apparatus for error diffusion paraing of images with improved smoothness in highlight and shadow regions
GNU LESSER GENERAL PUBLIC LICENSE Version 2.1, February 1999 https://www.gnu.org/licenses/lgpl-2.1.html
Copyright© 1991, 1999
Free Software Foundation, Inc. 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA
[This is the first released version of the Lesser GPL. It also counts as the successor of the GNU Library Public License, version 2, hence the version number 2.1.]
Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public Licenses are intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users.
This license, the Lesser General Public License, applies to some specially designated software packages--typically libraries--of the Free Software Foundation and other authors who decide to use it. You can use it too, but we suggest you first think carefully about whether this license or the ordinary General Public License is the better strategy to use in any particular case, based on the explanations below.
When we speak of free software, we are referring to freedom of use, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish); that you receive source code or can get it if you want it; that you can change the software and use pieces of it in new free programs; and that you are informed that you can do these things.
To protect your rights, we need to make restrictions that forbid distributors to deny you these rights or to ask you to surrender these rights. These restrictions translate to certain responsibilities for you if you distribute copies of the library or if you modify it.
For example, if you distribute copies of the library, whether gratis or for a fee, you must give the recipients all the rights that we gave you. You must make sure that they, too, receive or can get the source code. If you link other code with the library, you must provide complete object files to the recipients, so that they can relink them with the library after making changes to the library and recompiling it. And you must show them these terms so they know their rights.
We protect your rights with a two-step method:
we copyright the library, and
we offer you this license, which gives you legal permission to copy, distribute and/or modify the library.
To protect each distributor, we want to make it very clear that there is no warranty for the free library. Also, if the library is modified by someone else and passed on, the recipients should know that what they have is not the original version, so that the original author's reputation will not be affected by problems that might be introduced by others.
Finally, software patents pose a constant threat to the existence of any free program. We wish to make sure that a company cannot effectively restrict the users of a free program by obtaining a restrictive license from a patent holder. Therefore, we insist that any patent license obtained for a version of the library must be consistent with the full freedom of use specified in this license.
Most GNU software, including some libraries, is covered by the ordinary GNU General Public License. This license, the GNU Lesser General Public License, applies to certain designated libraries, and is quite different from the ordinary General Public License. We use this license for certain libraries in order to permit linking those libraries into non-free programs.
When a program is linked with a library, whether statically or using a shared library, the combination of the two is legally speaking a combined work, a derivative of the original library. The ordinary General Public License therefore permits such linking only if the entire combination fits its criteria of freedom. The Lesser General Public License permits more lax criteria for linking other code with the library.
We call this license the "Lesser" General Public License because it does Less to protect the user's freedom than the ordinary General Public License. It also provides other free software developers Less of an advantage over competing non-free programs. These disadvantages are the reason we use the ordinary General Public License for many libraries. However, the Lesser license provides advantages in certain special circumstances.
For example, on rare occasions, there may be a special need to encourage the widest possible use of a certain library, so that it becomes a de-facto standard. To achieve this, non-free programs must be allowed to use the library. A more frequent case is that a free library does the same job as widely used non-free libraries. In this case, there is little to gain by limiting the free library to free software only, so we use the Lesser General Public License.
In other cases, permission to use a particular library in non-free programs enables a greater number of people to use a large body of free software. For example, permission to use the GNU C Library in non-free programs enables many more people to use the whole GNU operating system, as well as its variant, the GNU/Linux operating system.
Although the Lesser General Public License is Less protective of the users' freedom, it does ensure that the user of a program that is linked with the Library has the freedom and the wherewithal to run that program using a modified version of the Library.
The precise terms and conditions for copying, distribution and modification follow. Pay close attention to the difference between a "work based on the library" and a "work that uses the library". The former contains code derived from the library, whereas the latter must be combined with the library in order to run.
This License Agreement applies to any software library or other program which contains a notice placed by the copyright holder or other authorized party saying it may be distributed under the terms of this Lesser General Public License (also called "this License"). Each licensee is addressed as "you".
A "library" means a collection of software functions and/or data prepared so as to be conveniently linked with application programs (which use some of those functions and data) to form executables.
The "Library", below, refers to any such software library or work which has been distributed under these terms. A "work based on the Library" means either the Library or any derivative work under copyright law: that is to say, a work containing the Library or a portion of it, either verbatim or with modifications and/or translated straightforwardly into another language. (Hereinafter, translation is included without limitation in the term "modification".)
"Source code" for a work means the preferred form of the work for making modifications to it. For a library, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the library.
Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running a program using the Library is not restricted, and output from such a program is covered only if its contents constitute a work based on the Library (independent of the use of the Library in a tool for writing it). Whether that is true depends on what the Library does and what the program that uses the Library does.
You may copy and distribute verbatim copies of the Library's complete source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and distribute a copy of this License along with the Library.
You may modify your copy or copies of the Library or any portion of it, thus forming a work based on the Library, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions:
The modified work must itself be a software library.
You must cause the files modified to carry prominent notices stating that you changed the files and the date of any change.
You must cause the whole of the work to be licensed at no charge to all third parties under the terms of this License.
If a facility in the modified Library refers to a function or a table of data to be supplied by an application program that uses the facility, other than as an argument passed when the facility is invoked, then you must make a good faith effort to ensure that, in the event an application does not supply such function or table, the facility still operates, and performs whatever part of its purpose remains meaningful. (For example, a function in a library to compute square roots has a purpose that is entirely well-defined independent of the application. Therefore, Subsection 2d requires that any application-supplied function or table used by this function must be optional: if the application does not supply it, the square root function must still compute square roots.)
These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Library, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Library, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Library.
In addition, mere aggregation of another work not based on the Library with the Library (or with a work based on the Library) on a volume of a storage or distribution medium does not bring the other work under the scope of this License.
You may opt to apply the terms of the ordinary GNU General Public License instead of this License to a given copy of the Library. To do this, you must alter all the notices that refer to this License, so that they refer to the ordinary GNU General Public License, version 2, instead of to this License. (If a newer version than version 2 of the ordinary GNU General Public License has appeared, then you can specify that version instead if you wish.) Do not make any other change in these notices.
Once this change is made in a given copy, it is irreversible for that copy, so the ordinary GNU General Public License applies to all subsequent copies and derivative works made from that copy.
This option is useful when you wish to copy part of the code of the Library into a program that is not a library.
You may copy and distribute the Library (or a portion or derivative of it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange.
If distribution of object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place satisfies the requirement to distribute the source code, even though third parties are not compelled to copy the source along with the object code.
A program that contains no derivative of any portion of the Library, but is designed to work with the Library by being compiled or linked with it, is called a "work that uses the Library". Such a work, in isolation, is not a derivative work of the Library, and therefore falls outside the scope of this License.
However, linking a "work that uses the Library" with the Library creates an executable that is a derivative of the Library (because it contains portions of the Library), rather than a "work that uses the library". The executable is therefore covered by this License. Section 6 states terms for distribution of such executables.
When a "work that uses the Library" uses material from a header file that is part of the Library, the object code for the work may be a derivative work of the Library even though the source code is not. Whether this is true is especially significant if the work can be linked without the Library, or if the work is itself a library. The threshold for this to be true is not precisely defined by law.
If such an object file uses only numerical parameters, data structure layouts and accessors, and small macros and small inline functions (ten lines or less in length), then the use of the object file is unrestricted, regardless of whether it is legally a derivative work. (Executables containing this object code plus portions of the Library will still fall under Section 6.)
Otherwise, if the work is a derivative of the Library, you may distribute the object code for the work under the terms of Section 6. Any executables containing that work also fall under Section 6, whether or not they are linked directly with the Library itself.
As an exception to the Sections above, you may also combine or link a "work that uses the Library" with the Library to produce a work containing portions of the Library, and distribute that work under terms of your choice, provided that the terms permit modification of the work for the customer's own use and reverse engineering for debugging such modifications.
You must give prominent notice with each copy of the work that the Library is used in it and that the Library and its use are covered by this License. You must supply a copy of this License. If the work during execution displays copyright notices, you must include the copyright notice for the Library among them, as well as a reference directing the user to the copy of this License.
Also, you must do one of these things:
Accompany the work with the complete corresponding machine-readable source code for the Library including whatever changes were used in the work (which must be distributed under Sections 1 and 2 above); and, if the work is an executable linked with the Library, with the complete machine-readable "work that uses the Library", as object code and/or source code, so that the user can modify the Library and then relink to produce a modified executable containing the modified Library. (It is understood that the user who changes the contents of definitions files in the Library will not necessarily be able to recompile the application to use the modified definitions.)
Use a suitable shared library mechanism for linking with the Library. A suitable mechanism is one that
uses at run time a copy of the library already present on the user's computer system, rather than copying library functions into the executable, and
will operate properly with a modified version of the library, if the user installs one, as long as the modified version is interface-compatible with the version that the work was made with.
Accompany the work with a written offer, valid for at least three years, to give the same user the materials specified in Subsection 6a, above, for a charge no more than the cost of performing this distribution.
If distribution of the work is made by offering access to copy from a designated place, offer equivalent access to copy the above specified materials from the same place.
Verify that the user has already received a copy of these materials or that you have already sent this user a copy. For an executable, the required form of the "work that uses the Library" must include any data and utility programs needed for reproducing the executable from it. However, as a special exception, the materials to be distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable.
It may happen that this requirement contradicts the license restrictions of other proprietary libraries that do not normally accompany the operating system. Such a contradiction means you cannot use both them and the Library together in an executable that you distribute.
You may place library facilities that are a work based on the Library side-by-side in a single library together with other library facilities not covered by this License, and distribute such a combined library, provided that the separate distribution of the work based on the Library and of the other library facilities is otherwise permitted, and provided that you do these two things:
Accompany the combined library with a copy of the same work based on the Library, uncombined with any other library facilities. This must be distributed under the terms of the Sections above.
Give prominent notice with the combined library of the fact that part of it is a work based on the Library, and explaining where to find the accompanying uncombined form of the same work.
You may not copy, modify, sublicense, link with, or distribute the Library except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense, link with, or distribute the Library is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance.
You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Library or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Library (or any work based on the Library), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Library or works based on it.
Each time you redistribute the Library (or any work based on the Library), the recipient automatically receives a license from the original licensor to copy, distribute, link with or modify the Library subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties with this License.
If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Library at all. For example, if a patent license would not permit royalty-free redistribution of the Library by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Library.
If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply, and the section as a whole is intended to apply in other circumstances.
It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice.
If the distribution and/or use of the Library is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Library under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License.
The Free Software Foundation may publish revised and/or new versions of the Lesser General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns.
Each version is given a distinguishing version number. If the Library specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Library does not specify a license version number, you may choose any version ever published by the Free Software Foundation.
If you wish to incorporate parts of the Library into other free programs whose distribution conditions are incompatible with these, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally.
NO WARRANTY
BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
© Wang Bin wbsecg1@gmail.com Shanghai University->S3 Graphics->Deepin, Shanghai, China 2013-01-21
QtAV is free software licensed under the term of LGPL v2.1. The player example is licensed under GPL v3. If you use QtAV or its constituent libraries, you must adhere to the terms of the license in question.
Rather than repeating the text of the LGPL v2.1, the original text can be found in GNU LESSER GENERAL PUBLIC LICENSE, Version 2.1.
Most files in FFmpeg are under the GNU Lesser General Public License version 2.1 or later (LGPL v2.1+). Read the file `COPYING.LGPLv2.1` for details. Some other files have MIT/X11/BSD-style licenses. In combination the LGPL v2.1+ applies to FFmpeg.
The MIT License (MIT) © 2013 Masayuki Tanaka
Copyright© 2010-2017 Mike Bostock All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of the author nor the names of contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
MIT License
Copyright © 2018 Dominik Thalhammer
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
The MIT License
Copyright © 2017-, https://github.com/j2doll/QXlsx