DFexport — Client-side, command-line interface for exporting data by plate, field or module; exporting change history; or exporting components of study definition
DFexport
[-S server]
[-U username]
[-C password]
[
[-Mname moduleName]
| [-Mnum moduleID]
| [-Pnum
[
[#]
| [qc]
| [reason]
| [new]
| [missed]
]
]
]
[-status #, #-#]
[-level #, #-#]
[-visit #, #-#]
[
[-site #, #-#]
| [-subject #, #-#]
| [-alias string]
]
[-plate #, #-#]
[-create list]
[-modify list]
[-resolve list]
[
[-Pattern string]
| [-pattern string]
]
[-expr string]
[
[-ALL criteria]
| [-ANY criteria]
]
[
[-Fnum #, #-#]
| [-Fname fieldname_list]
| [-Falias fieldalias_list]
]
[
[-joinALL plt#[moduleFields]...]
| [-joinANY plt#[moduleFields]...]
]
[-usealias]
[
[-c]
| [-j]
]
[-d]
[-p]
[-z]
[-h]
[-D]
[-H header_list]
[-L lostcode]
[
[-w]
| [-a]
]
[-o [
[outfile]
| [-]
]
]
[-e errlog]
{study#}
DFexport also has a descriptive mode for the output of schema information.
DFexport
[-S server]
[-U username]
[-C password]
[-listfields]
{
[
[-listmodules]
| [-listplates]
]
[
[plt#,plt#~plt#]
| [all]
]
}
{study#}
DFexport has a history mode for the output of change history related to a subject, visit, or plate.
DFexport
[-S server]
[-U username]
[-C password]
{-history}
{-subject #}
[-visit #]
[-plate #]
{study#}
DFexport exports data from an individual study database. Minimally, the user must specify the study database and also at least one of the plates or modules containing the data of interest. Data records are exported in ASCII plain text format to a specified file, or the command-line output.
DFexport and DFexport.rpc share many features yet they are also unique in several important ways.
DFexport is available as both a server-side and client-side application; DFexport.rpc is server-side only.
DFexport is able to export descriptive information from
a study schema with the -listfields,
-listmodules and -listplates options.
DFexport is able to export, with the -history option,
the history of changes for a subject, visit within subject, or plate within visit of a subject.
DFexport includes support for joining data fields from the same module definition that are instanced over one or more plates.
DFexport is able to export all records marked as missed in one command
execution using the -Pnum missed option.
In DFexport.rpc this would require looping through each of the
plates individually, selecting missed records at each step of the loop.
User permissions for selecting by validation level are enforced in DFexport.
Selecting queries and reasons by creation and modification date is supported in DFexport; in DFexport.rpc only data records may be selected in this way.
Many options are specified using the same notation as DFexport.rpc, while others are specified using unique notation.
It is possible for DFexport and DFexport.rpc to produce different results
for the same selection criteria. DFexport enforces user permissions
for the user specified with the
-U option or the
usernameDFUSER environment variable. DFexport.rpc uses the
UNIX login name of the command-line.
DFexport can export data in a plate-centric manner, like DFexport.rpc,
using -Pnum, or in a
module-centric manner that is unique to DFexport using -Mname or
-Mnum.
Many options are available to create data sets for a wide variety of needs.
To get the most out of DFexport it can be useful to approach it's use with the
following in mind:
Preparation. Access to exported data is available only to authenticated users with appropriate database permissions, Authentication and Database Permissions. Plate, module and field name information is available from DFexport itself, Schema Listings.
Where is the data? Specifying the source of the data, whether it is plate-based or module-based, is required, Data Source.
Do the data records need to be filtered? In many cases not all of the rows in a data set are of interest and need to be filtered further, Record Selection by Keys and Filters.
Data Fields. By default, all of the data fields in the specified data source will be exported. It is possible to specify a subset of the fields to export, Extracting/Combining/Concatenating Fields for Output. It is also possible to add field, module, plate, visit, image, site, and study metadata (including user-defined properties) to the export data, Including Metadata in Output.
Output Formatting. For export, the appearance of data fields can be modified without modification of the data source, Output Formatting.
Output Destination. Finally, the export data is written to a destination, Output Options.
Authentication. To authenticate, DFexport requires the username and password for connection to a specific database server. These may be supplied as:
command-line options, the user can specify
-S ,
servername-U and
username-C when the program is
run, or
password
environment variables, the user can set the variables
DFSERVER ,
servernameDFUSER and
usernameDFPASSWD before the
program is run.
password
Specification of command-line options takes priority if both command-line options and environment variables are supplied. Refer to User Credentials for more information.
Database Permissions. Subsequent to authentication with a specific database server, use of DFexport is allowed (or disallowed) by the Server - Export Data permission, or the intersection of both DFexplore - List view and DFexplore - Print/Save Data permissions. These permissions are granted to a role (and then user) by an administrator on a study-by-study basis (see System Administrator Guide, Server). An authenticated user with one of these role permissions will then be able to export data records from the set of site and subject IDs granted by their user permissions, which may further be limited by visit, plate and level restrictions associated with the user's role. DFexport provides no warning that some records cannot be exported due to permission restrictions as this would itself provide information about the existence of restricted data records. For roles without 'Show Hidden Fields' permission and fields with the Hidden or Masked property enabled, DFexport will output an empty (NULL) value when exporting such data fields, and it will also skip over reasons and queries attached to such fields.
This feature is unique to DFexport and is not available in DFexport.rpc. DFexport is able to output database schema information. The available information includes plate numbers and module names, and may optionally also include field number, name, alias and data type for data fields in the requested plates or modules. This descriptive information can be useful reference material when a user wishes to further refine an export.
To obtain a listing of all defined plate numbers, use
-listplates all.
[1]
To obtain a listing of all defined module names, use
-listmodules all.
In both cases:
the output is a space-delimited list of values (either plate numbers or module names) matching the requested criteria,
a subset can be specified by replacing
all with individual, lists, or ranges of plate numbers in
the format #, #-#.
Example 3.15. List all module names used in plates 100 through 200
%DFexport -listmodules 100-200 10Chemistry Eligibility Enrollment Header LabData PhysicalExam SocialImpact Urinalysis
The output is sorted by module name and each module is listed exactly once, even when instanced more than once. Modules that are not instanced on one of the selected plates are not included in the output.
Information about data fields in the specified plates or modules can be
requested by additionally including the
-listfields option.
This option may only be used in conjunction with the
-listmodules and -listplates options.
Example 3.16. List schema information for data fields of all plates
%DFexport -listfields -listplates all 253Plate 001: Blood Pressure Screening Visits Field Name Alias Type ----- ------------------------- ------------------------- ------- 1 DFSTATUS DFstatus1 Choice 2 DFVALID DFvalid1 Choice 3 DFRASTER DFraster1 String 4 DFSTUDY DFstudy1 Number 5 DFPLATE DFplate1 Number 6 DFSEQ DFseq1 Number 7 ID ID001 Number 8 PINIT PINIT001 String 9 AGE AGE Number 10 SEX SEX Choice 11 RACE RACE Choice 12 RACEOTH RACEOTH String 13 S1DATE S1DATE Date...
This is the beginning of the output - actual output would be much longer as the user has requested this information for all defined plates. Field information is included both for user-defined fields and the fixed DFdiscover fields.
Example 3.17. List schema information for data fields of all modules
%DFexport -listfields -listmodules all 253Module: BloodPressure (ID=5022): Blood Pressure Readings Field Name Alias Type Used in Plates ----- -------------------- -------------------- ------- ------------------ 1 DFSTATUS DFSTATUS Choice 2 DFVALID DFVALID Choice 3 DFRASTER DFRASTER String 4 DFSTUDY DFSTUDY Number 5 DFPLATE DFPLATE Number 6 DFSEQ DFSEQ Number 7 DFPID DFPID Number 8 DFMNAME DFMNAME String 9 DFMID DFMID Number 10 DFMREF DFMREF Number 11 SBP1 SystolicReading1 Number 1,5 12 SBP2 SystolicReading2 Number 1,5 13 SBP3 SystolicReading3 Number 1 14 DBP1 DiastolicReading1 Number 1,5 15 DBP2 DiastolicReading2 Number 1,5 16 DBP3 DiastolicReading3 Number 1 17 SDAT ReadingDate Date 1 18 BPARM BPwhichArm Choice 5...
This is the beginning of the output - actual output would be much longer as the user has requested this information for all defined modules.
The module listing provides the module name, BloodPressure
(useful in conjunction with -Mname),
the module number, 5022
(useful in conjunction with -Mnum) and the field numbers and
names defined in the module, DFSTATUS, ...
(useful in conjunction with -Fname or -Fnum).
This feature is unique to DFexport and is not available in DFexport.rpc.
DFexport is able to output the history of all changes made to the data for a specific subject,
visit or plate.
The available information includes when the change was made, who made the change, what was changed,
any reason associated with the change and any queries related to the data.
The output is always to a file, in Excel format, and hence option
-o outfile.xlsx is required.
To obtain a history listing use -history.
The history of changes is specific to a single subject, and includes all data,
query and reason changes for all data elements in records idenitfied by the subject ID.
The output can be further filtered by visit, -visit ,
and plate, #-plate .
#
DFexport is designed to export history of changes for exactly one subject. To export history for multiple subjects, DFaudittrace is available.
Specifying the data source for the records to export is required.
Data can be exported by plate number, using the -Pnum
option, or by module, using either the -Mname or
-Mnum option.
Within any of these data sources, the default behavior is to export all of the
defined fields. It is possible to further select the exported fields using
one of the -Fname, -Falias or
-Fnum options.
Field name, alias and number information can be obtained from a previous
invocation that includes the -listfields option.
-Pnum | Plate number. This is equivalent to the required plate number in DFexport.rpc. Exactly one plate number value may be specified. The value is the actual plate number or from this list of keywords:
If a plate selector is not given, the data source must be specified by module name or module number. |
-Mname or -Mnum | Module name or module number.
Export data from the specified module.
Exactly one module name or module number may be specified.
Schema listing options If a module selector is not given, the data source must be specified by plate number. |
There are many options available for selecting subsets of data records for export. Several of these options are similar or identical to their counterparts in DFexport.rpc. [2] In all cases, record selection and filtering is from the set of records allowed by the user's database permissions.
The available selection mechanisms and filters are:
-level | validation level. By default, records at all validation levels permitted by the user permissions are exported. This option further selects a subset of those records, namely those having a validation level that falls within the specified range. | ||||||||||||||||||||||||||||||||
-site | site number. This option selects records by site number, site numbers, site number range or any combination thereof. The site number of any data record is derived by dereferencing the subject ID in the study's sites database.
This option may not be combined with
| ||||||||||||||||||||||||||||||||
-subject | subject ID. This option selects records by subject ID, subject IDs, subject ID range or any combination thereof.
This option may not be combined with
| ||||||||||||||||||||||||||||||||
-alias | subject alias. This option selects records for a single subject alias.
This option may not be combined with
| ||||||||||||||||||||||||||||||||
-visit | visit/sequence number. This option selects records by visit number, visit numbers, visit number range or any combination thereof. | ||||||||||||||||||||||||||||||||
-status status | record status. Select records by status keyword or status
numeric value.
The recognized status keywords are
If plate 511 is exported, the available status keywords are the same but their meaning (as related to query status) is translated by the table: Table 3.7. Numeric value, record status keyword, query status and reason status equivalence
| ||||||||||||||||||||||||||||||||
-plate | plate number. Select from module instances, queries, new, missed or reason records by plate number. It is important to note that this is not the same as the required plate number selector in DFexport.rpc. In that environment, the plate number selects the data file source for the export. In DFexport, this option filters the data source which may be modules, complete data records, queries, reason, new or missed records. Example 3.18. Export all missed records in plates 1 through 10
Example 3.19. Export all ESIG modules in plates 100 through 110 and 200
The output, from study 123,
is written to the file | ||||||||||||||||||||||||||||||||
-create | creation date.
Select data, query or reason records by their creation date.
The keyword | ||||||||||||||||||||||||||||||||
-modify | modification date.
Select data, query or reason records by their last modification date.
The keyword | ||||||||||||||||||||||||||||||||
-resolve | resolution date.
Select query records by their resolution date.
Queries that are not yet resolved will not have a resolution date and
hence will always be excluded if this selection filter is used.
The keyword | ||||||||||||||||||||||||||||||||
-Pattern|pattern string |
Filter records to export only those that
match the specified pattern string.
For inclusion in the export, a fragment of the record must exactly match
the entire pattern string.
The | ||||||||||||||||||||||||||||||||
-expr | Include only records that match the expression. The expression has the same format and meaning as the expression builder in DFexplore. For a complete description of the available expression syntax refer to DFexplore User Guide, List ViewSearching Data Records SearchingDataRecordsSearching Data Records. | ||||||||||||||||||||||||||||||||
-ALL, -ANY |
Specify one or more, simplified filter criteria.
Each filter references fields from one plate and multiple filters may
be combined to include fields from other plates.
The result of these selection criteria is a set of subject IDs
matching all of the criteria ( Each criterion must be specified using the notation: where:
To specify multiple criteria use Example 3.20. Export eSignature records for eligible, african-american males
The
The fields used in the criteria are, by default, not part of the output, nor are they required to be. In fact, unless the criteria are all from the same source, this is not possible with this option. | ||||||||||||||||||||||||||||||||
Frequently, only a subset of the fields in the data source are of interest.
These can be selected using one of the -Fnum,
-Fname or -Falias options.
Special handling of data fields that are defined in a module which is then
instanced across more than one plate is required.
-Fnum .
Select by field number. From each exported record, select for output
only those fields requested by their number. Output fields can be
re-ordered by specifying the field numbers in the desired order.
Field numbers can be repeated.
It is possible to request field numbers using #, #-#NF notation
which selects by field number relative to the last field of the record.
Some examples of NF usage are:
2-NF. field numbers 2 to the last field inclusive
NF-1. the second last field
5-NF-1. fields 5 to the second last field, inclusive
NF-5-NF-1. fields fifth from the last to the second last field, inclusive
This option may not be combined with
-Fname or -Falias - doing so will cause
DFexport to exit with an error message.
-Falias ,
fieldname_list-Fname .
Select by field name or field alias.
From each exported record, select for output
only those fields requested by their name or alias. Output fields can be
re-ordered by specifying the field names or aliases in the desired order.
Field names/aliases can be repeated.
This option may not be combined with
fieldname_list-Fnum - doing so will cause DFexport to exit
with an error message.
Similarly, only one of -Falias or
-Fname may be specified - any other specification or
combination will cause DFexport to exit with an error message.
-joinANY, -joinALL.
The default behavior of DFexport is to export data from one
source, a plate or a module. However special handling is needed in the cases where a module
instance is distributed across multiple plates.
With the -joinALL and -joinANY options it
is possible to combine and export data from a data source that is defined across plates in one
invocation and into one output file.
The join keys (the "join key triple") are always the subject ID, the sequence number
and the module reference instance number and must match across data sources. These keys are
always exported as the first three fields in each output record when joining occurs.
Multiple data sources are specified using | as a delimiter.
The specification for a single data source uses the notation:
plate#[fieldlist]where:
plate# is a plate number from the set of
plate numbers defined for the study,
fieldlist is a list of one or more comma-separated
field numbers or names, enclosed with [ and ].
The -joinANY and -joinALL options specify the
resulting output behavior: with the -joinALL option, the
database must contain a data record for every data source; otherwise, no data
is output for that join key triple.
Conversely, specifying -joinANY will export data so long as
the join key triple appears in at least one data source.
Example 3.21. Re-construct the demographics module for output
The DEMOGRAPHICS module is instanced with fields on plates 1
and 2. Export the data, with a specific field ordering and some output formatting.
-Mname DEMOGRAPHICS -joinANY '1[PINIT, SEX] | 2[RACE, RACE:d, RACEOTH] | 1[AGE, DOB:c]'
Concatenation with +.
By default, each selected field is output with optional formatting (see the section called “Output Formatting” and is separated
from adjacent selected fields by the delimiter. Alternatively, selected fields can be concatenated, output with no delimiter at all.
To specify concatenation, use the special concatenation symbol, +, between fields to concatenate. For example,
to concatenate the site identifier and the subject identifier, use
DFSITE_ID+SUBJIDor to concatenate fields 9 and 12, use
9+12It is also possible to concatenate a range of fields. In this case, concatenation has precedence over field range so,
8+9-12or
8-11+12or
8+9+10+11+12are all equivalent. Note that there is no notation to concatenate a range of fields by specifying only the minimum and maximum field numbers; at least one of the numbers must be inidividually specified so that the concatenation operator can be used.
Study metadata is available for export as columns in each output record. Metadata is specified for inclusion in the output by inserting metadata keywords in the list of fields for export. In all cases, the exported value has the string data type.
The available metadata keywords are grouped by 6 categories: field, page, visit, image, site and study.
field.
These metadata keywords reference field definition properties of a data field.
For these keywords only, the keyword is appended to a field name to request a specific property of that specific data field.
For example, MHENDAT.DFVAR_PROMPT requests the DFVAR_PROMPT property (the field prompt from the
study definition) of the MHENDAT data field. User-defined
The field properties that are available are:
DFVAR_DESC: field description
DFVAR_TYPE: field type
DFVAR_PROMPT: field prompt
DFVAR_UNITS: field units
DFVAR_COMMENT: field comment
DFVAR_MODNUM: module instance number containing field (Module Instance in setup definition)
DFVAR_MODNAME: module name containing field (Module Name in setup definition)
DFVAR_MODDESC: module description containing field (Module Description in setup definition)
DFVAR_USER#: field user-defined property value, where # is 1-20. User-defined property tags may be used in place of the default name
module. These metadata keywords reference properties of the module instance.
DFMODULE_NAME: module name (Module Name in setup definition)
DFMODULE_DESC: module description (Module Description in setup definition)
DFMODULE_USER#: module user-defined property value, where # is 1-20. User-defined property tags may be used in place of the default name
plate.
Metadata keywords for a plate include DFPLATE_DESC (plate description) and DFPLATE_USER# (plate user-defined property value, where # is 1-20. User-defined property tags may be used in place of the default name).
page.
These metadata keywords reference page properties of the plate associated
with the current data record.
There is currently one page property available: DFPAGE_LABEL.
The value is the descriptive label of the page.
This label is taken from DFpage_map if there is a matching
entry for the combination of visit and plate (field number substitution
is also performed if requested with the # or
#:d notation); otherwise, it is the
plate description taken from the study database definition.
visit.
These metadata keywords reference visit properties of the visit associated with the current visit.
The current visit is determined by the visit number of the data record,
even if the visit number is not included in the export.
The visit properties that are available are: DFVISIT_DATE, DFVISIT_TYPE, DFVISIT_ACRONYM,
DFVISIT_LABEL,
DFVISIT_DUE,
DFVISIT_OVERDUE,
DFVISIT_REQUIREDPLATES,
DFVISIT_OPTIONALPLATES,
DFVISIT_ORDERPLATES and
DFVISIT_MISSEDPLATE.
The value of each property is detailed in
Study Setup User Guide, Visit Map.
image. Each data record has an image attribute, which will have a value if there is an image associated with the data record. For those records, the image properties that are available are:
DFIMAGE_ARRIVAL: arrival date/time of the image
DFIMAGE_FIRSTARRIVAL: arrival date/time of the image; if there were multiple images over time,
the chronologically first image
DFIMAGE_LASTARRIVAL: arrival date/time of the image; if there were multiple images over time,
the chronologically last (most recent) image
DFIMAGE_FORMAT: image format
DFIMAGE_SENDER: sender ID for the document that included this image
DFIMAGE_PAGES: number of pages in the document that included this image
site.
The site metadata associated with each data record can be determined by connecting the subject ID of the data record with the site
record that includes the subject ID in the list of subjects.
The site properies that are available are:
DFSITE_ID,
DFSITE_NAME,
DFSITE_CONTACT,
DFSITE_ADDRESS,
DFSITE_FAX,
DFSITE_PHONE,
DFSITE_INVESTIGATOR,
DFSITE_SUBJECTS,
DFSITE_TEST,
DFSITE_COUNTRY,
DFSITE_BEGINDATE,
DFSITE_ENDDATE,
DFSITE_ENROLL,
DFSITE_PROTOCOL1,
DFSITE_PROTOCOLDATE1,
DFSITE_PROTOCOL2,
DFSITE_PROTOCOLDATE2,
DFSITE_PROTOCOL3,
DFSITE_PROTOCOLDATE3,
DFSITE_PROTOCOL4,
DFSITE_PROTOCOLDATE4,
DFSITE_PROTOCOL5,
DFSITE_PROTOCOLDATE5 and
DFSITE_REPLYTO.
The value of each property is detailed in
Programmer Guide, DFcenters - sites database.
study.
The study metadata is static for the entire study, it is defined at the
study database level and does not change for different data record keys.
The study properies that are available are:
DFSTUDY_NUMBER (study number),
DFSTUDY_NAME (study name) and
DFSTUDY_YEAR (4-digit year of study start, defined by study admin
as a global setting). In addition,
DFSTUDY_USER# provides the study user-defined property value, where # is 1-20. User-defined property tags may be used in place of the default name.
Date Modifiers.
By default, date fields are output "as is" in their defined format.
The options -c and -j may be used to
cause imputation and to
alter/normalize that defined format for all date fields that are exported.
The -c (calendar format) option requests that any 2-digit year
be replaced with a 4-digit year and imputation be applied.
The -j (julian format) option requests that imputation
be applied and dates be exported as their equivalent julian value - this
is primarily useful for mathematical calculations and comparisons with
other date values.
On an individual field basis, it is also possible to override the
defined format with field level date modifiers.
By following the field number, name or alias specification with a
:c or :j modifier, it is possible
to export the same calendar or julian format result for a single field.
The :o modifier requests the "original" date value,
and can be useful where -c or -j
has been previously specified.
DateField and DateField:o
produce the same result.
Similarly, DateField:c also produces the same result if
the field's format already specifies 4-digit years and no imputation is needed.
Example 3.22. Calendar and Julian Date Modifiers
For the field named VisitDate, export the value in
default, original, calendar and julian formats.
-Fname "VisitDate, VisitDate:o VisitDate:c, VisitDate:j"
If VisitDate is defined with "imputation to the beginning"
and an instance of that field in a data record has the data value
16/01/00, this specification will produce
16/01/00|16/01/00|2016/01/01|2456963.
A field level date modifier cannot be applied to a non-date
field nor can it be applied to a range of field numbers, names or aliases;
it may only be applied to a single field number, name or alias at a time.
Label Decoding.
For any field that is defined with coding,
it is possible to output the decoded label for a field's value by
appending the :d modifier to the field number, name or
alias.
Any coded value that cannot be decoded is output as is.
The modifier cannot be applied to a field that has no coding nor can
it be applied to a range of field numbers, names or aliases;
it must be applied to a single field number, name or alias at a time.
Header Record.
Specifying -h forces the inclusion of a descriptive
record, the header, as the first row of output.
If -h is specified with -D,
the header contains each field's description.
Otherwise, if -h is specified with -Fname,
the header contains each field alias.
If -h is combined with -Fnum, or no
field selection criteria are needed:
the header contains each
field alias if the data source is a plate, as specified with
-Pnum
the header contains each field name if the data source is
a module, as specified with
-Mname or -Mnum.
During export it is possible to create one or more new data fields by
including options to string split
(see String Splitting),
extract substrings
(see Extracting Sub-Strings) or
include fixed, literal values
(see Literal, constant values).
If -h is specified then these new fields also need names
to appear in the header record. This is done with the
-H option.
This option specifies the variable names to
appear in the header record for any new fields.
If header_listheader_list contains fewer names than there
are new fields, the
remaining fields are assigned names identical to their original field
names.
String Splitting.
It is possible to split a string field into multiple, shorter string fields by
appending a :numxcharscw
modifier to a field number, name or alias selection.
The modifier includes a specification of the number of fields to
split the input string into (num), the
literal x as a separator,
the maximum width in characters of each output field
(chars),
and whether or not splitting should be done on character
(c) boundaries or word (w) boundaries.
If word boundaries are used, a word is
delimited by any combination of space or tab characters. Word splitting will
always split at the word boundary closest to but less than the maximum width.
If there is no word boundary in the substring, the string will be split at the
character boundary.
Example 3.24. Split a string field into 5 fields
This example splits the comments field into 5
200-character fields at character boundaries:
-Fname "comments:5x200c"
If a field to be split contains a
missing value code, the first output field will contain the same missing
value code and the remaining fields will be blank.
If a field is split to create additional fields and a header record is
requested with -h, field names for the new
fields are identical to the original field name unless
-H is also specified.
If -H is given,
the field names for the new fields
are assigned in order from the header_listheader_list.
Extracting Sub-Strings.
New fields can be created from substrings while exporting fields.
The modifier notation :xS.L indicates that a substring is
being extracted (x), starting at character position
S (the first position is 1)
and having length L characters.
The modifier can be applied to any field number, name or alias selection.
New fields can be created from substrings while exporting fields.
The modifier notation :xS.L indicates that a substring is
being extracted (x), starting at character position
S (the first position is 1)
and having length L characters.
The modifier can be applied to any field number, name or alias selection.
Substrings can be extracted from any field type: string, date, time
or numeric.
Substrings are extracted from the string value equivalent of the field.
Numeric field values are leading
zero-padded to their store width before substring extraction is performed.
[5]
Example 3.25. Digit extraction from subject ID
In some settings, the subject ID is a concatenation of site ID and a
numeric subject identifier.
In this example, the leading three digits are the site ID and the trailing
four digits. Given PID values of 50001,
60401 and 1232003, the extraction
would yield these results.
-Fname "PID:x1.3,PID:x4.4"005|0001 006|0401 123|2003
Literal, constant values.
A constant value may be inserted into the data export by including it in
one of the -Fnum, -Fname or
-Falias specifications.
The value must always be enclosed with single-quotes,
as in 'AB', to prevent any confusion with a matching
field name or alias.
To insert a blank field into the export, it must be
represented by 2 adjacent single quotes, specifically
''.
Example 3.26. Insert AD into the data export
-Falias "EnrollDate,'AD',SubjInit"2014/12/12|AD|STN 2013/06/15|AD|PRP 2015/03/12|AD|KKW
The constant value AD is inserted between the
EnrollDate and SubjInit fields
in every exported record.
Enclosed by single-quotes,
DFexport ignores the special meaning of any characters that
would otherwise be delimiters.
For example, 'a,b' is the constant value
a,b, not two separate constants.
If -h is present, the name used in the header record
is the value itself.
If -H is also present,
the name used in the header record for each constant value
will be the next value from header_listheader_list,
if one is available; otherwise, the value itself will again be used.
Example 3.27. Specify a field name for the header record
-h -H "When" -Falias "EnrollDate,'AD',SubjInit"EnrollDate|When|SubjInit 2014/12/12|AD|STN 2013/06/15|AD|PRP 2015/03/12|AD|KKW
Trailing Field Delimiter.
For backwards compatibility, exported data records can be terminated
by a trailing | if one is not already present.
This is specified with the -p option.
[6]
With the trailing delimiter, exported data records maintain the
original DFdiscover record format so that they can be, for example, easily
re-imported into DFdiscover in a subsequent step.
Missed Record Handling.
Missed records are include in the export if either
-status all or -status missed are
specified.
To export missed records, there are two output options:
Internal to DFdiscover, missed records are handled in a different structure
than other data records (see
plt###.dat - missed records and
Table 2.7, “Missed record field descriptions” for further information).
Exporting them in this structure, when intermingled with plate- or
module-based data records,
does not result in normalized records - most records have the data structure,
while some have the missed record structure.
The result is that these records can be difficult to work with post-export.
By specifying the -L
option, DFexport will export missed records by matching the structure of the
corresponding plate and inserting lostcodelostcode
into all of the user-defined data fields.
This results in exported records that are normalized and all share the same
structure.
Post-export, missed records can be identified by the missed
status in the status field, or the lostcode in
all of the user-defined data fields.
When missed records are exported along with one of the field selection options
-Fnum, -Fname, or -Falias,
the second output option is always used.
If the option -L has
not been specified, the standard DFdiscover missing
value code (lostcode*) is automatically inserted in
each user-defined field of the exported record.
The lostcode specified with
-L must be comprised of one or more alphanumeric characters.
It is not permissible to use control characters.
Also, the DFdiscover delimiter (|) may not be used, unless
data is also being exported in CSV format with the -z option.
Note that specifying the -L
option will not, by itself, cause missed records to be exported.
That is controlled entirely by the lostcode-status option.
The -L option only indicates that
the second, "normalized" output format is desired and is to be used with
the specified substitution code.
Output File.
By default, output from DFexport is written to standard output (the command-line output).
To write the output to a specific file, specify the filename with
-o .
The user must have filesystem permission to create or modify the file.
If outfileoutfile is specified as a relative pathname, the file is
created relative to the DFexport command invocation directory.[7]
It is also possible to be explicit that standard output is the output destination
by specifying -o - although this is not necessary.
Output Mode.
Options -w and -a control the output mode.
The output is written (-w) or appended (-a) to the output file.
This option is ignored if exported records are written to standard output rather than a file.
If the output file does not exist, it is first created.
If the output file already exists, it is either overwritten or appended to, depending upon the
output mode.
Error Output.
Re-direct any error messages to a specific file with the
-e option.
By default, error messages are written to standard error and hence would get
intermixed with data if the data is being exported to standard output.
errorfile
Comma-Separated Values (CSV) Format.
CSV is a popular format used for sharing data
records between different software programs.
By including the -z option, DFexport will generate output
that is compliant with this format.
The unique requirements of the format are:
Each field within a record is separated by a comma.
Leading and trailing spaces adjacent to comma separators are ignored.
Fields containing commas as part of their value are enclosed in double quotes.
Fields containing double quotes are enclosed within double quotes, and the embedded double quotes themselves are each represented by a pair of consecutive double quotes.
Fields with leading or trailing spaces are enclosed in double quotes.
The record delimiter is a newline character (this is also true of records exported in the default, non-CSV format).
Exported records, in CSV or traditional (non-CSV), format are always delimited by a newline character.
If the -h option is also specified, the CSV requirements are
applied to the header record and all of the values in the header.
DFexport exits with status 0 if the command was successful.
Exit status 1 indicates that an error occurred - errors are written to standard error
or the errlog file if -e was specified.
In the following examples, only the options unique to the example are specified. For clarity, required options such as authentication credentials, output destination and study number are omitted.
Example 3.28. Select and filter using data from two different plates
Export complete data records from plate 6.
Filter those records to include only instances where records with matching
keys on plate 1 have a date variable, S1DATE with
a value of 01/01/06.
-Pnum 6 -ANY "1:$(S1DATE) == 01/01/06"
Example 3.29. Module-based export
The structure of data exported from a module varies dependent upon whether all of the fields are instanced on a single plate, or on multiple plates.
Consider a study containing a module, BASE,
that collects baseline data in fields
HEIGHT, WEIGHT,
the data across two records as in:
-Mname BASE73|180|| ||120|70 65|145|| ||110|65
To generate the previous single record output requires a little more work, specifically:
-Mname BASE -joinALL "1[HEIGHT,WEIGHT] | 2[SYSTOLIC,DIASTOLIC]"73|180|120|70 65|145|110|65
Example 3.30. Export eSignature records with a specific name
The eSignature name is stored in the eSignature module.
Export records signed by 'Jack'.
Several specifications are possible as illustrated here.
-Mname eSignature -expr '$(SigName) == "Jack"'
This is the most specific filter - the signature name field must contain exactly "Jack" and nothing else.
-Mname eSignature -expr 'tolower($(SigName)) ~ "jack"'
This is a less specific filter - the signature name field can contain a case insensitive variation of "Jack" somewhere in the value. Note that this also selects values like "Jackson", "Alex Jack" and "Wojack".
-Mname eSignature -pattern "jack"
This is the least specific filter and selects each eSignature module record that contains a case insensitive match for "jack" anywhere in the record.
[1] This output format matches the output from DFlistplates.rpc.
[2] DFexport typically uses a word to specify an option while DFexport.rpc uses an option letter.
[3]
For backwards compatibility, the legacy status keywords
(clean, dirty, error, CLEAN, DIRTY, ERROR, missed, primary, secondary, all) are currently supported but will be removed in
a future release.
Do not rely upon the availability of the legacy status keywords.
[4] Matching occurs before any output formatting (such as variable decoding or string splitting) is applied.
[5] Choice and check field values are NOT zero-padded and hence substring extraction for those data types may yield unexpected or unreliable results.
[6] This option has no effect when applied to exported query or reason records.
[7] If DFexport is invoked from a cron facility, absolute pathnames are highly recommended.