DFexport.rpc — Export data records from one or multiple plates from a study data file
DFexport.rpc
[-A]
[
[-w]
| [-a]
]
[
[-c]
| [-j]
]
[-d]
[-e]
[-J]
[-m]
[-p]
[-q]
[-h]
[-k]
[-z]
[-s status_list]
[-v #, #-#]
[-n #, #-#]
[-I #, #-#]
[-V #, #-#]
[-C yy/mm/dd-yy/mm/dd]
[-M yy/mm/dd-yy/mm/dd]
[-L lostcode]
[
[-U fieldname_list]
| [-G fieldname_list]
]
[-f fieldnum_list]
[-H header_list]
{study}
{plate(s)}
{outfile}
DFexport.rpc can be used to export whole data records or selected fields, from specified plates of a specified study database. Data records are exported in ASCII text format. DFexport.rpc does not provide support for joining fields from different plates or studies. To use DFexport.rpc users must have 'Export Data' permission, which is granted by an administrator on a study-by-study basis (see System Administrator Guide, Server). Users with export permission will only receive records for which they have get permission, which may be restricted by level, site, subject, assessment and plate. DFexport.rpc provides no warning that some records cannot be exported as this would itself provide information about the existence of restricted data records.
DFexport.rpc may appear in a shell script used to create reports stored in
the study reports directory, to be run in DFexplore from the Reports View,
or in a shell script run using dfexecute in an edit check. In these cases
the permissions of the DFexplore user running the report or edit check
will be applied. Thus the behavior and output may differ depending on the
permissions of the user who runs the script.
If the same script is run from the UNIX shell, the permissions associated with the UNIX login name apply. Thus, if a user has different UNIX and DFexplore login accounts with different permissions, the user may get different results depending on whether they run the script in the UNIX shell or DFexplore environments.
-A | output the subject alias, if defined, in place of the subject ID. If there is no subject alias, output the subject ID. | |||
-w, -a | output mode.
The output is written ( | |||
-c, -j | default date format.
Without this option, date values are output in the format
specified by their variable definition, and for partial date values,
without any date imputation.
With this option, date values are
output in calendar ( | |||
-d | decode coded variables.
With this option, all requested variables that are coded
are output with their decoded labels.
The default behavior, without this option, is to output the code itself.
This option can be invoked at the individual variable level by appending the
| |||
-e | outfile extension. With this option, the output filename specified will have a text file extension (.txt) added to the end of the user defined outfile name. Outfile extension is ignored if the output file is standard output indicated by '-'. The outfile extension is also ignored if only a single plate is specified. In the case of a single plate, the outfile name will be exactly as specified.
If this option is used along with the csv ( | |||
-J | Outer join with requested fields that may not exist for current plate. With this option, if the user specifies one or more fields that do not exist for a plate that is requested, data for that plate will still be exported but non-existent fields will have a data field containing the missed subsitution code and a header containing "NOT_DEFINED". If this option is omitted, plates that do not contain one or more of the fields requested will not be exported. | |||
-m | match data record validation level. This option is only relevant when exporting metadata records: reasons (plate 510) and queries (plate 511); and is ignored when exporting data records. Meta-data records may have different validation levels from the underlying data records to which they are attached. When metadata records are exported using the -m option, the validation level of each exported metadata record is changed to match the validation level of the data record to which it is attached. | |||
-p | trailing pipe. For backwards compatibility, exported data
records can be terminated by a trailing
| |||
-q | quiet mode. This option instructs the program to execute in quiet mode, silencing all warning messages. The default, without this option, is to write warning messages to standard error. Warning messages are generated upon a request to export a field that does not exist in the record. | |||
-h | header. Include as the first record in the output file, a
delimited record of variable names for the columns in the data records. The
variable names are, by default, the alias variable names defined in the study
data dictionary combined with This option cannot be used with an export of the new record queue, plate 0, as the variable names are not fixed. It can however, be used with the export of query records (plate 511) and reason for data change records (plate 510). The headers for query and reason for data change records are defined in the study schema.
The trailing character is different in data records and queries.
Data records are terminated by a final
If multiple plates are requested and the | |||
-k | keys only. Output only the key fields for each data record. This
includes, in order: id, plate, visit, status, and validation level, as
| |||
-z | Comma Separated Variables (CSV) format. This option exports all records so that they are compliant with this popular format. If this option is used in conjunction with the file extension option, the extension added to the end of the outfile name is the csv extension (.csv). |
The following three options are required and must appear in order at the end of the option list:
study | the DFdiscover study number, from which data records are to be exported. |
plate(s) | plate numbers of the data files to export. A single plate can be specified or a list of plates can be specified to run in one invocation. If the same plate is listed more than once, the plate data is only exported once. The keyword 'all' can be used to export all existing plates within the specified study including the special reserved plates. Special reserved plate numbers include: 0 for new records (received but not yet reviewed), 501 for returned Query reports, 510 for reason for data change records, and 511 for queries. If plate 511 is exported, the validation level of all exported records is updated to reflect the current validation level of the record that the notes are attached to. |
outfile | output file that the exported records are written to. The user
executing DFexport.rpc must have permission to create or modify
this file.
In the case of multiple plates being exported, the outfile name is used as
the base filename from which a unique filename is constructed by appending
each three digit, zero-padded plate number.
If the output file is given as |
The following options allow specific records to be selected from the output. With no options specified, all records are output. With options specified, only records matching the selection criteria are output. If multiple options are specified, only those records that match all criteria are output.
-s status | record status. The recognized status keywords consist of the
legacy terminology:
If plate 511 is exported, the available status keywords are the same but their meaning (as related to query status) is translated by the table: Table 3.8. Record and query status equivalence
| ||||||||||||||||||||||||
-v | validation level. By default, records at all validation levels are exported, independent of the maximum validation level defined for the user. Specifying this argument causes only those records that match a validation level or are within a validation level to be exported. | ||||||||||||||||||||||||
-I | subject ID.
If this option is specified, only those
records which have a subject ID matching the selection criteria are
output. It is not valid to select records using | ||||||||||||||||||||||||
-n | site ID. This option allows selection by site ID, site
range or any combination thereof.
However, it is not valid to specify | ||||||||||||||||||||||||
-V | visit/sequence number. Specifying this option selects records to output by the visit number. Only those records matching the visit selection criteria are output. | ||||||||||||||||||||||||
-C | creation date. If this option is specified, only those records which have a creation date matching the selection criteria are output. Note that the selection criteria apply only to the creation date of data records and not queries. | ||||||||||||||||||||||||
-M | modification date. If this option is specified, only those records which have a modification date matching the selection criteria are output. Note that the selection criteria apply only to the modification date of data records and not queries.
For both | ||||||||||||||||||||||||
-L lostcode | missing value code for missed records.
Missed records are exported if the keywords 'missed',
'lost', or 'all' are specified
with the status option,
The only exception to this is when missed records are requested along with
one of the field selection options
(
The
Note that specifying the | ||||||||||||||||||||||||
![]() | Argument parsing |
|---|---|
|
Argument parsing in DFexport.rpc ignores extra, repeated delimiters. Any combination of space and comma is collapsed to one delimiter. For example: DFRASTER,,,DFVALID = DFRASTER,DFVALID DFRASTER DFVALID = DFRASTER,DFVALID DFRASTER ,, ,,DFVALID = DFRASTER,DFVALID ,,DFRASTER ,, ,,DFVALID = DFRASTER,DFVALID are equivalent. |
The following options allow fields (variables) to be selected from the output.
With no options specified, all fields are output, unless -k
has already been specified. With options specified, only the
requested fields are output. If multiple options are specified, all of the
requested fields are output, in the order requested.
-f | field number. Specifying this option selects for output from each record only those fields requested by their number. Output fields can be re-ordered by specifying the field numbers in the desired order. Fields can be repeated within the option. This option cannot be repeated in the command-line.
Command-line referencing of field numbers using
|
-U fieldname_list, -G fieldname_list | field name. Specifying this option selects for output from each
record only those fields requested by their field name.
If |
-H header_list | field name headers. This option specifies the variable names to
appear in the header ( |
Wherever a field number or field name is expected in field selection
criteria, a constant value may also be referenced by inserting it
in single quotes, as in 'AB'.
The purpose of this is to insert the constant value in the specified
field location into the output records.
The 'value' notation
allows constant values to be exported in much the same way that a variable
value can be.
Since it is possible for a constant value and variable name to be
the same, a constant value must always use the 'value' notation.
For example, the specification:
-G "date,'AD',DFRASTER"
requests the variable with the name date,
followed by the constant value AD
and the raster image ID number. The output might appear as:
99/06/17|AD|9924/00001001
If a blank field is being exported, it must be represented by 2 single quotes with nothing inside.
DFexport.rpc will ignore the special meaning of any characters that
would otherwise be delimiters.
For example, 'a,b' is the constant value
a,b, not two separate constants.
If -h is present, the column name for each constant
value will be the value itself, as in the following specification and
output:
% DFexport.rpc -h -G "date,'AD',DFRASTER" 254 1 - | head -2
date|AD|DFRASTER
99/06/17|AD|9924/0001001
If -H is also present, the column name for each constant
value will be replaced with the next -H value,
if one is available. For example:
% DFexport.rpc -h -G "date,'AD',DFRASTER" -H "when" 254 1 - | head -2
date|when|DFRASTER
99/06/17|AD|9924/0001001
By default, date fields are output in the format defined for the field in the
study data dictionary.
If -c or -j is specified, the default
output format is changed to 4-digit year format with imputation
(-c) or julian format (-j).
It is possible to override the default format for selected fields
by following the field number or field name specification with a
:c, :j or
:o modifier.
For example, the specification:
-G "VisitDate, VisitDate:c, VisitDate:j"
requests the variable with name
VisitDate to be output in its default
format, in calendar date format, and finally in julian format.
The :c modifier applied to a field that already
uses a 4 digit year format applies
date imputation only.
When using -c or -j
to change the default output format for dates,
note that there is no field level modifier that
restores the original date format defined in the study data dictionary.
The :o
modifier can be used in an analogous fashion to
:c and :j. The effect of the
:o
modifier will be to cause the original date value to be output without any
imputation.
A modifier cannot be applied to a non-date variable nor can it be applied to a range of field numbers or field names; it must be applied to a single field number or field name at a time.
It is possible to output the decoded label for a variable's value by
appending the :d modifier to any variable that is defined
with coding.
Any coded value that cannot be decoded is output as is.
The modifier cannot be applied to a non-coded variable nor can it be applied to a range of field numbers or field names; it must be applied to a single field number or field name at a time.
It is possible to split a string field into multiple, shorter string fields by
appending a :numxcharscw
modifier to the field number or field name
specification. The modifier includes a specification of the number of fields to
split the input string into (num),
the maximum width in characters of each output field
(chars),
and whether or not splitting should be done on character
(c) boundaries or (w) boundaries.
If word boundaries are used, a word is
delimited by any combination of space or tab characters. Word splitting will
always split at the word boundary closest to but less than the maximum width.
If there is no word boundary in the substring, the string will be split at the
character boundary.
Example 3.32. Split a string field into 5 fields
This example splits the comments field into 5
200-character segments at character boundaries:
-G "comments:5x200c"
If a field to be split contains a missing value code, the first output field will contain the complete missing value code and the remaining fields will be blank.
If a field is split to create additional fields and a header record is
requested with -h, field names for the new
fields are identical to the original field name unless
-H is specified. If -H is
given, the field names for the new fields are assigned in order from the option
list. If the option list contains fewer names than there are new fields, the
remaining fields are assigned names identical to their original field
names.
New fields can be created consisting of substrings of database fields. The following example creates a data field from the 4th and 5th characters of field 22.
-f 22:x4.2
Substrings can be extracted from any field type: strings, dates, or numbers. Substrings are extracted from the string value of the field. Numeric fields will be zero-padded to their store width before the substring extraction is done. In the following example, the first three digits of the subject ID field are extracted.
-f 7:x1.3
CSV is a popular format used for sharing data
records between different software programs.
By selecting the -z option, DFexportrpc; will generate output
records that are compliant with this format.
The requirements of the format are:
The record delimiter is a newline character (this is also true of records exported in the default, non-CSV format).
Each field within a record is separated by a comma.
Leading and trailing spaces adjacent to comma separators are ignored.
Fields containing commas as part of their value are enclosed in double quotes.
Fields containing double quotes are enclosed within double quotes, and the embedded double quotes themselves are each represented by a pair of consecutive double quotes.
Fields containing embedded line breaks are enclosed within double quotes.
Fields with leading or trailing spaces are enclosed in double quotes.
The first record in the CSV output file may consist of a header record containing field names. Each field in the header will also be comma-delimited and follow the requirements above.
DFexport.rpc exits with one of the following statuses:
0 | The command was successful. |
24 | The user does not have the necessary permission to execute the command. |
31 | The requested plate does not exist in the database. |
36 | The required command-line arguments were not present or were incorrectly specified. |
> 0 | The command failed because the database server could not be contacted, or communication with the database server failed. |
Example 3.33. Export all records from plate 1 of study 255 to standard output
% DFexport.rpc -s all -h 255 1 -
DFSTATUS|DFVALID|DFRASTER|DFSTUDY|DFPLATE|DFSEQ|PID|INIT|VDATE|DFSCREEN|DFCREATE|DFMODIFY|
1|1|9807/1234567|255|1|0|99001|SCL|98/01/25|1|98/02/10 12:34:12|98/02/12 12:34:12|
2|4|9811/0005001|255|1|1|99002|RRN|98/02/12|2|98/02/10 15:03:34|98/03/01 11:23:14|
5|2|9831/0004012|255|1|1|99002|RRN|98/12/12|2|98/07/02 13:45:20|98/07/05 09:21:44|
1|3|9809/0044002|255|1|0|99003|*|98/02/03|1|98/02/10 14:23:01|98/02/10 14:23:01|
Example 3.34. Export, with header, all primary records at validation levels 1-2 from plate 1 of study 255 to standard output
% DFexport.rpc -h -s primary -v "1-2" 255 1 -
DFSTATUS|DFVALID|DFRASTER|DFSTUDY|DFPLATE|DFSEQ|PID|INIT|VDATE|DFSCREEN|DFCREATE|DFMODIFY|
1|1|9807/1234567|255|1|0|99001|SCL|98/01/25|1|98/02/10 12:34:12|98/02/10 12:34:12|
1|3|9809/0044002|255|1|0|99003|*|98/02/03|1|98/02/10 14:23:01|98/02/10 14:23:01|
Example 3.35. Export the first 3 and the 7th fields from plate 1 to standard output
% DFexport.rpc -f "1-3,7" 255 1 -
1|1|9807/1234567|99001
2|4|9811/0005001|99002
5|2|9831/0004012|99002
1|3|9809/0044002|99003
Example 3.36. Export the VDATE date field in
4-digit year format and export
the subject initials in 3 single-character fields. Include the header record
and define names for the newly created fields
% DFexport.rpc -c -G "VDATE,INIT:3x1c" -H "middle,last" 255 1 -
VDATE|INIT|middle|last
1998/01/25|S|C|L
1998/02/12|R|R|N
1998/12/12|R|R|N
1998/02/03|*||
Example 3.37. Export the primary records for subject IDs 99001 and 99002
selecting the fields from DFSTUDY to
VDATE inclusive
% DFexport.rpc -s primary -I "99001,99002" -G "DFSTUDY-VDATE" 255 1 -
255|1|0|99001|SCL|98/01/25
255|1|1|99002|RRN|98/02/12
Example 3.38. Show all forms of manipulation for a partial date value in the
variable DateCompleted1 for study 251
% DFexport.rpc -j -G "DateCompleted1,DateCompleted1:c,DateCompleted1:o" 251 1 -
2450845|1998/02/01|98/02/00
2451297|1999/04/29|99/04/29
Example 3.39. Create a DFdiscover Retrieval File (ID99001_plate10.drf) for study 254, all primary records for plate 10, for subject ID 99001
% DFexport.rpc -f "7,6,5" -I 99001 254 10 ID9901_plate10.drf
The following is the contents of the file ID9901_plate10.drf.
99001|1|10 99001|2|10 99001|3|10 99001|6|10 99001|9|10 99001|12|10
Example 3.40. Export all primary records for plates 1-3, 7 and 8 to standard output
% DFexport.rpc -f "7,6,5,3" 254 "1-3,7,8" -
99001|0|1|0915/000T001
99003|0|1|0915R000S001
99004|0|1|0915/000V001
99001|1|2|0915/000T002
99004|1|2|0915/000V002
99001|1|3|0915/000T003
99004|1|3|0915/000V003
99005|1|3|0000/0000000
99001|30|7|0915/000T009
99004|30|7|0915/000V009
99001|51|8|0915/000T010
When exporting multiple plates in one invocation, the outputs are always sorted in ascending order of the plate numbers, for example if the plate argument is changed to "7,8,3-1", the same output will be seen.
Example 3.41. Export all primary records for all plates in a study and create separate text files for each set of plate data.
% DFexport.rpc -e -f "7,6,5,3" 254 all Study254_
The following are the output files created by the above command
Study254_000.txt Study254_001.txt Study254_002.txt Study254_003.txt Study254_004.txt Study254_005.txt Study254_006.txt Study254_007.txt Study254_008.txt Study254_009.txt Study254_010.txt Study254_011.txt Study254_020.txt Study254_501.txt Study254_510.txt Study254_511.txt
As seen above, all plates are exported in one invocation of the command. Special reserved plates are also included when the keyword 'all' is specified for the plates argument. A three digit, zero padded, plate number is also appended to the end of the outfile name, and is optionally followed by a file extension.
Example 3.42. Export, in CSV format, a subset of fields for plate 3 of study 254 and write the results to standard output
%DFexport.rpc -s all -z -f 1-7,59,63-66 254 3 -2,1,9807/0047003,254,3,1,99001,,0,0,"knee surgery, hip replacement",2 1,1,0347R0012001,254,3,1,99101,"""other"" surgery",1,1," carotid endarterectomy ",2
Note that field values containing commas (knee surgery, hip
replacement) are enclosed within double quotes.
Fields containing double quotes (such as "other" surgery)
are enclosed within double quotes and the
embedded double quotes themselves are represented by a pair of consecutive
double quotes.
Example 3.43. Export only missed records for plate 1, inserting the missing value code "NA" into the fields of each missed record exported. Export the results to standard output.
%DFexport.rpc -s missed -L "NA" 254 1 -0|7|0000/0000000|254|1|0|20100|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|0|2018/01/15 12:35:23|2018/01/15 12:35:23 0|7|0000/0000000|254|1|0|20101|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|NA|0|2018/01/15 12:35:23|2018/01/15 12:35:23
Note that missing value code insertion is only performed for user-defined fields after field 7 (subject ID).
DFexport.rpc and DFexport are the recommended ways to export data records from the study database for subsequent examination or analysis. Never read the data files directly as they may contain old copies of records scheduled for removal.
Since DFexportrpc; may only be run server-side by an authenticated user, it does not enforce user permissions as strictly as DFexport does. For example, DFexportrpc; ignores the Hidden/Masked field property while exporting field values.
[8]
Data written to standard output will also include "inline" header
metadata if either of the -h or
-x options are specified.