mdf-reader
Version:
Reads MDF YAML and returns helpful accessors
717 lines (715 loc) • 26.6 kB
YAML
# ICDC model nodes and relns
# Title case names are "reserved" (meaningful to the parser)
# Lower case names are labels for the entities
# document number - really a property of properties (where did this question appear)
Nodes:
program:
Desc: Within the Integrated Canine Data Commons, studies are grouped into discrete programs, based upon the origins and/or scientific nature of each study/trial. These programs may or may not directly relate to any official, e.g. NCI, funding program. The Program node contains the properties required to appropriately characterize any given ICDC program.
Tags:
Category: administrative
Assignment: core
Class: primary
Template: 'Yes'
Props:
- program_name
- program_acronym
- program_short_description
- program_full_description
- program_external_url
- program_sort_order
study:
Desc: The Study node contains properties required to characterize each study/trial in terms of a title, how and why the study/trial was conducted, and the results that were generated.
Tags:
Category: study
Assignment: core
Class: primary
Template: 'Yes'
Props:
- clinical_study_id
- clinical_study_designation
- clinical_study_name
- clinical_study_description
- clinical_study_type
- date_of_iacuc_approval
- dates_of_conduct
- accession_id
- study_disposition
study_site:
Desc: The Study Site node contains properties which identify the various sites at which any given study/trial was conducted, either in terms of where clinical trial patients were assessed and treated, or in terms of the geographical sites at which biospecimens were acquired from patients/subjects/donors for subsequent analysis.
Tags:
Category: study
Assignment: core
Class: secondary
Template: 'Yes'
Props:
- site_short_name
- veterinary_medical_center
- registering_institution
study_arm:
Desc: The Study Arm node contains properties required to describe the arms into which any given study/trial was divided. Division of a study/trial into multiple arms is optional and is at the discretion of the data owners, based upon the way in which the study/trial in question was structured, and how best that structure can be represented within the ICDC. Where applicable, the appropriate study arms are defined during the study on-boarding process and then created via a specific data loading file.
Tags:
Category: study
Assignment: core
Class: secondary
Template: 'Yes'
Props:
- arm
- ctep_treatment_assignment_code
# arm has no example in the data, putting cohort_description in here
# to help define study_arm
- arm_description
- arm_id # potentially needed to differentiate between arms having the same name, but which actually belong to different studies. Proactively including sooner rather than later.
agent:
Desc: The Agent node documents the name of each therapeutic agent being administered during a clinical trial. In this way, in clinical trials which assess the efficacy of combination therapies, adverse events observed during the trial can be attributed specifically to one or more of the medications being used.
Tags:
Category: clinical_trial
Assignment: extended
Class: secondary
Template: 'No'
Clinical_Data_Export: 'Yes'
Props:
- medication
# d/n from STUDY_MED_ADMIN/SDAD/1
- document_number
cohort:
Desc: The Cohort node contains properties required to describe the cohorts into which any given study/trial was divided. Division of a study/trial into multiple cohorts is optional and is at the discretion of the data owners, based upon the way in which the study/trial in question was structured, and how best that structure can be represented within the ICDC. Where applicable, the appropriate cohorts are defined during the study on-boarding process and then created via a specific data loading file.
Tags:
Category: study
Assignment: core
Class: secondary
Template: 'Yes'
Props:
- cohort_description
# the intended or protocol dose
- cohort_dose
- cohort_id # needed to differentiate between cohorts that share values for cohort description, but which actually belong to different studies
case:
Desc: The Case node contains properties required to unambiguously identify each patient/subject/donor, either based upon the data submitter's original ID, or upon a study-specific Case ID derived from it during data transformation, which prefixes each original ID with a short, study-specific code.
Tags:
Category: case
Assignment: core
Class: primary
Template: 'Yes'
Props:
- case_id
- patient_id
- patient_first_name
registration:
Desc: The Registration node functions to capture multiple IDs that may be associated with any single patient/subject/donor. Specifically, it captures multiple IDs in the form of Key:Value pairs, which represent each alternate ID and the specific source from which that alternate ID originates. These registrations can then be used to identify multi-study participants, i.e. canine individuals enrolled in two or more ICDC studies as study-specific cases, but which nonetheless represent the same underlying patient/subject/donor.
Tags:
Category: case
Assignment: core
Class: secondary
Template: 'Yes'
Props:
- registration_origin
- registration_id
#- is_primary_id
biospecimen_source:
Desc: The Biospecimen Source node functions essentially as a look-up table used by the front-end of the application to convert the names of biobanks and tissue repositories represented in the form of acronyms into human-readable, full text names.
Tags:
Category: biospecimen
Assignment: core
Class: secondary
Template: 'No'
Props:
- biospecimen_repository_acronym
- biospecimen_repository_full_name
canine_individual:
Desc: The Canine Individual node contains only a single property, i.e. canine_individual_id, a loader-generated ID which identifies each underlying canine subject represented by two or more study-specific ICDC cases. This ID functions to map data sets ultimately derived from the same underlying patient/subject/donor, but generated from discrete cases in separate studies, to the underlying canine individual, such that all data sets derived from any given canine individual can be identified within the application’s user interface and combined.
Tags:
Category: case
Assignment: core
Class: secondary
Template: 'No'
Props:
- canine_individual_id
demographic:
Desc: The Demographic node is comprised of properties which describe the key characteristics of each patient/subject/donor, such as breed, sex and neutered status.
Tags:
Category: case
Assignment: core
Class: primary
Template: 'Yes'
Props:
- demographic_id
- breed
- additional_breed_detail
- patient_age_at_enrollment
- date_of_birth
- sex
- weight
- neutered_indicator
cycle:
Desc: In clinical trials where therapeutic agents are administered in multiple discrete treatment cycles, the properties within the Cycle node serve to capture the dates upon which each cycle started and ended, providing a detailed timeframe for the therapeutic intervention(s) in question. Adverse events can then be associated with the correct cycle based upon when they were observed.
Tags:
Category: clinical_trial
Assignment: extended
Class: secondary
Template: 'Yes'
Clinical_Data_Export: 'Yes'
Props:
- cycle_number
- date_of_cycle_start
- date_of_cycle_end
visit:
Desc: Clinical trials typically require the patient to make multiple visits to the study site for clinical evaluation and/or the administration of additional medication(s). Properties within the Visit node serve to capture the date upon which each visit occurs. Adverse events and various clinical assessments can then be associated with the correct visit based upon date.
Tags:
Category: clinical_trial
Assignment: extended
Class: secondary
Template: 'Yes'
Clinical_Data_Export: 'Yes'
Props:
- visit_date
- visit_number
- visit_id
principal_investigator:
Desc: The Principal Investigator node contains properties which identify the principal investigator(s) responsible for any given study/trial. A study/trial may have one or more principal investigators, and any given individual may be listed as a principal investigator on more than one study/trial.
Tags:
Category: study
Assignment: core
Class: primary
Template: 'Yes'
Props:
- pi_first_name
- pi_last_name
- pi_middle_initial
diagnosis:
Desc: The Diagnosis node contains numerous properties which fully characterize the type of cancer with which any given patient/subject/donor was diagnosed, inclusive of stage. This node also contains properties pertaining to comorbidities, and the availability of pathology reports, treatment data and follow-up data.
Tags:
Category: clinical
Assignment: core
Class: primary
Template: 'Yes'
Clinical_Data_Export: 'No'
Props:
- diagnosis_id
- disease_term
- primary_disease_site
- stage_of_disease
- date_of_diagnosis
- histology_cytopathology
- date_of_histology_confirmation
- histological_grade
- best_response
- pathology_report
- treatment_data
- follow_up_data
- concurrent_disease
- concurrent_disease_type
enrollment:
Desc: The Enrollment node is comprised of properties which document when and where a patient/subject/donor was enrolled onto a study/trial.
Tags:
Category: case
Assignment: core
Class: primary
Template: 'Yes'
Props:
- enrollment_id
- date_of_registration
- registering_institution
- initials
- date_of_informed_consent
- site_short_name
- veterinary_medical_center
#- cohort_description
- patient_subgroup
prior_therapy:
Desc: Properties within the Prior Therapy node detail therapies received by the patient/subject/donor prior to being enrolled in the study/trial in question. Clinical trials will typically capture more of this information than will cross-sectional and/or mechanistic studies.
Tags:
Category: clinical
Assignment: extended
Class: secondary
Template: 'Yes'
Clinical_Data_Export: 'Yes'
Props:
- date_of_first_dose
- date_of_last_dose
- agent_name
- dose_schedule
- total_dose
- agent_units_of_measure
- best_response_to_prior_therapy
- nonresponse_therapy_type
- prior_therapy_type
- prior_steroid_exposure
- number_of_prior_regimens_steroid
- total_number_of_doses_steroid
- date_of_last_dose_steroid
- prior_nsaid_exposure
- number_of_prior_regimens_nsaid
- total_number_of_doses_nsaid
- date_of_last_dose_nsaid
- tx_loc_geo_loc_ind_nsaid
- min_rsdl_dz_tx_ind_nsaids_treatment_pe
- therapy_type
- any_therapy
- number_of_prior_regimens_any_therapy
- total_number_of_doses_any_therapy
- date_of_last_dose_any_therapy
- treatment_performed_at_site
- treatment_performed_in_minimal_residual
prior_surgery:
Desc: Properties within the Prior Surgery node detail surgical procedures that the patient/subject/donor underwent prior to being enrolled in the study/trial in question. Clinical trials will typically capture more of this information than will cross-sectional and/or mechanistic studies.
Tags:
Category: clinical
Assignment: extended
Class: secondary
Template: 'Yes'
Clinical_Data_Export: 'Yes'
Props:
- date_of_surgery
- procedure
- anatomical_site_of_surgery
- surgical_finding
- residual_disease
- therapeutic_indicator
agent_administration:
Desc: Properties within the Agent Administration node detail the dosing of the therapeutic agent(s) being studied, alongside the specifics of how and when such agents were administered.
Tags:
Category: clinical_trial
Assignment: extended
Class: secondary
Template: 'Yes'
Clinical_Data_Export: 'Yes'
Props:
# d/n from STUDY_MED_ADMIN/SDAD/1
- document_number
- medication
- route_of_administration
- medication_lot_number
- medication_vial_id
- medication_actual_units_of_measure
- medication_duration
- medication_units_of_measure
- medication_actual_dose
# what is phase?
- phase
- start_time
- stop_time
- dose_level
- dose_units_of_measure
- date_of_missed_dose
- medication_missed_dose
- missed_dose_amount
- missed_dose_units_of_measure
- medication_course_number
- comment
sample:
Desc: The Sample node contains numerous properties which provide an in-depth characterization of the types of samples which were collected from any given patient/subject/donor and subsequently analyzed. Many of these sample annotations are required.
Tags:
Category: biospecimen
Assignment: core
Class: primary
Template: 'Yes'
Props:
- sample_id
- sample_site
- physical_sample_type # formerly just sample_type
- general_sample_pathology
- tumor_sample_origin
- summarized_sample_type # value is generated during data transformation based upon the combination of the preceding three properties
- molecular_subtype
- specific_sample_pathology
- date_of_sample_collection
- sample_chronology
- necropsy_sample
- tumor_grade
- length_of_tumor
- width_of_tumor
- volume_of_tumor
- percentage_tumor
- sample_preservation
- comment
assay:
Desc: The Assay node does not yet have any properties associated with it and is not currently used.
Tags:
Category: analysis
Assignment: extended
Class: secondary
Template: 'No'
Props: null
file:
Desc: Files can be associated with ICDC study, case, diagnosis and sample records, but are not themselves stored within the application. Instead, the application stores records as to the existence and nature of such files. The File node is comprised of properties which characterize these files in terms of their size, format and content, such that they can be appropriately represented within the application’s UI, and in terms of their storage location, such that they can be retrieved for analysis.
Tags:
Category: data_file
Assignment: core
Class: primary
Template: 'Yes'
Props:
- file_name
- file_type
- file_description
- file_format
- file_size
- md5sum
- file_status
- uuid
- file_location
image:
Desc: The Image node does not yet have any properties associated with it and is not currently used.
Tags:
Category: data_file
Assignment: core
Class: secondary
Template: 'No'
Props: null
image_collection:
Desc: The Image Collection node is comprised of properties which describe collections of images that are associated with any given study/trial. These properties characterize such image collections in terms of the types of images they contain, where the collections are hosted, and how they can be accessed.
Tags:
Category: study
Assignment: core
Class: secondary
Template: 'Yes'
Props:
- image_collection_name
- image_type_included
- image_collection_url
- repository_name
- collection_access
physical_exam:
Desc: Properties within the Physical Exam node detail observations around the status of multiple body systems as of a patient enrolled in a clinical trial, as of that patient being examined by a veterinarian during a scheduled visit to the appropriate study site.
Tags:
Category: clinical_trial
Assignment: extended
Class: secondary
Template: 'Yes'
Clinical_Data_Export: 'Yes'
Props:
- date_of_examination
- day_in_cycle
- body_system
- pe_finding
- pe_comment
- phase_pe
- assessment_timepoint
publication:
Desc: The Publication node is comprised of properties which describe publications that are directly associated with any given study/trial of interest, inclusive of the location(s) at which publications can be viewed in electronic form.
Tags:
Category: study
Assignment: core
Class: secondary
Template: 'Yes'
Props:
- publication_title
- authorship
- year_of_publication
- journal_citation
- digital_object_id
- pubmed_id
vital_signs:
Desc: Properties within the Vital Signs node detail observations around the key indicators of the bodily functions of a patient enrolled in a clinical trial, as of that patient being examined by a veterinarian during a scheduled visit to the appropriate study site.
Tags:
Category: clinical_trial
Assignment: extended
Class: secondary
Template: 'Yes'
Clinical_Data_Export: 'Yes'
Props:
- date_of_vital_signs
- body_temperature
- pulse
- respiration_rate
- respiration_pattern
- systolic_bp
- pulse_ox
- patient_weight
- body_surface_area
- modified_ecog
- ecg
- assessment_timepoint
- phase
lab_exam:
Desc: The Lab Exam node does not yet have any properties associated with it and is not currently used.
Tags:
Category: clinical_trial
Assignment: extended
Class: secondary
Template: 'No'
Clinical_Data_Export: 'Yes'
Props: null
adverse_event:
# how to link? To case and agent? Also to visit/followup?
Desc: Properties within the Adverse Event node detail unexpected medical, physical and behavioral problems occurring during therapy, in terms of what issues are observed, their severity, and what is considered to be their root cause.
Tags:
Category: clinical_trial
Assignment: extended
Class: secondary
Template: 'Yes'
Clinical_Data_Export: 'Yes'
Props:
- day_in_cycle
- date_of_onset
- existing_adverse_event
- date_of_resolution
- ongoing_adverse_event
- adverse_event_term
- adverse_event_description
- adverse_event_grade
- adverse_event_grade_description
- adverse_event_agent_name
- adverse_event_agent_dose
- attribution_to_research
- attribution_to_ind
- attribution_to_disease
- attribution_to_commercial
- attribution_to_other
- other_attribution_description
- dose_limiting_toxicity
- unexpected_adverse_event
disease_extent:
Desc: Properties within the Disease Extent node detail the extent to which the disease for which the patient is being treated has either responded to treatment or progressed, based upon observations of one or more specific lesions.
Tags:
Category: clinical_trial
Assignment: extended
Class: secondary
Template: 'Yes'
Clinical_Data_Export: 'Yes'
Props:
- lesion_number
- lesion_site
- lesion_description
- previously_irradiated
- previously_treated
- measurable_lesion
- target_lesion
- date_of_evaluation
- measured_how
- longest_measurement
- evaluation_number
- evaluation_code
follow_up:
Desc: The Follow-up node is comprised of properties which document when a follow-up evaluation was performed, and what observations were made at each follow-up evaluation.
Tags:
Category: clinical_trial
Assignment: extended
Class: secondary
Template: 'Yes'
Clinical_Data_Export: 'Yes'
Props:
# d/n from FOLLOW_UP/FLWU/1
- document_number
- date_of_last_contact
- patient_status
- explain_unknown_status
- contact_type
- treatment_since_last_contact
- physical_exam_performed
- physical_exam_changes
off_study:
# off_study, off_treatment -- how related? should be a dependency and normalize properties?
Desc: Properties within the Off Study node detail when a patient was removed from a clinical trial relative to other key dates, and the reason(s) for the patient being removed.
Tags:
Category: clinical_trial
Assignment: extended
Class: secondary
Template: 'Yes'
Clinical_Data_Export: 'Yes'
Props:
# d/n from OFF_STUDY/OSSM/1
- document_number
- date_off_study
- reason_off_study
- date_of_disease_progression
- date_off_treatment
- best_resp_vet_tx_tp_secondary_response
- date_last_medication_administration
- best_resp_vet_tx_tp_best_response
- date_of_best_response
off_treatment:
Desc: Properties within the Off Treatment node detail when a clinical trial patient's treatment was curtailed relative to other key dates. Properties also detail the best response to treatment observed to that point, and the reason(s) for treatment being curtailed.
Tags:
Category: clinical_trial
Assignment: extended
Class: secondary
Template: 'Yes'
Clinical_Data_Export: 'Yes'
Props:
# d/n from OFF_TREATMENT/OTSM/1
- document_number
- date_off_treatment
- reason_off_treatment
- date_of_disease_progression
- best_resp_vet_tx_tp_secondary_response
- date_last_medication_administration
- best_resp_vet_tx_tp_best_response
- date_of_best_response
Relationships:
member_of:
Mul: many_to_one
Ends:
- Src: case
Dst: cohort
- Src: cohort
Dst: study_arm
- Src: study_arm
Dst: study
- Src: study
Dst: program
- Src: case
Dst: study
# to accommodate Cases not mapped to any Cohort, either due to error or by design
- Src: cohort
Dst: study
# to accommodate a Study having Cases that should be grouped into Cohorts, but where Study Arms do not apply
Props: null
of_case:
Mul: many_to_one
Ends:
- Src: enrollment
Dst: case
Mul: one_to_one
- Src: demographic
Dst: case
Mul: one_to_one
- Src: diagnosis
Dst: case
- Src: cycle
Dst: case
- Src: follow_up
Dst: case
- Src: sample
Dst: case
# to accommodate a Sample being directly associated with a Case, rather than being only indirectly associated with a Case through a Visit, etc.
- Src: file
Dst: case
# to accommodate a File being directly associated with a Case, rather than being only indirectly associated with a Case through a Sample or Diagnosis
- Src: visit
Dst: case
# to accommodate situations in which a Visit cannot be unambiguously associated with a treatment Cycle
- Src: adverse_event
Dst: case
# required because adverse event observations are tied to date of onset rather than being tied to the date of the visit at which they were reported
- Src: registration
Dst: case
Mul: many_to_many
Props: null
of_study_arm:
Mul: many_to_many
Ends:
- Src: agent
Dst: study_arm
- Src: case
Dst: study_arm
Mul: many_to_one
Props: null
of_study:
Mul: many_to_many
Ends:
- Src: study_site
Dst: study
- Src: principal_investigator
Dst: study
- Src: file
Dst: study
Mul: many_to_one
- Src: image_collection
Dst: study
Mul: many_to_one
- Src: publication
Dst: study
Mul: many_to_one
Props: null
of_agent:
Mul: many_to_one
Ends:
- Src: agent_administration
Dst: agent
- Src: adverse_event
Dst: agent
Props: null
had_adverse_event:
Mul: many_to_one
Ends:
- Src: case
Dst: adverse_event
Props: null
at_enrollment:
Mul: many_to_one
Ends:
- Src: prior_therapy
Dst: enrollment
- Src: prior_surgery
Dst: enrollment
- Src: physical_exam
Dst: enrollment
Props: null
of_cycle:
Mul: many_to_one
Ends:
- Src: visit
Dst: cycle
Props: null
on_visit:
Mul: many_to_one
Ends:
- Src: agent_administration
Dst: visit
- Src: sample
Dst: visit
- Src: physical_exam
Dst: visit
- Src: lab_exam
Dst: visit
# removed relationship between adverse event and visit because adverse events are tied to date of onset rather than being tied to the date of the visit at which they were reported
- Src: disease_extent
Dst: visit
- Src: vital_signs
Dst: visit
Props: null
of_sample:
Mul: many_to_one
Ends:
- Src: assay
Dst: sample
- Src: file
Dst: sample
# added primarily to support direct association of sequence files with the primary samples to which they relate, when we have no data around the processing intermediates
Props: null
of_assay:
Mul: many_to_one
Ends:
- Src: file
Dst: assay
- Src: image
Dst: assay
Props: null
from_diagnosis:
Mul: many_to_one
Ends:
- Src: file
Dst: diagnosis
Props: null
went_off_study:
Mul: one_to_one
Ends:
- Src: case
Dst: off_study
Props: null
went_off_treatment:
Mul: one_to_one
Ends:
- Src: case
Dst: off_treatment
Props: null
next:
Mul: one_to_one
Ends:
- Src: visit
Dst: visit
- Src: sample
Dst: sample
- Src: prior_therapy
Dst: prior_therapy
- Src: prior_surgery
Dst: prior_surgery
- Src: adverse_event
Dst: adverse_event
Props: null
represents:
Mul: many_to_one
Ends:
- Src: case
Dst: canine_individual
Props: null