021-medical_conditions

Medical conditions dataset

Description

A medical conditions dataset is a collection of information about various health-related issues experienced by people. This data can include symptoms, diagnoses, treatments, and outcomes for different diseases and conditions. The dataset is gathered using various methods, or “modalities,” such as electronic health records, surveys, wearable devices, and medical imaging. The purpose of collecting and analyzing this information is to better understand health trends, identify risk factors, and improve medical care for everyone.

Introduction

Medical diagnosis in research studies plays a crucial role in the understanding and advancement of medicine. In clinical research, the process of diagnosis is used to identify individuals with specific medical conditions and to determine the prevalence of these conditions within a study population. Diagnosis is also used to identify subgroups of individuals with similar characteristics or outcomes, which can help researchers to better understand the underlying disease mechanisms and to develop risk models, identify biomarkers for early detection of the disease and for monitoring its progress and to develop new treatments.

Survey data is a valuable resource for understanding various aspects of medical conditions. It involves collecting self-reported information from individuals about their health, symptoms, diagnoses, treatments, and lifestyle factors. This data is tabular, meaning it is organized in rows and columns, which makes it easy to analyze and compare.

Medical conditions data collected through surveys covers diverse body systems, such as the cardiovascular, respiratory, digestive, and nervous systems. It provides insights into disease prevalence, risk factors, and the impact on individuals’ lives.

Surveys are an attractive data modality due to their accessibility, cost-effectiveness, and ability to capture patient perspectives. This information helps researchers identify potential risk factors, inform prevention strategies, and shape public health policies, ultimately contributing to better health outcomes for diverse populations.

Measurement protocol

Upon registration to the Human Phenotype Project study, participants provide details about their medical conditions in the Initial Medical Survey. Further data is then gathered during an interview at the baseline visit (In-system drop down) and using the Follow-up Medical Survey when participants return for subsequent visits. The data source columns indicate where information was collected.

Questions and self-reported medical diagnoses were mapped to ICD-11 codes. Medical diagnoses at baseline were determined as diagnoses that were reported at the baseline visit or with an onset date prior to the baseline visit.

Baseline

  • Participants fill in medical conditions history in the Initial Medical Survey.
  • Interviewer asks participant question and fill in a drop down list of different conditions participants have (In-system drop down).

Every follow up visit/ call and baseline

  • Follow-up Medical Survey -
    • Medical questionnaire asks about conditions.
    • Should fill in the date when new condition was founded .

Data availability

The information is stored in 1 parquet file: medical_conditions.parquet

Data dictionary

pl.dict
field_string description_string folder_id feature_set field_type strata data_coding array pandas_dtype bulk_file_extension relative_location units bulk_dictionary sampling_rate transformation list_of_tags stability sexed debut completed
tabular_field_name
medical_condition Medical condition name Medical condition name 21 medical_conditions Text Primary NaN Multiple string NaN medical_conditions/medical_conditions.parquet NaN NaN NaN NaN NaN Accruing Both sexes 11/21/2018 NaN
icd11_code ICD11 code ICD-11 codes are the latest global system for ... 21 medical_conditions Text Primary NaN Multiple string NaN medical_conditions/medical_conditions.parquet NaN NaN NaN NaN NaN Accruing Both sexes 11/21/2018 NaN
data_source Data source of medical condition Data source the self reporting of medical cond... 21 medical_conditions Text Primary NaN Multiple string NaN medical_conditions/medical_conditions.parquet NaN NaN NaN NaN NaN Accruing Both sexes 11/21/2018 NaN
start_month Start month Start month 21 medical_conditions Integer Primary NaN Multiple int NaN medical_conditions/medical_conditions.parquet NaN NaN NaN NaN NaN Accruing Both sexes 11/21/2018 NaN
start_year Start year Start year 21 medical_conditions Integer Primary NaN Multiple int NaN medical_conditions/medical_conditions.parquet NaN NaN NaN NaN NaN Accruing Both sexes 11/21/2018 NaN
end_month End month Start month 21 medical_conditions Integer Primary NaN Multiple int NaN medical_conditions/medical_conditions.parquet NaN NaN NaN NaN NaN Accruing Both sexes 11/21/2018 NaN
end_year Start year Start year 21 medical_conditions Integer Primary NaN Multiple int NaN medical_conditions/medical_conditions.parquet NaN NaN NaN NaN NaN Accruing Both sexes 11/21/2018 NaN
collection_timestamp Collection time Collection time 21 medical_conditions Datetime Collection time NaN Single datetime64[ns, Asia/Jerusalem] NaN medical_conditions/medical_conditions.parquet NaN NaN NaN NaN NaN Accruing Both sexes 11/21/2018 NaN
collection_date Collection date Collection date 21 medical_conditions Date Collection time NaN Single datetime64[ns] NaN medical_conditions/medical_conditions.parquet NaN NaN NaN NaN NaN Accruing Both sexes 11/21/2018 NaN
timezone Timezone Timezone 21 medical_conditions Categorical (single) Collection time NaN Single category NaN medical_conditions/medical_conditions.parquet NaN NaN NaN NaN NaN Accruing Both sexes 11/21/2018 NaN