Health Data

Using data that is collected from patient visits is a fairly new area of research, but one that is growing very quickly. Being able to access data that is already available on hand can be very valuable to our research teams.

Related News

Research Projects

  1. Advancing data science in drug development through an innovative computational framework for data sharing and statistical analysis


    Establishing this framework has been integral to the development of analytical tools.

  2. Near real-time determination of B. 1.1. 7 in proportion to total SARS-CoV-2 viral load in wastewater using an allele-specific primer extension PCR strategy


    Our study demonstrates that this strategy can provide public health units with an additional and much needed tool to rapidly triangulate VOC incidence/prevalence with high sensitivity and lineage specificity.

  3. The role of wastewater testing for SARS-CoV-2 surveillance


    These characteristics of SARS-CoV-2 infection, along with the observation that SARSCoV-2 is excreted in stools during all phases of infection, has led to the uptake of wastewater testing to complement SARS-CoV-2 surveillance based on clinical tests and case identification.

  4. Can synthetic data be a proxy for real clinical trial data? A validation study


    The high concordance between the analytical results and conclusions from synthetic and real data suggests that synthetic data can be used as a reasonable proxy for real clinical trial datasets.

  5. Identification and inclusion of gender factors in retrospective cohort studies: the GOING-FWD framework


    The application of the GOING-FWD multistep approach can help guide investigators to analyse gender and its impact on outcomes in previously collected data.

  6. Evaluating the utility of synthetic COVID-19 case data


    A privacy risk assessment on the synthetic data showed that the attribute and membership disclosure risks were low.

  7. Regional differences in access to the outdoors and outdoor play of Canadian children and youth during the COVID-19 outbreak


    It is unsurprising that in the provinces that have had the highest number of COVID-19 cases, there have been the most stringent restrictions on access to the outdoors. It is also unsurprising that these same provinces have had the greatest decline in time spent outdoors and in outdoor play among children and youth.

  8. Evaluating Identity Disclosure Risk in Fully Synthetic Health Data: Model Development and Validation


    We have presented a comprehensive identity disclosure risk model for fully synthetic data. The results for this synthesis method on 2 datasets demonstrate that synthesis can reduce meaningful identity disclosure risks considerably. The risk model can be applied in the future to evaluate the privacy of fully synthetic data.

  9. Canadian Association of Radiologists White Paper on De-Identification of Medical Imaging: Part 1, General Principles


    The application of AI algorithms in radiology requires access to large data sets containing PHI. The CAR AI Ethical and Legal standing committee published Part 2 of this guide to provide a practical approach to de-identification in the context of the current Canadian health care landscape. This article discussed the practical application of protecting patient data in the reality of our current Canadian clinical landscape. The strengths and weaknesses of de-identification approaches were outlined, along with the complexities of protecting patients’ medical imaging data, the possible de-identification software tools available, some common mistakes made by research and development teams, and perspectives on future directions.

  10. Abnormal placental pathological findings and adverse clinical outcomes of oocyte donation


    Placental pathology reflecting dysregulated immune processes and vasculopathy is associated with oocyte donation.

  11. Exploring data subsets with vtree


    While vtree can be used to explore data, it can also be used to generate study-flow diagrams.

  12. Quantitative analysis of SARS-CoV-2 RNA from wastewater solids in communities with low COVID-19 incidence and prevalence


    Can we find COVID-19 markers in the wastewater that will help predict outbreaks?

  13. Evaluating the psychometric properties of the parent-rated Strengths and Difficulties Questionnaire in a nationally representative sample of Canadian children and adolescents aged 6 to 17 years


    The original five-factor, parent-rated SDQ demonstrates evidence of factorial validity and reliability as a population measure of mental health difficulties among Canadian children and adolescents.

  14. Risk of Lower Birth Weight and Shorter Gestation in Oocyte Donation Pregnancies Compared With Other Assisted Reproductive Technology Methods: Systematic Review


    A high degree of interstudy heterogeneity exists, and the association between OD and infant outcomes remains unclear.

  15. Practical Synthetic Data Generation


    Data scientists will learn how synthetic data generation provides a way to make such data broadly available for secondary purposes while addressing many privacy concerns. Analysts will learn the principles and steps for generating synthetic data from real datasets. And business leaders will see how synthetic data can help accelerate time to a product or solution.

  16. Building an Anonymization Pipeline


    How can you use data in a way that protects individual privacy but still provides useful and meaningful analytics? With this practical book, data architects and engineers will learn how to establish and integrate secure, repeatable anonymization processes into their data flows and analytics in a sustainable manner.

  17. Racial differences in contribution of prepregnancy obesity and excessive gestational weight gain to large-for-gestational-age neonates


    Excessive gestational weight gain contributed more to LGA neonates than prepregnancy obesity in Whites and Asians, while there was no difference between excessive gestational weight gain and prepregnancy obesity in their contributions to the LGA neonates in Blacks. The differences are mostly driven by the differential prevalence of the two risk factors across racial groups.

  18. Barriers and Facilitators of Pediatric Shared Decision-Making: A Systematic Review


    Our findings can be used to identify potential pediatric SDM barriers and facilitators, guide context-specific barrier and facilitator assessments, and inform interventions for implementing SDM in pediatric practice.

  19. Increasing Incidence and Prevalence of Pathologic Hemoglobinopathies Among Children in Ontario, Canada from 1991-2013


    Through an innovative approach using provincial health administrative, immigration and demographic data, this study identified a rising incidence and prevalence of hemoglobinopathies among Ontario children <18 years of age between April 1, 1991 and March 31, 2013, potentially due to increased immigration rates.

  20. The reporting of studies conducted using observational routinely collected health data statement for pharmacoepidemiology (RECORD-PE)


    We anticipate that increasing use of the RECORD-PE guidelines by researchers and endorsement and adherence by journal editors will improve the standards of reporting of pharmacoepidemiological research undertaken using routinely collected data. This improved transparency will benefit the research community, patient care, and ultimately improve public health.

  21. Physical Literacy Knowledge Questionnaire: feasibility, validity, and reliability for Canadian children aged 8 to 12 years


    Future studies of alternative item wording and responses are recommended to enhance test-retest reliability.

  22. Canada’s Physical Literacy Consensus Statement: process and outcome


    Going forward, the impact of this initiative on the sector, and the more distal goal of increasing habitual physical activity levels, should be assessed.

  23. A Pragmatic Method for Identification of Long-Stay Patients in the PICU


    We present a pragmatic method for the retrospective identification of LSPs in the PICU that incorporates unit- and/or patient-specific characteristics. The next steps would be to validate this method using other patient and/or unit characteristics in different PICUs and over time.

  24. Defining and identifying concepts of medication literacy: an international perspective


    Future studies should focus on how this definition can be operationalized to support the role that pharmacists and other healthcare providers.

  25. Conceptual Critique of Canada’s Physical Literacy Assessment Instruments Also Misses the Mark


  26. A unified framework for evaluating the risk of re-identification of text de-identification tools


    Our framework attempts to correct for poorly distributed evaluation corpora, accounts for the data release context, and avoids the often optimistic assumptions that are made using the more traditional evaluation approach. It therefore provides a more realistic estimate of the true probability of re-identification.

  27. Anonymising and sharing individual patient data


    The expected benefits from sharing individual patient data for health research purposes include: it ensures accountability in results and that reported study results are valid, it allows researchers to build on the work of others more efficiently and to perform individual patient data meta-analyses to summarise evidence, and it decreases the burden on research subjects through the reuse of existing data.

  28. New concepts in the assessment of exercise capacity among children with congenital heart disease: Looking beyond heart function and mortality


    Physically active lifestyles are important for the physical and mental health of children with congenital heart defects.

  29. Complex care for kids Ontario: protocol for a mixed-methods randomised controlled trial of a population-level care coordination initiative for children with medical complexity

    Our primary objective is to evaluate the CCKO intervention using a randomised waitlist control design. The waitlist approach involves rolling out an intervention over time, whereby all participants are randomised into two groups (A and B) to receive the intervention at different time points determined at random.


  1. Tommy Michel Alain

    Scientist, CHEO Research Institute

    View Profile Email
  2. Melanie Bechard

    Investigator, CHEO Research Institute

    View Profile Email
  3. Natalie Bresee

    Investigator, CHEO Research Institute

    View Profile Email
  4. Dina El Demellawy

    Investigator, CHEO Research Institute

    View Profile Email
  5. Khaled El Emam

    Senior Scientist, CHEO Research Institute Professor, Faculty of Medicine, University of Ottawa

    View Profile Email
  6. Deshayne Fell

    Scientist, CHEO Research Institute

    View Profile Email
  7. Gary Goldfield

    Senior Scientist, CHEO Research Institute

    View Profile Email
  8. Tyson Graber

    Associate Scientist

    View Profile Email
  9. Robert Klaassen

    Investigator, CHEO Research Institute

    View Profile Email
  10. Margaret Lawson

    Senior Scientist, CHEO Research Institute

    View Profile Email
  11. Patricia Longmuir

    Senior Scientist, CHEO Research Institute

    View Profile Email
  12. Alex MacKenzie

    Senior Scientist, CHEO Research Institute

    View Profile Email
  13. Nathalie Major

    Investigator, CHEO Research Institute

    View Profile Email
  14. Kusum Menon

    Senior Scientist, CHEO Research Institute

    View Profile Email
  15. Sarah Sawyer

    Investigator, CHEO Research Institute

    View Profile Email
  16. Ewurabena Simpson

    Investigator, CHEO Research Institute

    View Profile Email
  17. Mark S. Tremblay

    Senior Scientist, CHEO Research Institute

    View Profile Email
  18. Régis Vaillancourt

    Senior Scientist, CHEO Research Institute

    View Profile Email
  19. Richard Webster

    Investigator, CHEO Research Institute

    View Profile Email
  20. Nancy Young

    Senior Scientist, CHEO Research Institute

    View Profile Email