QResearch Logo
Menu

The Integration and Analysis of Data using Artificial Intelligence to Improve Patient Outcomes with Thoracic Diseases (DART)

Status

Completed

Title

Development, validation, and evaluation of prediction models to identify high-risk individuals who would benefit from lung cancer screening using low-dose CT (work package 6)

What were the objectives of the study?

This study aims to develop, validate, and evaluate individualised/personalised risk prediction models that can be used to identify high-risk patients who will benefit the most for lung health check and/or lung cancer screening using low dose computerised tomography (LDCT). General practitioners (GP) can also use such models as clinical decision support tools to assess the risk of individual patient, proactively manage patients with high-risk, make a timely referral to secondary care for prompt investigation when patients present with relevant symptoms indicative of lung cancer.

The DART website is now live at www.dartlunghealth.co.uk

On Twitter @dartlunghealth

How was the research done?

This study aims to develop, validation, and evaluate risk prediction models for lung cancer screening in the British population and select the optimal model where possible. Individuals at the highest risk of developing lung cancer are most likely to benefit from screening. Selecting such individuals from the population and providing them with target interventions is a cost-effectiveness approach to improve lung cancer survival outcome while balancing over-diagnosis. The research objectives for this specific study are to:

1. Undertake a literature review to identify existing lung cancer prediction models and critically appraise these models using the PROBAST checklist (a tool to assess the risk of bias and applicability of prediction models);

2. Characterise the current epidemiology of the natural history of lung cancer (from patient’s first presentation in primary care, investigation by GP, referred to secondary care, lung cancer diagnosis, treatment, and survival) using linked data from the QResearch database; furthermore, how the natural history of lung cancer varied by age, sex, ethnicity, deprivation, smoking status, geographical region and over time will be explored.

3. Identify and quantify the risk factors for lung cancer based on the analysis of electronic health record and compare the risk factors with the literature;

4. Update and validate the existing QCancer (lung) algorithm for lung cancer screening using more recent data linked to HES, ONS mortality, and cancer registry;

5. Compare the QCancer (lung) model with other risk prediction models identified from the literature review, and select the best models for clinical practice where possible.

References
OLESEN, F., HANSEN, R. P. & VEDSTED, P. 2009. Delay in diagnosis: the experience in Denmark. Br J Cancer, 101 Suppl 2, S5-8.
WALTER, F., WEBSTER, A., SCOTT, S. & EMERY, J. 2012. The Andersen Model of Total Patient Delay: a systematic review of its application in cancer diagnosis. J Health Serv Res Policy, 17, 110-8.

Chief Investigator

Professor Fergus Gleeson, University of Oxford

Lead Applicant Organisation Name

Sponsor

University of Oxford

Location of research

University of Oxford

Date on which research approved

08-Mar-2021

Project reference ID

OX37

Generic ethics approval reference

18/EM/0400

Are all data accessed are in anonymised form?

Yes

Brief summary of the dataset to be released (including any sensitive data)

Linked electronic health record including primary care records, Hospital Episode Statistics (HES, secondary care records), Office for National Statistics (ONS) mortality data, and cancer registry, will be needed for this study.

What were the main findings?

There were 73 380 incident lung cancer cases in the QResearch derivation cohort, 22 838 cases in the QResearch internal validation cohort, and 16 145 cases in the CPRD external validation cohort during follow-up. The predictors in the final model included sociodemographic characteristics (age, sex, ethnicity, Townsend score), lifestyle factors (BMI, smoking and alcohol status), comorbidities, family history of lung cancer, and personal history of other cancers. Some predictors were different between the models for women and men, but model performance was similar between sexes. The CanPredict (lung) model showed excellent discrimination and calibration in both internal and external validation of the full model, by sex and ethnicity. The model explained 65% of the variation in time to diagnosis of lung cancer R2D in both sexes in the QResearch validation cohort and 59% of the R2D in both sexes in the CPRD validation cohort. Harrell's C statistics were 0·90 in the QResearch (validation) cohort and 0·87 in the CPRD cohort, and the D statistics were 2·8 in the QResearch (validation) cohort and 2·4 in the CPRD cohort. Compared with seven other lung cancer prediction models, the CanPredict (lung) model had the best performance in discrimination, calibration, and net benefit across three prediction horizons (5, 6, and 10 years) in the two approaches. The CanPredict (lung) model also had higher sensitivity than the current UK recommended models (LLPv2 and PLCOM2012), as it identified more lung cancer cases than those models by screening the same amount of individuals at high risk.

Implications and Impact

The CanPredict (lung) model was developed, and internally and externally validated, using data from 19·67 million people from two English primary care databases. Our model has potential utility for risk stratification of the UK primary care population and selection of individuals at high risk of lung cancer for targeted screening. If our model is recommended to be implemented in primary care, each individual's risk can be calculated using information in the primary care electronic health records, and people at high risk can be identified for the lung cancer screening programme.

Funding Source

INNOVATE UK

Public Benefit Statement

Research Team

Professor Julia Hippisley-Cox, University of Oxford

Mr Weiqi Liao, University of Oxford

Professor Sarah Wordsworth, University of Oxford

Dr James Buchanan, University of Oxford

Dr Elizabeth Stokes, University of Oxford

Professor Fergus Gleeson, University of Oxford

Publications

Press Releases

Access Type

Trusted Research Environment (TRE)

Share this