QResearch Logo

The Integration and Analysis of Data using Artificial Intelligence to Improve Patient Outcomes with Thoracic Diseases (DART)


Development, validation, and evaluation of prediction models to identify high-risk individuals who would benefit from lung cancer screening using low-dose CT (work package 6)

What is the aim of the study and why is it important?

This study aims to develop, validate, and evaluate individualised/personalised risk prediction models that can be used to identify high-risk patients who will benefit the most for lung health check and/or lung cancer screening using low dose computerised tomography (LDCT). General practitioners (GP) can also use such models as clinical decision support tools to assess the risk of individual patient, proactively manage patients with high-risk, make a timely referral to secondary care for prompt investigation when patients present with relevant symptoms indicative of lung cancer.

The DART website is now live at www.dartlunghealth.co.uk

On Twitter @dartlunghealth

How is the research being done?

This study aims to develop, validation, and evaluate risk prediction models for lung cancer screening in the British population and select the optimal model where possible. Individuals at the highest risk of developing lung cancer are most likely to benefit from screening. Selecting such individuals from the population and providing them with target interventions is a cost-effectiveness approach to improve lung cancer survival outcome while balancing over-diagnosis. The research objectives for this specific study are to:

1. Undertake a literature review to identify existing lung cancer prediction models and critically appraise these models using the PROBAST checklist (a tool to assess the risk of bias and applicability of prediction models);

2. Characterise the current epidemiology of the natural history of lung cancer (from patient’s first presentation in primary care, investigation by GP, referred to secondary care, lung cancer diagnosis, treatment, and survival) using linked data from the QResearch database; furthermore, how the natural history of lung cancer varied by age, sex, ethnicity, deprivation, smoking status, geographical region and over time will be explored.

3. Identify and quantify the risk factors for lung cancer based on the analysis of electronic health record and compare the risk factors with the literature;

4. Update and validate the existing QCancer (lung) algorithm for lung cancer screening using more recent data linked to HES, ONS mortality, and cancer registry;

5. Compare the QCancer (lung) model with other risk prediction models identified from the literature review, and select the best models for clinical practice where possible.

OLESEN, F., HANSEN, R. P. & VEDSTED, P. 2009. Delay in diagnosis: the experience in Denmark. Br J Cancer, 101 Suppl 2, S5-8.
WALTER, F., WEBSTER, A., SCOTT, S. & EMERY, J. 2012. The Andersen Model of Total Patient Delay: a systematic review of its application in cancer diagnosis. J Health Serv Res Policy, 17, 110-8.

Chief Investigator

Professor Fergus Gleeson, University of Oxford


University of Oxford

Location of research

University of Oxford

Date on which research approved


Project reference ID


Generic ethics approval reference


Are all data accessed are in anonymised form?


Brief summary of the dataset to be released (including any sensitive data)

Linked electronic health record including primary care records, Hospital Episode Statistics (HES, secondary care records), Office for National Statistics (ONS) mortality data, and cancer registry, will be needed for this study.

Implications and Impact

Lung cancer is a research priority in this country, due to its high incidence and mortality, and poor survival. It is the biggest cause of cancer death in the UK, with £307 million per year cost to the NHS England. Early diagnosis is critical to reduce lung cancer mortality, and improve patient outcomes (survival and quality of life). Through developing, validating, and evaluating personalised risk prediction models, this study can help identify individuals with a high-risk of developing lung cancer and refer them to do low dose CT scan, which may result in early diagnosis but without unduly burdening the overstretched NHS. In addition, health economic analysis will provide new insight to maximise the cost-effectiveness of lung cancer screening. The potential impact includes early diagnosis and better survival outcomes for patients, and reduced disease burden of lung cancer in the UK.

Funding Source


Research Team

Professor Julia Hippisley-Cox, University of Oxford

Mr Weiqi Liao, University of Oxford

Professor Sarah Wordsworth, University of Oxford

Dr James Buchanan, University of Oxford

Dr Elizabeth Stokes, University of Oxford

Professor Fergus Gleeson, University of Oxford

Approval Letter

Download Approval Letter


  • Development, validation, and evaluation of prediction models to identify individuals at high risk of lung cancer for screening in the English primary care population using the QResearch® database: research protocol and statistical analysis plan.
    Authors: Liao W, Burchardt J, Coupland C, Gleeson F, Hippisley-Cox J

Press Releases

Share this