External validation of genetic risk prediction models for incident colorectal cancer using UK Biobank

Talk Code: 
Simon Griffin
Catherine L Saunders, Britt Killian, Deborah Thompson, Antonis C. Antoniou, Simon J Griffin, Jon Emery, Fiona M Walter, Joe Dennis, Xin Yang, Juliet A Usher-Smith
Author institutions: 
Department of Public Health and Primary Care, University of Cambridge


There is evidence that colorectal cancer (CRC) screening programmes reduce CRC incidence and mortality. In most countries individuals are invited for screening based on their age. Stratifying screening, or changing the age threshold at which someone is invited for screening, based on individual estimated risk could potentially provide a way of improving efficiency. Using the UK Biobank cohort for external validation we have previously shown that several risk models including only phenotypic risk factors and/or family history exhibit reasonable discrimination. This study aimed to compare and externally validate risk scores previously developed to predict future CRC that include genetic risk factors, with or without phenotypic risk factors, that could potentially be used to stratify the UK population.


We are using data from UK Biobank to perform external validation of 13 risk models for CRC identified through a systematic review (five ‘genes only’ and eight incorporating both genetic and phenotypic information). In total, across all models, the genetic risk models include 95 independent single nucleotide polymorphisms (SNPs), between six and 43 per model. The models with additional phenotypic risk factors include age, sex, BMI, family history, smoking, alcohol, physical activity, red meat consumption, aspirin use and fibre and vegetable consumption. We are using genetic information available in UK Biobank and phenotypic information from the baseline assessment, which was carried out between 2006 and 2010.For the main analysis we are including 373,112 participants with no prior history of CRC and five year follow-up (the last available cancer registry data linked to UK Biobank comes from September 2014). We are assessing discrimination of each risk model using the area under the receiver operating characteristic curve (AUC) and assessing calibration graphically, and using the Hosmer-Lemeshow statistic, for those models where absolute risk can be estimated.


At the time of writing this abstract we have externally assessed the discrimination of the five risk models that estimate CRC risk using genetic information alone. Wang 2013; AUC = 0.51, Ibanez Sanz 2016; AUC = 0.55, Hosono 2016; AUC = 0.54; Frampton 2016, AUC = 0.54; Jenkins 2016, AUC = 0.57. These are typically lower than the AUCs estimated in the development populations and previous external validations. Further models are in the process of being evaluated.


Genetic risk estimated through a risk score developed from SNPs typically does have some discriminative ability in a UK population, however results suggest this is lower than risk scores based on phenotypic risk, and scores incorporating age-related risk; further health economic modelling work is required to explore whether incorporating genetic or phenotypic risk into national screening programmes would be cost-effective.

Submitted by: 
Catherine Saunders
Funding acknowledgement: