Validation and Public Health Modelling of Risk Prediction Models for Kidney Cancer in UK Biobank
Problem
In the UK, kidney cancer is responsible for 4500 deaths annually. Although early detection is associated with improved survival rates, 25% of newly diagnosed kidney cancers are metastatic. One barrier to the introduction of a screening programme is the low population prevalence of kidney cancer. Population risk stratification could minimise harms to individuals and improve the efficiency of a screening programme. Stratification requires a model that accurately identifies individuals at high risk of undiagnosed kidney cancer. Although several models have been developed most have not been externally validated, and the benefits of incorporating them in a screening programme have not been assessed.
Approach
We identified phenotypic risk models in a recent systematic review and validated them in a large population cohort (UK Biobank) with 6-year follow-up. We assessed discrimination and calibration of the models for men, women and the whole cohort. We undertook a public health modelling analysis using the best performing models to estimate their accuracy in the UK population (individuals aged 40-70). We accounted for differences in demographics (age and sex) and kidney cancer incidence between the UK Biobank cohort and the general population, using ONS and CRUK data respectively. We compared the ability of the models to identify high-risk individuals for screening with simple age- and sex-based screening strategies.
Findings
We included seventeen studies (corresponding to 30 models) in the review. Eight models had reasonable discrimination (AUROC>0.62) in men, women and the mixed sex cohort. However, many of the models had poor calibration in the UK biobank cohort. Public health modelling demonstrated the accuracy of the best models over a range of thresholds (6-year risk: 0.1%-1.0%). At any particular risk threshold, the models performed very similarly. At all thresholds considered they showed a small improvement in ability to identify high-risk individuals compared to age- and sex- based screening. At a cut-off threshold of 0.4%, the best performing model screens 12.3% of the population and detects one case for every 180 individuals screened. Screening all men over the age of 60 (14.1% of the population) would detect one case for every 206 individuals screened. All of the models performed less well in women than men.
Consequences
This is the first comprehensive external validation of risk prediction models for kidney cancer. Five models showed both reasonable discrimination and good calibration in a UK-based population. The best-performing models could improve the efficiency of screening by similar amounts in a UK population, with the choice of model depending on the availability of data. However, very few people are predicted to have a 6-year risk higher than 1% and the models have worse performance in women. Future research may consider the potential benefits of adding biomarkers or genetic risk factors to phenotypic models.