Do changes in full blood count indices predate symptom reporting in people with undiagnosed bowel cancer? Retrospective analysis using cohort and case control designs.

Talk Code: 
Tim Holt
Jacqueline Birks, Clare Bankhead, Brian Nicholson, Alice Fuller, Julietta Patnick.
Author institutions: 
Oxford University


Early detection of bowel cancer confers substantial prognostic benefit. Symptoms are non-specific and appear relatively late in the disease process. The ColonFlag is a machine learning algorithm derived by the company Medial EarlySign that calculates a risk score for undiagnosed bowel cancer (adjusted for age and sex) using changes in full blood count (FBC) indices. Such changes also become more evident as the disease progresses. It is not known whether changes in the ColonFlag score predate the reporting of symptoms. We aimed to compare the timescales over which the ColonFlag can predict bowel cancer with the timescales over which relevant symptoms are reported to general practice.


We conducted cohort and case control studies using routine primary care data from the Clinical Practice Research Datalink (CPRD) linked to the National Cancer Registry. Each FBC had an associated ColonFlag score derived during a previous project. We examined the literature for symptoms positively associated with bowel cancer, and identified their codes in CPRD. Main outcomes were: period prevalence of bowel cancer symptoms at six monthly time intervals prior to the index date (date of diagnosis for cases or randomly selected date for controls); odds ratios and Harrell’s c-index for discrimination using logistic regression on the outcome of bowel cancer diagnosis at a range of time intervals (primary outcome 18-24 months), for the ColonFlag and for symptoms.


Our initial dataset included 1,893,641 patients, 10,875,556 full blood counts and 8,918,037 ColonFlag scores. Trajectories of the ColonFlag begin to diverge in cases compared with controls at around 3-4 years before diagnosis. In the cohort study the AUROC for a diagnosis 18-24 months into the future for the ColonFlag = 0.736 (95% CI 0.715, 0.759), but this falls to 0.536 when the influence of the age variable is removed through the case control design. At this timescale, symptoms do not add significantly to our ability to predict bowel cancer. The odds ratios for individual symptoms become non-significant prior to 12 months before index date, with the exception of abdominal pain (OR 1.29 at 12-18 months) and rectal bleeding (OR for females 2.09 at 18-24 months, males 2.50), p<0.0001.


Symptom reporting to general practice increases rapidly in the 12 months prior to a bowel cancer diagnosis, but prior to 18 months is barely different from background reporting. This limits the usefulness of symptoms alone for early stage detection. The ColonFlag can discriminate usefully at timescales of 18-24 months, although at this stage much of its discriminatory ability comes from the age variable. Its performance increases as the diagnosis approaches, suggesting a place for this algorithm in the primary care setting, supporting other approaches to early detection.

Submitted by: 
Tim Holt
Funding acknowledgement: 
This study was funded by the National Institute for Health Research through the Research for Patient Benefit Program, grant PB-PG-0817-20025.