Outputs and Growth of Primary Care Databases in the UK: Bibliometric Analysis

Talk Code: 
Zain Chaudhry & Fahmida Mannan
Fahmida Mannan, Angela Gibson-White, Usama Syed, Shirin Ahmed, Azeem Majeed
Author institutions: 
Imperial College School of Public Health


The data held in electronic health databases is now commonly used by researchers. The United Kingdom has several such electronic health databases derived from primary care records. The three major ones are the ‘Clinical Practice Research Datalink’ (CPRD), ‘The Health Improvement Network’ (THIN) and ‘QResearch’. Over time, research outputs generated from data contained in these databases have increased substantially, but are yet to be reviewed.


This study compares research outputs from CPRD, THIN and QResearch to assess growth and outputs in publications over a 10-year period (2004-2013). CPRD was also reviewed separately over 20 years as a case study.Publications that used data from CPRD and QResearch were extracted using the Science Citation Index (SCI) of the Thomson Scientific Institute for Scientific Information (Web of Science). Data for THIN was obtained from University College London and validated in Web of Science. All three databases were analysed for their growth in publications, the speciality areas and the journals in which their data have been published.


The three databases collectively produced 1,296 publications over a nine year period, with CPRD representing 63.6% (n=825 papers), THIN 30.4% (N=394) and QResearch 5.9% (n=77). Pharmacoepidemiology and General Medicine were the most common specialities featured. Over the 9 year period (2004-2013), Publications for THIN and QResearch have slowly increased over time, whereas CPRD publications have increased substantially in last 4 years with almost 75% of CPRD publications published in the past 9 years.


There is strong evidence that these databases are facilitating and enhancing scientific research and are growing from year to year. The observed variability in research outputs between these healthcare databases may be attributable to inherent differences in the types of Computerised Medical Records (CMR) that contribute to them, namely disparities in the extent of linking a specific, coded clinical problem to a visit episode. The three databases could become more powerful research tools if the National Health Service and general practitioners can provide accurate and comprehensive data for inclusion in these databases.

Submitted by: 
Zain Chaudhry
Funding acknowledgement: