PITCH: How many patients are required to provide a high level of reliability in the Japanese version of the CARE Measure?

Talk Code: 
Takaharu Matsuhisa
Noriyuki Takahashi, Muneyoshi Aomatsu, Kunihiko Takahashi, Jo Nishino, Nobutaro Ban, Stewart W Mercer
Author institutions: 
Nagoya University Graduate School of Medicine, Saku Central Hospital, Aichi Medical University School of Medicine, , Institute for Health and Wellbeing University of Glasgow,


Empathy is widely regarded as being key to effective consultation in general practice. The Consultation and Relational Empathy (CARE) Measure is a widely used and well-validated patient-rated measure in English. It has been translated and validated in other languages, and is used by researchers in various countries, including China, Holland, Sweden and Croatia. There has been preliminary work on the validity and reliability of a Japanese version of the CARE Measure. However, unlike the English and Chinese versions, the ability of the Japanese version of the measure to effectively discriminate between individual doctors has not yet been established.


We conducted secondary analysis of a data set involving 252 patients assessed by nine attending general practitioners. The key analysis in the current study was the reliability of the measure in terms of inter-rater reliability (the number of questionnaires required per doctor to attain a reliable score for each doctor on average) so that the ability and feasibility of the measure for effectively discriminating between doctors could be ascertained. The inter-rater reliability of the Japanese version of the CARE Measure was assessed using generalizability theory (using G-string IV software).


The ability of the CARE Measure to discriminate between doctors increased with the number of patients assessed per doctor. A sample size of approximately 40 patients provided an average inter-rater reliability of 0.8.


The current results suggest that the Japanese CARE Measure can effectively differentiate between doctors (on average) with approximately 40 patient ratings per doctor (inter-rater reliability > 0.8). These findings suggest that the measure is feasible for use in routine practice. Further studies involving larger numbers of doctors with a multi-center analysis are required to confirm the results of the current study, which was conducted at a single institution.

Submitted by: 
Takaharu Matsuhisa
Funding acknowledgement: 
This work was supported by JSPS KAKENHI Grant Number JP16K08869