Multimorbidity research in mental health services requires data from physical
health conditions which is traditionally limited in mental health care
electronic health records. In this study, we aimed to extract data from
physical health conditions from clinical notes using SemEHR. Data was extracted
from Clinical Record Interactive Search (CRIS) system at South London and
Maudsley Biomedical Research Centre (SLaM BRC) and the cohort consisted of all
individuals who had received a primary or secondary diagnosis of severe mental
illness between 2007 and 2018. Three pairs of annotators annotated 2403
documents with an average Cohen's Kappa of 0.757. Results show that the NLP
performance varies across different diseases areas (F1 0.601 - 0.954)
suggesting that the language patterns or terminologies of different condition
groups entail different technical challenges to the same NLP task.Comment: 4 pages, 2 table