Skip to main content
Article thumbnail
Location of Repository

From Episodes of Care to Diagnosis Codes: Automatic Text Categorization for Medico-Economic Encoding

By Patrick Ruch, Julien Gobeill, Imad Tbahriti and Antoine Geissbühler


We report on the design and evaluation of an original system to help assignment ICD (International Classification of Disease) codes to clinical narratives. The task is defined as a multi-class multi-document classification task. We combine a set of machine learning and data-poor methods to generate a single automatic text categorizer, which returns a ranked list of ICD codes. The combined ranking system currently obtains a precision of 75% at high ranks and a recall of about 63% for the top twenty returned codes for a theoretical upper bound of about 79% (inter-coder agreement). The performance of the data-poor classifier is weak, whereas the use of temporal features such as anamnesis and prescription contents results in a statistically significant improvement

Topics: Articles
Publisher: American Medical Informatics Association
OAI identifier:
Provided by: PubMed Central
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://www.pubmedcentral.nih.g... (external link)
  • Suggested articles

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.