The current state of adoption of well-structured electronic health records
and integration of digital methods for storing medical patient data in
structured formats can often considered as inferior compared to the use of
traditional, unstructured text based patient data documentation. Data mining in
the field of medical data analysis often needs to rely solely on processing of
unstructured data to retrieve relevant data. In natural language processing
(NLP), statistical models have been shown successful in various tasks like
part-of-speech tagging, relation extraction (RE) and named entity recognition
(NER). In this work, we present GERNERMED, the first open, neural NLP model for
NER tasks dedicated to detect medical entity types in German text data. Here,
we avoid the conflicting goals of protection of sensitive patient data from
training data extraction and the publication of the statistical model weights
by training our model on a custom dataset that was translated from publicly
available datasets in foreign language by a pretrained neural machine
translation model. The sample code and the statistical model is available at:
https://github.com/frankkramer-lab/GERNERME