In this work, we present the first corpus for German Adverse Drug Reaction
(ADR) detection in patient-generated content. The data consists of 4,169 binary
annotated documents from a German patient forum, where users talk about health
issues and get advice from medical doctors. As is common in social media data
in this domain, the class labels of the corpus are very imbalanced. This and a
high topic imbalance make it a very challenging dataset, since often, the same
symptom can have several causes and is not always related to a medication
intake. We aim to encourage further multi-lingual efforts in the domain of ADR
detection and provide preliminary experiments for binary classification using
different methods of zero- and few-shot learning based on a multi-lingual
model. When fine-tuning XLM-RoBERTa first on English patient forum data and
then on the new German data, we achieve an F1-score of 37.52 for the positive
class. We make the dataset and models publicly available for the community.Comment: Accepted at LREC 202