18 research outputs found

    MEDBERT.de: A Comprehensive German BERT Model for the Medical Domain

    Full text link
    This paper presents medBERTde, a pre-trained German BERT model specifically designed for the German medical domain. The model has been trained on a large corpus of 4.7 Million German medical documents and has been shown to achieve new state-of-the-art performance on eight different medical benchmarks covering a wide range of disciplines and medical document types. In addition to evaluating the overall performance of the model, this paper also conducts a more in-depth analysis of its capabilities. We investigate the impact of data deduplication on the model's performance, as well as the potential benefits of using more efficient tokenization methods. Our results indicate that domain-specific models such as medBERTde are particularly useful for longer texts, and that deduplication of training data does not necessarily lead to improved performance. Furthermore, we found that efficient tokenization plays only a minor role in improving model performance, and attribute most of the improved performance to the large amount of training data. To encourage further research, the pre-trained model weights and new benchmarks based on radiological data are made publicly available for use by the scientific community.Comment: Keno K. Bressem and Jens-Michalis Papaioannou and Paul Grundmann contributed equall

    The German National Registry of Primary Immunodeficiencies (2012-2017)

    Get PDF
    Introduction: The German PID-NET registry was founded in 2009, serving as the first national registry of patients with primary immunodeficiencies (PID) in Germany. It is part of the European Society for Immunodeficiencies (ESID) registry. The primary purpose of the registry is to gather data on the epidemiology, diagnostic delay, diagnosis, and treatment of PIDs. Methods: Clinical and laboratory data was collected from 2,453 patients from 36 German PID centres in an online registry. Data was analysed with the software Stata® and Excel. Results: The minimum prevalence of PID in Germany is 2.72 per 100,000 inhabitants. Among patients aged 1–25, there was a clear predominance of males. The median age of living patients ranged between 7 and 40 years, depending on the respective PID. Predominantly antibody disorders were the most prevalent group with 57% of all 2,453 PID patients (including 728 CVID patients). A gene defect was identified in 36% of patients. Familial cases were observed in 21% of patients. The age of onset for presenting symptoms ranged from birth to late adulthood (range 0–88 years). Presenting symptoms comprised infections (74%) and immune dysregulation (22%). Ninety-three patients were diagnosed without prior clinical symptoms. Regarding the general and clinical diagnostic delay, no PID had undergone a slight decrease within the last decade. However, both, SCID and hyper IgE- syndrome showed a substantial improvement in shortening the time between onset of symptoms and genetic diagnosis. Regarding treatment, 49% of all patients received immunoglobulin G (IgG) substitution (70%—subcutaneous; 29%—intravenous; 1%—unknown). Three-hundred patients underwent at least one hematopoietic stem cell transplantation (HSCT). Five patients had gene therapy. Conclusion: The German PID-NET registry is a precious tool for physicians, researchers, the pharmaceutical industry, politicians, and ultimately the patients, for whom the outcomes will eventually lead to a more timely diagnosis and better treatment

    The Extended Clinical Phenotype of 26 Patients with Chronic Mucocutaneous Candidiasis due to Gain-of-Function Mutations in STAT1

    Get PDF
    PURPOSE: Gain-of-function (GOF) mutations in the signal transducer and activator of transcription 1 (STAT1) result in unbalanced STAT signaling and cause immune dysregulation and immunodeficiency. The latter is often characterized by the susceptibility to recurrent Candida infections, resulting in the clinical picture of chronic mucocutaneous candidiasis (CMC). This study aims to assess the frequency of GOF STAT1 mutations in a large international cohort of CMC patients. METHODS: STAT1 was sequenced in genomic DNA from 57 CMC patients and 35 healthy family members. The functional relevance of nine different STAT1 variants was shown by flow cytometric analysis of STAT1 phosphorylation in patients' peripheral blood cells (PBMC) after stimulation with interferon (IFN)-α, IFN-γ or interleukin-27 respectively. Extended clinical data sets were collected and summarized for 26 patients. RESULTS: Heterozygous mutations within STAT1 were identified in 35 of 57 CMC patients (61 %). Out of 39 familial cases from 11 families, 26 patients (67 %) from 9 families and out of 18 sporadic cases, 9 patients (50 %) were shown to have heterozygous mutations within STAT1. Thirteen distinct STAT1 mutations are reported in this paper. Eight of these mutations are known to cause CMC (p.M202V, p.A267V, p.R274W, p.R274Q, p.T385M, p.K388E, p.N397D, and p.F404Y). However, five STAT1 variants (p.F172L, p.Y287D, p.P293S, p.T385K and p.S466R) have not been reported before in CMC patients. CONCLUSION: STAT1 mutations are frequently observed in patients suffering from CMC. Thus, sequence analysis of STAT1 in CMC patients is advised. Measurement of IFN- or IL-induced STAT1 phosphorylation in PBMC provides a fast and reliable diagnostic tool and should be carried out in addition to genetic testing

    Jet cross sections and transverse momentum distributions with NNLOJET

    No full text
    This talk discusses recent results for next-to-next-to-leading order (NNLO) QCD corrections to jet cross sections and transverse momentum distributions. The results are obtained in the NNLOJET code framework, which provides an implementation of the antenna subtraction method for the handling of infrared singular contributions at NNLO. We briefly describe the NNLOJET implementation, with particular emphasis on the construction of the real radiation phase space, which is tailored to ensure stability in all infrared sensitive regions

    Self-supervised attention-based deep learning for pan-cancer mutation prediction from histopathology

    No full text
    Abstract The histopathological phenotype of tumors reflects the underlying genetic makeup. Deep learning can predict genetic alterations from pathology slides, but it is unclear how well these predictions generalize to external datasets. We performed a systematic study on Deep-Learning-based prediction of genetic alterations from histology, using two large datasets of multiple tumor types. We show that an analysis pipeline that integrates self-supervised feature extraction and attention-based multiple instance learning achieves a robust predictability and generalizability

    Nerve Fibers in the Tumor Microenvironment Are Co-Localized with Lymphoid Aggregates in Pancreatic Cancer

    No full text
    B cells and tertiary lymphoid structures (TLS) are reported to be important in survival in cancer. Pancreatic Cancer (PDAC) is one of the most lethal cancer types, and currently, it is the seventh leading cause of cancer-related death worldwide. A better understanding of tumor biology is pivotal to improve clinical outcome. The desmoplastic stroma is a complex system in which crosstalk takes place between cancer-associated fibroblasts, immune cells and cancer cells. Indirect and direct cellular interactions within the tumor microenvironment (TME) drive key processes such as tumor progression, metastasis formation and treatment resistance. In order to understand the aggressiveness of PDAC and its resistance to therapeutics, the TME needs to be further unraveled. There are some limited data about the influence of nerve fibers on cancer progression. Here we show that small nerve fibers are located at lymphoid aggregates in PDAC. This unravels future pathways and has potential to improve clinical outcome by a rational development of new therapeutic strategies

    Nerve Fibers in the Tumor Microenvironment Are Co-Localized with Lymphoid Aggregates in Pancreatic Cancer

    No full text
    B cells and tertiary lymphoid structures (TLS) are reported to be important in survival in cancer. Pancreatic Cancer (PDAC) is one of the most lethal cancer types, and currently, it is the seventh leading cause of cancer-related death worldwide. A better understanding of tumor biology is pivotal to improve clinical outcome. The desmoplastic stroma is a complex system in which crosstalk takes place between cancer-associated fibroblasts, immune cells and cancer cells. Indirect and direct cellular interactions within the tumor microenvironment (TME) drive key processes such as tumor progression, metastasis formation and treatment resistance. In order to understand the aggressiveness of PDAC and its resistance to therapeutics, the TME needs to be further unraveled. There are some limited data about the influence of nerve fibers on cancer progression. Here we show that small nerve fibers are located at lymphoid aggregates in PDAC. This unravels future pathways and has potential to improve clinical outcome by a rational development of new therapeutic strategies

    Curative Treatment of POMP-Related Autoinflammation and Immune Dysregulation (PRAID) by Hematopoietic Stem Cell Transplantation

    No full text
    The rhabdoid tumor (RT) predisposition syndromes 1 and 2 (RTPS1 and 2) are rare genetic conditions rendering young children vulnerable to an increased risk of RT, malignant neoplasms affecting the kidney, miscellaneous soft-part tissues, the liver and the central nervous system (Atypical Teratoid Rhabdoid Tumors, ATRT). Both, RTPS1&2 are due to pathogenic variants (PV) in genes encoding constituents of the BAF chromatin remodeling complex, i.e. SMARCB1 (RTPS1) and SMARCA4 (RTPS2). In contrast to other genetic disorders related to PVs in SMARCB1 and SMARCA4 such as Coffin-Siris Syndrome, RTPS1&2 are characterized by a predominance of truncating PVs, terminating transcription thus explaining a specific cancer risk. The penetrance of RTPS1 early in life is high and associated with a poor survival. However, few unaffected carriers may be encountered. Beyond RT, the tumor spectrum may be larger than initially suspected, and cancer surveillance offered to unaffected carriers (siblings or parents) and long-term survivors of RT is still a matter of discussion. RTPS2 exposes female carriers to an ill-defined risk of small cell carcinoma of the ovaries, hypercalcemic type (SCCOHT), which may appear in prepubertal females. RT surveillance protocols for these rare families have not been established. To address unresolved issues in the care of individuals with RTPS and to propose appropriate surveillance guidelines in childhood, the SIOPe Host Genome working group invited pediatric oncologists and geneticists to contribute to an expert meeting. The current manuscript summarizes conclusions of the panel discussion, including consented statements as well as non-evidence-based proposals for validation in the future
    corecore