Search CORE

6 research outputs found

Prediction models for hormone receptor status in female breast cancer do not extend to males : further evidence of sex-based disparity in breast cancer

Author: Abu-Eid Rasha
Carrero Zunamys Itzell
Chatterji Subarnarekha
Cifci Didem
Kather Jakob Nikolas
Loeffler Chiara Maria Lavinia
Niehues Jan Moritz
Saldanha Oliver Lester
Speirs Valerie
van Treeck Marko
Veldhuizen Gregory Patrick
Publication venue
Publication date: 08/11/2023
Field of study

MEDBERT.de: A Comprehensive German BERT Model for the Medical Domain

Author: Adams Lisa C.
Aerts Hugo JWL.
Augustin Moritz
Borchert Florian
Bressem Keno K.
Busch Felix
Grosser Lennart
Grundmann Paul
Liu Leonhard
Loyen Jan P.
Löser Alexander
Makowski Marcus R.
Niehues Stefan M.
Papaioannou Jens-Michalis
Xu Lina
Publication venue
Publication date: 24/03/2023
Field of study

This paper presents medBERTde, a pre-trained German BERT model specifically designed for the German medical domain. The model has been trained on a large corpus of 4.7 Million German medical documents and has been shown to achieve new state-of-the-art performance on eight different medical benchmarks covering a wide range of disciplines and medical document types. In addition to evaluating the overall performance of the model, this paper also conducts a more in-depth analysis of its capabilities. We investigate the impact of data deduplication on the model's performance, as well as the potential benefits of using more efficient tokenization methods. Our results indicate that domain-specific models such as medBERTde are particularly useful for longer texts, and that deduplication of training data does not necessarily lead to improved performance. Furthermore, we found that efficient tokenization plays only a minor role in improving model performance, and attribute most of the improved performance to the large amount of training data. To encourage further research, the pre-trained model weights and new benchmarks based on radiological data are made publicly available for use by the scientific community.Comment: Keno K. Bressem and Jens-Michalis Papaioannou and Paul Grundmann contributed equall

arXiv.org e-Print Archive

Fully transformer-based biomarker prediction from colorectal cancer histology: a large-scale multicentric study

Background: Deep learning (DL) can extract predictive and prognostic biomarkers from routine pathology slides in colorectal cancer. For example, a DL test for the diagnosis of microsatellite instability (MSI) in CRC has been approved in 2022. Current approaches rely on convolutional neural networks (CNNs). Transformer networks are outperforming CNNs and are replacing them in many applications, but have not been used for biomarker prediction in cancer at a large scale. In addition, most DL approaches have been trained on small patient cohorts, which limits their clinical utility. Methods: In this study, we developed a new fully transformer-based pipeline for end-to-end biomarker prediction from pathology slides. We combine a pre-trained transformer encoder and a transformer network for patch aggregation, capable of yielding single and multi-target prediction at patient level. We train our pipeline on over 9,000 patients from 10 colorectal cancer cohorts. Results: A fully transformer-based approach massively improves the performance, generalizability, data efficiency, and interpretability as compared with current state-of-the-art algorithms. After training on a large multicenter cohort, we achieve a sensitivity of 0.97 with a negative predictive value of 0.99 for MSI prediction on surgical resection specimens. We demonstrate for the first time that resection specimen-only training reaches clinical-grade performance on endoscopic biopsy tissue, solving a long-standing diagnostic problem. Interpretation: A fully transformer-based end-to-end pipeline trained on thousands of pathology slides yields clinical-grade performance for biomarker prediction on surgical resections and biopsies. Our new methods are freely available under an open source license

arXiv.org e-Print Archive

ZORA

A multimodal comparison of latent denoising diffusion probabilistic models and generative adversarial networks for medical image synthesis

Author: Christiane Kuhl
Christoph Haarburger
Daniel Truhn
Firas Khader
Gustav Müller-Franzes
Jakob Nikolas Kather
Jan Moritz Niehues
Soroosh Tayebi Arasteh
Sven Nebelung
Teresa Nolte
Tianci Wang
Tianyu Han
Publication venue: Nature Portfolio
Publication date: 14/12/2022
Field of study

Abstract Although generative adversarial networks (GANs) can produce large datasets, their limited diversity and fidelity have been recently addressed by denoising diffusion probabilistic models, which have demonstrated superiority in natural image synthesis. In this study, we introduce Medfusion, a conditional latent DDPM designed for medical image generation, and evaluate its performance against GANs, which currently represent the state-of-the-art. Medfusion was trained and compared with StyleGAN-3 using fundoscopy images from the AIROGS dataset, radiographs from the CheXpert dataset, and histopathology images from the CRCDX dataset. Based on previous studies, Progressively Growing GAN (ProGAN) and Conditional GAN (cGAN) were used as additional baselines on the CheXpert and CRCDX datasets, respectively. Medfusion exceeded GANs in terms of diversity (recall), achieving better scores of 0.40 compared to 0.19 in the AIROGS dataset, 0.41 compared to 0.02 (cGAN) and 0.24 (StyleGAN-3) in the CRMDX dataset, and 0.32 compared to 0.17 (ProGAN) and 0.08 (StyleGAN-3) in the CheXpert dataset. Furthermore, Medfusion exhibited equal or higher fidelity (precision) across all three datasets. Our study shows that Medfusion constitutes a promising alternative to GAN-based models for generating high-quality medical images, leading to improved diversity and less artifacts in the generated images

arXiv.org e-Print Archive

Directory of Open Access Journals

Self-supervised attention-based deep learning for pan-cancer mutation prediction from histopathology

Author: Alexander T. Pearson
Chiara M. L. Loeffler
Didem Cifci
Gregory Patrick Veldhuizen
Jakob Nikolas Kather
Jan Moritz Niehues
Katherine Jane Hewitt
Marko van Treeck
Oliver Lester Saldanha
Siddhi Ramesh
Tobias P. Seraphin
Publication venue: Nature Portfolio
Publication date: 01/03/2023
Field of study

Abstract The histopathological phenotype of tumors reflects the underlying genetic makeup. Deep learning can predict genetic alterations from pathology slides, but it is unclear how well these predictions generalize to external datasets. We performed a systematic study on Deep-Learning-based prediction of genetic alterations from histology, using two large datasets of multiple tumor types. We show that an analysis pipeline that integrates self-supervised feature extraction and attention-based multiple instance learning achieves a robust predictability and generalizability

Directory of Open Access Journals

Transformer-based biomarker prediction from colorectal cancer histology:A large-scale multicentric study

Author: Bonner Joseph D
Boxberg Melanie
Brenner Hermann
Church David N
Foersch Sebastian
Geppert Carol
Grabsch Heike I
Gray Richard
Greenson Joel K
Gruber Stephen B
Hawkins Nicholas J
Hay Jennifer
Hoffmeister Michael
Hutchins Gordon G A
James Jacqueline A
Jenniskens Josien C A
Jonnagaddala Jitendra
Kather Jakob Nikolas
Koelzer Viktor H
Langer Rupert
Loughrey Maurice B
Magill Laura
Matek Christian
Morton Dion
Mueller Wolfram
Niehues Jan Moritz
Nowak Marta
Offermans Kelly
Ouyang Xiaoming
Peng Chaolong
Peng Tingying
Quirke Philip
Reisenbüchler Daniel
Rennert Gad
Richman Susan D
Salto-Tellez Manuel
Schmolze Daniel
Schnabel Julia A
Seymour Matthew
Truhn Daniel
van den Brandt Piet A
Veldhuizen Gregory Patrick
Wagner Sophia J
Ward Robyn L
West Nicholas P
Yuan Tanwei
Zhi Cheng
Zhu Jiefu
Publication venue
Publication date: 23/08/2023
Field of study

Deep learning (DL) can accelerate the prediction of prognostic biomarkers from routine pathology slides in colorectal cancer (CRC). However, current approaches rely on convolutional neural networks (CNNs) and have mostly been validated on small patient cohorts. Here, we develop a new transformer-based pipeline for end-to-end biomarker prediction from pathology slides by combining a pre-trained transformer encoder with a transformer network for patch aggregation. Our transformer-based approach substantially improves the performance, generalizability, data efficiency, and interpretability as compared with current state-of-the-art algorithms. After training and evaluating on a large multicenter cohort of over 13,000 patients from 16 colorectal cancer cohorts, we achieve a sensitivity of 0.99 with a negative predictive value of over 0.99 for prediction of microsatellite instability (MSI) on surgical resection specimens. We demonstrate that resection specimen-only training reaches clinical-grade performance on endoscopic biopsy tissue, solving a long-standing diagnostic problem

Maastricht University Research Portal

Queen's University Belfast Research Portal