Search CORE

18 research outputs found

Deployment of Image Analysis Algorithms under Prevalence Shifts

Author: Christodoulou Evangelia
Ferrer Luciana
Godau Patrick
Jäger Paul
Kalinowski Piotr
Maier-Hein Lena
Reinke Annika
Tizabi Minu
Publication venue
Publication date: 24/07/2023
Field of study

Domain gaps are among the most relevant roadblocks in the clinical translation of machine learning (ML)-based solutions for medical image analysis. While current research focuses on new training paradigms and network architectures, little attention is given to the specific effect of prevalence shifts on an algorithm deployed in practice. Such discrepancies between class frequencies in the data used for a method's development/validation and that in its deployment environment(s) are of great importance, for example in the context of artificial intelligence (AI) democratization, as disease prevalences may vary widely across time and location. Our contribution is twofold. First, we empirically demonstrate the potentially severe consequences of missing prevalence handling by analyzing (i) the extent of miscalibration, (ii) the deviation of the decision threshold from the optimum, and (iii) the ability of validation metrics to reflect neural network performance on the deployment population as a function of the discrepancy between development and deployment prevalence. Second, we propose a workflow for prevalence-aware image classification that uses estimated deployment prevalences to adjust a trained classifier to a new environment, without requiring additional annotated deployment data. Comprehensive experiments based on a diverse set of 30 medical classification tasks showcase the benefit of the proposed workflow in generating better classifier decisions and more reliable performance estimates compared to current practice

arXiv.org e-Print Archive

Common Limitations of Image Processing Metrics:A Picture Story

Author: Acion Laura
Antonelli Michela
Arbel Tal
Bakas Spyridon
Bankhead Peter
Baumgartner Michael
Benis Arriel
Cardoso M. Jorge
Cheplygina Veronika
Christodoulou Evangelia
Cimini Beth
Collins Gary S.
Eisenmann Matthias
Farahani Keyvan
Glocker Ben
Godau Patrick
Gutierrez Clarisa Sanchez
Hamprecht Fred
Hashimoto Daniel A.
Heckmann-Nötzel Doreen
Hoffman Michael M.
Huisman Merel
Isensee Fabian
Jannin Pierre
Jäger Paul
Kahn Charles E.
Kainz Bernhard
Karargyris Alexandros
Karthikesalingam Alan
Kavur Emre
Kenngott Hannes
Kleesiek Jens
Kooi Thijs
Kopp-Schneider Annette
Kozubek Michal
Kreshuk Anna
Kurc Tahsin
Landman Bennett A.
Litjens Geert
Madani Amin
Maier-Hein Klaus
Maier-Hein Lena
Martel Anne L.
Mattson Peter
Meijering Erik
Menze Bjoern
Moher David
Moons Karel G. M.
Müller Henning
Nichyporuk Brennan
Nickel Felix
Noyan M. Alican
Petersen Jens
Polat Gorkem
Rajpoot Nasir
Reinke Annika
Reyes Mauricio
Riegler Michael
Rieke Nicola
Rivaz Hassan
Rädsch Tim
Saez-Rodriguez Julio
Saha Anindo
Schroeter Julien
Shetty Shravya
Stieltjes Bram
Sudre Carole H.
Summers Ronald M.
Taha Abdel A.
Tizabi Minu D.
Tsaftaris Sotirios A.
Van Calster Ben
van Ginneken Bram
van Smeden Maarten
Varoquaux Gaël
Wiesenfarth Manuel
Yaniv Ziv R.
Publication venue
Publication date: 01/01/2021
Field of study

While the importance of automatic image analysis is continuously increasing, recent meta-research revealed major flaws with respect to algorithm validation. Performance metrics are particularly key for meaningful, objective, and transparent performance assessment and validation of the used automatic algorithms, but relatively little attention has been given to the practical pitfalls when using specific metrics for a given image analysis task. These are typically related to (1) the disregard of inherent metric properties, such as the behaviour in the presence of class imbalance or small target structures, (2) the disregard of inherent data set properties, such as the non-independence of the test cases, and (3) the disregard of the actual biomedical domain interest that the metrics should reflect. This living dynamically document has the purpose to illustrate important limitations of performance metrics commonly applied in the field of image analysis. In this context, it focuses on biomedical image analysis problems that can be phrased as image-level classification, semantic segmentation, instance segmentation, or object detection task. The current version is based on a Delphi process on metrics conducted by an international consortium of image analysis experts from more than 60 institutions worldwide.Comment: This is a dynamic paper on limitations of commonly used metrics. The current version discusses metrics for image-level classification, semantic segmentation, object detection and instance segmentation. For missing use cases, comments or questions, please contact [email protected] or [email protected]. Substantial contributions to this document will be acknowledged with a co-authorshi

arXiv.org e-Print Archive

Edinburgh Research Explorer

Understanding metric-related pitfalls in image analysis validation

Author: Acion Laura
Antonelli Michela
Arbel Tal
Bakas Spyridon
Baumgartner Michael
Benis Arriel
Blaschko Matthew
Büttner Florian
Calster Ben Van
Cardoso M. Jorge
Chen Jianxu
Cheplygina Veronika
Christodoulou Evangelia
Cimini Beth A.
Collins Gary S.
Eisenmann Matthias
Farahani Keyvan
Ferrer Luciana
Galdran Adrian
Ginneken Bram van
Glocker Ben
Godau Patrick
Haase Robert
Hashimoto Daniel A.
Heckmann-Nötzel Doreen
Hoffman Michael M.
Huisman Merel
Isensee Fabian
Jannin Pierre
Jäger Paul F.
Kahn Charles E.
Kainmueller Dagmar
Kainz Bernhard
Karargyris Alexandros
Karthikesalingam Alan
Kavur A. Emre
Kenngott Hannes
Kleesiek Jens
Kofler Florian
Kooi Thijs
Kopp-Schneider Annette
Kozubek Michal
Kreshuk Anna
Kurc Tahsin
Landman Bennett A.
Litjens Geert
Madani Amin
Maier-Hein Klaus
Maier-Hein Lena
Martel Anne L.
Mattson Peter
Meijering Erik
Menze Bjoern
Moons Karel G. M.
Müller Henning
Nichyporuk Brennan
Nickel Felix
Petersen Jens
Rafelski Susanne M.
Rajpoot Nasir
Reinke Annika
Reyes Mauricio
Riegler Michael A.
Rieke Nicola
Rädsch Tim
Saez-Rodriguez Julio
Shetty Shravya
Smeden Maarten van
Sudre Carole H.
Summers Ronald M.
Sánchez Clara I.
Taha Abdel A.
Tiulpin Aleksei
Tizabi Minu D.
Tsaftaris Sotirios A.
Varoquaux Gaël
Wiesenfarth Manuel
Yaniv Ziv R.
Publication venue
Publication date: 01/01/2023
Field of study

Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem. This could be attributed to a lack of accessibility of metric-related knowledge: While taking into account the individual strengths, weaknesses, and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multi-stage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides the first reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Focusing on biomedical image analysis but with the potential of transfer to other fields, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. To facilitate comprehension, illustrations and specific examples accompany each pitfall. As a structured body of information accessible to researchers of all levels of expertise, this work enhances global comprehension of a key topic in image analysis validation.Comment: Shared first authors: Annika Reinke, Minu D. Tizabi; shared senior authors: Paul F. J\"ager, Lena Maier-Hei

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Edinburgh Research Explorer

Warwick Research Archives Portal Repository

HAL-CEA

Bern Open Repository and Information System (BORIS)

HAL-Rennes 1

Intraoperative Bildgebung und Visualisierung

Author: Gockel Ines
Maier-Hein Lena
März Keno
Müller-Stich Beat
Navab Nassir
Nickel Felix
Speidel Stefanie
Teber Dogu
Tizabi Minu
Wendler Thomas
Publication venue
Publication date: 01/01/2020
Field of study

OPUS Augsburg

Tattoo tomography: Freehand 3D photoacoustic image reconstruction with an optical pattern

Author: Dreher Kris
Gröhl Janek
Holzwarth Niklas
Maier-Hein Lena
Müller-Stich Beat P
Nölke Jan-Hinrich
Schellenberg Melanie
Seitel Alexander
Tizabi Minu D
Publication venue
Publication date: 11/11/2020
Field of study

Purpose!#!Photoacoustic tomography (PAT) is a novel imaging technique that can spatially resolve both morphological and functional tissue properties, such as vessel topology and tissue oxygenation. While this capacity makes PAT a promising modality for the diagnosis, treatment, and follow-up of various diseases, a current drawback is the limited field of view provided by the conventionally applied 2D probes.!##!Methods!#!In this paper, we present a novel approach to 3D reconstruction of PAT data (Tattoo tomography) that does not require an external tracking system and can smoothly be integrated into clinical workflows. It is based on an optical pattern placed on the region of interest prior to image acquisition. This pattern is designed in a way that a single tomographic image of it enables the recovery of the probe pose relative to the coordinate system of the pattern, which serves as a global coordinate system for image compounding.!##!Results!#!To investigate the feasibility of Tattoo tomography, we assessed the quality of 3D image reconstruction with experimental phantom data and in vivo forearm data. The results obtained with our prototype indicate that the Tattoo method enables the accurate and precise 3D reconstruction of PAT data and may be better suited for this task than the baseline method using optical tracking.!##!Conclusions!#!In contrast to previous approaches to 3D ultrasound (US) or PAT reconstruction, the Tattoo approach neither requires complex external hardware nor training data acquired for a specific application. It could thus become a valuable tool for clinical freehand PAT

arXiv.org e-Print Archive

PubMed Central

Hochschulbibliothekszentrum des Landes Nordrhein-Westfalen (hbz)