1,951 research outputs found

    Out of distribution detection for intra-operative functional imaging

    Full text link
    Multispectral optical imaging is becoming a key tool in the operating room. Recent research has shown that machine learning algorithms can be used to convert pixel-wise reflectance measurements to tissue parameters, such as oxygenation. However, the accuracy of these algorithms can only be guaranteed if the spectra acquired during surgery match the ones seen during training. It is therefore of great interest to detect so-called out of distribution (OoD) spectra to prevent the algorithm from presenting spurious results. In this paper we present an information theory based approach to OoD detection based on the widely applicable information criterion (WAIC). Our work builds upon recent methodology related to invertible neural networks (INN). Specifically, we make use of an ensemble of INNs as we need their tractable Jacobians in order to compute the WAIC. Comprehensive experiments with in silico, and in vivo multispectral imaging data indicate that our approach is well-suited for OoD detection. Our method could thus be an important step towards reliable functional imaging in the operating room.Comment: The final authenticated version is available online at https://doi.org/10.1007/978-3-030-32689-0_

    Quantitative performance and optimal regularization parameter in block sequential regularized expectation maximization reconstructions in clinical 68Ga-PSMA PET/MR.

    Full text link
    BACKGROUND: In contrast to ordered subset expectation maximization (OSEM), block sequential regularized expectation maximization (BSREM) positron emission tomography (PET) reconstruction algorithms can run until full convergence while controlling image quality and noise. Recent studies with BSREM and F-FDG PET reported higher signal-to-noise ratios and higher standardized uptake values (SUV). In this study, we investigate the optimal regularization parameter (β) for clinical Ga-PSMA PET/MR reconstructions in the pelvic region applying time-of-flight (TOF) BSREM in comparison to TOF OSEM. Two-minute emission data from the pelvic region of 25 patients who underwent Ga-PSMA PET/MR were retrospectively reconstructed. Reference OSEM reconstructions had 28 subsets and 2 iterations. BSREM reconstructions were performed with 15 β values between 150 and 1200. Regions of interest (ROIs) were drawn around lesions and in uniform background. Background SUVmean (average) and SUVstd (standard deviation), and lesion SUVmax (average of 5 hottest voxels) were calculated. Differences were analyzed using the Wilcoxon matched pairs signed-rank test. RESULTS: A total of 40 lesions were identified in the pelvic region. Background noise (SUVstd) and lesions SUVmax decreased with increasing β. Image reconstructions with β values lower than 400 have higher (p < 0.01) background noise, compared to the reference OSEM reconstructions, and are therefore less useful. Lesions with low activity on images reconstructed with β values higher than 600 have a lower (p < 0.05) SUVmax compared to the reference. These reconstructions are likely visually appealing due to the lower background noise, but the lower SUVmax could possibly render small low-uptake lesions invisible. CONCLUSIONS: In our study, we showed that PET images reconstructed with TOF BSREM in combination with the Ga-PSMA tracer result in lower background noise and higher SUVmax values in lesions compared to TOF OSEM. Our study indicates that a β value between 400 and 550 might be the optimal compromise between high SUVmax and low background noise

    Intercomparison of global river discharge simulations focusing on dam operation --- Part II: Multiple models analysis in two case-study river basins, Missouri-Mississippi and Green-Colorado

    Get PDF
    We performed a twofold intercomparison of river discharge regulated by dams under multiple meteorological forcings among multiple global hydrological models for a historical period by simulation. Paper II provides an intercomparison of river discharge simulated by five hydrological models under four meteorological forcings. This is the first global multimodel intercomparison study on dam-regulated river flow. Although the simulations were conducted globally, the Missouri-Mississippi and Green-Colorado Rivers were chosen as case-study sites in this study. The hydrological models incorporate generic schemes of dam operation, not specific to a certain dam. We examined river discharge on a longitudinal section of river channels to investigate the effects of dams on simulated discharge, especially at the seasonal time scale. We found that the magnitude of dam regulation differed considerably among the hydrological models. The difference was attributable not only to dam operation schemes but also to the magnitude of simulated river discharge flowing into dams. That is, although a similar algorithm of dam operation schemes was incorporated in different hydrological models, the magnitude of dam regulation substantially differed among the models. Intermodel discrepancies tended to decrease toward the lower reaches of these river basins, which means model dependence is less significant toward lower reaches. These case-study results imply that, intermodel comparisons of river discharge should be made at different locations along the river’s course to critically examine the performance of hydrological models because the performance can vary with the locations

    Comparative validation of machine learning algorithms for surgical workflow and skill analysis with the HeiChole benchmark

    Get PDF
    Purpose: Surgical workflow and skill analysis are key technologies for the next generation of cognitive surgical assistance systems. These systems could increase the safety of the operation through context-sensitive warnings and semi-autonomous robotic assistance or improve training of surgeons via data-driven feedback. In surgical workflow analysis up to 91% average precision has been reported for phase recognition on an open data single-center video dataset. In this work we investigated the generalizability of phase recognition algorithms in a multicenter setting including more difficult recognition tasks such as surgical action and surgical skill. Methods: To achieve this goal, a dataset with 33 laparoscopic cholecystectomy videos from three surgical centers with a total operation time of 22 h was created. Labels included framewise annotation of seven surgical phases with 250 phase transitions, 5514 occurences of four surgical actions, 6980 occurences of 21 surgical instruments from seven instrument categories and 495 skill classifications in five skill dimensions. The dataset was used in the 2019 international Endoscopic Vision challenge, sub-challenge for surgical workflow and skill analysis. Here, 12 research teams trained and submitted their machine learning algorithms for recognition of phase, action, instrument and/or skill assessment. Results: F1-scores were achieved for phase recognition between 23.9% and 67.7% (n = 9 teams), for instrument presence detection between 38.5% and 63.8% (n = 8 teams), but for action recognition only between 21.8% and 23.3% (n = 5 teams). The average absolute error for skill assessment was 0.78 (n = 1 team). Conclusion: Surgical workflow and skill analysis are promising technologies to support the surgical team, but there is still room for improvement, as shown by our comparison of machine learning algorithms. This novel HeiChole benchmark can be used for comparable evaluation and validation of future work. In future studies, it is of utmost importance to create more open, high-quality datasets in order to allow the development of artificial intelligence and cognitive robotics in surgery

    Common Limitations of Image Processing Metrics:A Picture Story

    Get PDF
    While the importance of automatic image analysis is continuously increasing, recent meta-research revealed major flaws with respect to algorithm validation. Performance metrics are particularly key for meaningful, objective, and transparent performance assessment and validation of the used automatic algorithms, but relatively little attention has been given to the practical pitfalls when using specific metrics for a given image analysis task. These are typically related to (1) the disregard of inherent metric properties, such as the behaviour in the presence of class imbalance or small target structures, (2) the disregard of inherent data set properties, such as the non-independence of the test cases, and (3) the disregard of the actual biomedical domain interest that the metrics should reflect. This living dynamically document has the purpose to illustrate important limitations of performance metrics commonly applied in the field of image analysis. In this context, it focuses on biomedical image analysis problems that can be phrased as image-level classification, semantic segmentation, instance segmentation, or object detection task. The current version is based on a Delphi process on metrics conducted by an international consortium of image analysis experts from more than 60 institutions worldwide.Comment: This is a dynamic paper on limitations of commonly used metrics. The current version discusses metrics for image-level classification, semantic segmentation, object detection and instance segmentation. For missing use cases, comments or questions, please contact [email protected] or [email protected]. Substantial contributions to this document will be acknowledged with a co-authorshi

    Understanding metric-related pitfalls in image analysis validation

    Get PDF
    Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem. This could be attributed to a lack of accessibility of metric-related knowledge: While taking into account the individual strengths, weaknesses, and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multi-stage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides the first reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Focusing on biomedical image analysis but with the potential of transfer to other fields, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. To facilitate comprehension, illustrations and specific examples accompany each pitfall. As a structured body of information accessible to researchers of all levels of expertise, this work enhances global comprehension of a key topic in image analysis validation.Comment: Shared first authors: Annika Reinke, Minu D. Tizabi; shared senior authors: Paul F. J\"ager, Lena Maier-Hei

    Adapting Agriculture to Climate Change: A Synopsis of Coordinated National Crop Wild Relative Seed Collecting Programs across Five Continents

    Get PDF
    The Adapting Agriculture to Climate Change Project set out to improve the diversity, quantity, and accessibility of germplasm collections of crop wild relatives (CWR). Between 2013 and 2018, partners in 25 countries, heirs to the globetrotting legacy of Nikolai Vavilov, undertook seed collecting expeditions targeting CWR of 28 crops of global significance for agriculture. Here, we describe the implementation of the 25 national collecting programs and present the key results. A total of 4587 unique seed samples from at least 355 CWR taxa were collected, conserved ex situ, safety duplicated in national and international genebanks, and made available through the Multilateral System (MLS) of the International Treaty on Plant Genetic Resources for Food and Agriculture (Plant Treaty). Collections of CWR were made for all 28 targeted crops. Potato and eggplant were the most collected genepools, although the greatest number of primary genepool collections were made for rice. Overall, alfalfa, Bambara groundnut, grass pea and wheat were the genepools for which targets were best achieved. Several of the newly collected samples have already been used in pre-breeding programs to adapt crops to future challenges.info:eu-repo/semantics/publishedVersio

    Twenty-three unsolved problems in hydrology (UPH) – a community perspective

    Get PDF
    This paper is the outcome of a community initiative to identify major unsolved scientific problems in hydrology motivated by a need for stronger harmonisation of research efforts. The procedure involved a public consultation through on-line media, followed by two workshops through which a large number of potential science questions were collated, prioritised, and synthesised. In spite of the diversity of the participants (230 scientists in total), the process revealed much about community priorities and the state of our science: a preference for continuity in research questions rather than radical departures or redirections from past and current work. Questions remain focussed on process-based understanding of hydrological variability and causality at all space and time scales. Increased attention to environmental change drives a new emphasis on understanding how change propagates across interfaces within the hydrological system and across disciplinary boundaries. In particular, the expansion of the human footprint raises a new set of questions related to human interactions with nature and water cycle feedbacks in the context of complex water management problems. We hope that this reflection and synthesis of the 23 unsolved problems in hydrology will help guide research efforts for some years to come
    corecore