196 research outputs found

    Preoperative Brain Tumor Imaging:Models and Software for Segmentation and Standardized Reporting

    Get PDF
    For patients suffering from brain tumor, prognosis estimation and treatment decisions are made by a multidisciplinary team based on a set of preoperative MR scans. Currently, the lack of standardized and automatic methods for tumor detection and generation of clinical reports, incorporating a wide range of tumor characteristics, represents a major hurdle. In this study, we investigate the most occurring brain tumor types: glioblastomas, lower grade gliomas, meningiomas, and metastases, through four cohorts of up to 4,000 patients. Tumor segmentation models were trained using the AGU-Net architecture with different preprocessing steps and protocols. Segmentation performances were assessed in-depth using a wide-range of voxel and patient-wise metrics covering volume, distance, and probabilistic aspects. Finally, two software solutions have been developed, enabling an easy use of the trained models and standardized generation of clinical reports: Raidionics and Raidionics-Slicer. Segmentation performances were quite homogeneous across the four different brain tumor types, with an average true positive Dice ranging between 80 and 90%, patient-wise recall between 88 and 98%, and patient-wise precision around 95%. In conjunction to Dice, the identified most relevant other metrics were the relative absolute volume difference, the variation of information, and the Hausdorff, Mahalanobis, and object average symmetric surface distances. With our Raidionics software, running on a desktop computer with CPU support, tumor segmentation can be performed in 16-54 s depending on the dimensions of the MRI volume. For the generation of a standardized clinical report, including the tumor segmentation and features computation, 5-15 min are necessary. All trained models have been made open-access together with the source code for both software solutions and validation metrics computation. In the future, a method to convert results from a set of metrics into a final single score would be highly desirable for easier ranking across trained models. In addition, an automatic classification of the brain tumor type would be necessary to replace manual user input. Finally, the inclusion of post-operative segmentation in both software solutions will be key for generating complete post-operative standardized clinical reports

    Challenges and Opportunities of End-to-End Learning in Medical Image Classification

    Get PDF
    Das Paradigma des End-to-End Lernens hat in den letzten Jahren die Bilderkennung revolutioniert, aber die klinische Anwendung hinkt hinterher. Bildbasierte computergestützte Diagnosesysteme basieren immer noch weitgehend auf hochtechnischen und domänen-spezifischen Pipelines, die aus unabhängigen regelbasierten Modellen bestehen, welche die Teilaufgaben der Bildklassifikation wiederspiegeln: Lokalisation von auffälligen Regionen, Merkmalsextraktion und Entscheidungsfindung. Das Versprechen einer überlegenen Entscheidungsfindung beim End-to-End Lernen ergibt sich daraus, dass domänenspezifische Zwangsbedingungen von begrenzter Komplexität entfernt werden und stattdessen alle Systemkomponenten gleichzeitig, direkt anhand der Rohdaten, und im Hinblick auf die letztendliche Aufgabe optimiert werden. Die Gründe dafür, dass diese Vorteile noch nicht den Weg in die Klinik gefunden haben, d.h. die Herausforderungen, die sich bei der Entwicklung Deep Learning-basierter Diagnosesysteme stellen, sind vielfältig: Die Tatsache, dass die Generalisierungsfähigkeit von Lernalgorithmen davon abhängt, wie gut die verfügbaren Trainingsdaten die tatsächliche zugrundeliegende Datenverteilung abbilden, erweist sich in medizinische Anwendungen als tiefgreifendes Problem. Annotierte Datensätze in diesem Bereich sind notorisch klein, da für die Annotation eine kostspielige Beurteilung durch Experten erforderlich ist und die Zusammenlegung kleinerer Datensätze oft durch Datenschutzauflagen und Patientenrechte erschwert wird. Darüber hinaus weisen medizinische Datensätze drastisch unterschiedliche Eigenschaften im Bezug auf Bildmodalitäten, Bildgebungsprotokolle oder Anisotropien auf, und die oft mehrdeutige Evidenz in medizinischen Bildern kann sich auf inkonsistente oder fehlerhafte Trainingsannotationen übertragen. Während die Verschiebung von Datenverteilungen zwischen Forschungsumgebung und Realität zu einer verminderten Modellrobustheit führt und deshalb gegenwärtig als das Haupthindernis für die klinische Anwendung von Lernalgorithmen angesehen wird, wird dieser Graben oft noch durch Störfaktoren wie Hardwarelimitationen oder Granularität von gegebenen Annotation erweitert, die zu Diskrepanzen zwischen der modellierten Aufgabe und der zugrunde liegenden klinischen Fragestellung führen. Diese Arbeit untersucht das Potenzial des End-to-End-Lernens in klinischen Diagnosesystemen und präsentiert Beiträge zu einigen der wichtigsten Herausforderungen, die derzeit eine breite klinische Anwendung verhindern. Zunächst wird der letzten Teil der Klassifikations-Pipeline untersucht, die Kategorisierung in klinische Pathologien. Wir demonstrieren, wie das Ersetzen des gegenwärtigen klinischen Standards regelbasierter Entscheidungen durch eine groß angelegte Merkmalsextraktion gefolgt von lernbasierten Klassifikatoren die Brustkrebsklassifikation im MRT signifikant verbessert und eine Leistung auf menschlichem Level erzielt. Dieser Ansatz wird weiter anhand von kardiologischer Diagnose gezeigt. Zweitens ersetzen wir, dem Paradigma des End-to-End Lernens folgend, das biophysikalische Modell, das für die Bildnormalisierung in der MRT angewandt wird, sowie die Extraktion handgefertigter Merkmale, durch eine designierte CNN-Architektur und liefern eine eingehende Analyse, die das verborgene Potenzial der gelernten Bildnormalisierung und einen Komplementärwert der gelernten Merkmale gegenüber den handgefertigten Merkmalen aufdeckt. Während dieser Ansatz auf markierten Regionen arbeitet und daher auf manuelle Annotation angewiesen ist, beziehen wir im dritten Teil die Aufgabe der Lokalisierung dieser Regionen in den Lernprozess ein, um eine echte End-to-End-Diagnose baserend auf den Rohbildern zu ermöglichen. Dabei identifizieren wir eine weitgehend vernachlässigte Zwangslage zwischen dem Streben nach der Auswertung von Modellen auf klinisch relevanten Skalen auf der einen Seite, und der Optimierung für effizientes Training unter Datenknappheit auf der anderen Seite. Wir präsentieren ein Deep Learning Modell, das zur Auflösung dieses Kompromisses beiträgt, liefern umfangreiche Experimente auf drei medizinischen Datensätzen sowie eine Serie von Toy-Experimenten, die das Verhalten bei begrenzten Trainingsdaten im Detail untersuchen, und publiziren ein umfassendes Framework, das unter anderem die ersten 3D-Implementierungen gängiger Objekterkennungsmodelle umfasst. Wir identifizieren weitere Hebelpunkte in bestehenden End-to-End-Lernsystemen, bei denen Domänenwissen als Zwangsbedingung dienen kann, um die Robustheit von Modellen in der medizinischen Bildanalyse zu erhöhen, die letztendlich dazu beitragen sollen, den Weg für die Anwendung in der klinischen Praxis zu ebnen. Zu diesem Zweck gehen wir die Herausforderung fehlerhafter Trainingsannotationen an, indem wir die Klassifizierungskompnente in der End-to-End-Objekterkennung durch Regression ersetzen, was es ermöglicht, Modelle direkt auf der kontinuierlichen Skala der zugrunde liegenden pathologischen Prozesse zu trainieren und so die Robustheit der Modelle gegenüber fehlerhaften Trainingsannotationen zu erhöhen. Weiter adressieren wir die Herausforderung der Input-Heterogenitäten, mit denen trainierte Modelle konfrontiert sind, wenn sie an verschiedenen klinischen Orten eingesetzt werden, indem wir eine modellbasierte Domänenanpassung vorschlagen, die es ermöglicht, die ursprüngliche Trainingsdomäne aus veränderten Inputs wiederherzustellen und damit eine robuste Generalisierung zu gewährleisten. Schließlich befassen wir uns mit dem höchst unsystematischen, aufwendigen und subjektiven Trial-and-Error-Prozess zum Finden von robusten Hyperparametern für einen gegebene Aufgabe, indem wir Domänenwissen in ein Set systematischer Regeln überführen, die eine automatisierte und robuste Konfiguration von Deep Learning Modellen auf einer Vielzahl von medizinischen Datensetzen ermöglichen. Zusammenfassend zeigt die hier vorgestellte Arbeit das enorme Potenzial von End-to-End Lernalgorithmen im Vergleich zum klinischen Standard mehrteiliger und hochtechnisierter Diagnose-Pipelines auf, und präsentiert Lösungsansätze zu einigen der wichtigsten Herausforderungen für eine breite Anwendung unter realen Bedienungen wie Datenknappheit, Diskrepanz zwischen der vom Modell behandelten Aufgabe und der zugrunde liegenden klinischen Fragestellung, Mehrdeutigkeiten in Trainingsannotationen, oder Verschiebung von Datendomänen zwischen klinischen Standorten. Diese Beiträge können als Teil des übergreifende Zieles der Automatisierung von medizinischer Bildklassifikation gesehen werden - ein integraler Bestandteil des Wandels, der erforderlich ist, um die Zukunft des Gesundheitswesens zu gestalten

    Preoperative Brain Tumor Imaging: Models and Software for Segmentation and Standardized Reporting

    Get PDF
    For patients suffering from brain tumor, prognosis estimation and treatment decisions are made by a multidisciplinary team based on a set of preoperative MR scans. Currently, the lack of standardized and automatic methods for tumor detection and generation of clinical reports, incorporating a wide range of tumor characteristics, represents a major hurdle. In this study, we investigate the most occurring brain tumor types: glioblastomas, lower grade gliomas, meningiomas, and metastases, through four cohorts of up to 4,000 patients. Tumor segmentation models were trained using the AGU-Net architecture with different preprocessing steps and protocols. Segmentation performances were assessed in-depth using a wide-range of voxel and patient-wise metrics covering volume, distance, and probabilistic aspects. Finally, two software solutions have been developed, enabling an easy use of the trained models and standardized generation of clinical reports: Raidionics and Raidionics-Slicer. Segmentation performances were quite homogeneous across the four different brain tumor types, with an average true positive Dice ranging between 80 and 90%, patient-wise recall between 88 and 98%, and patient-wise precision around 95%. In conjunction to Dice, the identified most relevant other metrics were the relative absolute volume difference, the variation of information, and the Hausdorff, Mahalanobis, and object average symmetric surface distances. With our Raidionics software, running on a desktop computer with CPU support, tumor segmentation can be performed in 16–54 s depending on the dimensions of the MRI volume. For the generation of a standardized clinical report, including the tumor segmentation and features computation, 5–15 min are necessary. All trained models have been made open-access together with the source code for both software solutions and validation metrics computation. In the future, a method to convert results from a set of metrics into a final single score would be highly desirable for easier ranking across trained models. In addition, an automatic classification of the brain tumor type would be necessary to replace manual user input. Finally, the inclusion of post-operative segmentation in both software solutions will be key for generating complete post-operative standardized clinical reports.publishedVersio

    Diagnosis, Localization, and Prognosis of Melanoma in WSIs with a Complete Pipeline by Digital Pathology

    Get PDF
    The most dangerous and aggressive form of skin cancer is melanoma, responsible for 90% of skin cancer mortality. Early detection of melanoma plays a crucial role in the prognostic outcome. The diagnostic has to be performed by a pathologist, which is time- consuming. The recent increase in melanoma incidents indicates the growing demand for a more efficient diagnostic process. This thesis’s main objective is to develop a pipeline utilizing two independent pre- trained models built on the VGG16 architecture. This pipeline consists of a diagnostic and a prognostic model. The diagnosis model is responsible for localizing malignant patches in WSIs and giving a patient-level diagnosis. The prognosis model uses the output from the diagnosis model to provide a patient-level prognosis. The complete pipeline provides both a prognostic and a diagnostic tool, which can be used by a pathologist when evaluating Whole Slide Images (WSIs). A total of 243 WSIs were provided by Stavanger University Hospital for this thesis. All have been provided a patient-level label. 203 of the WSIs were used for parameter tuning and 40 were used for testing. The diagnosis model performed with a 100% accuracy when evaluated on the original test, which was provided together with the training set. The prognosis model also per- formed well on the original dataset, with an accuracy of 0.7885. The model’s capability to predict diagnosis and prognosis decreases significantly when being introduced to the new dataset. In addition to developing the pipeline, some parameters for the diagnosis model was found using a ROC cuve. By using the new parameters for the diagnosis model on the validation set, the performance of the diagnosis model increased when using the test set. The prognosis model performed relatively equally in all experiments. A correlation between the number of patches in a WSI and the number of patches predicted malignant was discovered and counteracted by altering the patient-level threshold calculation method.The most dangerous and aggressive form of skin cancer is melanoma, responsible for 90% of skin cancer mortality. Early detection of melanoma plays a crucial role in the prognostic outcome. The diagnostic has to be performed by a pathologist, which is time- consuming. The recent increase in melanoma incidents indicates the growing demand for a more efficient diagnostic process. This thesis’s main objective is to develop a pipeline utilizing two independent pre- trained models built on the VGG16 architecture. This pipeline consists of a diagnostic and a prognostic model. The diagnosis model is responsible for localizing malignant patches in WSIs and giving a patient-level diagnosis. The prognosis model uses the output from the diagnosis model to provide a patient-level prognosis. The complete pipeline provides both a prognostic and a diagnostic tool, which can be used by a pathologist when evaluating Whole Slide Images (WSIs). A total of 243 WSIs were provided by Stavanger University Hospital for this thesis. All have been provided a patient-level label. 203 of the WSIs were used for parameter tuning and 40 were used for testing. The diagnosis model performed with a 100% accuracy when evaluated on the original test, which was provided together with the training set. The prognosis model also per- formed well on the original dataset, with an accuracy of 0.7885. The model’s capability to predict diagnosis and prognosis decreases significantly when being introduced to the new dataset. In addition to developing the pipeline, some parameters for the diagnosis model was found using a ROC cuve. By using the new parameters for the diagnosis model on the validation set, the performance of the diagnosis model increased when using the test set. The prognosis model performed relatively equally in all experiments. A correlation between the number of patches in a WSI and the number of patches predicted malignant was discovered and counteracted by altering the patient-level threshold calculation method

    Artificial intelligence in digital pathology: a diagnostic test accuracy systematic review and meta-analysis

    Full text link
    Ensuring diagnostic performance of AI models before clinical use is key to the safe and successful adoption of these technologies. Studies reporting AI applied to digital pathology images for diagnostic purposes have rapidly increased in number in recent years. The aim of this work is to provide an overview of the diagnostic accuracy of AI in digital pathology images from all areas of pathology. This systematic review and meta-analysis included diagnostic accuracy studies using any type of artificial intelligence applied to whole slide images (WSIs) in any disease type. The reference standard was diagnosis through histopathological assessment and / or immunohistochemistry. Searches were conducted in PubMed, EMBASE and CENTRAL in June 2022. We identified 2976 studies, of which 100 were included in the review and 48 in the full meta-analysis. Risk of bias and concerns of applicability were assessed using the QUADAS-2 tool. Data extraction was conducted by two investigators and meta-analysis was performed using a bivariate random effects model. 100 studies were identified for inclusion, equating to over 152,000 whole slide images (WSIs) and representing many disease types. Of these, 48 studies were included in the meta-analysis. These studies reported a mean sensitivity of 96.3% (CI 94.1-97.7) and mean specificity of 93.3% (CI 90.5-95.4) for AI. There was substantial heterogeneity in study design and all 100 studies identified for inclusion had at least one area at high or unclear risk of bias. This review provides a broad overview of AI performance across applications in whole slide imaging. However, there is huge variability in study design and available performance data, with details around the conduct of the study and make up of the datasets frequently missing. Overall, AI offers good accuracy when applied to WSIs but requires more rigorous evaluation of its performance.Comment: 26 pages, 5 figures, 8 tables + Supplementary material

    blob loss: instance imbalance aware loss functions for semantic segmentation

    Full text link
    Deep convolutional neural networks have proven to be remarkably effective in semantic segmentation tasks. Most popular loss functions were introduced targeting improved volumetric scores, such as the Sorensen Dice coefficient. By design, DSC can tackle class imbalance; however, it does not recognize instance imbalance within a class. As a result, a large foreground instance can dominate minor instances and still produce a satisfactory Sorensen Dice coefficient. Nevertheless, missing out on instances will lead to poor detection performance. This represents a critical issue in applications such as disease progression monitoring. For example, it is imperative to locate and surveil small-scale lesions in the follow-up of multiple sclerosis patients. We propose a novel family of loss functions, nicknamed blob loss, primarily aimed at maximizing instance-level detection metrics, such as F1 score and sensitivity. Blob loss is designed for semantic segmentation problems in which the instances are the connected components within a class. We extensively evaluate a DSC-based blob loss in five complex 3D semantic segmentation tasks featuring pronounced instance heterogeneity in terms of texture and morphology. Compared to soft Dice loss, we achieve 5 percent improvement for MS lesions, 3 percent improvement for liver tumor, and an average 2 percent improvement for Microscopy segmentation tasks considering F1 score

    blob loss: instance imbalance aware loss functions for semantic segmentation

    Full text link
    Deep convolutional neural networks have proven to be remarkably effective in semantic segmentation tasks. Most popular loss functions were introduced targeting improved volumetric scores, such as the Sorensen Dice coefficient. By design, DSC can tackle class imbalance; however, it does not recognize instance imbalance within a class. As a result, a large foreground instance can dominate minor instances and still produce a satisfactory Sorensen Dice coefficient. Nevertheless, missing out on instances will lead to poor detection performance. This represents a critical issue in applications such as disease progression monitoring. For example, it is imperative to locate and surveil small-scale lesions in the follow-up of multiple sclerosis patients. We propose a novel family of loss functions, nicknamed blob loss, primarily aimed at maximizing instance-level detection metrics, such as F1 score and sensitivity. Blob loss is designed for semantic segmentation problems in which the instances are the connected components within a class. We extensively evaluate a DSC-based blob loss in five complex 3D semantic segmentation tasks featuring pronounced instance heterogeneity in terms of texture and morphology. Compared to soft Dice loss, we achieve 5 percent improvement for MS lesions, 3 percent improvement for liver tumor, and an average 2 percent improvement for Microscopy segmentation tasks considering F1 score.Comment: 23 pages, 7 figures // corrected one mistake where it said beta instead of alpha in the tex

    Image Processing and Analysis for Preclinical and Clinical Applications

    Get PDF
    Radiomics is one of the most successful branches of research in the field of image processing and analysis, as it provides valuable quantitative information for the personalized medicine. It has the potential to discover features of the disease that cannot be appreciated with the naked eye in both preclinical and clinical studies. In general, all quantitative approaches based on biomedical images, such as positron emission tomography (PET), computed tomography (CT) and magnetic resonance imaging (MRI), have a positive clinical impact in the detection of biological processes and diseases as well as in predicting response to treatment. This Special Issue, “Image Processing and Analysis for Preclinical and Clinical Applications”, addresses some gaps in this field to improve the quality of research in the clinical and preclinical environment. It consists of fourteen peer-reviewed papers covering a range of topics and applications related to biomedical image processing and analysis
    • …
    corecore