12 research outputs found
ViTs are Everywhere: A Comprehensive Study Showcasing Vision Transformers in Different Domain
Transformer design is the de facto standard for natural language processing
tasks. The success of the transformer design in natural language processing has
lately piqued the interest of researchers in the domain of computer vision.
When compared to Convolutional Neural Networks (CNNs), Vision Transformers
(ViTs) are becoming more popular and dominant solutions for many vision
problems. Transformer-based models outperform other types of networks, such as
convolutional and recurrent neural networks, in a range of visual benchmarks.
We evaluate various vision transformer models in this work by dividing them
into distinct jobs and examining their benefits and drawbacks. ViTs can
overcome several possible difficulties with convolutional neural networks
(CNNs). The goal of this survey is to show the first use of ViTs in CV. In the
first phase, we categorize various CV applications where ViTs are appropriate.
Image classification, object identification, image segmentation, video
transformer, image denoising, and NAS are all CV applications. Our next step
will be to analyze the state-of-the-art in each area and identify the models
that are currently available. In addition, we outline numerous open research
difficulties as well as prospective research possibilities.Comment: ICCD-2023. arXiv admin note: substantial text overlap with
arXiv:2208.04309 by other author
Continual Learning in Medical Image Analysis: A Comprehensive Review of Recent Advancements and Future Prospects
Medical imaging analysis has witnessed remarkable advancements even
surpassing human-level performance in recent years, driven by the rapid
development of advanced deep-learning algorithms. However, when the inference
dataset slightly differs from what the model has seen during one-time training,
the model performance is greatly compromised. The situation requires restarting
the training process using both the old and the new data which is
computationally costly, does not align with the human learning process, and
imposes storage constraints and privacy concerns. Alternatively, continual
learning has emerged as a crucial approach for developing unified and
sustainable deep models to deal with new classes, tasks, and the drifting
nature of data in non-stationary environments for various application areas.
Continual learning techniques enable models to adapt and accumulate knowledge
over time, which is essential for maintaining performance on evolving datasets
and novel tasks. This systematic review paper provides a comprehensive overview
of the state-of-the-art in continual learning techniques applied to medical
imaging analysis. We present an extensive survey of existing research, covering
topics including catastrophic forgetting, data drifts, stability, and
plasticity requirements. Further, an in-depth discussion of key components of a
continual learning framework such as continual learning scenarios, techniques,
evaluation schemes, and metrics is provided. Continual learning techniques
encompass various categories, including rehearsal, regularization,
architectural, and hybrid strategies. We assess the popularity and
applicability of continual learning categories in various medical sub-fields
like radiology and histopathology..
Attention Mechanisms in Medical Image Segmentation: A Survey
Medical image segmentation plays an important role in computer-aided
diagnosis. Attention mechanisms that distinguish important parts from
irrelevant parts have been widely used in medical image segmentation tasks.
This paper systematically reviews the basic principles of attention mechanisms
and their applications in medical image segmentation. First, we review the
basic concepts of attention mechanism and formulation. Second, we surveyed over
300 articles related to medical image segmentation, and divided them into two
groups based on their attention mechanisms, non-Transformer attention and
Transformer attention. In each group, we deeply analyze the attention
mechanisms from three aspects based on the current literature work, i.e., the
principle of the mechanism (what to use), implementation methods (how to use),
and application tasks (where to use). We also thoroughly analyzed the
advantages and limitations of their applications to different tasks. Finally,
we summarize the current state of research and shortcomings in the field, and
discuss the potential challenges in the future, including task specificity,
robustness, standard evaluation, etc. We hope that this review can showcase the
overall research context of traditional and Transformer attention methods,
provide a clear reference for subsequent research, and inspire more advanced
attention research, not only in medical image segmentation, but also in other
image analysis scenarios.Comment: Submitted to Medical Image Analysis, survey paper, 34 pages, over 300
reference
CoNIC Challenge: Pushing the Frontiers of Nuclear Detection, Segmentation, Classification and Counting
Nuclear detection, segmentation and morphometric profiling are essential in
helping us further understand the relationship between histology and patient
outcome. To drive innovation in this area, we setup a community-wide challenge
using the largest available dataset of its kind to assess nuclear segmentation
and cellular composition. Our challenge, named CoNIC, stimulated the
development of reproducible algorithms for cellular recognition with real-time
result inspection on public leaderboards. We conducted an extensive
post-challenge analysis based on the top-performing models using 1,658
whole-slide images of colon tissue. With around 700 million detected nuclei per
model, associated features were used for dysplasia grading and survival
analysis, where we demonstrated that the challenge's improvement over the
previous state-of-the-art led to significant boosts in downstream performance.
Our findings also suggest that eosinophils and neutrophils play an important
role in the tumour microevironment. We release challenge models and WSI-level
results to foster the development of further methods for biomarker discovery
Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries
This two-volume set LNCS 12962 and 12963 constitutes the thoroughly refereed proceedings of the 7th International MICCAI Brainlesion Workshop, BrainLes 2021, as well as the RSNA-ASNR-MICCAI Brain Tumor Segmentation (BraTS) Challenge, the Federated Tumor Segmentation (FeTS) Challenge, the Cross-Modality Domain Adaptation (CrossMoDA) Challenge, and the challenge on Quantification of Uncertainties in Biomedical Image Quantification (QUBIQ). These were held jointly at the 23rd Medical Image Computing for Computer Assisted Intervention Conference, MICCAI 2020, in September 2021. The 91 revised papers presented in these volumes were selected form 151 submissions. Due to COVID-19 pandemic the conference was held virtually. This is an open access book
Multiple Object Tracking in Light Microscopy Images Using Graph-based and Deep Learning Methods
Multi-Objekt-Tracking (MOT) ist ein Problem der Bildanalyse, welches die Lokalisierung und Verknüpfung von Objekten in einer Bildsequenz über die Zeit umfasst, mit zahlreichen Anwendungen in Bereichen wie autonomes Fahren, Robotik oder Überwachung. Neben technischen Anwendungsgebieten besteht auch ein großer Bedarf an MOT in biomedizinischen Anwendungen. So können beispielsweise Experimente, die mittels Lichtmikroskopie über mehrere Stunden oder Tage hinweg erfasst wurden, Hunderte oder sogar Tausende von ähnlich aussehenden Objekten enthalten, was eine manuelle Analyse unmöglich macht. Um jedoch zuverlässige Schlussfolgerungen aus den verfolgten Objekten abzuleiten, ist eine hohe Qualität der prädizierten Trajektorien erforderlich. Daher werden domänenspezifische MOT-Ansätze benötigt, die in der Lage sind, die Besonderheiten von lichtmikroskopischen Daten zu berücksichtigen. In dieser Arbeit werden daher zwei neuartige Methoden für das MOT-Problem in Lichtmikroskopie-Bildern erarbeitet sowie Ansätze zum Vergleich der Tracking-Methoden vorgestellt.
Um die Performanz der Tracking-Methode von der Qualität der Segmentierung zu unterscheiden, wird ein Ansatz vorgeschlagen, der es ermöglicht die Tracking-Methode getrennt von der Segmentierung zu analysieren, was auch eine Untersuchung der Robustheit von Tracking-Methoden gegeben verschlechterter Segmentierungsdaten erlaubt. Des Weiteren wird eine graphbasierte Tracking-Methode vorgeschlagen, welche eine Brücke zwischen einfach anzuwendenden, aber weniger performanten Tracking-Methoden und performanten Tracking-Methoden mit vielen schwer einstellbaren Parametern schlägt. Die vorgeschlagene Tracking-Methode hat nur wenige manuell einstellbare Parameter und ist einfach auf 2D- und 3D-Datensätze anwendbar. Durch die Modellierung von Vorwissen über die Form des Tracking-Graphen ist die vorgeschlagene Tracking-Methode außerdem in der Lage, bestimmte Arten von Segmentierungsfehlern automatisch zu korrigieren. Darüber hinaus wird ein auf Deep Learning basierender Ansatz vorgeschlagen, der die Aufgabe der Instanzsegmentierung und Objektverfolgung gleichzeitig in einem einzigen neuronalen Netzwerk erlernt. Außerdem lernt der vorgeschlagene Ansatz Repräsentationen zu prädizieren, die für den Menschen verständlich sind. Um die Performanz der beiden vorgeschlagenen Tracking-Methoden im Vergleich zu anderen aktuellen, domänenspezifischen Tracking-Ansätzen zu zeigen, werden sie auf einen domänenspezifischen Benchmark angewendet. Darüber hinaus werden weitere Bewertungskriterien für Tracking-Methoden eingeführt, welche zum Vergleich der beiden vorgeschlagenen Tracking-Methoden herangezogen werden
Leveraging Related Instances for Better Prediction
One fundamental task of machine learning is to predict output responses y from input data x. However, despite significant advances in the past decade, most current predictive models still only consider every single x in isolation when making predictions, which inevitably impacts model performance as the model may lose the opportunity to extract helpful information from other related instances to better predict x. This dissertation pushes the boundaries of machine learning research by explicitly taking advantage of related instances for better prediction. We find that leveraging multiple learned or intrinsically-related instances when making predictions in a data-driven and flexible manner is important for achieving good performance over a myriad of tasks. When x is a single instance, we can flexibly find related instances based on similarity measurements. We develop algorithms that consider related neighborhood instances for a specific given x during prediction. Our assumption is that similar instances can be found near one another in an embedding space and they locally share a similar predictive function. We develop a model, Meta-Neighborhood, to learn a dictionary of neighbor points during training so that we can retrieve related instances from this dictionary during inference for improved classification and regression. Furthermore, this work is extended to Differentiable Wavetable Synthesis (DWTS) which leverages a dictionary of related basis waveforms for audio synthesis. We show that realistic audio can be synthesized by directly combining those basis waveforms. Next, we consider the case where x is a given collection that contains multiple instances. In this case, x already included multiple related instances and we develop methods that learn how to exploit these related instances together to improve the prediction. Algorithms are developed to discover those instances more related to the prediction tasks and encourage the model to focus on these related instances for prediction. We first develop a transparent and human-understandable algorithm CKME that summarizes millions of instances into hundreds whilst being comparably accurate for single-cell set classification. Then an algorithm NRTSI for time series imputation is developed that treats the time series as a set and imputes missing data by leveraging those observed related data.Doctor of Philosoph