388 research outputs found
Attention mechanism in deep neural networks for computer vision tasks
“Attention mechanism, which is one of the most important algorithms in the deep Learning community, was initially designed in the natural language processing for enhancing the feature representation of key sentence fragments over the context. In recent years, the attention mechanism has been widely adopted in solving computer vision tasks by guiding deep neural networks (DNNs) to focus on specific image features for better understanding the semantic information of the image. However, the attention mechanism is not only capable of helping DNNs understand semantics, but also useful for the feature fusion, visual cue discovering, and temporal information selection, which are seldom researched. In this study, we take the classic attention mechanism a step further by proposing the Semantic Attention Guidance Unit (SAGU) for multi-level feature fusion to tackle the challenging Biomedical Image Segmentation task. Furthermore, we propose a novel framework that consists of (1) Semantic Attention Unit (SAU), which is an advanced version of SAGU for adaptively bringing high-level semantics to mid-level features, (2) Two-level Spatial Attention Module (TSPAM) for discovering multiple visual cues within the image, and (3) Temporal Attention Module (TAM) for temporal information selection to solve the Videobased Person Re-identification task. To validate our newly proposed attention mechanisms, extensive experiments are conducted on challenging datasets. Our methods obtain competitive performance and outperform state-of-the-art methods. Selective publications are also presented in the Appendix”--Abstract, page iii
A Gaussian mixture model for automated vesicle fusion detection and classification
Accurately detecting and classifying vesicle-plasma membrane fusion events in fluorescence microscopy, is of primary interest for studying biological activities in a close proximity to the plasma membrane. In this paper, we present a novel Gaussian mixture model for automated identification of vesicle-plasma membrane fusion and partial fusion events in total internal reflection fluorescence microscopy image sequences. Image patches of fusion event candidates are detected in individual images and linked over consecutive frames. A Gaussian mixture model is fit on each image patch of the patch sequence with outliers rejected for robust Gaussian fitting. The estimated parameters of Gaussian functions over time are catenated into feature vectors for classifier training. Applied on three challenging datasets, our method achieved competitive results on detecting and classifying fusion events compared with two state-of-the-art methods --Abstract, page iii
A Deep Learning Framework for Automated Vesicle Fusion Detection
Quantitative analysis of vesicle-plasma membrane fusion events in the fluorescence microscopy, has been proven to be important in the vesicle exocytosis study. In this paper, we present a framework to automatically detect fusion events. First, an iterative searching algorithm is developed to extract image patch sequences containing potential events. Then, we propose an event image to integrate the critical image patches of a candidate event into a single-image joint representation as the input to Convolutional Neural Networks (CNNs). According to the duration of candidate events, we design three CNN architectures to automatically learn features for the fusion event classification. Compared on 9 challenging datasets, our proposed method showed very competitive performance and outperformed two state-of-the-arts
Choosing Wisely and Learning Deeply: Selective Cross-Modality Distillation via CLIP for Domain Generalization
Domain Generalization (DG), a crucial research area, seeks to train models
across multiple domains and test them on unseen ones. In this paper, we
introduce a novel approach, namely, Selective Cross-Modality Distillation for
Domain Generalization (SCMD). SCMD leverages the capabilities of large
vision-language models, specifically CLIP, to train a more efficient model,
ensuring it acquires robust generalization capabilities across unseen domains.
Our primary contribution is a unique selection framework strategically designed
to identify hard-to-learn samples for distillation. In parallel, we introduce a
novel cross-modality module that seamlessly combines the projected features of
the student model with the text embeddings from CLIP, ensuring the alignment of
similarity distributions. We assess SCMD's performance on various benchmarks,
where it empowers a ResNet50 to deliver state-of-the-art performance,
surpassing existing domain generalization methods. Furthermore, we provide a
theoretical analysis of our selection strategy, offering deeper insight into
its effectiveness and potential in the field of DG
Scheduling Mixed-Criticality Real-Time Systems
This dissertation addresses the following question to the design of scheduling policies and resource allocation mechanisms in contemporary embedded systems that are implemented on integrated computing platforms: in a multitasking system where it is hard to estimate a task's worst-case execution time, how do we assign task priorities so that 1) the safety-critical tasks are asserted to be completed within a specified length of time, and 2) the non-critical tasks are also guaranteed to be completed within a predictable length of time if no task is actually consuming time at the worst case? This dissertation tries to answer this question based on the mixed-criticality real-time system model, which defines multiple worst-case execution scenarios, and demands a scheduling policy to provide provable timing guarantees to each level of critical tasks with respect to each type of scenario. Two scheduling algorithms are proposed to serve this model. The OCBP algorithm is aimed at discrete one-shot tasks with an arbitrary number of criticality levels. The EDF-VD algorithm is aimed at recurrent tasks with two criticality levels (safety-critical and non-critical). Both algorithms are proved to optimally minimize the percentage of computational resource waste within two criticality levels. More in-depth investigations to the relationship among the computational resource requirement of different criticality levels are also provided for both algorithms.Doctor of Philosoph
The Devil is in the Edges: Monocular Depth Estimation with Edge-aware Consistency Fusion
This paper presents a novel monocular depth estimation method, named ECFNet,
for estimating high-quality monocular depth with clear edges and valid overall
structure from a single RGB image. We make a thorough inquiry about the key
factor that affects the edge depth estimation of the MDE networks, and come to
a ratiocination that the edge information itself plays a critical role in
predicting depth details. Driven by this analysis, we propose to explicitly
employ the image edges as input for ECFNet and fuse the initial depths from
different sources to produce the final depth. Specifically, ECFNet first uses a
hybrid edge detection strategy to get the edge map and edge-highlighted image
from the input image, and then leverages a pre-trained MDE network to infer the
initial depths of the aforementioned three images. After that, ECFNet utilizes
a layered fusion module (LFM) to fuse the initial depth, which will be further
updated by a depth consistency module (DCM) to form the final estimation.
Extensive experimental results on public datasets and ablation studies indicate
that our method achieves state-of-the-art performance. Project page:
https://zrealli.github.io/edgedepth.Comment: 17 pages, 19 figure
From Tissue Plane to Organ World: A Benchmark Dataset for Multimodal Biomedical Image Registration using Deep Co-Attention Networks
Correlating neuropathology with neuroimaging findings provides a multiscale
view of pathologic changes in the human organ spanning the meso- to
micro-scales, and is an emerging methodology expected to shed light on numerous
disease states. To gain the most information from this multimodal, multiscale
approach, it is desirable to identify precisely where a histologic tissue
section was taken from within the organ in order to correlate with the tissue
features in exactly the same organ region. Histology-to-organ registration
poses an extra challenge, as any given histologic section can capture only a
small portion of a human organ. Making use of the capabilities of
state-of-the-art deep learning models, we unlock the potential to address and
solve such intricate challenges. Therefore, we create the ATOM benchmark
dataset, sourced from diverse institutions, with the primary objective of
transforming this challenge into a machine learning problem and delivering
outstanding outcomes that enlighten the biomedical community. The performance
of our RegisMCAN model demonstrates the potential of deep learning to
accurately predict where a subregion extracted from an organ image was obtained
from within the overall 3D volume. The code and dataset can be found at:
https://github.com/haizailache999/Image-Registration/tree/mai
- …
