388 research outputs found

    Attention mechanism in deep neural networks for computer vision tasks

    Get PDF
    “Attention mechanism, which is one of the most important algorithms in the deep Learning community, was initially designed in the natural language processing for enhancing the feature representation of key sentence fragments over the context. In recent years, the attention mechanism has been widely adopted in solving computer vision tasks by guiding deep neural networks (DNNs) to focus on specific image features for better understanding the semantic information of the image. However, the attention mechanism is not only capable of helping DNNs understand semantics, but also useful for the feature fusion, visual cue discovering, and temporal information selection, which are seldom researched. In this study, we take the classic attention mechanism a step further by proposing the Semantic Attention Guidance Unit (SAGU) for multi-level feature fusion to tackle the challenging Biomedical Image Segmentation task. Furthermore, we propose a novel framework that consists of (1) Semantic Attention Unit (SAU), which is an advanced version of SAGU for adaptively bringing high-level semantics to mid-level features, (2) Two-level Spatial Attention Module (TSPAM) for discovering multiple visual cues within the image, and (3) Temporal Attention Module (TAM) for temporal information selection to solve the Videobased Person Re-identification task. To validate our newly proposed attention mechanisms, extensive experiments are conducted on challenging datasets. Our methods obtain competitive performance and outperform state-of-the-art methods. Selective publications are also presented in the Appendix”--Abstract, page iii

    A Gaussian mixture model for automated vesicle fusion detection and classification

    Get PDF
    Accurately detecting and classifying vesicle-plasma membrane fusion events in fluorescence microscopy, is of primary interest for studying biological activities in a close proximity to the plasma membrane. In this paper, we present a novel Gaussian mixture model for automated identification of vesicle-plasma membrane fusion and partial fusion events in total internal reflection fluorescence microscopy image sequences. Image patches of fusion event candidates are detected in individual images and linked over consecutive frames. A Gaussian mixture model is fit on each image patch of the patch sequence with outliers rejected for robust Gaussian fitting. The estimated parameters of Gaussian functions over time are catenated into feature vectors for classifier training. Applied on three challenging datasets, our method achieved competitive results on detecting and classifying fusion events compared with two state-of-the-art methods --Abstract, page iii

    A Deep Learning Framework for Automated Vesicle Fusion Detection

    Get PDF
    Quantitative analysis of vesicle-plasma membrane fusion events in the fluorescence microscopy, has been proven to be important in the vesicle exocytosis study. In this paper, we present a framework to automatically detect fusion events. First, an iterative searching algorithm is developed to extract image patch sequences containing potential events. Then, we propose an event image to integrate the critical image patches of a candidate event into a single-image joint representation as the input to Convolutional Neural Networks (CNNs). According to the duration of candidate events, we design three CNN architectures to automatically learn features for the fusion event classification. Compared on 9 challenging datasets, our proposed method showed very competitive performance and outperformed two state-of-the-arts

    Choosing Wisely and Learning Deeply: Selective Cross-Modality Distillation via CLIP for Domain Generalization

    Full text link
    Domain Generalization (DG), a crucial research area, seeks to train models across multiple domains and test them on unseen ones. In this paper, we introduce a novel approach, namely, Selective Cross-Modality Distillation for Domain Generalization (SCMD). SCMD leverages the capabilities of large vision-language models, specifically CLIP, to train a more efficient model, ensuring it acquires robust generalization capabilities across unseen domains. Our primary contribution is a unique selection framework strategically designed to identify hard-to-learn samples for distillation. In parallel, we introduce a novel cross-modality module that seamlessly combines the projected features of the student model with the text embeddings from CLIP, ensuring the alignment of similarity distributions. We assess SCMD's performance on various benchmarks, where it empowers a ResNet50 to deliver state-of-the-art performance, surpassing existing domain generalization methods. Furthermore, we provide a theoretical analysis of our selection strategy, offering deeper insight into its effectiveness and potential in the field of DG

    Scheduling Mixed-Criticality Real-Time Systems

    Get PDF
    This dissertation addresses the following question to the design of scheduling policies and resource allocation mechanisms in contemporary embedded systems that are implemented on integrated computing platforms: in a multitasking system where it is hard to estimate a task's worst-case execution time, how do we assign task priorities so that 1) the safety-critical tasks are asserted to be completed within a specified length of time, and 2) the non-critical tasks are also guaranteed to be completed within a predictable length of time if no task is actually consuming time at the worst case? This dissertation tries to answer this question based on the mixed-criticality real-time system model, which defines multiple worst-case execution scenarios, and demands a scheduling policy to provide provable timing guarantees to each level of critical tasks with respect to each type of scenario. Two scheduling algorithms are proposed to serve this model. The OCBP algorithm is aimed at discrete one-shot tasks with an arbitrary number of criticality levels. The EDF-VD algorithm is aimed at recurrent tasks with two criticality levels (safety-critical and non-critical). Both algorithms are proved to optimally minimize the percentage of computational resource waste within two criticality levels. More in-depth investigations to the relationship among the computational resource requirement of different criticality levels are also provided for both algorithms.Doctor of Philosoph

    Evaluation of breakup models for marine diesel spray simulations

    Get PDF
    C

    The Devil is in the Edges: Monocular Depth Estimation with Edge-aware Consistency Fusion

    Full text link
    This paper presents a novel monocular depth estimation method, named ECFNet, for estimating high-quality monocular depth with clear edges and valid overall structure from a single RGB image. We make a thorough inquiry about the key factor that affects the edge depth estimation of the MDE networks, and come to a ratiocination that the edge information itself plays a critical role in predicting depth details. Driven by this analysis, we propose to explicitly employ the image edges as input for ECFNet and fuse the initial depths from different sources to produce the final depth. Specifically, ECFNet first uses a hybrid edge detection strategy to get the edge map and edge-highlighted image from the input image, and then leverages a pre-trained MDE network to infer the initial depths of the aforementioned three images. After that, ECFNet utilizes a layered fusion module (LFM) to fuse the initial depth, which will be further updated by a depth consistency module (DCM) to form the final estimation. Extensive experimental results on public datasets and ablation studies indicate that our method achieves state-of-the-art performance. Project page: https://zrealli.github.io/edgedepth.Comment: 17 pages, 19 figure

    From Tissue Plane to Organ World: A Benchmark Dataset for Multimodal Biomedical Image Registration using Deep Co-Attention Networks

    Full text link
    Correlating neuropathology with neuroimaging findings provides a multiscale view of pathologic changes in the human organ spanning the meso- to micro-scales, and is an emerging methodology expected to shed light on numerous disease states. To gain the most information from this multimodal, multiscale approach, it is desirable to identify precisely where a histologic tissue section was taken from within the organ in order to correlate with the tissue features in exactly the same organ region. Histology-to-organ registration poses an extra challenge, as any given histologic section can capture only a small portion of a human organ. Making use of the capabilities of state-of-the-art deep learning models, we unlock the potential to address and solve such intricate challenges. Therefore, we create the ATOM benchmark dataset, sourced from diverse institutions, with the primary objective of transforming this challenge into a machine learning problem and delivering outstanding outcomes that enlighten the biomedical community. The performance of our RegisMCAN model demonstrates the potential of deep learning to accurately predict where a subregion extracted from an organ image was obtained from within the overall 3D volume. The code and dataset can be found at: https://github.com/haizailache999/Image-Registration/tree/mai
    corecore