267 research outputs found

    Blood vessel enhancement via multi-dictionary and sparse coding: Application to retinal vessel enhancing

    Full text link
    International audienceBlood vessel images can provide considerable information of many diseases, which are widely used by ophthalmologists for disease diagnosis and surgical planning. In this paper, we propose a novel method for the blood Vessel Enhancement via Multi-dictionary and Sparse Coding (VE-MSC). In the proposed method, two dictionaries are utilized to gain the vascular structures and details, including the Representation Dictionary (RD) generated from the original vascular images and the Enhancement Dictionary (ED) extracted from the corresponding label images. The sparse coding technology is utilized to represent the original target vessel image with RD. After that, the enhanced target vessel image can be reconstructed using the obtained sparse coefficients and ED. The proposed method has been evaluated for the retinal vessel enhancement on the DRIVE and STARE databases. Experimental results indicate that the proposed method can not only effectively improve the image contrast but also enhance the retinal vascular structures and details

    A bioinspired flexible optical sensor for force and orientation sensing

    Get PDF
    Flexible optical sensors have been an emerging paradigm for applications in robotics, healthcare, and human–machine interfaces due to their high sensitivity, fast response, and anti-electromagnetic interference. Recently, Marques reports a bioinspired multifunctional flexible optical sensor (BioMFOS), achieving a forces sensitivity of 13.28 μN, and a spatial resolution of 0.02 mm. The BioMFOS has a small dimension (around 2 cm) and a light weight (0.8 g), making it suitable for wearable application and clothing integration. As proof-of-concept demonstrations, monitoring of finger position, trunk movements, and respiration rate are realized, implying their prominent applications in remote healthcare, intelligent robots, assistance devices teleoperation, and human-machine interfaces

    CycleACR: Cycle Modeling of Actor-Context Relations for Video Action Detection

    Full text link
    The relation modeling between actors and scene context advances video action detection where the correlation of multiple actors makes their action recognition challenging. Existing studies model each actor and scene relation to improve action recognition. However, the scene variations and background interference limit the effectiveness of this relation modeling. In this paper, we propose to select actor-related scene context, rather than directly leverage raw video scenario, to improve relation modeling. We develop a Cycle Actor-Context Relation network (CycleACR) where there is a symmetric graph that models the actor and context relations in a bidirectional form. Our CycleACR consists of the Actor-to-Context Reorganization (A2C-R) that collects actor features for context feature reorganizations, and the Context-to-Actor Enhancement (C2A-E) that dynamically utilizes reorganized context features for actor feature enhancement. Compared to existing designs that focus on C2A-E, our CycleACR introduces A2C-R for a more effective relation modeling. This modeling advances our CycleACR to achieve state-of-the-art performance on two popular action detection datasets (i.e., AVA and UCF101-24). We also provide ablation studies and visualizations as well to show how our cycle actor-context relation modeling improves video action detection. Code is available at https://github.com/MCG-NJU/CycleACR.Comment: technical repor

    Memory-and-Anticipation Transformer for Online Action Understanding

    Full text link
    Most existing forecasting systems are memory-based methods, which attempt to mimic human forecasting ability by employing various memory mechanisms and have progressed in temporal modeling for memory dependency. Nevertheless, an obvious weakness of this paradigm is that it can only model limited historical dependence and can not transcend the past. In this paper, we rethink the temporal dependence of event evolution and propose a novel memory-anticipation-based paradigm to model an entire temporal structure, including the past, present, and future. Based on this idea, we present Memory-and-Anticipation Transformer (MAT), a memory-anticipation-based approach, to address the online action detection and anticipation tasks. In addition, owing to the inherent superiority of MAT, it can process online action detection and anticipation tasks in a unified manner. The proposed MAT model is tested on four challenging benchmarks TVSeries, THUMOS'14, HDD, and EPIC-Kitchens-100, for online action detection and anticipation tasks, and it significantly outperforms all existing methods. Code is available at https://github.com/Echo0125/Memory-and-Anticipation-Transformer.Comment: ICCV 2023 Camera Read

    BasicTAD: an Astounding RGB-Only Baseline for Temporal Action Detection

    Full text link
    Temporal action detection (TAD) is extensively studied in the video understanding community by generally following the object detection pipeline in images. However, complex designs are not uncommon in TAD, such as two-stream feature extraction, multi-stage training, complex temporal modeling, and global context fusion. In this paper, we do not aim to introduce any novel technique for TAD. Instead, we study a simple, straightforward, yet must-known baseline given the current status of complex design and low detection efficiency in TAD. In our simple baseline (termed BasicTAD), we decompose the TAD pipeline into several essential components: data sampling, backbone design, neck construction, and detection head. We extensively investigate the existing techniques in each component for this baseline, and more importantly, perform end-to-end training over the entire pipeline thanks to the simplicity of design. As a result, this simple BasicTAD yields an astounding and real-time RGB-Only baseline very close to the state-of-the-art methods with two-stream inputs. In addition, we further improve the BasicTAD by preserving more temporal and spatial information in network representation (termed as PlusTAD). Empirical results demonstrate that our PlusTAD is very efficient and significantly outperforms the previous methods on the datasets of THUMOS14 and FineAction. Meanwhile, we also perform in-depth visualization and error analysis on our proposed method and try to provide more insights on the TAD problem. Our approach can serve as a strong baseline for future TAD research. The code and model will be released at https://github.com/MCG-NJU/BasicTAD.Comment: Accepted by CVI
    • …
    corecore