284 research outputs found

    An improved classification approach for echocardiograms embedding temporal information

    Get PDF
    Cardiovascular disease is an umbrella term for all diseases of the heart. At present, computer-aided echocardiogram diagnosis is becoming increasingly beneficial. For echocardiography, different cardiac views can be acquired depending on the location and angulations of the ultrasound transducer. Hence, the automatic echocardiogram view classification is the first step for echocardiogram diagnosis, especially for computer-aided system and even for automatic diagnosis in the future. In addition, heart views classification makes it possible to label images especially for large-scale echo videos, provide a facility for database management and collection. This thesis presents a framework for automatic cardiac viewpoints classification of echocardiogram video data. In this research, we aim to overcome the challenges facing this investigation while analyzing, recognizing and classifying echocardiogram videos from 3D (2D spatial and 1D temporal) space. Specifically, we extend 2D KAZE approach into 3D space for feature detection and propose a histogram of acceleration as feature descriptor. Subsequently, feature encoding follows before the application of SVM to classify echo videos. In addition, comparison with the state of the art methodologies also takes place, including 2D SIFT, 3D SIFT, and optical flow technique to extract temporal information sustained in the video images. As a result, the performance of 2D KAZE, 2D KAZE with Optical Flow, 3D KAZE, Optical Flow, 2D SIFT and 3D SIFT delivers accuracy rate of 89.4%, 84.3%, 87.9%, 79.4%, 83.8% and 73.8% respectively for the eight view classes of echo videos

    A Survey on Global LiDAR Localization

    Full text link
    Knowledge about the own pose is key for all mobile robot applications. Thus pose estimation is part of the core functionalities of mobile robots. In the last two decades, LiDAR scanners have become a standard sensor for robot localization and mapping. This article surveys recent progress and advances in LiDAR-based global localization. We start with the problem formulation and explore the application scope. We then present the methodology review covering various global localization topics, such as maps, descriptor extraction, and consistency checks. The contents are organized under three themes. The first is the combination of global place retrieval and local pose estimation. Then the second theme is upgrading single-shot measurement to sequential ones for sequential global localization. The third theme is extending single-robot global localization to cross-robot localization on multi-robot systems. We end this survey with a discussion of open challenges and promising directions on global lidar localization

    Gaze Guidance, Task-Based Eye Movement Prediction, and Real-World Task Inference using Eye Tracking

    Get PDF
    The ability to predict and guide viewer attention has important applications in computer graphics, image understanding, object detection, visual search and training. Human eye movements provide insight into the cognitive processes involved in task performance and there has been extensive research on what factors guide viewer attention in a scene. It has been shown, for example, that saliency in the image, scene context, and task at hand play significant roles in guiding attention. This dissertation presents and discusses research on visual attention with specific focus on the use of subtle visual cues to guide viewer gaze and the development of algorithms to predict the distribution of gaze about a scene. Specific contributions of this work include: a framework for gaze guidance to enable problem solving and spatial learning, a novel algorithm for task-based eye movement prediction, and a system for real-world task inference using eye tracking. A gaze guidance approach is presented that combines eye tracking with subtle image-space modulations to guide viewer gaze about a scene. Several experiments were conducted using this approach to examine its impact on short-term spatial information recall, task sequencing, training, and password recollection. A model of human visual attention prediction that uses saliency maps, scene feature maps and task-based eye movements to predict regions of interest was also developed. This model was used to automatically select target regions for active gaze guidance to improve search task performance. Finally, we develop a framework for inferring real-world tasks using image features and eye movement data. Overall, this dissertation naturally leads to an overarching framework, that combines all three contributions to provide a continuous feedback system to improve performance on repeated visual search tasks. This research has important applications in data visualization, problem solving, training, and online education

    Weakly Supervised Caveline Detection For AUV Navigation Inside Underwater Caves

    Full text link
    Underwater caves are challenging environments that are crucial for water resource management, and for our understanding of hydro-geology and history. Mapping underwater caves is a time-consuming, labor-intensive, and hazardous operation. For autonomous cave mapping by underwater robots, the major challenge lies in vision-based estimation in the complete absence of ambient light, which results in constantly moving shadows due to the motion of the camera-light setup. Thus, detecting and following the caveline as navigation guidance is paramount for robots in autonomous cave mapping missions. In this paper, we present a computationally light caveline detection model based on a novel Vision Transformer (ViT)-based learning pipeline. We address the problem of scarce annotated training data by a weakly supervised formulation where the learning is reinforced through a series of noisy predictions from intermediate sub-optimal models. We validate the utility and effectiveness of such weak supervision for caveline detection and tracking in three different cave locations: USA, Mexico, and Spain. Experimental results demonstrate that our proposed model, CL-ViT, balances the robustness-efficiency trade-off, ensuring good generalization performance while offering 10+ FPS on single-board (Jetson TX2) devices
    • …
    corecore