49,296 research outputs found
Small Lesions Evaluation Based on Unsupervised Cluster Analysis of Signal-Intensity Time Courses in Dynamic Breast MRI
An application of an unsupervised neural network-based computer-aided diagnosis (CAD) system is reported for the detection
and characterization of small indeterminate breast lesions, average size 1.1 mm, in dynamic contrast-enhanced MRI. This system
enables the extraction of spatial and temporal features of dynamic MRI data and additionally provides a segmentation with regard
to identification and regional subclassification of pathological breast tissue lesions. Lesions with an initial contrast enhancement
≥50% were selected with semiautomatic segmentation. This conventional segmentation analysis is based on the mean initial signal
increase and postinitial course of all voxels included in the lesion. In this paper, we compare the conventional segmentation analysis
with unsupervised classification for the evaluation of signal intensity time courses for the differential diagnosis of enhancing
lesions in breast MRI. The results suggest that the computerized analysis system based on unsupervised clustering has the potential to
increase the diagnostic accuracy of MRI mammography for small lesions and can be used as a basis for computer-aided diagnosis
of breast cancer with MR mammography
FlyNet 2.0: Drosophila heart 3D (2D + time) segmentation in optical coherence microscopy images using a convolutional long short-term memory neural network
A custom convolutional neural network (CNN) integrated with convolutional long short-term memory (LSTM) achieves accurate 3D (2D + time) segmentation in cross-sectional videos of the Drosophila heart acquired by an optical coherence microscopy (OCM) system. While our previous FlyNet 1.0 model utilized regular CNNs to extract 2D spatial information from individual video frames, convolutional LSTM, FlyNet 2.0, utilizes both spatial and temporal information to improve segmentation performance further. To train and test FlyNet 2.0, we used 100 datasets including 500,000 fly heart OCM images. OCM videos in three developmental stages and two heartbeat situations were segmented achieving an intersection over union (IOU) accuracy of 92%. This increased segmentation accuracy allows morphological and dynamic cardiac parameters to be better quantified
Hierarchical Attention Network for Action Segmentation
The temporal segmentation of events is an essential task and a precursor for
the automatic recognition of human actions in the video. Several attempts have
been made to capture frame-level salient aspects through attention but they
lack the capacity to effectively map the temporal relationships in between the
frames as they only capture a limited span of temporal dependencies. To this
end we propose a complete end-to-end supervised learning approach that can
better learn relationships between actions over time, thus improving the
overall segmentation performance. The proposed hierarchical recurrent attention
framework analyses the input video at multiple temporal scales, to form
embeddings at frame level and segment level, and perform fine-grained action
segmentation. This generates a simple, lightweight, yet extremely effective
architecture for segmenting continuous video streams and has multiple
application domains. We evaluate our system on multiple challenging public
benchmark datasets, including MERL Shopping, 50 salads, and Georgia Tech
Egocentric datasets, and achieves state-of-the-art performance. The evaluated
datasets encompass numerous video capture settings which are inclusive of
static overhead camera views and dynamic, ego-centric head-mounted camera
views, demonstrating the direct applicability of the proposed framework in a
variety of settings.Comment: Published in Pattern Recognition Letter
Convolutional Neural Network on Three Orthogonal Planes for Dynamic Texture Classification
Dynamic Textures (DTs) are sequences of images of moving scenes that exhibit
certain stationarity properties in time such as smoke, vegetation and fire. The
analysis of DT is important for recognition, segmentation, synthesis or
retrieval for a range of applications including surveillance, medical imaging
and remote sensing. Deep learning methods have shown impressive results and are
now the new state of the art for a wide range of computer vision tasks
including image and video recognition and segmentation. In particular,
Convolutional Neural Networks (CNNs) have recently proven to be well suited for
texture analysis with a design similar to a filter bank approach. In this
paper, we develop a new approach to DT analysis based on a CNN method applied
on three orthogonal planes x y , xt and y t . We train CNNs on spatial frames
and temporal slices extracted from the DT sequences and combine their outputs
to obtain a competitive DT classifier. Our results on a wide range of commonly
used DT classification benchmark datasets prove the robustness of our approach.
Significant improvement of the state of the art is shown on the larger
datasets.Comment: 19 pages, 10 figure
Object-Oriented Dynamics Learning through Multi-Level Abstraction
Object-based approaches for learning action-conditioned dynamics has
demonstrated promise for generalization and interpretability. However, existing
approaches suffer from structural limitations and optimization difficulties for
common environments with multiple dynamic objects. In this paper, we present a
novel self-supervised learning framework, called Multi-level Abstraction
Object-oriented Predictor (MAOP), which employs a three-level learning
architecture that enables efficient object-based dynamics learning from raw
visual observations. We also design a spatial-temporal relational reasoning
mechanism for MAOP to support instance-level dynamics learning and handle
partial observability. Our results show that MAOP significantly outperforms
previous methods in terms of sample efficiency and generalization over novel
environments for learning environment models. We also demonstrate that learned
dynamics models enable efficient planning in unseen environments, comparable to
true environment models. In addition, MAOP learns semantically and visually
interpretable disentangled representations.Comment: Accepted to the Thirthy-Fourth AAAI Conference On Artificial
Intelligence (AAAI), 202
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
- …