1,151 research outputs found
Going Deeper into Action Recognition: A Survey
Understanding human actions in visual data is tied to advances in
complementary research areas including object recognition, human dynamics,
domain adaptation and semantic segmentation. Over the last decade, human action
analysis evolved from earlier schemes that are often limited to controlled
environments to nowadays advanced solutions that can learn from millions of
videos and apply to almost all daily activities. Given the broad range of
applications from video surveillance to human-computer interaction, scientific
milestones in action recognition are achieved more rapidly, eventually leading
to the demise of what used to be good in a short time. This motivated us to
provide a comprehensive review of the notable steps taken towards recognizing
human actions. To this end, we start our discussion with the pioneering methods
that use handcrafted representations, and then, navigate into the realm of deep
learning based approaches. We aim to remain objective throughout this survey,
touching upon encouraging improvements as well as inevitable fallbacks, in the
hope of raising fresh questions and motivating new research directions for the
reader
Review of Person Re-identification Techniques
Person re-identification across different surveillance cameras with disjoint
fields of view has become one of the most interesting and challenging subjects
in the area of intelligent video surveillance. Although several methods have
been developed and proposed, certain limitations and unresolved issues remain.
In all of the existing re-identification approaches, feature vectors are
extracted from segmented still images or video frames. Different similarity or
dissimilarity measures have been applied to these vectors. Some methods have
used simple constant metrics, whereas others have utilised models to obtain
optimised metrics. Some have created models based on local colour or texture
information, and others have built models based on the gait of people. In
general, the main objective of all these approaches is to achieve a
higher-accuracy rate and lowercomputational costs. This study summarises
several developments in recent literature and discusses the various available
methods used in person re-identification. Specifically, their advantages and
disadvantages are mentioned and compared.Comment: Published 201
Visual region understanding: unsupervised extraction and abstraction
The ability to gain a conceptual understanding of the world in uncontrolled environments is the ultimate goal of vision-based computer systems. Technological
societies today are heavily reliant on surveillance and security infrastructure, robotics, medical image analysis, visual data categorisation and search, and smart device user interaction, to name a few. Out of all the complex problems tackled
by computer vision today in context of these technologies, that which lies closest to the original goals of the field is the subarea of unsupervised scene analysis or scene modelling. However, its common use of low level features does not provide
a good balance between generality and discriminative ability, both a result and a symptom of the sensory and semantic gaps existing between low level computer
representations and high level human descriptions.
In this research we explore a general framework that addresses the fundamental
problem of universal unsupervised extraction of semantically meaningful visual
regions and their behaviours. For this purpose we address issues related to
(i) spatial and spatiotemporal segmentation for region extraction, (ii) region shape modelling, and (iii) the online categorisation of visual object classes and the spatiotemporal analysis of their behaviours. Under this framework we propose (a)
a unified region merging method and spatiotemporal region reduction, (b) shape
representation by the optimisation and novel simplication of contour-based growing neural gases, and (c) a foundation for the analysis of visual object motion properties using a shape and appearance based nearest-centroid classification algorithm
and trajectory plots for the obtained region classes.
1
Specifically, we formulate a region merging spatial segmentation mechanism
that combines and adapts features shown previously to be individually useful,
namely parallel region growing, the best merge criterion, a time adaptive threshold, and region reduction techniques. For spatiotemporal region refinement we
consider both scalar intensity differences and vector optical flow. To model the shapes of the visual regions thus obtained, we adapt the growing neural gas for
rapid region contour representation and propose a contour simplication technique. A fast unsupervised nearest-centroid online learning technique next groups observed region instances into classes, for which we are then able to analyse spatial
presence and spatiotemporal trajectories. The analysis results show semantic correlations to real world object behaviour. Performance evaluation of all steps across
standard metrics and datasets validate their performance
Action Recognition in Videos: from Motion Capture Labs to the Web
This paper presents a survey of human action recognition approaches based on
visual data recorded from a single video camera. We propose an organizing
framework which puts in evidence the evolution of the area, with techniques
moving from heavily constrained motion capture scenarios towards more
challenging, realistic, "in the wild" videos. The proposed organization is
based on the representation used as input for the recognition task, emphasizing
the hypothesis assumed and thus, the constraints imposed on the type of video
that each technique is able to address. Expliciting the hypothesis and
constraints makes the framework particularly useful to select a method, given
an application. Another advantage of the proposed organization is that it
allows categorizing newest approaches seamlessly with traditional ones, while
providing an insightful perspective of the evolution of the action recognition
task up to now. That perspective is the basis for the discussion in the end of
the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4
table
AI-enhanced diagnosis of challenging lesions in breast MRI: a methodology and application primer
Computer-aided diagnosis (CAD) systems have become an important tool in the assessment of breast tumors with magnetic resonance imaging (MRI). CAD systems can be used for the detection and diagnosis of breast tumors as a āsecond opinionā review complementing the radiologistās review. CAD systems have many common parts such as image pre-processing, tumor feature extraction and data classification that are mostly based on machine learning (ML) techniques. In this review paper, we describe the application of ML-based CAD systems in MRI of the breast covering the detection of diagnostically challenging lesions such as non-mass enhancing (NME) lesions, multiparametric MRI, neo-adjuvant chemotherapy (NAC) and radiomics all applied to NME. Since ML has been widely used in the medical imaging community, we provide an overview about the state-ofthe-art and novel techniques applied as classifiers to CAD systems. The differences in the CAD systems in MRI of the breast for several standard and novel applications for NME are explained in detail to provide important examples illustrating: (i) CAD for the detection and diagnosis, (ii) CAD in multi-parametric imaging (iii) CAD in NAC and (iv) breast cancer radiomics. We aim to provide a comparison between these CAD applications and to illustrate a global view on intelligent CAD systems based on ANN in MRI of the breast
- ā¦