15,659 research outputs found
A Survey Of Activity Recognition And Understanding The Behavior In Video Survelliance
This paper presents a review of human activity recognition and behaviour
understanding in video sequence. The key objective of this paper is to provide
a general review on the overall process of a surveillance system used in the
current trend. Visual surveillance system is directed on automatic
identification of events of interest, especially on tracking and classification
of moving objects. The processing step of the video surveillance system
includes the following stages: Surrounding model, object representation, object
tracking, activity recognition and behaviour understanding. It describes
techniques that use to define a general set of activities that are applicable
to a wide range of scenes and environments in video sequence.Comment: 14 pages, 5 figures, 5 table
Revisiting Active Perception
Despite the recent successes in robotics, artificial intelligence and
computer vision, a complete artificial agent necessarily must include active
perception. A multitude of ideas and methods for how to accomplish this have
already appeared in the past, their broader utility perhaps impeded by
insufficient computational power or costly hardware. The history of these
ideas, perhaps selective due to our perspectives, is presented with the goal of
organizing the past literature and highlighting the seminal contributions. We
argue that those contributions are as relevant today as they were decades ago
and, with the state of modern computational tools, are poised to find new life
in the robotic perception systems of the next decade
A Survey on Food Computing
Food is very essential for human life and it is fundamental to the human
experience. Food-related study may support multifarious applications and
services, such as guiding the human behavior, improving the human health and
understanding the culinary culture. With the rapid development of social
networks, mobile networks, and Internet of Things (IoT), people commonly
upload, share, and record food images, recipes, cooking videos, and food
diaries, leading to large-scale food data. Large-scale food data offers rich
knowledge about food and can help tackle many central issues of human society.
Therefore, it is time to group several disparate issues related to food
computing. Food computing acquires and analyzes heterogenous food data from
disparate sources for perception, recognition, retrieval, recommendation, and
monitoring of food. In food computing, computational approaches are applied to
address food related issues in medicine, biology, gastronomy and agronomy. Both
large-scale food data and recent breakthroughs in computer science are
transforming the way we analyze food data. Therefore, vast amounts of work has
been conducted in the food area, targeting different food-oriented tasks and
applications. However, there are very few systematic reviews, which shape this
area well and provide a comprehensive and in-depth summary of current efforts
or detail open problems in this area. In this paper, we formalize food
computing and present such a comprehensive overview of various emerging
concepts, methods, and tasks. We summarize key challenges and future directions
ahead for food computing. This is the first comprehensive survey that targets
the study of computing technology for the food area and also offers a
collection of research studies and technologies to benefit researchers and
practitioners working in different food-related fields.Comment: Accepted by ACM Computing Survey
A Survey on Content-Aware Video Analysis for Sports
Sports data analysis is becoming increasingly large-scale, diversified, and
shared, but difficulty persists in rapidly accessing the most crucial
information. Previous surveys have focused on the methodologies of sports video
analysis from the spatiotemporal viewpoint instead of a content-based
viewpoint, and few of these studies have considered semantics. This study
develops a deeper interpretation of content-aware sports video analysis by
examining the insight offered by research into the structure of content under
different scenarios. On the basis of this insight, we provide an overview of
the themes particularly relevant to the research on content-aware systems for
broadcast sports. Specifically, we focus on the video content analysis
techniques applied in sportscasts over the past decade from the perspectives of
fundamentals and general review, a content hierarchical model, and trends and
challenges. Content-aware analysis methods are discussed with respect to
object-, event-, and context-oriented groups. In each group, the gap between
sensation and content excitement must be bridged using proper strategies. In
this regard, a content-aware approach is required to determine user demands.
Finally, the paper summarizes the future trends and challenges for sports video
analysis. We believe that our findings can advance the field of research on
content-aware video analysis for broadcast sports.Comment: Accepted for publication in IEEE Transactions on Circuits and Systems
for Video Technology (TCSVT
Automated Lane Detection in Crowds using Proximity Graphs
Studying the behavior of crowds is vital for understanding and predicting
human interactions in public areas. Research has shown that, under certain
conditions, large groups of people can form collective behavior patterns: local
interactions between individuals results in global movements patterns. To
detect these patterns in a crowd, we assume each person is carrying an on-body
device that acts a local proximity sensor, e.g., smartphone or bluetooth badge,
and represent the texture of the crowd as a proximity graph. Our goal is
extract information about crowds from these proximity graphs. In this work, we
focus on one particular type of pattern: lane formation. We present a formal
definition of a lane, proposed a simple probabilistic model that simulates
lanes moving through a stationary crowd, and present an automated
lane-detection method. Our preliminary results show that our method is able to
detect lanes of different shapes and sizes. We see our work as an initial step
towards rich pattern recognition using proximity graphs.Comment: Presented at the 6th International Workshop on Urban Computing
(UrbComp 2017) held in conjunction with the 23th ACM SIGKD
A Survey on Deep Learning Methods for Robot Vision
Deep learning has allowed a paradigm shift in pattern recognition, from using
hand-crafted features together with statistical classifiers to using
general-purpose learning procedures for learning data-driven representations,
features, and classifiers together. The application of this new paradigm has
been particularly successful in computer vision, in which the development of
deep learning methods for vision applications has become a hot research topic.
Given that deep learning has already attracted the attention of the robot
vision community, the main purpose of this survey is to address the use of deep
learning in robot vision. To achieve this, a comprehensive overview of deep
learning and its usage in computer vision is given, that includes a description
of the most frequently used neural models and their main application areas.
Then, the standard methodology and tools used for designing deep-learning based
vision systems are presented. Afterwards, a review of the principal work using
deep learning in robot vision is presented, as well as current and future
trends related to the use of deep learning in robotics. This survey is intended
to be a guide for the developers of robot vision systems
Recommended from our members
Indexing Multivariate Mobile Data through Spatio-Temporal Event Detection and Clustering.
Mobile and wearable devices are capable of quantifying user behaviors based on their contextual sensor data. However, few indexing and annotation mechanisms are available, due to difficulties inherent in raw multivariate data types and the relative sparsity of sensor data. These issues have slowed the development of higher level human-centric searching and querying mechanisms. Here, we propose a pipeline of three algorithms. First, we introduce a spatio-temporal event detection algorithm. Then, we introduce a clustering algorithm based on mobile contextual data. Our spatio-temporal clustering approach can be used as an annotation on raw sensor data. It improves information retrieval by reducing the search space and is based on searching only the related clusters. To further improve behavior quantification, the third algorithm identifies contrasting events withina cluster content. Two large real-world smartphone datasets have been used to evaluate our algorithms and demonstrate the utility and resource efficiency of our approach to search
OpenEDS2020: Open Eyes Dataset
We present the second edition of OpenEDS dataset, OpenEDS2020, a novel
dataset of eye-image sequences captured at a frame rate of 100 Hz under
controlled illumination, using a virtual-reality head-mounted display mounted
with two synchronized eye-facing cameras. The dataset, which is anonymized to
remove any personally identifiable information on participants, consists of 80
participants of varied appearance performing several gaze-elicited tasks, and
is divided in two subsets: 1) Gaze Prediction Dataset, with up to 66,560
sequences containing 550,400 eye-images and respective gaze vectors, created to
foster research in spatio-temporal gaze estimation and prediction approaches;
and 2) Eye Segmentation Dataset, consisting of 200 sequences sampled at 5 Hz,
with up to 29,500 images, of which 5% contain a semantic segmentation label,
devised to encourage the use of temporal information to propagate labels to
contiguous frames. Baseline experiments have been evaluated on OpenEDS2020, one
for each task, with average angular error of 5.37 degrees when performing gaze
prediction on 1 to 5 frames into the future, and a mean intersection over union
score of 84.1% for semantic segmentation. As its predecessor, OpenEDS dataset,
we anticipate that this new dataset will continue creating opportunities to
researchers in eye tracking, machine learning and computer vision communities,
to advance the state of the art for virtual reality applications. The dataset
is available for download upon request at
http://research.fb.com/programs/openeds-2020-challenge/.Comment: Description of dataset used in OpenEDS2020 challenge:
https://research.fb.com/programs/openeds-2020-challenge
Towards Storytelling from Visual Lifelogging: An Overview
Visual lifelogging consists of acquiring images that capture the daily
experiences of the user by wearing a camera over a long period of time. The
pictures taken offer considerable potential for knowledge mining concerning how
people live their lives, hence, they open up new opportunities for many
potential applications in fields including healthcare, security, leisure and
the quantified self. However, automatically building a story from a huge
collection of unstructured egocentric data presents major challenges. This
paper provides a thorough review of advances made so far in egocentric data
analysis, and in view of the current state of the art, indicates new lines of
research to move us towards storytelling from visual lifelogging.Comment: 16 pages, 11 figures, Submitted to IEEE Transactions on Human-Machine
System
Constant Space Complexity Environment Representation for Vision-based Navigation
This paper presents a preliminary conceptual investigation into an
environment representation that has constant space complexity with respect to
the camera image space. This type of representation allows the planning
algorithms of a mobile agent to bypass what are often complex and noisy
transformations between camera image space and Euclidean space. The approach is
to compute per-pixel potential values directly from processed camera data,
which results in a discrete potential field that has constant space complexity
with respect to the image plane. This can enable planning and control
algorithms, whose complexity often depends on the size of the environment
representation, to be defined with constant run-time. This type of approach can
be particularly useful for platforms with strict resource constraints, such as
embedded and real-time systems.Comment: IROS 2017: 9th Workshop on Planning, Perception and Navigation for
Intelligent Vehicle
- …