Search CORE

15,659 research outputs found

A Survey Of Activity Recognition And Understanding The Behavior In Video Survelliance

Author: Kumar Dhananjay
Revathi A. R.
Publication venue
Publication date: 29/07/2012
Field of study

This paper presents a review of human activity recognition and behaviour understanding in video sequence. The key objective of this paper is to provide a general review on the overall process of a surveillance system used in the current trend. Visual surveillance system is directed on automatic identification of events of interest, especially on tracking and classification of moving objects. The processing step of the video surveillance system includes the following stages: Surrounding model, object representation, object tracking, activity recognition and behaviour understanding. It describes techniques that use to define a general set of activities that are applicable to a wide range of scenes and environments in video sequence.Comment: 14 pages, 5 figures, 5 table

arXiv.org e-Print Archive

Revisiting Active Perception

Author: Aloimonos Yiannis
Bajcsy Ruzena
Tsotsos John K.
Publication venue
Publication date: 13/03/2016
Field of study

Despite the recent successes in robotics, artificial intelligence and computer vision, a complete artificial agent necessarily must include active perception. A multitude of ideas and methods for how to accomplish this have already appeared in the past, their broader utility perhaps impeded by insufficient computational power or costly hardware. The history of these ideas, perhaps selective due to our perspectives, is presented with the goal of organizing the past literature and highlighting the seminal contributions. We argue that those contributions are as relevant today as they were decades ago and, with the state of modern computational tools, are poised to find new life in the robotic perception systems of the next decade

arXiv.org e-Print Archive

A Survey on Food Computing

Author: Jain Ramesh
Jiang Shuqiang
Liu Linhu
Min Weiqing
Rui Yong
Publication venue
Publication date: 16/07/2019
Field of study

Food is very essential for human life and it is fundamental to the human experience. Food-related study may support multifarious applications and services, such as guiding the human behavior, improving the human health and understanding the culinary culture. With the rapid development of social networks, mobile networks, and Internet of Things (IoT), people commonly upload, share, and record food images, recipes, cooking videos, and food diaries, leading to large-scale food data. Large-scale food data offers rich knowledge about food and can help tackle many central issues of human society. Therefore, it is time to group several disparate issues related to food computing. Food computing acquires and analyzes heterogenous food data from disparate sources for perception, recognition, retrieval, recommendation, and monitoring of food. In food computing, computational approaches are applied to address food related issues in medicine, biology, gastronomy and agronomy. Both large-scale food data and recent breakthroughs in computer science are transforming the way we analyze food data. Therefore, vast amounts of work has been conducted in the food area, targeting different food-oriented tasks and applications. However, there are very few systematic reviews, which shape this area well and provide a comprehensive and in-depth summary of current efforts or detail open problems in this area. In this paper, we formalize food computing and present such a comprehensive overview of various emerging concepts, methods, and tasks. We summarize key challenges and future directions ahead for food computing. This is the first comprehensive survey that targets the study of computing technology for the food area and also offers a collection of research studies and technologies to benefit researchers and practitioners working in different food-related fields.Comment: Accepted by ACM Computing Survey

arXiv.org e-Print Archive

A Survey on Content-Aware Video Analysis for Sports

Author: Shih Huang-Chia
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/03/2017
Field of study

Sports data analysis is becoming increasingly large-scale, diversified, and shared, but difficulty persists in rapidly accessing the most crucial information. Previous surveys have focused on the methodologies of sports video analysis from the spatiotemporal viewpoint instead of a content-based viewpoint, and few of these studies have considered semantics. This study develops a deeper interpretation of content-aware sports video analysis by examining the insight offered by research into the structure of content under different scenarios. On the basis of this insight, we provide an overview of the themes particularly relevant to the research on content-aware systems for broadcast sports. Specifically, we focus on the video content analysis techniques applied in sportscasts over the past decade from the perspectives of fundamentals and general review, a content hierarchical model, and trends and challenges. Content-aware analysis methods are discussed with respect to object-, event-, and context-oriented groups. In each group, the gap between sensation and content excitement must be bridged using proper strategies. In this regard, a content-aware approach is required to determine user demands. Finally, the paper summarizes the future trends and challenges for sports video analysis. We believe that our findings can advance the field of research on content-aware video analysis for broadcast sports.Comment: Accepted for publication in IEEE Transactions on Circuits and Systems for Video Technology (TCSVT

arXiv.org e-Print Archive

Automated Lane Detection in Crowds using Proximity Graphs

Author: Heldens Stijn
Litvak Nelly
Martella Claudio
van Steen Maarten
Publication venue
Publication date: 06/07/2017
Field of study

Studying the behavior of crowds is vital for understanding and predicting human interactions in public areas. Research has shown that, under certain conditions, large groups of people can form collective behavior patterns: local interactions between individuals results in global movements patterns. To detect these patterns in a crowd, we assume each person is carrying an on-body device that acts a local proximity sensor, e.g., smartphone or bluetooth badge, and represent the texture of the crowd as a proximity graph. Our goal is extract information about crowds from these proximity graphs. In this work, we focus on one particular type of pattern: lane formation. We present a formal definition of a lane, proposed a simple probabilistic model that simulates lanes moving through a stationary crowd, and present an automated lane-detection method. Our preliminary results show that our method is able to detect lanes of different shapes and sizes. We see our work as an initial step towards rich pattern recognition using proximity graphs.Comment: Presented at the 6th International Workshop on Urban Computing (UrbComp 2017) held in conjunction with the 23th ACM SIGKD

arXiv.org e-Print Archive

A Survey on Deep Learning Methods for Robot Vision

Author: Loncomilla Patricio
Ruiz-del-Solar Javier
Soto Naiomi
Publication venue
Publication date: 28/03/2018
Field of study

Deep learning has allowed a paradigm shift in pattern recognition, from using hand-crafted features together with statistical classifiers to using general-purpose learning procedures for learning data-driven representations, features, and classifiers together. The application of this new paradigm has been particularly successful in computer vision, in which the development of deep learning methods for vision applications has become a hot research topic. Given that deep learning has already attracted the attention of the robot vision community, the main purpose of this survey is to address the use of deep learning in robot vision. To achieve this, a comprehensive overview of deep learning and its usage in computer vision is given, that includes a description of the most frequently used neural models and their main application areas. Then, the standard methodology and tools used for designing deep-learning based vision systems are presented. Afterwards, a review of the principal work using deep learning in robot vision is presented, as well as current and future trends related to the use of deep learning in robotics. This survey is intended to be a guide for the developers of robot vision systems

arXiv.org e-Print Archive

Recommended from our members

Indexing Multivariate Mobile Data through Spatio-Temporal Event Detection and Clustering.

Author: Akbari Mohammad
Dobbins Chelsea
Pazzani Michael
Rawassizadeh Reza
Publication venue: eScholarship, University of California
Publication date: 22/01/2019
Field of study

Mobile and wearable devices are capable of quantifying user behaviors based on their contextual sensor data. However, few indexing and annotation mechanisms are available, due to difficulties inherent in raw multivariate data types and the relative sparsity of sensor data. These issues have slowed the development of higher level human-centric searching and querying mechanisms. Here, we propose a pipeline of three algorithms. First, we introduce a spatio-temporal event detection algorithm. Then, we introduce a clustering algorithm based on mobile contextual data. Our spatio-temporal clustering approach can be used as an annotation on raw sensor data. It improves information retrieval by reducing the search space and is based on searching only the related clusters. To further improve behavior quantification, the third algorithm identifies contrasting events withina cluster content. Two large real-world smartphone datasets have been used to evaluate our algorithms and demonstrate the utility and resource efficiency of our approach to search

eScholarship - University of California

OpenEDS2020: Open Eyes Dataset

Author: Behrendt Karsten
Komogortsev Oleg V.
Krishnakumar Kapil
Palmero Cristina
Sharma Abhishek
Talathi Sachin S.
Publication venue
Publication date: 08/05/2020
Field of study

We present the second edition of OpenEDS dataset, OpenEDS2020, a novel dataset of eye-image sequences captured at a frame rate of 100 Hz under controlled illumination, using a virtual-reality head-mounted display mounted with two synchronized eye-facing cameras. The dataset, which is anonymized to remove any personally identifiable information on participants, consists of 80 participants of varied appearance performing several gaze-elicited tasks, and is divided in two subsets: 1) Gaze Prediction Dataset, with up to 66,560 sequences containing 550,400 eye-images and respective gaze vectors, created to foster research in spatio-temporal gaze estimation and prediction approaches; and 2) Eye Segmentation Dataset, consisting of 200 sequences sampled at 5 Hz, with up to 29,500 images, of which 5% contain a semantic segmentation label, devised to encourage the use of temporal information to propagate labels to contiguous frames. Baseline experiments have been evaluated on OpenEDS2020, one for each task, with average angular error of 5.37 degrees when performing gaze prediction on 1 to 5 frames into the future, and a mean intersection over union score of 84.1% for semantic segmentation. As its predecessor, OpenEDS dataset, we anticipate that this new dataset will continue creating opportunities to researchers in eye tracking, machine learning and computer vision communities, to advance the state of the art for virtual reality applications. The dataset is available for download upon request at http://research.fb.com/programs/openeds-2020-challenge/.Comment: Description of dataset used in OpenEDS2020 challenge: https://research.fb.com/programs/openeds-2020-challenge

arXiv.org e-Print Archive

Towards Storytelling from Visual Lifelogging: An Overview

Author: Bolaños Marc
Dimiccoli Mariella
Radeva Petia
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/07/2016
Field of study

Visual lifelogging consists of acquiring images that capture the daily experiences of the user by wearing a camera over a long period of time. The pictures taken offer considerable potential for knowledge mining concerning how people live their lives, hence, they open up new opportunities for many potential applications in fields including healthcare, security, leisure and the quantified self. However, automatically building a story from a huge collection of unstructured egocentric data presents major challenges. This paper provides a thorough review of advances made so far in egocentric data analysis, and in view of the current state of the art, indicates new lines of research to move us towards storytelling from visual lifelogging.Comment: 16 pages, 11 figures, Submitted to IEEE Transactions on Human-Machine System

arXiv.org e-Print Archive

Constant Space Complexity Environment Representation for Vision-based Navigation

Author: Johnson Jeffrey Kane
Publication venue
Publication date: 12/09/2017
Field of study

This paper presents a preliminary conceptual investigation into an environment representation that has constant space complexity with respect to the camera image space. This type of representation allows the planning algorithms of a mobile agent to bypass what are often complex and noisy transformations between camera image space and Euclidean space. The approach is to compute per-pixel potential values directly from processed camera data, which results in a discrete potential field that has constant space complexity with respect to the image plane. This can enable planning and control algorithms, whose complexity often depends on the size of the environment representation, to be defined with constant run-time. This type of approach can be particularly useful for platforms with strict resource constraints, such as embedded and real-time systems.Comment: IROS 2017: 9th Workshop on Planning, Perception and Navigation for Intelligent Vehicle

arXiv.org e-Print Archive