Search CORE

877 research outputs found

WATCHING PEOPLE: ALGORITHMS TO STUDY HUMAN MOTION AND ACTIVITIES

Author: MONTELEONE Vito
Publication venue: place:Palermo
Publication date: 21/02/2020
Field of study

Nowadays human motion analysis is one of the most active research topics in Computer Vision and it is receiving an increasing attention from both the industrial and scientific communities. The growing interest in human motion analysis is motivated by the increasing number of promising applications, ranging from surveillance, human–computer interaction, virtual reality to healthcare, sports, computer games and video conferencing, just to name a few. The aim of this thesis is to give an overview of the various tasks involved in visual motion analysis of the human body and to present the issues and possible solutions related to it. In this thesis, visual motion analysis is categorized into three major areas related to the interpretation of human motion: tracking of human motion using virtual pan-tilt-zoom (vPTZ) camera, recognition of human motions and human behaviors segmentation. In the field of human motion tracking, a virtual environment for PTZ cameras (vPTZ) is presented to overcame the mechanical limitations of PTZ cameras. The vPTZ is built on equirectangular images acquired by 360° cameras and it allows not only the development of pedestrian tracking algorithms but also the comparison of their performances. On the basis of this virtual environment, three novel pedestrian tracking algorithms for 360° cameras were developed, two of which adopt a tracking-by-detection approach while the last adopts a Bayesian approach. The action recognition problem is addressed by an algorithm that represents actions in terms of multinomial distributions of frequent sequential patterns of different length. Frequent sequential patterns are series of data descriptors that occur many times in the data. The proposed method learns a codebook of frequent sequential patterns by means of an apriori-like algorithm. An action is then represented with a Bag-of-Frequent-Sequential-Patterns approach. In the last part of this thesis a methodology to semi-automatically annotate behavioral data given a small set of manually annotated data is presented. The resulting methodology is not only effective in the semi-automated annotation task but can also be used in presence of abnormal behaviors, as demonstrated empirically by testing the system on data collected from children affected by neuro-developmental disorders

Archivio istituzionale della ricerca - Università di Palermo

Recommended from our members

Semantic localisation via globally unique instance segmentation

Author: Budvytis I
Cipolla R
Sauer P
Publication venue: British Machine Vision Conference 2018, BMVC 2018
Publication date: 01/01/2019
Field of study

In this work we propose a novel approach to semantic localisation. Our work is motivated by the need for environment perception techniques which not only perform self-localisation within a map but also simultaneously recognise surrounding objects. Such capabilities are crucial for computer vision applications which interact with the environment: autonomous driving, augmented reality or robotics. In order to achieve this goal we propose a solution which consists of three key steps. Firstly, a database of panoramic RGB images and corresponding globally unique, per-pixel object instance labels is built for the desired environment where we typically consider objects from static categories such as "building" or "tree". Secondly, a semantic segmentation network capable of predicting more than 3000 labels is trained on the collected data. Finally, for a given panoramic query image, the corresponding instance label image predicted by the network is used for semantic matching within the database. The matching is performed in two stages: (i) a fast retrieval of a small subset of database images (~100) with highly overlapping instance label histograms, followed by (ii) an explicit approximate 3 DoF (yaw, pitch, roll) alignment of the selected subset of images and the query image. We evaluate our approach in challenging indoor and outdoor navigation scenarios, achieving better or similar performance when compared to state-of-the-art image retrieval-based localisation approaches using key-point matching and image level embedding. Our contribution includes: (i) a description of a novel semantic localisation approach using globally unique instance segmentation, (ii) corresponding quantitative and qualitative analysis and (iii) a novel CamVid-360 dataset containing 986 labelled instances of buildings, trees, road signs and poles

Apollo (Cambridge)

Dataset of Panoramic Images for People Tracking in Service Robotics

Author
Publication venue
Publication date
Field of study

We provide a framework for constructing a guided robot for usage in hospitals in this thesis. The omnidirectional camera on the robot allows it to recognize and track the person who is following it. Furthermore, when directing the individual to their preferred position in the hospital, the robot must be aware of its surroundings and avoid accidents with other people or items. To train and evaluate our robot's performance, we developed an auto-labeling framework for creating a dataset of panoramic videos captured by the robot's omnidirectional camera. We labeled each person in the video and their real position in the robot's frame, enabling us to evaluate the accuracy of our tracking system and guide the development of the robot's navigation algorithms. Our research expands on earlier work that has established a framework for tracking individuals using omnidirectional cameras. We want to contribute to the continuing work to enhance the precision and dependability of these tracking systems, which is essential for the creation of efficient guiding robots in healthcare facilities, by developing a benchmark dataset. Our research has the potential to improve the patient experience and increase the efficiency of healthcare institutions by reducing staff time spent guiding patients through the facility.We provide a framework for constructing a guided robot for usage in hospitals in this thesis. The omnidirectional camera on the robot allows it to recognize and track the person who is following it. Furthermore, when directing the individual to their preferred position in the hospital, the robot must be aware of its surroundings and avoid accidents with other people or items. To train and evaluate our robot's performance, we developed an auto-labeling framework for creating a dataset of panoramic videos captured by the robot's omnidirectional camera. We labeled each person in the video and their real position in the robot's frame, enabling us to evaluate the accuracy of our tracking system and guide the development of the robot's navigation algorithms. Our research expands on earlier work that has established a framework for tracking individuals using omnidirectional cameras. We want to contribute to the continuing work to enhance the precision and dependability of these tracking systems, which is essential for the creation of efficient guiding robots in healthcare facilities, by developing a benchmark dataset. Our research has the potential to improve the patient experience and increase the efficiency of healthcare institutions by reducing staff time spent guiding patients through the facility

Padua Thesis and Dissertation Archive

Depth- and Semantics-aware Multi-modal Domain Translation: Generating 3D Panoramic Color Images from LiDAR Point Clouds

Author: Aksoy Eren Erdal
Cortinhal Tiago
Publication venue
Publication date: 15/02/2023
Field of study

This work presents a new depth- and semantics-aware conditional generative model, named TITAN-Next, for cross-domain image-to-image translation in a multi-modal setup between LiDAR and camera sensors. The proposed model leverages scene semantics as a mid-level representation and is able to translate raw LiDAR point clouds to RGB-D camera images by solely relying on semantic scene segments. We claim that this is the first framework of its kind and it has practical applications in autonomous vehicles such as providing a fail-safe mechanism and augmenting available data in the target image domain. The proposed model is evaluated on the large-scale and challenging Semantic-KITTI dataset, and experimental findings show that it considerably outperforms the original TITAN-Net and other strong baselines by 23.7

\%

margin in terms of IoU

arXiv.org e-Print Archive

PASS: Panoramic Annular Semantic Segmentation

Author: Bergasa Pascual Luis M.
Hu Xinxin
Kaiwei Wang
Romera Carmena Eduardo
Yang Kailun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/09/2019
Field of study

Pixel-wise semantic segmentation is capable of unifying most of driving scene perception tasks, and has enabled striking progress in the context of navigation assistance, where an entire surrounding sensing is vital. However, current mainstream semantic segmenters are predominantly benchmarked against datasets featuring narrow Field of View (FoV), and a large part of vision-based intelligent vehicles use only a forward-facing camera. In this paper, we propose a Panoramic Annular Semantic Segmentation (PASS) framework to perceive the whole surrounding based on a compact panoramic annular lens system and an online panorama unfolding process. To facilitate the training of PASS models, we leverage conventional FoV imaging datasets, bypassing the efforts entailed to create fully dense panoramic annotations. To consistently exploit the rich contextual cues in the unfolded panorama, we adapt our real-time ERF-PSPNet to predict semantically meaningful feature maps in different segments, and fuse them to fulfill panoramic scene parsing. The innovation lies in the network adaptation to enable smooth and seamless segmentation, combined with an extended set of heterogeneous data augmentations to attain robustness in panoramic imagery. A comprehensive variety of experiments demonstrates the effectiveness for real-world surrounding perception in a single PASS, while the adaptation proposal is exceptionally positive for state-of-the-art efficient networks.Ministerio de Economía y CompetitividadComunidad de Madri

e_Buah - Biblioteca Digital de la Universidad de Alcalá

Multi-Clip Video Editing from a Single Viewpoint

Author: Alton John
Laszlo Andrew
Mascelli Joseph
Mavlankar Aditya
Mercado Gustavo
Thomson Roy
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 13/11/2014
Field of study

International audienceWe propose a framework for automatically generating multiple clips suitable for video editing by simulating pan-tilt-zoom camera movements within the frame of a single static camera. Assuming important actors and objects can be localized using computer vision techniques, our method requires only minimal user input to define the subject matter of each sub-clip. The composition of each sub-clip is automatically computed in a novel L1-norm optimization framework. Our approach encodes several common cinematographic practices into a single convex cost function minimization problem, resulting in aesthetically pleasing sub-clips which can easily be edited together using off-the-shelf multi-clip video editing software. We demonstrate our approach on five video sequences of a live theatre performance by generating multiple synchronized subclips for each sequence

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

A Survey on Human-aware Robot Navigation

Author: Battiato Sebastiano
Farinella Giovanni Maria
Furnari Antonino
Härmä Aki
Möller Ronja
Publication venue
Publication date: 22/06/2021
Field of study

Intelligent systems are increasingly part of our everyday lives and have been integrated seamlessly to the point where it is difficult to imagine a world without them. Physical manifestations of those systems on the other hand, in the form of embodied agents or robots, have so far been used only for specific applications and are often limited to functional roles (e.g. in the industry, entertainment and military fields). Given the current growth and innovation in the research communities concerned with the topics of robot navigation, human-robot-interaction and human activity recognition, it seems like this might soon change. Robots are increasingly easy to obtain and use and the acceptance of them in general is growing. However, the design of a socially compliant robot that can function as a companion needs to take various areas of research into account. This paper is concerned with the navigation aspect of a socially-compliant robot and provides a survey of existing solutions for the relevant areas of research as well as an outlook on possible future directions.Comment: Robotics and Autonomous Systems, 202

arXiv.org e-Print Archive

Maastricht University Research Portal