1,219 research outputs found
Probabilistic three-dimensional object tracking based on adaptive depth segmentation
Object tracking is one of the fundamental topics of computer vision with diverse applications. The arising challenges in tracking, i.e., cluttered scenes, occlusion, complex motion, and illumination variations have motivated utilization of depth information from 3D sensors. However, current 3D trackers are not applicable to unconstrained environments without a priori knowledge. As an important object detection module in tracking, segmentation subdivides an image into its constituent regions. Nevertheless, the existing range segmentation methods in literature are difficult to implement in real-time due to their slow performance. In this thesis, a 3D object tracking method based on adaptive depth segmentation and particle filtering is presented. In this approach, the segmentation method as the bottom-up process is combined with the particle filter as the top-down process to achieve efficient tracking results under challenging circumstances. The experimental results demonstrate the efficiency, as well as robustness of the tracking algorithm utilizing real-world range information
Recommended from our members
Recognizing human activity using RGBD data
textTraditional computer vision algorithms try to understand the world using visible light cameras. However, there are inherent limitations of this type of data source. First, visible light images are sensitive to illumination changes and background clutter. Second, the 3D structural information of the scene is lost when projecting the 3D world to 2D images. Recovering the 3D information from 2D images is a challenging problem. Range sensors have existed for over thirty years, which capture 3D characteristics of the scene. However, earlier range sensors were either too expensive, difficult to use in human environments, slow at acquiring data, or provided a poor estimation of distance. Recently, the easy access to the RGBD data at real-time frame rate is leading to a revolution in perception and inspired many new research using RGBD data. I propose algorithms to detect persons and understand the activities using RGBD data. I demonstrate the solutions to many computer vision problems may be improved with the added depth channel. The 3D structural information may give rise to algorithms with real-time and view-invariant properties in a faster and easier fashion. When both data sources are available, the features extracted from the depth channel may be combined with traditional features computed from RGB channels to generate more robust systems with enhanced recognition abilities, which may be able to deal with more challenging scenarios. As a starting point, the first problem is to find the persons of various poses in the scene, including moving or static persons. Localizing humans from RGB images is limited by the lighting conditions and background clutter. Depth image gives alternative ways to find the humans in the scene. In the past, detection of humans from range data is usually achieved by tracking, which does not work for indoor person detection. In this thesis, I propose a model based approach to detect the persons using the structural information embedded in the depth image. I propose a 2D head contour model and a 3D head surface model to look for the head-shoulder part of the person. Then, a segmentation scheme is proposed to segment the full human body from the background and extract the contour. I also give a tracking algorithm based on the detection result. I further research on recognizing human actions and activities. I propose two features for recognizing human activities. The first feature is drawn from the skeletal joint locations estimated from a depth image. It is a compact representation of the human posture called histograms of 3D joint locations (HOJ3D). This representation is view-invariant and the whole algorithm runs at real-time. This feature may benefit many applications to get a fast estimation of the posture and action of the human subject. The second feature is a spatio-temporal feature for depth video, which is called Depth Cuboid Similarity Feature (DCSF). The interest points are extracted using an algorithm that effectively suppresses the noise and finds salient human motions. DCSF is extracted centered on each interest point, which forms the description of the video contents. This descriptor can be used to recognize the activities with no dependence on skeleton information or pre-processing steps such as motion segmentation, tracking, or even image de-noising or hole-filling. It is more flexible and widely applicable to many scenarios. Finally, all the features herein developed are combined to solve a novel problem: first-person human activity recognition using RGBD data. Traditional activity recognition algorithms focus on recognizing activities from a third-person perspective. I propose to recognize activities from a first-person perspective with RGBD data. This task is very novel and extremely challenging due to the large amount of camera motion either due to self exploration or the response of the interaction. I extracted 3D optical flow features as the motion descriptor, 3D skeletal joints features as posture descriptors, spatio-temporal features as local appearance descriptors to describe the first-person videos. To address the ego-motion of the camera, I propose an attention mask to guide the recognition procedures and separate the features on the ego-motion region and independent-motion region. The 3D features are very useful at summarizing the discerning information of the activities. In addition, the combination of the 3D features with existing 2D features brings more robust recognition results and make the algorithm capable of dealing with more challenging cases.Electrical and Computer Engineerin
An unsupervised acoustic fall detection system using source separation for sound interference suppression
We present a novel unsupervised fall detection system that employs the collected acoustic signals (footstep sound signals) from an elderly person’s normal activities to construct a data description model to distinguish falls from non-falls. The measured acoustic signals are initially processed with a source separation (SS) technique to remove
the possible interferences from other background sound sources. Mel-frequency cepstral coefficient (MFCC) features are next extracted from the processed signals and used to construct a data description model based on a one class support vector machine (OCSVM) method, which is finally applied to distinguish fall from non-fall sounds. Experiments on a recorded dataset confirm that our proposed fall detection system can achieve better performance, especially with high level of interference from other sound sources, as compared with existing single microphone based methods
Computer vision based techniques for fall detection with application towards assisted living
In this thesis, new computer vision based techniques are proposed to detect falls of an elderly person living alone. This is an important problem in assisted living.
Different types of information extracted from video recordings are exploited for fall detection using both analytical and machine learning techniques. Initially, a particle filter is used to extract a 2D cue, head velocity,
to determine a likely fall event. The human body region is then extracted with a modern background subtraction algorithm. Ellipse fitting is used to represent this shape and its orientation angle is employed for fall detection. An analytical method is used by setting proper thresholds against which the head velocity and orientation angle are compared for fall discrimination. Movement amplitude is then integrated into the fall detector to reduce false
alarms.
Since 2D features can generate false alarms and are not invariant to different directions, more robust 3D features are next extracted from a 3D person representation formed from video measurements from multiple calibrated cameras. Instead of using thresholds, different data fitting methods
are applied to construct models corresponding to fall activities. These are then used to distinguish falls and non-falls.
In the final works, two practical fall detection schemes which use only one un-calibrated camera are tested in a real home environment. These approaches are based on 2D features which describe human body posture. These extracted features are then applied to construct either a supervised method for posture classification or an unsupervised method for abnormal posture detection. Certain rules which are set according to the characteristics of fall activities are lastly used to build robust fall detection methods. Extensive evaluation studies are included to confirm the efficiency of the
schemes
An unsupervised acoustic fall detection system using source separation for sound interference suppression
We present a novel unsupervised fall detection system that employs the collected acoustic signals (footstep sound signals) from an elderly person׳s normal activities to construct a data description model to distinguish falls from non-falls. The measured acoustic signals are initially processed with a source separation (SS) technique to remove the possible interferences from other background sound sources. Mel-frequency cepstral coefficient (MFCC) features are next extracted from the processed signals and used to construct a data description model based on a one class support vector machine (OCSVM) method, which is finally applied to distinguish fall from non-fall sounds. Experiments on a recorded dataset confirm that our proposed fall detection system can achieve better performance, especially with high level of interference from other sound sources, as compared with existing single microphone based methods
Development of a simulation tool for measurements and analysis of simulated and real data to identify ADLs and behavioral trends through statistics techniques and ML algorithms
openCon una popolazione di anziani in crescita, il numero di soggetti a rischio di patologia è in rapido aumento. Molti gruppi di ricerca stanno studiando soluzioni pervasive per monitorare continuamente e discretamente i soggetti fragili nelle loro case, riducendo i costi sanitari e supportando la diagnosi medica. Comportamenti anomali durante l'esecuzione di attività di vita quotidiana (ADL) o variazioni sulle tendenze comportamentali sono di grande importanza.With a growing population of elderly people, the number of subjects at risk of pathology is rapidly increasing. Many research groups are studying pervasive solutions to continuously and unobtrusively monitor fragile subjects in their homes, reducing health-care costs and supporting the medical diagnosis. Anomalous behaviors while performing activities of daily living (ADLs) or variations on behavioral trends are of great importance. To measure ADLs a significant number of parameters need to be considering affecting the measurement such as sensors and environment characteristics or sensors disposition. To face the impossibility to study in the real context the best configuration of sensors able to minimize costs and maximize accuracy, simulation tools are being developed as powerful means. This thesis presents several contributions on this topic. In the following research work, a study of a measurement chain aimed to measure ADLs and represented by PIRs sensors and ML algorithm is conducted and a simulation tool in form of Web Application has been developed to generate datasets and to simulate how the measurement chain reacts varying the configuration of the sensors. Starting from eWare project results, the simulation tool has been thought to provide support for technicians, developers and installers being able to speed up analysis and monitoring times, to allow rapid identification of changes in behavioral trends, to guarantee system performance monitoring and to study the best configuration of the sensors network for a given environment. The UNIVPM Home Care Web App offers the chance to create ad hoc datasets related to ADLs and to conduct analysis thanks to statistical algorithms applied on data. To measure ADLs, machine learning algorithms have been implemented in the tool. Five different tasks have been identified. To test the validity of the developed instrument six case studies divided into two categories have been considered. To the first category belong those studies related to: 1) discover the best configuration of the sensors keeping environmental characteristics and user behavior as constants; 2) define the most performant ML algorithms. The second category aims to proof the stability of the algorithm implemented and its collapse condition by varying user habits. Noise perturbation on data has been applied to all case studies. Results show the validity of the generated datasets. By maximizing the sensors network is it possible to minimize the ML error to 0.8%. Due to cost is a key factor in this scenario, the fourth case studied considered has shown that minimizing the configuration of the sensors it is possible to reduce drastically the cost with a more than reasonable value for the ML error around 11.8%. Results in ADLs measurement can be considered more than satisfactory.INGEGNERIA INDUSTRIALEopenPirozzi, Michel
Multiple human tracking in RGB-depth data: A survey
© The Institution of Engineering and Technology. Multiple human tracking (MHT) is a fundamental task in many computer vision applications. Appearance-based approaches, primarily formulated on RGB data, are constrained and affected by problems arising from occlusions and/or illumination variations. In recent years, the arrival of cheap RGB-depth devices has led to many new approaches to MHT, and many of these integrate colour and depth cues to improve each and every stage of the process. In this survey, the authors present the common processing pipeline of these methods and review their methodology based (a) on how they implement this pipeline and (b) on what role depth plays within each stage of it. They identify and introduce existing, publicly available, benchmark datasets and software resources that fuse colour and depth data for MHT. Finally, they present a brief comparative evaluation of the performance of those works that have applied their methods to these datasets
Video surveillance systems-current status and future trends
Within this survey an attempt is made to document the present status of video surveillance systems. The main components of a surveillance system are presented and studied thoroughly. Algorithms for image enhancement, object detection, object tracking, object recognition and item re-identification are presented. The most common modalities utilized by surveillance systems are discussed, putting emphasis on video, in terms of available resolutions and new imaging approaches, like High Dynamic Range video. The most important features and analytics are presented, along with the most common approaches for image / video quality enhancement. Distributed computational infrastructures are discussed (Cloud, Fog and Edge Computing), describing the advantages and disadvantages of each approach. The most important deep learning algorithms are presented, along with the smart analytics that they utilize. Augmented reality and the role it can play to a surveillance system is reported, just before discussing the challenges and the future trends of surveillance
- …