145 research outputs found

    Multi-Sensor Data Fusion for Robust Environment Reconstruction in Autonomous Vehicle Applications

    Get PDF
    In autonomous vehicle systems, understanding the surrounding environment is mandatory for an intelligent vehicle to make every decision of movement on the road. Knowledge about the neighboring environment enables the vehicle to detect moving objects, especially irregular events such as jaywalking, sudden lane change of the vehicle etc. to avoid collision. This local situation awareness mostly depends on the advanced sensors (e.g. camera, LIDAR, RADAR) added to the vehicle. The main focus of this work is to formulate a problem of reconstructing the vehicle environment using point cloud data from the LIDAR and RGB color images from the camera. Based on a widely used point cloud registration tool such as iterated closest point (ICP), an expectation-maximization (EM)-ICP technique has been proposed to automatically mosaic multiple point cloud sets into a larger one. Motion trajectories of the moving objects are analyzed to address the issue of irregularity detection. Another contribution of this work is the utilization of fusion of color information (from RGB color images captured by the camera) with the three-dimensional point cloud data for better representation of the environment. For better understanding of the surrounding environment, histogram of oriented gradient (HOG) based techniques are exploited to detect pedestrians and vehicles.;Using both camera and LIDAR, an autonomous vehicle can gather information and reconstruct the map of the surrounding environment up to a certain distance. Capability of communicating and cooperating among vehicles can improve the automated driving decisions by providing extended and more precise view of the surroundings. In this work, a transmission power control algorithm is studied along with the adaptive content control algorithm to achieve a more accurate map of the vehicle environment. To exchange the local sensor data among the vehicles, an adaptive communication scheme is proposed that controls the lengths and the contents of the messages depending on the load of the communication channel. The exchange of this information can extend the tracking region of a vehicle beyond the area sensed by its own sensors. In this experiment, a combined effect of power control, and message length and content control algorithm is exploited to improve the map\u27s accuracy of the surroundings in a cooperative automated vehicle system

    Computer vision in target pursuit using a UAV

    Get PDF
    Research in target pursuit using Unmanned Aerial Vehicle (UAV) has gained attention in recent years, this is primarily due to decrease in cost and increase in demand of small UAVs in many sectors. In computer vision, target pursuit is a complex problem as it involves the solving of many sub-problems which are typically concerned with the detection, tracking and following of the object of interest. At present, the majority of related existing methods are developed using computer simulation with the assumption of ideal environmental factors, while the remaining few practical methods are mainly developed to track and follow simple objects that contain monochromatic colours with very little texture variances. Current research in this topic is lacking of practical vision based approaches. Thus the aim of this research is to fill the gap by developing a real-time algorithm capable of following a person continuously given only a photo input. As this research considers the whole procedure as an autonomous system, therefore the drone is activated automatically upon receiving a photo of a person through Wi-Fi. This means that the whole system can be triggered by simply emailing a single photo from any device anywhere. This is done by first implementing image fetching to automatically connect to WIFI, download the image and decode it. Then, human detection is performed to extract the template from the upper body of the person, the intended target is acquired using both human detection and template matching. Finally, target pursuit is achieved by tracking the template continuously while sending the motion commands to the drone. In the target pursuit system, the detection is mainly accomplished using a proposed human detection method that is capable of detecting, extracting and segmenting the human body figure robustly from the background without prior training. This involves detecting face, head and shoulder separately, mainly using gradient maps. While the tracking is mainly accomplished using a proposed generic and non-learning template matching method, this involves combining intensity template matching with colour histogram model and employing a three-tier system for template management. A flight controller is also developed, it supports three types of controls: keyboard, mouse and text messages. Furthermore, the drone is programmed with three different modes: standby, sentry and search. To improve the detection and tracking of colour objects, this research has also proposed several colour related methods. One of them is a colour model for colour detection which consists of three colour components: hue, purity and brightness. Hue represents the colour angle, purity represents the colourfulness and brightness represents intensity. It can be represented in three different geometric shapes: sphere, hemisphere and cylinder, each of these shapes also contains two variations. Experimental results have shown that the target pursuit algorithm is capable of identifying and following the target person robustly given only a photo input. This can be evidenced by the live tracking and mapping of the intended targets with different clothing in both indoor and outdoor environments. Additionally, the various methods developed in this research could enhance the performance of practical vision based applications especially in detecting and tracking of objects

    Object detection in optical remote sensing images based on weakly supervised learning and high-level feature learning

    Get PDF
    The abundant spatial and contextual information provided by the advanced remote sensing technology has facilitated subsequent automatic interpretation of the optical remote sensing images (RSIs). In this paper, a novel and effective geospatial object detection framework is proposed by combining the weakly supervised learning (WSL) and high-level feature learning. First, deep Boltzmann machine is adopted to infer the spatial and structural information encoded in the low-level and middle-level features to effectively describe objects in optical RSIs. Then, a novel WSL approach is presented to object detection where the training sets require only binary labels indicating whether an image contains the target object or not. Based on the learnt high-level features, it jointly integrates saliency, intraclass compactness, and interclass separability in a Bayesian framework to initialize a set of training examples from weakly labeled images and start iterative learning of the object detector. A novel evaluation criterion is also developed to detect model drift and cease the iterative learning. Comprehensive experiments on three optical RSI data sets have demonstrated the efficacy of the proposed approach in benchmarking with several state-of-the-art supervised-learning-based object detection approaches

    Time-slice analysis of dyadic human activity

    Get PDF
    La reconnaissance d’activités humaines à partir de données vidéo est utilisée pour la surveillance ainsi que pour des applications d’interaction homme-machine. Le principal objectif est de classer les vidéos dans l’une des k classes d’actions à partir de vidéos entièrement observées. Cependant, de tout temps, les systèmes intelligents sont améliorés afin de prendre des décisions basées sur des incertitudes et ou des informations incomplètes. Ce besoin nous motive à introduire le problème de l’analyse de l’incertitude associée aux activités humaines et de pouvoir passer à un nouveau niveau de généralité lié aux problèmes d’analyse d’actions. Nous allons également présenter le problème de reconnaissance d’activités par intervalle de temps, qui vise à explorer l’activité humaine dans un intervalle de temps court. Il a été démontré que l’analyse par intervalle de temps est utile pour la caractérisation des mouvements et en général pour l’analyse de contenus vidéo. Ces études nous encouragent à utiliser ces intervalles de temps afin d’analyser l’incertitude associée aux activités humaines. Nous allons détailler à quel degré de certitude chaque activité se produit au cours de la vidéo. Dans cette thèse, l’analyse par intervalle de temps d’activités humaines avec incertitudes sera structurée en 3 parties. i) Nous présentons une nouvelle famille de descripteurs spatiotemporels optimisés pour la prédiction précoce avec annotations d’intervalle de temps. Notre représentation prédictive du point d’intérêt spatiotemporel (Predict-STIP) est basée sur l’idée de la contingence entre intervalles de temps. ii) Nous exploitons des techniques de pointe pour extraire des points d’intérêts afin de représenter ces intervalles de temps. iii) Nous utilisons des relations (uniformes et par paires) basées sur les réseaux neuronaux convolutionnels entre les différentes parties du corps de l’individu dans chaque intervalle de temps. Les relations uniformes enregistrent l’apparence locale de la partie du corps tandis que les relations par paires captent les relations contextuelles locales entre les parties du corps. Nous extrayons les spécificités de chaque image dans l’intervalle de temps et examinons différentes façons de les agréger temporellement afin de générer un descripteur pour tout l’intervalle de temps. En outre, nous créons une nouvelle base de données qui est annotée à de multiples intervalles de temps courts, permettant la modélisation de l’incertitude inhérente à la reconnaissance d’activités par intervalle de temps. Les résultats expérimentaux montrent l’efficience de notre stratégie dans l’analyse des mouvements humains avec incertitude.Recognizing human activities from video data is routinely leveraged for surveillance and human-computer interaction applications. The main focus has been classifying videos into one of k action classes from fully observed videos. However, intelligent systems must to make decisions under uncertainty, and based on incomplete information. This need motivates us to introduce the problem of analysing the uncertainty associated with human activities and move to a new level of generality in the action analysis problem. We also present the problem of time-slice activity recognition which aims to explore human activity at a small temporal granularity. Time-slice recognition is able to infer human behaviours from a short temporal window. It has been shown that temporal slice analysis is helpful for motion characterization and for video content representation in general. These studies motivate us to consider timeslices for analysing the uncertainty associated with human activities. We report to what degree of certainty each activity is occurring throughout the video from definitely not occurring to definitely occurring. In this research, we propose three frameworks for time-slice analysis of dyadic human activity under uncertainty. i) We present a new family of spatio-temporal descriptors which are optimized for early prediction with time-slice action annotations. Our predictive spatiotemporal interest point (Predict-STIP) representation is based on the intuition of temporal contingency between time-slices. ii) we exploit state-of-the art techniques to extract interest points in order to represent time-slices. We also present an accumulative uncertainty to depict the uncertainty associated with partially observed videos for the task of early activity recognition. iii) we use Convolutional Neural Networks-based unary and pairwise relations between human body joints in each time-slice. The unary term captures the local appearance of the joints while the pairwise term captures the local contextual relations between the parts. We extract these features from each frame in a time-slice and examine different temporal aggregations to generate a descriptor for the whole time-slice. Furthermore, we create a novel dataset which is annotated at multiple short temporal windows, allowing the modelling of the inherent uncertainty in time-slice activity recognition. All the three methods have been evaluated on TAP dataset. Experimental results demonstrate the effectiveness of our framework in the analysis of dyadic activities under uncertaint

    15th SC@RUG 2018 proceedings 2017-2018

    Get PDF

    15th SC@RUG 2018 proceedings 2017-2018

    Get PDF

    15th SC@RUG 2018 proceedings 2017-2018

    Get PDF
    • …
    corecore