2,887 research outputs found

    Long Range Automated Persistent Surveillance

    Get PDF
    This dissertation addresses long range automated persistent surveillance with focus on three topics: sensor planning, size preserving tracking, and high magnification imaging. field of view should be reserved so that camera handoff can be executed successfully before the object of interest becomes unidentifiable or untraceable. We design a sensor planning algorithm that not only maximizes coverage but also ensures uniform and sufficient overlapped camera’s field of view for an optimal handoff success rate. This algorithm works for environments with multiple dynamic targets using different types of cameras. Significantly improved handoff success rates are illustrated via experiments using floor plans of various scales. Size preserving tracking automatically adjusts the camera’s zoom for a consistent view of the object of interest. Target scale estimation is carried out based on the paraperspective projection model which compensates for the center offset and considers system latency and tracking errors. A computationally efficient foreground segmentation strategy, 3D affine shapes, is proposed. The 3D affine shapes feature direct and real-time implementation and improved flexibility in accommodating the target’s 3D motion, including off-plane rotations. The effectiveness of the scale estimation and foreground segmentation algorithms is validated via both offline and real-time tracking of pedestrians at various resolution levels. Face image quality assessment and enhancement compensate for the performance degradations in face recognition rates caused by high system magnifications and long observation distances. A class of adaptive sharpness measures is proposed to evaluate and predict this degradation. A wavelet based enhancement algorithm with automated frame selection is developed and proves efficient by a considerably elevated face recognition rate for severely blurred long range face images

    A multi-agent architecture based on the BDI model for data fusion in visual sensor networks

    Get PDF
    30 pages, 18 figures.-- Article in press.The newest surveillance applications is attempting more complex tasks such as the analysis of the behavior of individuals and crowds. These complex tasks may use a distributed visual sensor network in order to gain coverage and exploit the inherent redundancy of the overlapped field of views. This article, presents a Multi-agent architecture based on the Belief-Desire-Intention (BDI) model for processing the information and fusing the data in a distributed visual sensor network. Instead of exchanging raw images between the agents involved in the visual network, local signal processing is performed and only the key observed features are shared. After a registration or calibration phase, the proposed architecture performs tracking, data fusion and coordination. Using the proposed Multi-agent architecture, we focus on the means of fusing the estimated positions on the ground plane from different agents which are applied to the same object. This fusion process is used for two different purposes: (1) to obtain a continuity in the tracking along the field of view of the cameras involved in the distributed network, (2) to improve the quality of the tracking by means of data fusion techniques, and by discarding non reliable sensors. Experimental results on two different scenarios show that the designed architecture can successfully track an object even when occlusions or sensor’s errors take place. The sensor’s errors are reduced by exploiting the inherent redundancy of a visual sensor network with overlapped field of views.This work was partially supported by projects CICYT TIN2008-06742-C02-02/TSI, CICYT TEC2008-06732-C02-02/TEC, SINPROB, CAM MADRINET S-0505/TIC/0255 and DPS2008-07029-C02-02.En prens

    Camera Planning and Fusion in a Heterogeneous Camera Network

    Get PDF
    Wide-area camera networks are becoming more and more common. They have widerange of commercial and military applications from video surveillance to smart home and from traffic monitoring to anti-terrorism. The design of such a camera network is a challenging problem due to the complexity of the environment, self and mutual occlusion of moving objects, diverse sensor properties and a myriad of performance metrics for different applications. In this dissertation, we consider two such challenges: camera planing and camera fusion. Camera planning is to determine the optimal number and placement of cameras for a target cost function. Camera fusion describes the task of combining images collected by heterogenous cameras in the network to extract information pertinent to a target application. I tackle the camera planning problem by developing a new unified framework based on binary integer programming (BIP) to relate the network design parameters and the performance goals of a variety of camera network tasks. Most of the BIP formulations are NP hard problems and various approximate algorithms have been proposed in the literature. In this dissertation, I develop a comprehensive framework in comparing the entire spectrum of approximation algorithms from Greedy, Markov Chain Monte Carlo (MCMC) to various relaxation techniques. The key contribution is to provide not only a generic formulation of the camera planning problem but also novel approaches to adapt the formulation to powerful approximation schemes including Simulated Annealing (SA) and Semi-Definite Program (SDP). The accuracy, efficiency and scalability of each technique are analyzed and compared in depth. Extensive experimental results are provided to illustrate the strength and weakness of each method. The second problem of heterogeneous camera fusion is a very complex problem. Information can be fused at different levels from pixel or voxel to semantic objects, with large variation in accuracy, communication and computation costs. My focus is on the geometric transformation of shapes between objects observed at different camera planes. This so-called the geometric fusion approach usually provides the most reliable fusion approach at the expense of high computation and communication costs. To tackle the complexity, a hierarchy of camera models with different levels of complexity was proposed to balance the effectiveness and efficiency of the camera network operation. Then different calibration and registration methods are proposed for each camera model. At last, I provide two specific examples to demonstrate the effectiveness of the model: 1)a fusion system to improve the segmentation of human body in a camera network consisted of thermal and regular visible light cameras and 2) a view dependent rendering system by combining the information from depth and regular cameras to collecting the scene information and generating new views in real time

    A wireless sensor network-based approach to large-scale dimensional metrology

    No full text
    In many branches of industry, dimensional measurements have become an important part of the production cycle, in order to check product compliance with specifications. This task is not trivial especially when dealing with largescale dimensional measurements: the bigger the measurement dimensions are, the harder is to achieve high accuracies. Nowadays, the problem can be handled using many metrological systems, based on different technologies (e.g. optical, mechanical, electromagnetic). Each of these systems is more or less adequate, depending upon measuring conditions, user's experience and skill, or other factors such as time, cost, accuracy and portability. This article focuses on a new possible approach to large-scale dimensional metrology based on wireless sensor networks. Advantages and drawbacks of such approach are analysed and deeply discussed. Then, the article briefly presents a recent prototype system - the Mobile Spatial Coordinate-Measuring System (MScMS-II) - which has been developed at the Industrial Metrology and Quality Laboratory of DISPEA - Politecnico di Torino. The system seems to be suitable for performing dimensional measurements of large-size objects (sizes on the order of several meters). Owing to its distributed nature, the system - based on a wireless network of optical devices - is portable, fully scalable with respect to dimensions and shapes and easily adaptable to different working environments. Preliminary results of experimental tests, aimed at evaluating system performance as well as research perspectives for further improvements, are discusse

    Evaluating indoor positioning systems in a shopping mall : the lessons learned from the IPIN 2018 competition

    Get PDF
    The Indoor Positioning and Indoor Navigation (IPIN) conference holds an annual competition in which indoor localization systems from different research groups worldwide are evaluated empirically. The objective of this competition is to establish a systematic evaluation methodology with rigorous metrics both for real-time (on-site) and post-processing (off-site) situations, in a realistic environment unfamiliar to the prototype developers. For the IPIN 2018 conference, this competition was held on September 22nd, 2018, in Atlantis, a large shopping mall in Nantes (France). Four competition tracks (two on-site and two off-site) were designed. They consisted of several 1 km routes traversing several floors of the mall. Along these paths, 180 points were topographically surveyed with a 10 cm accuracy, to serve as ground truth landmarks, combining theodolite measurements, differential global navigation satellite system (GNSS) and 3D scanner systems. 34 teams effectively competed. The accuracy score corresponds to the third quartile (75th percentile) of an error metric that combines the horizontal positioning error and the floor detection. The best results for the on-site tracks showed an accuracy score of 11.70 m (Track 1) and 5.50 m (Track 2), while the best results for the off-site tracks showed an accuracy score of 0.90 m (Track 3) and 1.30 m (Track 4). These results showed that it is possible to obtain high accuracy indoor positioning solutions in large, realistic environments using wearable light-weight sensors without deploying any beacon. This paper describes the organization work of the tracks, analyzes the methodology used to quantify the results, reviews the lessons learned from the competition and discusses its future

    Contextual Human Trajectory Forecasting within Indoor Environments and Its Applications

    Get PDF
    A human trajectory is the likely path a human subject would take to get to a destination. Human trajectory forecasting algorithms try to estimate or predict this path. Such algorithms have wide applications in robotics, computer vision and video surveillance. Understanding the human behavior can provide useful information towards the design of these algorithms. Human trajectory forecasting algorithm is an interesting problem because the outcome is influenced by many factors, of which we believe that the destination, geometry of the environment, and the humans in it play a significant role. In addressing this problem, we propose a model to estimate the occupancy behavior of humans based on the geometry and behavioral norms. We also develop a trajectory forecasting algorithm that understands this occupancy and leverages it for trajectory forecasting in previously unseen geometries. The algorithm can be useful in a variety of applications. In this work, we show its utility in three applications, namely person re-identification, camera placement optimization, and human tracking. Experiments were performed with real world data and compared to state-of-the-art methods to assess the quality of the forecasting algorithm and the enhancement in the quality of the applications. Results obtained suggests a significant enhancement in the accuracy of trajectory forecasting and the computer vision applications.Computer Science, Department o

    SegICP: Integrated Deep Semantic Segmentation and Pose Estimation

    Full text link
    Recent robotic manipulation competitions have highlighted that sophisticated robots still struggle to achieve fast and reliable perception of task-relevant objects in complex, realistic scenarios. To improve these systems' perceptive speed and robustness, we present SegICP, a novel integrated solution to object recognition and pose estimation. SegICP couples convolutional neural networks and multi-hypothesis point cloud registration to achieve both robust pixel-wise semantic segmentation as well as accurate and real-time 6-DOF pose estimation for relevant objects. Our architecture achieves 1cm position error and <5^\circ$ angle error in real time without an initial seed. We evaluate and benchmark SegICP against an annotated dataset generated by motion capture.Comment: IROS camera-read

    Deployment, Coverage And Network Optimization In Wireless Video Sensor Networks For 3D Indoor Monitoring

    Get PDF
    As a result of extensive research over the past decade or so, wireless sensor networks (wsns) have evolved into a well established technology for industry, environmental and medical applications. However, traditional wsns employ such sensors as thermal or photo light resistors that are often modeled with simple omni-directional sensing ranges, which focus only on scalar data within the sensing environment. In contrast, the sensing range of a wireless video sensor is directional and capable of providing more detailed video information about the sensing field. Additionally, with the introduction of modern features in non-fixed focus cameras such as the pan, tilt and zoom (ptz), the sensing range of a video sensor can be further regarded as a fan-shape in 2d and pyramid-shape in 3d. Such uniqueness attributed to wireless video sensors and the challenges associated with deployment restrictions of indoor monitoring make the traditional sensor coverage, deployment and networked solutions in 2d sensing model environments for wsns ineffective and inapplicable in solving the wireless video sensor network (wvsn) issues for 3d indoor space, thus calling for novel solutions. In this dissertation, we propose optimization techniques and develop solutions that will address the coverage, deployment and network issues associated within wireless video sensor networks for a 3d indoor environment. We first model the general problem in a continuous 3d space to minimize the total number of required video sensors to monitor a given 3d indoor region. We then convert it into a discrete version problem by incorporating 3d grids, which can achieve arbitrary approximation precision by adjusting the grid granularity. Due in part to the uniqueness of the visual sensor directional sensing range, we propose to exploit the directional feature to determine the optimal angular-coverage of each deployed visual sensor. Thus, we propose to deploy the visual sensors from divergent directional angles and further extend k-coverage to ``k-angular-coverage\u27\u27, while ensuring connectivity within the network. We then propose a series of mechanisms to handle obstacles in the 3d environment. We develop efficient greedy heuristic solutions that integrate all these aforementioned considerations one by one and can yield high quality results. Based on this, we also propose enhanced depth first search (dfs) algorithms that can not only further improve the solution quality, but also return optimal results if given enough time. Our extensive simulations demonstrate the superiority of both our greedy heuristic and enhanced dfs solutions. Finally, this dissertation discusses some future research directions such as in-network traffic routing and scheduling issues

    Learning to Prevent Monocular SLAM Failure using Reinforcement Learning

    Full text link
    Monocular SLAM refers to using a single camera to estimate robot ego motion while building a map of the environment. While Monocular SLAM is a well studied problem, automating Monocular SLAM by integrating it with trajectory planning frameworks is particularly challenging. This paper presents a novel formulation based on Reinforcement Learning (RL) that generates fail safe trajectories wherein the SLAM generated outputs do not deviate largely from their true values. Quintessentially, the RL framework successfully learns the otherwise complex relation between perceptual inputs and motor actions and uses this knowledge to generate trajectories that do not cause failure of SLAM. We show systematically in simulations how the quality of the SLAM dramatically improves when trajectories are computed using RL. Our method scales effectively across Monocular SLAM frameworks in both simulation and in real world experiments with a mobile robot.Comment: Accepted at the 11th Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP) 2018 More info can be found at the project page at https://robotics.iiit.ac.in/people/vignesh.prasad/SLAMSafePlanner.html and the supplementary video can be found at https://www.youtube.com/watch?v=420QmM_Z8v

    Map-Based Localization for Unmanned Aerial Vehicle Navigation

    Get PDF
    Unmanned Aerial Vehicles (UAVs) require precise pose estimation when navigating in indoor and GNSS-denied / GNSS-degraded outdoor environments. The possibility of crashing in these environments is high, as spaces are confined, with many moving obstacles. There are many solutions for localization in GNSS-denied environments, and many different technologies are used. Common solutions involve setting up or using existing infrastructure, such as beacons, Wi-Fi, or surveyed targets. These solutions were avoided because the cost should be proportional to the number of users, not the coverage area. Heavy and expensive sensors, for example a high-end IMU, were also avoided. Given these requirements, a camera-based localization solution was selected for the sensor pose estimation. Several camera-based localization approaches were investigated. Map-based localization methods were shown to be the most efficient because they close loops using a pre-existing map, thus the amount of data and the amount of time spent collecting data are reduced as there is no need to re-observe the same areas multiple times. This dissertation proposes a solution to address the task of fully localizing a monocular camera onboard a UAV with respect to a known environment (i.e., it is assumed that a 3D model of the environment is available) for the purpose of navigation for UAVs in structured environments. Incremental map-based localization involves tracking a map through an image sequence. When the map is a 3D model, this task is referred to as model-based tracking. A by-product of the tracker is the relative 3D pose (position and orientation) between the camera and the object being tracked. State-of-the-art solutions advocate that tracking geometry is more robust than tracking image texture because edges are more invariant to changes in object appearance and lighting. However, model-based trackers have been limited to tracking small simple objects in small environments. An assessment was performed in tracking larger, more complex building models, in larger environments. A state-of-the art model-based tracker called ViSP (Visual Servoing Platform) was applied in tracking outdoor and indoor buildings using a UAVs low-cost camera. The assessment revealed weaknesses at large scales. Specifically, ViSP failed when tracking was lost, and needed to be manually re-initialized. Failure occurred when there was a lack of model features in the cameras field of view, and because of rapid camera motion. Experiments revealed that ViSP achieved positional accuracies similar to single point positioning solutions obtained from single-frequency (L1) GPS observations standard deviations around 10 metres. These errors were considered to be large, considering the geometric accuracy of the 3D model used in the experiments was 10 to 40 cm. The first contribution of this dissertation proposes to increase the performance of the localization system by combining ViSP with map-building incremental localization, also referred to as simultaneous localization and mapping (SLAM). Experimental results in both indoor and outdoor environments show sub-metre positional accuracies were achieved, while reducing the number of tracking losses throughout the image sequence. It is shown that by integrating model-based tracking with SLAM, not only does SLAM improve model tracking performance, but the model-based tracker alleviates the computational expense of SLAMs loop closing procedure to improve runtime performance. Experiments also revealed that ViSP was unable to handle occlusions when a complete 3D building model was used, resulting in large errors in its pose estimates. The second contribution of this dissertation is a novel map-based incremental localization algorithm that improves tracking performance, and increases pose estimation accuracies from ViSP. The novelty of this algorithm is the implementation of an efficient matching process that identifies corresponding linear features from the UAVs RGB image data and a large, complex, and untextured 3D model. The proposed model-based tracker improved positional accuracies from 10 m (obtained with ViSP) to 46 cm in outdoor environments, and improved from an unattainable result using VISP to 2 cm positional accuracies in large indoor environments. The main disadvantage of any incremental algorithm is that it requires the camera pose of the first frame. Initialization is often a manual process. The third contribution of this dissertation is a map-based absolute localization algorithm that automatically estimates the camera pose when no prior pose information is available. The method benefits from vertical line matching to accomplish a registration procedure of the reference model views with a set of initial input images via geometric hashing. Results demonstrate that sub-metre positional accuracies were achieved and a proposed enhancement of conventional geometric hashing produced more correct matches - 75% of the correct matches were identified, compared to 11%. Further the number of incorrect matches was reduced by 80%
    corecore