902 research outputs found

    PeopleNet: A Novel People Counting Framework for Head-Mounted Moving Camera Videos

    Get PDF
    Traditional crowd counting (optical flow or feature matching) techniques have been upgraded to deep learning (DL) models due to their lack of automatic feature extraction and low-precision outcomes. Most of these models were tested on surveillance scene crowd datasets captured by stationary shooting equipment. It is very challenging to perform people counting from the videos shot with a head-mounted moving camera; this is mainly due to mixing the temporal information of the moving crowd with the induced camera motion. This study proposed a transfer learning-based PeopleNet model to tackle this significant problem. For this, we have made some significant changes to the standard VGG16 model, by disabling top convolutional blocks and replacing its standard fully connected layers with some new fully connected and dense layers. The strong transfer learning capability of the VGG16 network yields in-depth insights of the PeopleNet into the good quality of density maps resulting in highly accurate crowd estimation. The performance of the proposed model has been tested over a self-generated image database prepared from moving camera video clips, as there is no public and benchmark dataset for this work. The proposed framework has given promising results on various crowd categories such as dense, sparse, average, etc. To ensure versatility, we have done self and cross-evaluation on various crowd counting models and datasets, which proves the importance of the PeopleNet model in adverse defense of society

    Proceedings of the Augmented VIsual Display (AVID) Research Workshop

    Get PDF
    The papers, abstracts, and presentations were presented at a three day workshop focused on sensor modeling and simulation, and image enhancement, processing, and fusion. The technical sessions emphasized how sensor technology can be used to create visual imagery adequate for aircraft control and operations. Participants from industry, government, and academic laboratories contributed to panels on Sensor Systems, Sensor Modeling, Sensor Fusion, Image Processing (Computer and Human Vision), and Image Evaluation and Metrics

    Expansion-based passive ranging

    Get PDF
    A new technique of passive ranging which is based on utilizing the image-plane expansion experienced by every object as its distance from the sensor decreases is described. This technique belongs in the feature/object-based family. The motion and shape of a small window, assumed to be fully contained inside the boundaries of some object, is approximated by an affine transformation. The parameters of the transformation matrix are derived by initially comparing successive images, and progressively increasing the image time separation so as to achieve much larger triangulation baseline than currently possible. Depth is directly derived from the expansion part of the transformation. To a first approximation, image-plane expansion is independent of image-plane location with respect to the focus of expansion (FOE) and of platform maneuvers. Thus, an expansion-based method has the potential of providing a reliable range in the difficult image area around the FOE. In areas far from the FOE the shift parameters of the affine transformation can provide more accurate depth information than the expansion alone, and can thus be used similarly to the way they were used in conjunction with the Inertial Navigation Unit (INU) and Kalman filtering. However, the performance of a shift-based algorithm, when the shifts are derived from the affine transformation, would be much improved compared to current algorithms because the shifts - as well as the other parameters - can be obtained between widely separated images. Thus, the main advantage of this new approach is that, allowing the tracked window to expand and rotate, in addition to moving laterally, enables one to correlate images over a very long time span which, in turn, translates into a large spatial baseline - resulting in a proportionately higher depth accuracy

    Expansion-based passive ranging

    Get PDF
    This paper describes a new technique of passive ranging which is based on utilizing the image-plane expansion experienced by every object as its distance from the sensor decreases. This technique belongs in the feature/object-based family. The motion and shape of a small window, assumed to be fully contained inside the boundaries of some object, is approximated by an affine transformation. The parameters of the transformation matrix are derived by initially comparing successive images, and progressively increasing the image time separation so as to achieve much larger triangulation baseline than currently possible. Depth is directly derived from the expansion part of the transformation. To a first approximation, image-plane expansion is independent of image-plane location with respect to the focus of expansion (FOE) and of platform maneuvers. Thus, an expansion-based method has the potential of providing a reliable range in the difficult image area around the FOE. In areas far from the FOE the shift parameters of the affine transformation can provide more accurate depth information than the expansion alone, and can thus be used similarly to the way they have been used in conjunction with the Inertial Navigation Unit (INU) and Kalman filtering. However, the performance of a shift-based algorithm, when the shifts are derived from the affine transformation, would be much improved compared to current algorithms because the shifts--as well as the other parameters--can be obtained between widely separated images. Thus, the main advantage of this new approach is that, allowing the tracked window to expand and rotate, in addition to moving laterally, enables one to correlate images over a very long time span which, in turn, translates into a large spatial baseline resulting in a proportionately higher depth accuracy

    Visual-Inertial Sensor Fusion Models and Algorithms for Context-Aware Indoor Navigation

    Get PDF
    Positioning in navigation systems is predominantly performed by Global Navigation Satellite Systems (GNSSs). However, while GNSS-enabled devices have become commonplace for outdoor navigation, their use for indoor navigation is hindered due to GNSS signal degradation or blockage. For this, development of alternative positioning approaches and techniques for navigation systems is an ongoing research topic. In this dissertation, I present a new approach and address three major navigational problems: indoor positioning, obstacle detection, and keyframe detection. The proposed approach utilizes inertial and visual sensors available on smartphones and are focused on developing: a framework for monocular visual internal odometry (VIO) to position human/object using sensor fusion and deep learning in tandem; an unsupervised algorithm to detect obstacles using sequence of visual data; and a supervised context-aware keyframe detection. The underlying technique for monocular VIO is a recurrent convolutional neural network for computing six-degree-of-freedom (6DoF) in an end-to-end fashion and an extended Kalman filter module for fine-tuning the scale parameter based on inertial observations and managing errors. I compare the results of my featureless technique with the results of conventional feature-based VIO techniques and manually-scaled results. The comparison results show that while the framework is more effective compared to featureless method and that the accuracy is improved, the accuracy of feature-based method still outperforms the proposed approach. The approach for obstacle detection is based on processing two consecutive images to detect obstacles. Conducting experiments and comparing the results of my approach with the results of two other widely used algorithms show that my algorithm performs better; 82% precision compared with 69%. In order to determine the decent frame-rate extraction from video stream, I analyzed movement patterns of camera and inferred the context of the user to generate a model associating movement anomaly with proper frames-rate extraction. The output of this model was utilized for determining the rate of keyframe extraction in visual odometry (VO). I defined and computed the effective frames for VO and experimented with and used this approach for context-aware keyframe detection. The results show that the number of frames, using inertial data to infer the decent frames, is decreased

    Bioinspired Autonomous Visual Vertical Control of a Quadrotor Unmanned Aerial Vehicle

    Get PDF
    Near-ground maneuvers, such as hover, approach, and landing, are key elements of autonomy in unmanned aerial vehicles. Such maneuvers have been tackled conventionally by measuring or estimating the velocity and the height above the ground, often using ultrasonic or laser range finders. Near-ground maneuvers are naturally mastered by flying birds and insects because objects below may be of interest for food or shelter. These animals perform such maneuvers efficiently using only the available vision and vestibular sensory information. In this paper, the time-to-contact (tau) theory, which conceptualizes the visual strategy with which many species are believed to approach objects, is presented as a solution for relative ground distance control for unmanned aerial vehicles. The paper shows how such an approach can be visually guided without knowledge of height and velocity relative to the ground. A control scheme that implements the tau strategy is developed employing only visual information from a monocular camera and an inertial measurement unit. To achieve reliable visual information at a high rate, a novel filtering system is proposed to complement the control system. The proposed system is implemented onboard an experimental quadrotor unmanned aerial vehicle and is shown to not only successfully land and approach ground, but also to enable the user to choose the dynamic characteristics of the approach. The methods presented in this paper are applicable to both aerial and space autonomous vehicles

    Receiver algorithms that enable multi-mode baseband terminals

    Get PDF

    Fusion-layer-based machine vision for intelligent transportation systems/

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 307-317).Environment understanding technology is very vital for intelligent vehicles that are expected to automatically respond to fast changing environment and dangerous situations. To obtain perception abilities, we should automatically detect static and dynamic obstacles, and obtain their related information, such as, locations, speed, collision/occlusion possibility, and other dynamic current/historic information. Conventional methods independently detect individual information, which is normally noisy and not very reliable. Instead we propose fusion-based and layered-based information-retrieval methodology to systematically detect obstacles and obtain their location/timing information for visible and infrared sequences. The proposed obstacle detection methodologies take advantage of connection between different information and increase the computational accuracy of obstacle information estimation, thus improving environment understanding abilities, and driving safety.by Yajun Fang.Ph.D

    UAV or Drones for Remote Sensing Applications in GPS/GNSS Enabled and GPS/GNSS Denied Environments

    Get PDF
    The design of novel UAV systems and the use of UAV platforms integrated with robotic sensing and imaging techniques, as well as the development of processing workflows and the capacity of ultra-high temporal and spatial resolution data, have enabled a rapid uptake of UAVs and drones across several industries and application domains.This book provides a forum for high-quality peer-reviewed papers that broaden awareness and understanding of single- and multiple-UAV developments for remote sensing applications, and associated developments in sensor technology, data processing and communications, and UAV system design and sensing capabilities in GPS-enabled and, more broadly, Global Navigation Satellite System (GNSS)-enabled and GPS/GNSS-denied environments.Contributions include:UAV-based photogrammetry, laser scanning, multispectral imaging, hyperspectral imaging, and thermal imaging;UAV sensor applications; spatial ecology; pest detection; reef; forestry; volcanology; precision agriculture wildlife species tracking; search and rescue; target tracking; atmosphere monitoring; chemical, biological, and natural disaster phenomena; fire prevention, flood prevention; volcanic monitoring; pollution monitoring; microclimates; and land use;Wildlife and target detection and recognition from UAV imagery using deep learning and machine learning techniques;UAV-based change detection
    corecore