12,463 research outputs found

    Action recognition using single-pixel time-of-flight detection

    Get PDF
    Action recognition is a challenging task that plays an important role in many robotic systems, which highly depend on visual input feeds. However, due to privacy concerns, it is important to find a method which can recognise actions without using visual feed. In this paper, we propose a concept for detecting actions while preserving the test subject's privacy. Our proposed method relies only on recording the temporal evolution of light pulses scattered back from the scene. Such data trace to record one action contains a sequence of one-dimensional arrays of voltage values acquired by a single-pixel detector at 1 GHz repetition rate. Information about both the distance to the object and its shape are embedded in the traces. We apply machine learning in the form of recurrent neural networks for data analysis and demonstrate successful action recognition. The experimental results show that our proposed method could achieve on average 96.47 % accuracy on the actions walking forward, walking backwards, sitting down, standing up and waving hand, using recurrent neural network

    Smart environment monitoring through micro unmanned aerial vehicles

    Get PDF
    In recent years, the improvements of small-scale Unmanned Aerial Vehicles (UAVs) in terms of flight time, automatic control, and remote transmission are promoting the development of a wide range of practical applications. In aerial video surveillance, the monitoring of broad areas still has many challenges due to the achievement of different tasks in real-time, including mosaicking, change detection, and object detection. In this thesis work, a small-scale UAV based vision system to maintain regular surveillance over target areas is proposed. The system works in two modes. The first mode allows to monitor an area of interest by performing several flights. During the first flight, it creates an incremental geo-referenced mosaic of an area of interest and classifies all the known elements (e.g., persons) found on the ground by an improved Faster R-CNN architecture previously trained. In subsequent reconnaissance flights, the system searches for any changes (e.g., disappearance of persons) that may occur in the mosaic by a histogram equalization and RGB-Local Binary Pattern (RGB-LBP) based algorithm. If present, the mosaic is updated. The second mode, allows to perform a real-time classification by using, again, our improved Faster R-CNN model, useful for time-critical operations. Thanks to different design features, the system works in real-time and performs mosaicking and change detection tasks at low-altitude, thus allowing the classification even of small objects. The proposed system was tested by using the whole set of challenging video sequences contained in the UAV Mosaicking and Change Detection (UMCD) dataset and other public datasets. The evaluation of the system by well-known performance metrics has shown remarkable results in terms of mosaic creation and updating, as well as in terms of change detection and object detection

    A New Vehicle Localization Scheme Based on Combined Optical Camera Communication and Photogrammetry

    Full text link
    The demand for autonomous vehicles is increasing gradually owing to their enormous potential benefits. However, several challenges, such as vehicle localization, are involved in the development of autonomous vehicles. A simple and secure algorithm for vehicle positioning is proposed herein without massively modifying the existing transportation infrastructure. For vehicle localization, vehicles on the road are classified into two categories: host vehicles (HVs) are the ones used to estimate other vehicles' positions and forwarding vehicles (FVs) are the ones that move in front of the HVs. The FV transmits modulated data from the tail (or back) light, and the camera of the HV receives that signal using optical camera communication (OCC). In addition, the streetlight (SL) data are considered to ensure the position accuracy of the HV. Determining the HV position minimizes the relative position variation between the HV and FV. Using photogrammetry, the distance between FV or SL and the camera of the HV is calculated by measuring the occupied image area on the image sensor. Comparing the change in distance between HV and SLs with the change in distance between HV and FV, the positions of FVs are determined. The performance of the proposed technique is analyzed, and the results indicate a significant improvement in performance. The experimental distance measurement validated the feasibility of the proposed scheme

    RGBD Datasets: Past, Present and Future

    Full text link
    Since the launch of the Microsoft Kinect, scores of RGBD datasets have been released. These have propelled advances in areas from reconstruction to gesture recognition. In this paper we explore the field, reviewing datasets across eight categories: semantics, object pose estimation, camera tracking, scene reconstruction, object tracking, human actions, faces and identification. By extracting relevant information in each category we help researchers to find appropriate data for their needs, and we consider which datasets have succeeded in driving computer vision forward and why. Finally, we examine the future of RGBD datasets. We identify key areas which are currently underexplored, and suggest that future directions may include synthetic data and dense reconstructions of static and dynamic scenes.Comment: 8 pages excluding references (CVPR style

    Bio-inspired vision-based leader-follower formation flying in the presence of delays

    Get PDF
    Flocking starlings at dusk are known for the mesmerizing and intricate shapes they generate, as well as how fluid these shapes change. They seem to do this effortlessly. Real-life vision-based flocking has not been achieved in micro-UAVs (micro Unmanned Aerial Vehicles) to date. Towards this goal, we make three contributions in this paper: (i) we used a computational approach to develop a bio-inspired architecture for vision-based Leader-Follower formation flying on two micro-UAVs. We believe that the minimal computational cost of the resulting algorithm makes it suitable for object detection and tracking during high-speed flocking; (ii) we show that provided delays in the control loop of a micro-UAV are below a critical value, Kalman filter-based estimation algorithms are not required to achieve Leader-Follower formation flying; (iii) unlike previous approaches, we do not use external observers, such as GPS signals or synchronized communication with flock members. These three contributions could be useful in achieving vision-based flocking in GPS-denied environments on computationally-limited agents

    Cross-calibration of Time-of-flight and Colour Cameras

    Get PDF
    Time-of-flight cameras provide depth information, which is complementary to the photometric appearance of the scene in ordinary images. It is desirable to merge the depth and colour information, in order to obtain a coherent scene representation. However, the individual cameras will have different viewpoints, resolutions and fields of view, which means that they must be mutually calibrated. This paper presents a geometric framework for this multi-view and multi-modal calibration problem. It is shown that three-dimensional projective transformations can be used to align depth and parallax-based representations of the scene, with or without Euclidean reconstruction. A new evaluation procedure is also developed; this allows the reprojection error to be decomposed into calibration and sensor-dependent components. The complete approach is demonstrated on a network of three time-of-flight and six colour cameras. The applications of such a system, to a range of automatic scene-interpretation problems, are discussed.Comment: 18 pages, 12 figures, 3 table

    Automatic Detection of Calibration Grids in Time-of-Flight Images

    Get PDF
    It is convenient to calibrate time-of-flight cameras by established methods, using images of a chequerboard pattern. The low resolution of the amplitude image, however, makes it difficult to detect the board reliably. Heuristic detection methods, based on connected image-components, perform very poorly on this data. An alternative, geometrically-principled method is introduced here, based on the Hough transform. The projection of a chequerboard is represented by two pencils of lines, which are identified as oriented clusters in the gradient-data of the image. A projective Hough transform is applied to each of the two clusters, in axis-aligned coordinates. The range of each transform is properly bounded, because the corresponding gradient vectors are approximately parallel. Each of the two transforms contains a series of collinear peaks; one for every line in the given pencil. This pattern is easily detected, by sweeping a dual line through the transform. The proposed Hough-based method is compared to the standard OpenCV detection routine, by application to several hundred time-of-flight images. It is shown that the new method detects significantly more calibration boards, over a greater variety of poses, without any overall loss of accuracy. This conclusion is based on an analysis of both geometric and photometric error.Comment: 11 pages, 11 figures, 1 tabl

    Improved depth recovery in consumer depth cameras via disparity space fusion within cross-spectral stereo.

    Get PDF
    We address the issue of improving depth coverage in consumer depth cameras based on the combined use of cross-spectral stereo and near infra-red structured light sensing. Specifically we show that fusion of disparity over these modalities, within the disparity space image, prior to disparity optimization facilitates the recovery of scene depth information in regions where structured light sensing fails. We show that this joint approach, leveraging disparity information from both structured light and cross-spectral sensing, facilitates the joint recovery of global scene depth comprising both texture-less object depth, where conventional stereo otherwise fails, and highly reflective object depth, where structured light (and similar) active sensing commonly fails. The proposed solution is illustrated using dense gradient feature matching and shown to outperform prior approaches that use late-stage fused cross-spectral stereo depth as a facet of improved sensing for consumer depth cameras
    • …
    corecore