12,463 research outputs found
Action recognition using single-pixel time-of-flight detection
Action recognition is a challenging task that plays an important role in many robotic systems, which highly depend on visual input feeds. However, due to privacy concerns, it is important to find a method which can recognise actions without using visual feed. In this paper, we propose a concept for detecting actions while preserving the test subject's privacy. Our proposed method relies only on recording the temporal evolution of light pulses scattered back from the scene. Such data trace to record one action contains a sequence of one-dimensional arrays of voltage values acquired by a single-pixel detector at 1 GHz repetition rate. Information about both the distance to the object and its shape are embedded in the traces. We apply machine learning in the form of recurrent neural networks for data analysis and demonstrate successful action recognition. The experimental results show that our proposed method could achieve on average 96.47 % accuracy on the actions walking forward, walking backwards, sitting down, standing up and waving hand, using recurrent neural network
Smart environment monitoring through micro unmanned aerial vehicles
In recent years, the improvements of small-scale Unmanned Aerial Vehicles (UAVs) in terms of flight time, automatic control, and remote transmission are promoting the development of a wide range of practical applications. In aerial video surveillance, the monitoring of broad areas still has many challenges due to the achievement of different tasks in real-time, including mosaicking, change detection, and object detection. In this thesis work, a small-scale UAV based vision system to maintain regular surveillance over target areas is proposed. The system works in two modes. The first mode allows to monitor an area of interest by performing several flights. During the first flight, it creates an incremental geo-referenced mosaic of an area of interest and classifies all the known elements (e.g., persons) found on the ground by an improved Faster R-CNN architecture previously trained. In subsequent reconnaissance flights, the system searches for any changes (e.g., disappearance of persons) that may occur in the mosaic by a histogram equalization and RGB-Local Binary Pattern (RGB-LBP) based algorithm. If present, the mosaic is updated. The second mode, allows to perform a real-time classification by using, again, our improved Faster R-CNN model, useful for time-critical operations. Thanks to different design features, the system works in real-time and performs mosaicking and change detection tasks at low-altitude, thus allowing the classification even of small objects. The proposed system was tested by using the whole set of challenging video sequences contained in the UAV Mosaicking and Change Detection (UMCD) dataset and other public datasets. The evaluation of the system by well-known performance metrics has shown remarkable results in terms of mosaic creation and updating, as well as in terms of change detection and object detection
A New Vehicle Localization Scheme Based on Combined Optical Camera Communication and Photogrammetry
The demand for autonomous vehicles is increasing gradually owing to their
enormous potential benefits. However, several challenges, such as vehicle
localization, are involved in the development of autonomous vehicles. A simple
and secure algorithm for vehicle positioning is proposed herein without
massively modifying the existing transportation infrastructure. For vehicle
localization, vehicles on the road are classified into two categories: host
vehicles (HVs) are the ones used to estimate other vehicles' positions and
forwarding vehicles (FVs) are the ones that move in front of the HVs. The FV
transmits modulated data from the tail (or back) light, and the camera of the
HV receives that signal using optical camera communication (OCC). In addition,
the streetlight (SL) data are considered to ensure the position accuracy of the
HV. Determining the HV position minimizes the relative position variation
between the HV and FV. Using photogrammetry, the distance between FV or SL and
the camera of the HV is calculated by measuring the occupied image area on the
image sensor. Comparing the change in distance between HV and SLs with the
change in distance between HV and FV, the positions of FVs are determined. The
performance of the proposed technique is analyzed, and the results indicate a
significant improvement in performance. The experimental distance measurement
validated the feasibility of the proposed scheme
RGBD Datasets: Past, Present and Future
Since the launch of the Microsoft Kinect, scores of RGBD datasets have been
released. These have propelled advances in areas from reconstruction to gesture
recognition. In this paper we explore the field, reviewing datasets across
eight categories: semantics, object pose estimation, camera tracking, scene
reconstruction, object tracking, human actions, faces and identification. By
extracting relevant information in each category we help researchers to find
appropriate data for their needs, and we consider which datasets have succeeded
in driving computer vision forward and why.
Finally, we examine the future of RGBD datasets. We identify key areas which
are currently underexplored, and suggest that future directions may include
synthetic data and dense reconstructions of static and dynamic scenes.Comment: 8 pages excluding references (CVPR style
Bio-inspired vision-based leader-follower formation flying in the presence of delays
Flocking starlings at dusk are known for the mesmerizing and intricate shapes they generate, as well as how fluid these shapes change. They seem to do this effortlessly. Real-life vision-based flocking has not been achieved in micro-UAVs (micro Unmanned Aerial Vehicles) to date. Towards this goal, we make three contributions in this paper: (i) we used a computational approach to develop a bio-inspired architecture for vision-based Leader-Follower formation flying on two micro-UAVs. We believe that the minimal computational cost of the resulting algorithm makes it suitable for object detection and tracking during high-speed flocking; (ii) we show that provided delays in the control loop of a micro-UAV are below a critical value, Kalman filter-based estimation algorithms are not required to achieve Leader-Follower formation flying; (iii) unlike previous approaches, we do not use external observers, such as GPS signals or synchronized communication with flock members. These three contributions could be useful in achieving vision-based flocking in GPS-denied environments on computationally-limited agents
Cross-calibration of Time-of-flight and Colour Cameras
Time-of-flight cameras provide depth information, which is complementary to
the photometric appearance of the scene in ordinary images. It is desirable to
merge the depth and colour information, in order to obtain a coherent scene
representation. However, the individual cameras will have different viewpoints,
resolutions and fields of view, which means that they must be mutually
calibrated. This paper presents a geometric framework for this multi-view and
multi-modal calibration problem. It is shown that three-dimensional projective
transformations can be used to align depth and parallax-based representations
of the scene, with or without Euclidean reconstruction. A new evaluation
procedure is also developed; this allows the reprojection error to be
decomposed into calibration and sensor-dependent components. The complete
approach is demonstrated on a network of three time-of-flight and six colour
cameras. The applications of such a system, to a range of automatic
scene-interpretation problems, are discussed.Comment: 18 pages, 12 figures, 3 table
Automatic Detection of Calibration Grids in Time-of-Flight Images
It is convenient to calibrate time-of-flight cameras by established methods,
using images of a chequerboard pattern. The low resolution of the amplitude
image, however, makes it difficult to detect the board reliably. Heuristic
detection methods, based on connected image-components, perform very poorly on
this data. An alternative, geometrically-principled method is introduced here,
based on the Hough transform. The projection of a chequerboard is represented
by two pencils of lines, which are identified as oriented clusters in the
gradient-data of the image. A projective Hough transform is applied to each of
the two clusters, in axis-aligned coordinates. The range of each transform is
properly bounded, because the corresponding gradient vectors are approximately
parallel. Each of the two transforms contains a series of collinear peaks; one
for every line in the given pencil. This pattern is easily detected, by
sweeping a dual line through the transform. The proposed Hough-based method is
compared to the standard OpenCV detection routine, by application to several
hundred time-of-flight images. It is shown that the new method detects
significantly more calibration boards, over a greater variety of poses, without
any overall loss of accuracy. This conclusion is based on an analysis of both
geometric and photometric error.Comment: 11 pages, 11 figures, 1 tabl
Improved depth recovery in consumer depth cameras via disparity space fusion within cross-spectral stereo.
We address the issue of improving depth coverage in consumer depth cameras based on the combined use of cross-spectral stereo and near infra-red structured light sensing. Specifically we show that fusion of disparity over these modalities, within the disparity space image, prior to disparity optimization facilitates the recovery of scene depth information in regions where structured light sensing fails. We show that this joint approach, leveraging disparity information from both structured light and cross-spectral sensing, facilitates the joint recovery of global scene depth comprising both texture-less object depth, where conventional stereo otherwise fails, and highly reflective object depth, where structured light (and similar) active sensing commonly fails. The proposed solution is illustrated using dense gradient feature matching and shown to outperform prior approaches that use late-stage fused cross-spectral stereo depth as a facet of improved sensing for consumer depth cameras
- …