10,467 research outputs found

    RGB-D datasets using microsoft kinect or similar sensors: a survey

    Get PDF
    RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms

    Object tracking using log-polar transformation

    Get PDF
    In this thesis, we use log-polar transform to solve object tracking. Object tracking in video sequences is a fundamental problem in computer vision. Even though object tracking is being studied extensively, still some challenges need to be addressed, such as appearance variations, large scale and rotation variations, and occlusion. We implemented a novel tracking algorithm which works robustly in the presence of large scale changes, rotation, occlusion, illumination changes, perspective transformations and some appearance changes. Log-polar transformation is used to achieve robustness to scale and rotation. Our object tracking approach is based on template matching technique. Template matching is based on extracting an example image, template, of an object in first frame, and then finding the region which best suites this template in the subsequent frames. In template matching, we implemented a fixed template algorithm and a template update algorithm. In the fixed template algorithm we use same template for the entire image sequence, where as in the template update algorithm the template is updated according to the changes in object image. The fixed template algorithm is faster; the template update algorithm is more robust to appearance changes in the object being tracked. The proposed object tracking is highly robust to scale, rotation, illumination changes and occlusion with good implementation speed

    Activity Monitoring Made Easier by Smart 360-degree Cameras

    Get PDF
    This paper proposes the use of smart 360-degree cameras for activity monitoring. By exploiting the geometric properties of these cameras and adopting off-the-shelf tracking algorithms adapted to equirectangular images, this paper shows how simple it becomes deploying a camera network, and detecting the presence of pedestrians in predefined regions of interest with minimal information on the camera, namely its height. The paper further shows that smart 360-degree cameras can enhance motion understanding in the environment and proposes a simple method to estimate the heatmap of the scene to highlight regions where pedestrians are more often present. Quantitative and qualitative results demonstrate the effectiveness of the proposed approach

    DroTrack: High-speed Drone-based Object Tracking Under Uncertainty

    Full text link
    We present DroTrack, a high-speed visual single-object tracking framework for drone-captured video sequences. Most of the existing object tracking methods are designed to tackle well-known challenges, such as occlusion and cluttered backgrounds. The complex motion of drones, i.e., multiple degrees of freedom in three-dimensional space, causes high uncertainty. The uncertainty problem leads to inaccurate location predictions and fuzziness in scale estimations. DroTrack solves such issues by discovering the dependency between object representation and motion geometry. We implement an effective object segmentation based on Fuzzy C Means (FCM). We incorporate the spatial information into the membership function to cluster the most discriminative segments. We then enhance the object segmentation by using a pre-trained Convolution Neural Network (CNN) model. DroTrack also leverages the geometrical angular motion to estimate a reliable object scale. We discuss the experimental results and performance evaluation using two datasets of 51,462 drone-captured frames. The combination of the FCM segmentation and the angular scaling increased DroTrack precision by up to 9%9\% and decreased the centre location error by 162162 pixels on average. DroTrack outperforms all the high-speed trackers and achieves comparable results in comparison to deep learning trackers. DroTrack offers high frame rates up to 1000 frame per second (fps) with the best location precision, more than a set of state-of-the-art real-time trackers.Comment: 10 pages, 12 figures, FUZZ-IEEE 202

    Foveated Path Tracing with Fast Reconstruction and Efficient Sample Distribution

    Get PDF
    Polunseuranta on tietokonegrafiikan piirtotekniikka, jota on käytetty pääasiassa ei-reaaliaikaisen realistisen piirron tekemiseen. Polunseuranta tukee luonnostaan monia muilla tekniikoilla vaikeasti saavutettavia todellisen valon ilmiöitä kuten heijastuksia ja taittumista. Reaaliaikainen polunseuranta on hankalaa polunseurannan suuren laskentavaatimuksen takia. Siksi nykyiset reaaliaikaiset polunseurantasysteemi tuottavat erittäin kohinaisia kuvia, jotka tyypillisesti suodatetaan jälkikäsittelykohinanpoisto-suodattimilla. Erittäin immersiivisiä käyttäjäkokemuksia voitaisiin luoda polunseurannalla, joka täyttäisi laajennetun todellisuuden vaatimukset suuresta resoluutiosta riittävän matalassa vasteajassa. Yksi mahdollinen ratkaisu näiden vaatimusten täyttämiseen voisi olla katsekeskeinen polunseuranta, jossa piirron resoluutiota vähennetään katseen reunoilla. Tämän johdosta piirron laatu on katseen reunoilla sekä harvaa että kohinaista, mikä asettaa suuren roolin lopullisen kuvan koostavalle suodattimelle. Tässä työssä esitellään ensimmäinen reaaliajassa toimiva regressionsuodatin. Suodatin on suunniteltu kohinaisille kuville, joissa on yksi polunseurantanäyte pikseliä kohden. Nopea suoritus saavutetaan tiileissä käsittelemällä ja nopealla sovituksen toteutuksella. Lisäksi työssä esitellään Visual-Polar koordinaattiavaruus, joka jakaa polunseurantanäytteet siten, että niiden jakauma seuraa silmän herkkyysmallia. Visual-Polar-avaruuden etu muihin tekniikoiden nähden on että se vähentää työmäärää sekä polunseurannassa että suotimessa. Nämä tekniikat esittelevät toimivan prototyypin katsekeskeisestä polunseurannasta, ja saattavat toimia tienraivaajina laajamittaiselle realistisen reaaliaikaisen polunseurannan käyttöönotolle.Photo-realistic offline rendering is currently done with path tracing, because it naturally produces many real-life light effects such as reflections, refractions and caustics. These effects are hard to achieve with other rendering techniques. However, path tracing in real time is complicated due to its high computational demand. Therefore, current real-time path tracing systems can only generate very noisy estimate of the final frame, which is then denoised with a post-processing reconstruction filter. A path tracing-based rendering system capable of filling the high resolution in the low latency requirements of mixed reality devices would generate a very immersive user experience. One possible solution for fulfilling these requirements could be foveated path tracing, wherein the rendering resolution is reduced in the periphery of the human visual system. The key challenge is that the foveated path tracing in the periphery is both sparse and noisy, placing high demands on the reconstruction filter. This thesis proposes the first regression-based reconstruction filter for path tracing that runs in real time. The filter is designed for highly noisy one sample per pixel inputs. The fast execution is accomplished with blockwise processing and fast implementation of the regression. In addition, a novel Visual-Polar coordinate space which distributes the samples according to the contrast sensitivity model of the human visual system is proposed. The specialty of Visual-Polar space is that it reduces both path tracing and reconstruction work because both of them can be done with smaller resolution. These techniques enable a working prototype of a foveated path tracing system and may work as a stepping stone towards wider commercial adoption of photo-realistic real-time path tracing

    Cortical Dynamics of Navigation and Steering in Natural Scenes: Motion-Based Object Segmentation, Heading, and Obstacle Avoidance

    Full text link
    Visually guided navigation through a cluttered natural scene is a challenging problem that animals and humans accomplish with ease. The ViSTARS neural model proposes how primates use motion information to segment objects and determine heading for purposes of goal approach and obstacle avoidance in response to video inputs from real and virtual environments. The model produces trajectories similar to those of human navigators. It does so by predicting how computationally complementary processes in cortical areas MT-/MSTv and MT+/MSTd compute object motion for tracking and self-motion for navigation, respectively. The model retina responds to transients in the input stream. Model V1 generates a local speed and direction estimate. This local motion estimate is ambiguous due to the neural aperture problem. Model MT+ interacts with MSTd via an attentive feedback loop to compute accurate heading estimates in MSTd that quantitatively simulate properties of human heading estimation data. Model MT interacts with MSTv via an attentive feedback loop to compute accurate estimates of speed, direction and position of moving objects. This object information is combined with heading information to produce steering decisions wherein goals behave like attractors and obstacles behave like repellers. These steering decisions lead to navigational trajectories that closely match human performance.National Science Foundation (SBE-0354378, BCS-0235398); Office of Naval Research (N00014-01-1-0624); National Geospatial Intelligence Agency (NMA201-01-1-2016

    Object Detection and Tracking for ASV

    Get PDF
    In this thesis automatic Object Detection system is presented. Object Detection is performed by different algorithms. As reading many literature we have observed that detecting objects in particular video sequence or by any surveillance cameras is a really challenging task in computer vision application because in sea the atmosphere affects a lot in the detection. Therefore we felt that there can be a wide range of possibilities are open in relation to detection. In order to improve the object detection, we developed image stabilization software on top of the image acquisition. First image stabilization has been performed over the raw data of ROAZ II. After achieving stabled video or images, object detection algorithm is performed using color based segmentation. Field tests have been performed with a data set from the ROAZ-II and during it shows the effectiveness of the approach. And system is able to achieve object detection in video or images with high accuracy
    • …
    corecore