12,691 research outputs found

    Occlusion Handling using Semantic Segmentation and Visibility-Based Rendering for Mixed Reality

    Full text link
    Real-time occlusion handling is a major problem in outdoor mixed reality system because it requires great computational cost mainly due to the complexity of the scene. Using only segmentation, it is difficult to accurately render a virtual object occluded by complex objects such as trees, bushes etc. In this paper, we propose a novel occlusion handling method for real-time, outdoor, and omni-directional mixed reality system using only the information from a monocular image sequence. We first present a semantic segmentation scheme for predicting the amount of visibility for different type of objects in the scene. We also simultaneously calculate a foreground probability map using depth estimation derived from optical flow. Finally, we combine the segmentation result and the probability map to render the computer generated object and the real scene using a visibility-based rendering method. Our results show great improvement in handling occlusions compared to existing blending based methods

    Automatic annotation for weakly supervised learning of detectors

    Get PDF
    PhDObject detection in images and action detection in videos are among the most widely studied computer vision problems, with applications in consumer photography, surveillance, and automatic media tagging. Typically, these standard detectors are fully supervised, that is they require a large body of training data where the locations of the objects/actions in images/videos have been manually annotated. With the emergence of digital media, and the rise of high-speed internet, raw images and video are available for little to no cost. However, the manual annotation of object and action locations remains tedious, slow, and expensive. As a result there has been a great interest in training detectors with weak supervision where only the presence or absence of object/action in image/video is needed, not the location. This thesis presents approaches for weakly supervised learning of object/action detectors with a focus on automatically annotating object and action locations in images/videos using only binary weak labels indicating the presence or absence of object/action in images/videos. First, a framework for weakly supervised learning of object detectors in images is presented. In the proposed approach, a variation of multiple instance learning (MIL) technique for automatically annotating object locations in weakly labelled data is presented which, unlike existing approaches, uses inter-class and intra-class cue fusion to obtain the initial annotation. The initial annotation is then used to start an iterative process in which standard object detectors are used to refine the location annotation. Finally, to ensure that the iterative training of detectors do not drift from the object of interest, a scheme for detecting model drift is also presented. Furthermore, unlike most other methods, our weakly supervised approach is evaluated on data without manual pose (object orientation) annotation. Second, an analysis of the initial annotation of objects, using inter-class and intra-class cues, is carried out. From the analysis, a new method based on negative mining (NegMine) is presented for the initial annotation of both object and action data. The NegMine based approach is a much simpler formulation using only inter-class measure and requires no complex combinatorial optimisation but can still meet or outperform existing approaches including the previously pre3 sented inter-intra class cue fusion approach. Furthermore, NegMine can be fused with existing approaches to boost their performance. Finally, the thesis will take a step back and look at the use of generic object detectors as prior knowledge in weakly supervised learning of object detectors. These generic object detectors are typically based on sampling saliency maps that indicate if a pixel belongs to the background or foreground. A new approach to generating saliency maps is presented that, unlike existing approaches, looks beyond the current image of interest and into images similar to the current image. We show that our generic object proposal method can be used by itself to annotate the weakly labelled object data with surprisingly high accuracy

    LiDAR aided simulation pipeline for wireless communication in vehicular traffic scenarios

    Get PDF
    Abstract. Integrated Sensing and Communication (ISAC) is a modern technology under development for Sixth Generation (6G) systems. This thesis focuses on creating a simulation pipeline for dynamic vehicular traffic scenarios and a novel approach to reducing wireless communication overhead with a Light Detection and Ranging (LiDAR) based system. The simulation pipeline can be used to generate data sets for numerous problems. Additionally, the developed error model for vehicle detection algorithms can be used to identify LiDAR performance with respect to different parameters like LiDAR height, range, and laser point density. LiDAR behavior on traffic environment is provided as part of the results in this study. A periodic beam index map is developed by capturing antenna azimuth and elevation angles, which denote maximum Reference Signal Receive Power (RSRP) for a simulated receiver grid on the road and classifying areas using Support Vector Machine (SVM) algorithm to reduce the number of Synchronization Signal Blocks (SSBs) that are needed to be sent in Vehicle to Infrastructure (V2I) communication. This approach effectively reduces the wireless communication overhead in V2I communication

    Collaborative searching for video using the Físchlár system and a DiamondTouch table

    Get PDF
    Fischlar DT is one of a family of systems which support interactive searching and browsing through an archive of digital video information. Previous Fischlar systems have used a conventional screen, keyboard and mouse interface, but Fischlar-DT operates with using a horizontal, multiuser, touch sensitive tabletop known as a DiamondTouch. We present the Fischlar-DT system partly from a systems perspective, but mostly in terms of how its design and functionality supports collaborative searching. The contribution of the paper is thus the introduction of Fischlar-DT and a description of how design concerns for supporting collaborative search can be realised on a tabletop interface

    Complete Vision-Based Traffic Sign Recognition Supported by an I2V Communication System

    Get PDF
    This paper presents a complete traffic sign recognition system based on vision sensor onboard a moving vehicle which detects and recognizes up to one hundred of the most important road signs, including circular and triangular signs. A restricted Hough transform is used as detection method from the information extracted in contour images, while the proposed recognition system is based on Support Vector Machines (SVM). A novel solution to the problem of discarding detected signs that do not pertain to the host road is proposed. For that purpose infrastructure-to-vehicle (I2V) communication and a stereo vision sensor are used. Furthermore, the outputs provided by the vision sensor and the data supplied by the CAN Bus and a GPS sensor are combined to obtain the global position of the detected traffic signs, which is used to identify a traffic sign in the I2V communication. This paper presents plenty of tests in real driving conditions, both day and night, in which an average detection rate over 95% and an average recognition rate around 93% were obtained with an average runtime of 35 ms that allows real-time performance

    GCE As and A level subject criteria for science

    Get PDF
    corecore