1,592 research outputs found
Selected Topics in Bayesian Image/Video Processing
In this dissertation, three problems in image deblurring, inpainting and virtual content insertion are solved in a Bayesian framework.;Camera shake, motion or defocus during exposure leads to image blur. Single image deblurring has achieved remarkable results by solving a MAP problem, but there is no perfect solution due to inaccurate image prior and estimator. In the first part, a new non-blind deconvolution algorithm is proposed. The image prior is represented by a Gaussian Scale Mixture(GSM) model, which is estimated from non-blurry images as training data. Our experimental results on a total twelve natural images have shown that more details are restored than previous deblurring algorithms.;In augmented reality, it is a challenging problem to insert virtual content in video streams by blending it with spatial and temporal information. A generic virtual content insertion (VCI) system is introduced in the second part. To the best of my knowledge, it is the first successful system to insert content on the building facades from street view video streams. Without knowing camera positions, the geometry model of a building facade is established by using a detection and tracking combined strategy. Moreover, motion stabilization, dynamic registration and color harmonization contribute to the excellent augmented performance in this automatic VCI system.;Coding efficiency is an important objective in video coding. In recent years, video coding standards have been developing by adding new tools. However, it costs numerous modifications in the complex coding systems. Therefore, it is desirable to consider alternative standard-compliant approaches without modifying the codec structures. In the third part, an exemplar-based data pruning video compression scheme for intra frame is introduced. Data pruning is used as a pre-processing tool to remove part of video data before they are encoded. At the decoder, missing data is reconstructed by a sparse linear combination of similar patches. The novelty is to create a patch library to exploit similarity of patches. The scheme achieves an average 4% bit rate reduction on some high definition videos
Component-based Attention for Large-scale Trademark Retrieval
The demand for large-scale trademark retrieval (TR) systems has significantly
increased to combat the rise in international trademark infringement.
Unfortunately, the ranking accuracy of current approaches using either
hand-crafted or pre-trained deep convolution neural network (DCNN) features is
inadequate for large-scale deployments. We show in this paper that the ranking
accuracy of TR systems can be significantly improved by incorporating hard and
soft attention mechanisms, which direct attention to critical information such
as figurative elements and reduce attention given to distracting and
uninformative elements such as text and background. Our proposed approach
achieves state-of-the-art results on a challenging large-scale trademark
dataset.Comment: Fix typos related to authors' informatio
A computational approach for obstruction-free photography
We present a unified computational approach for taking photos through reflecting or occluding elements such as windows and fences. Rather than capturing a single image, we instruct the user to take a short image sequence while slightly moving the camera. Differences that often exist in the relative position of the background and the obstructing elements from the camera allow us to separate them based on their motions, and to recover the desired background scene as if the visual obstructions were not there. We show results on controlled experiments and many real and practical scenarios, including shooting through reflections, fences, and raindrop-covered windows.Shell ResearchUnited States. Office of Naval Research (Navy Fund 6923196
Recent advances in deep learning for object detection
Object detection is a fundamental visual recognition problem in computer
vision and has been widely studied in the past decades. Visual object detection
aims to find objects of certain target classes with precise localization in a
given image and assign each object instance a corresponding class label. Due to
the tremendous successes of deep learning based image classification, object
detection techniques using deep learning have been actively studied in recent
years. In this paper, we give a comprehensive survey of recent advances in
visual object detection with deep learning. By reviewing a large body of recent
related work in literature, we systematically analyze the existing object
detection frameworks and organize the survey into three major parts: (i)
detection components, (ii) learning strategies, and (iii) applications &
benchmarks. In the survey, we cover a variety of factors affecting the
detection performance in detail, such as detector architectures, feature
learning, proposal generation, sampling strategies, etc. Finally, we discuss
several future directions to facilitate and spur future research for visual
object detection with deep learning. Keywords: Object Detection, Deep Learning,
Deep Convolutional Neural Network
DSAAR: distributed software architecture for autonomous robots
Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para obtenção do grau de Mestre em Engenharia ElectrotécnicaThis dissertation presents a software architecture called the Distributed Software Architecture for Autonomous Robots (DSAAR), which is designed to provide the fast development and prototyping of multi-robot systems. The DSAAR building blocks allow engineers to focus on the behavioural model of robots and collectives. This architecture is of special interest in domains where several human, robot, and software agents have to interact continuously. Thus, fast prototyping and reusability is a must. DSAAR tries to cope with these requirements towards
an advanced solution to the n-humans and m-robots problem with a set of design good practices and development tools.
This dissertation will also focus on Human-Robot Interaction, mainly on the subject of teleoperation. In teleoperation human judgement is an integral part of the process, heavily influenced by the telemetry data received from the remote environment. So the speed in which commands are given and the telemetry data is received, is of crucial importance. Using the DSAAR architecture a teleoperation approach is proposed. This approach was designed to provide all entities present in the network a shared reality, where every entity is an information source in an approach similar to the distributed blackboard. This solution was designed to accomplish a real time response, as well as, the completest perception of the robots’ surroundings.
Experimental results obtained with the physical robot suggest that the system is able to guarantee a close interaction between users and robot
- …