19,742 research outputs found
An Efficient Algorithm for Video Super-Resolution Based On a Sequential Model
In this work, we propose a novel procedure for video super-resolution, that
is the recovery of a sequence of high-resolution images from its low-resolution
counterpart. Our approach is based on a "sequential" model (i.e., each
high-resolution frame is supposed to be a displaced version of the preceding
one) and considers the use of sparsity-enforcing priors. Both the recovery of
the high-resolution images and the motion fields relating them is tackled. This
leads to a large-dimensional, non-convex and non-smooth problem. We propose an
algorithmic framework to address the latter. Our approach relies on fast
gradient evaluation methods and modern optimization techniques for
non-differentiable/non-convex problems. Unlike some other previous works, we
show that there exists a provably-convergent method with a complexity linear in
the problem dimensions. We assess the proposed optimization method on {several
video benchmarks and emphasize its good performance with respect to the state
of the art.}Comment: 37 pages, SIAM Journal on Imaging Sciences, 201
Magnetic-Visual Sensor Fusion-based Dense 3D Reconstruction and Localization for Endoscopic Capsule Robots
Reliable and real-time 3D reconstruction and localization functionality is a
crucial prerequisite for the navigation of actively controlled capsule
endoscopic robots as an emerging, minimally invasive diagnostic and therapeutic
technology for use in the gastrointestinal (GI) tract. In this study, we
propose a fully dense, non-rigidly deformable, strictly real-time,
intraoperative map fusion approach for actively controlled endoscopic capsule
robot applications which combines magnetic and vision-based localization, with
non-rigid deformations based frame-to-model map fusion. The performance of the
proposed method is demonstrated using four different ex-vivo porcine stomach
models. Across different trajectories of varying speed and complexity, and four
different endoscopic cameras, the root mean square surface reconstruction
errors 1.58 to 2.17 cm.Comment: submitted to IROS 201
LRMM: Learning to Recommend with Missing Modalities
Multimodal learning has shown promising performance in content-based
recommendation due to the auxiliary user and item information of multiple
modalities such as text and images. However, the problem of incomplete and
missing modality is rarely explored and most existing methods fail in learning
a recommendation model with missing or corrupted modalities. In this paper, we
propose LRMM, a novel framework that mitigates not only the problem of missing
modalities but also more generally the cold-start problem of recommender
systems. We propose modality dropout (m-drop) and a multimodal sequential
autoencoder (m-auto) to learn multimodal representations for complementing and
imputing missing modalities. Extensive experiments on real-world Amazon data
show that LRMM achieves state-of-the-art performance on rating prediction
tasks. More importantly, LRMM is more robust to previous methods in alleviating
data-sparsity and the cold-start problem.Comment: 11 pages, EMNLP 201
Deep Learning based Recommender System: A Survey and New Perspectives
With the ever-growing volume of online information, recommender systems have
been an effective strategy to overcome such information overload. The utility
of recommender systems cannot be overstated, given its widespread adoption in
many web applications, along with its potential impact to ameliorate many
problems related to over-choice. In recent years, deep learning has garnered
considerable interest in many research fields such as computer vision and
natural language processing, owing not only to stellar performance but also the
attractive property of learning feature representations from scratch. The
influence of deep learning is also pervasive, recently demonstrating its
effectiveness when applied to information retrieval and recommender systems
research. Evidently, the field of deep learning in recommender system is
flourishing. This article aims to provide a comprehensive review of recent
research efforts on deep learning based recommender systems. More concretely,
we provide and devise a taxonomy of deep learning based recommendation models,
along with providing a comprehensive summary of the state-of-the-art. Finally,
we expand on current trends and provide new perspectives pertaining to this new
exciting development of the field.Comment: The paper has been accepted by ACM Computing Surveys.
https://doi.acm.org/10.1145/328502
Image enhancement from a stabilised video sequence
The aim of video stabilisation is to create a new video sequence where the motions (i.e. rotations, translations) and scale differences between frames (or parts of a frame) have effectively been removed. These stabilisation effects can be obtained via digital video processing techniques which use the information extracted from the video sequence itself, with no need for additional hardware or knowledge about camera physical motion.
A video sequence usually contains a large overlap between successive frames, and regions of the same scene are sampled at different positions. In this paper, this multiple sampling is combined to achieve images with a higher spatial resolution. Higher resolution imagery play an important role in assisting in the identification of people, vehicles, structures or objects of interest captured by surveillance cameras or by video cameras used in face recognition, traffic monitoring, traffic law reinforcement, driver assistance and automatic vehicle guidance systems
- …