2,753 research outputs found
Grounding semantics in robots for Visual Question Answering
In this thesis I describe an operational implementation of an object detection and description system that incorporates in an end-to-end Visual Question Answering system and evaluated it on two visual question answering datasets for compositional language and elementary visual reasoning
Neural NILM: Deep Neural Networks Applied to Energy Disaggregation
Energy disaggregation estimates appliance-by-appliance electricity
consumption from a single meter that measures the whole home's electricity
demand. Recently, deep neural networks have driven remarkable improvements in
classification performance in neighbouring machine learning fields such as
image classification and automatic speech recognition. In this paper, we adapt
three deep neural network architectures to energy disaggregation: 1) a form of
recurrent neural network called `long short-term memory' (LSTM); 2) denoising
autoencoders; and 3) a network which regresses the start time, end time and
average power demand of each appliance activation. We use seven metrics to test
the performance of these algorithms on real aggregate power data from five
appliances. Tests are performed against a house not seen during training and
against houses seen during training. We find that all three neural nets achieve
better F1 scores (averaged over all five appliances) than either combinatorial
optimisation or factorial hidden Markov models and that our neural net
algorithms generalise well to an unseen house.Comment: To appear in ACM BuildSys'15, November 4--5, 2015, Seou
Fully Convolutional Network with Multi-Step Reinforcement Learning for Image Processing
This paper tackles a new problem setting: reinforcement learning with
pixel-wise rewards (pixelRL) for image processing. After the introduction of
the deep Q-network, deep RL has been achieving great success. However, the
applications of deep RL for image processing are still limited. Therefore, we
extend deep RL to pixelRL for various image processing applications. In
pixelRL, each pixel has an agent, and the agent changes the pixel value by
taking an action. We also propose an effective learning method for pixelRL that
significantly improves the performance by considering not only the future
states of the own pixel but also those of the neighbor pixels. The proposed
method can be applied to some image processing tasks that require pixel-wise
manipulations, where deep RL has never been applied. We apply the proposed
method to three image processing tasks: image denoising, image restoration, and
local color enhancement. Our experimental results demonstrate that the proposed
method achieves comparable or better performance, compared with the
state-of-the-art methods based on supervised learning.Comment: Accepted to AAAI 201
Discrete Denoising with Shifts
We introduce S-DUDE, a new algorithm for denoising DMC-corrupted data. The
algorithm, which generalizes the recently introduced DUDE (Discrete Universal
DEnoiser) of Weissman et al., aims to compete with a genie that has access, in
addition to the noisy data, also to the underlying clean data, and can choose
to switch, up to times, between sliding window denoisers in a way that
minimizes the overall loss. When the underlying data form an individual
sequence, we show that the S-DUDE performs essentially as well as this genie,
provided that is sub-linear in the size of the data. When the clean data is
emitted by a piecewise stationary process, we show that the S-DUDE achieves the
optimum distribution-dependent performance, provided that the same
sub-linearity condition is imposed on the number of switches. To further
substantiate the universal optimality of the S-DUDE, we show that when the
number of switches is allowed to grow linearly with the size of the data,
\emph{any} (sequence of) scheme(s) fails to compete in the above senses. Using
dynamic programming, we derive an efficient implementation of the S-DUDE, which
has complexity (time and memory) growing only linearly with the data size and
the number of switches . Preliminary experimental results are presented,
suggesting that S-DUDE has the capacity to significantly improve on the
performance attained by the original DUDE in applications where the nature of
the data abruptly changes in time (or space), as is often the case in practice.Comment: 30 pages, 3 figures, submitted to IEEE Trans. Inform. Theor
Image Restoration Using Joint Statistical Modeling in Space-Transform Domain
This paper presents a novel strategy for high-fidelity image restoration by
characterizing both local smoothness and nonlocal self-similarity of natural
images in a unified statistical manner. The main contributions are three-folds.
First, from the perspective of image statistics, a joint statistical modeling
(JSM) in an adaptive hybrid space-transform domain is established, which offers
a powerful mechanism of combining local smoothness and nonlocal self-similarity
simultaneously to ensure a more reliable and robust estimation. Second, a new
form of minimization functional for solving image inverse problem is formulated
using JSM under regularization-based framework. Finally, in order to make JSM
tractable and robust, a new Split-Bregman based algorithm is developed to
efficiently solve the above severely underdetermined inverse problem associated
with theoretical proof of convergence. Extensive experiments on image
inpainting, image deblurring and mixed Gaussian plus salt-and-pepper noise
removal applications verify the effectiveness of the proposed algorithm.Comment: 14 pages, 18 figures, 7 Tables, to be published in IEEE Transactions
on Circuits System and Video Technology (TCSVT). High resolution pdf version
and Code can be found at: http://idm.pku.edu.cn/staff/zhangjian/IRJSM
- …