Search CORE

739 research outputs found

Deep Learning for Single Image Super-Resolution: A Brief Review

Author: Liao Q
Tian Y
Wang W
Xue J-H
Yang W
Zhang X
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/07/2019
Field of study

Single image super-resolution (SISR) is a notoriously challenging ill-posed problem, which aims to obtain a high-resolution (HR) output from one of its low-resolution (LR) versions. To solve the SISR problem, recently powerful deep learning algorithms have been employed and achieved the state-of-the-art performance. In this survey, we review representative deep learning-based SISR methods, and group them into two categories according to their major contributions to two essential aspects of SISR: the exploration of efficient neural network architectures for SISR, and the development of effective optimization objectives for deep SISR learning. For each category, a baseline is firstly established and several critical limitations of the baseline are summarized. Then representative works on overcoming these limitations are presented based on their original contents as well as our critical understandings and analyses, and relevant comparisons are conducted from a variety of perspectives. Finally we conclude this review with some vital current challenges and future trends in SISR leveraging deep learning algorithms.Comment: Accepted by IEEE Transactions on Multimedia (TMM

arXiv.org e-Print Archive

UCL Discovery

Hidden Two-Stream Convolutional Networks for Action Recognition

Author: Hauptmann Alexander G.
Lan Zhenzhong
Newsam Shawn
Zhu Yi
Publication venue
Publication date: 30/10/2018
Field of study

Analyzing videos of human actions involves understanding the temporal relationships among video frames. State-of-the-art action recognition approaches rely on traditional optical flow estimation methods to pre-compute motion information for CNNs. Such a two-stage approach is computationally expensive, storage demanding, and not end-to-end trainable. In this paper, we present a novel CNN architecture that implicitly captures motion information between adjacent frames. We name our approach hidden two-stream CNNs because it only takes raw video frames as input and directly predicts action classes without explicitly computing optical flow. Our end-to-end approach is 10x faster than its two-stage baseline. Experimental results on four challenging action recognition datasets: UCF101, HMDB51, THUMOS14 and ActivityNet v1.2 show that our approach significantly outperforms the previous best real-time approaches.Comment: Accepted at ACCV 2018, camera ready. Code available at https://github.com/bryanyzhu/Hidden-Two-Strea

arXiv.org e-Print Archive

Crossref

Bottom-Up and Top-Down Reasoning with Hierarchical Rectified Gaussians

Author: Hu Peiyun
Ramanan Deva
Publication venue
Publication date: 04/05/2016
Field of study

Convolutional neural nets (CNNs) have demonstrated remarkable performance in recent history. Such approaches tend to work in a unidirectional bottom-up feed-forward fashion. However, practical experience and biological evidence tells us that feedback plays a crucial role, particularly for detailed spatial understanding tasks. This work explores bidirectional architectures that also reason with top-down feedback: neural units are influenced by both lower and higher-level units. We do so by treating units as rectified latent variables in a quadratic energy function, which can be seen as a hierarchical Rectified Gaussian model (RGs). We show that RGs can be optimized with a quadratic program (QP), that can in turn be optimized with a recurrent neural network (with rectified linear units). This allows RGs to be trained with GPU-optimized gradient descent. From a theoretical perspective, RGs help establish a connection between CNNs and hierarchical probabilistic models. From a practical perspective, RGs are well suited for detailed spatial tasks that can benefit from top-down reasoning. We illustrate them on the challenging task of keypoint localization under occlusions, where local bottom-up evidence may be misleading. We demonstrate state-of-the-art results on challenging benchmarks.Comment: To appear in CVPR 201

arXiv.org e-Print Archive

Crossref

Representation Learning: A Review and New Perspectives

Author: Bengio Yoshua
Courville Aaron
Vincent Pascal
Publication venue
Publication date: 01/01/2014
Field of study

The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can be used to help design representations, learning with generic priors can also be used, and the quest for AI is motivating the design of more powerful representation-learning algorithms implementing such priors. This paper reviews recent work in the area of unsupervised feature learning and deep learning, covering advances in probabilistic models, auto-encoders, manifold learning, and deep networks. This motivates longer-term unanswered questions about the appropriate objectives for learning good representations, for computing representations (i.e., inference), and the geometrical connections between representation learning, density estimation and manifold learning

arXiv.org e-Print Archive

CiteSeerX

Spatial Learning and Action Planning in a Prefrontal Cortical Network Model

Author: A Alvernhe
A Arleo
A Arleo
A Arleo
A Bieszczad
A Johnson
A Peyrache
AB Mulder
AM Graybiel
AM Thierry
Angelo Arleo
AS Etienne
B Poucet
B Poucet
B Rivard
BB Averbeck
BL McNaughton
CF Doeller
CF Doeller
D Rubino
D Sheynikhovich
DA Lewis
DA Nitz
Denis Sheynikhovich
DG Amaral
DJ Foster
DL Schacter
DP Buxhoeveden
E Koechlin
EC Tolman
EC Tolman
EK Miller
EL Rich
F Fleuret
G Dragoi
G Girardeau
G Lei
G Rainer
G Winocur
H Frezza-Buet
H Kita
H Markram
H Mushiake
H Voicu
HA Mallot
HBM Uylings
HJ Spiers
HJ Spiers
HJ Spiers
I Lieblich
J Ferbinteanu
J Meyer
J O'Keefe
J O'Keefe
J Szentágothai
J Wu
JC Eccles
JC Horton
JD Cohen
JJ Knierim
JM Fuster
JM Fuster
JP Aggleton
JP Banquet
K Benchenane
Karim Benchenane
KB Kjelstrup
L Dollé
L Rondi-Reig
L Tremblay
Louis-Emmanuel Martinet
M Bezzi
M Mehta
M Watanabe
MA Wilson
ME Hasselmo
MG Packard
MJ Mataric
MM Botvinick
MO Franz
MW Jones
MW Jung
N Burgess
N Schmajuk
NA Schmajuk
NL Dallal
NM White
O Trullier
Olaf Sporns
P Byrne
P Rakic
PH Rudebeck
PL Gabbott
PLA Gabbott
RA Koene
RA Poldrack
RP Vertes
RS Sutton
RU Muller
S Becker
S Dehaene
S Granon
S Otani
SB Fountain
SG Rao
SI Wiener
T Hafting
T Macuda
T Shallice
TM Jay
TP McNamara
TP Vogels
V Hok
VB Mountcastle
VB Mountcastle
W Schultz
WA Roberts
WE Skaggs
WF Asaad
WH Meck
Y Burnod
Y Burnod
Y Goto
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

The interplay between hippocampus and prefrontal cortex (PFC) is fundamental to spatial cognition. Complementing hippocampal place coding, prefrontal representations provide more abstract and hierarchically organized memories suitable for decision making. We model a prefrontal network mediating distributed information processing for spatial learning and action planning. Specific connectivity and synaptic adaptation principles shape the recurrent dynamics of the network arranged in cortical minicolumns. We show how the PFC columnar organization is suitable for learning sparse topological-metrical representations from redundant hippocampal inputs. The recurrent nature of the network supports multilevel spatial processing, allowing structural features of the environment to be encoded. An activation diffusion mechanism spreads the neural activity through the column population leading to trajectory planning. The model provides a functional framework for interpreting the activity of PFC neurons recorded during navigation tasks. We illustrate the link from single unit activity to behavioral responses. The results suggest plausible neural mechanisms subserving the cognitive “insight” capability originally attributed to rodents by Tolman & Honzik. Our time course analysis of neural responses shows how the interaction between hippocampus and PFC can yield the encoding of manifold information pertinent to spatial planning, including prospective coding and distance-to-goal correlates

CiteSeerX

Crossref

Directory of Open Access Journals

HAL-Inserm

PubMed Central

Hal-Diderot

Unsupervised Training of Deep Neural Networks for Motion Estimation

Author: Ahmadi A
Publication venue: 'Queen Mary University of London'
Publication date: 22/07/2019
Field of study

PhDThis thesis addresses the problem of motion estimation, that is, the estimation of a eld that describes how pixels move from a reference frame to a target frame, using Deep Neural Networks (DNNs). In contrast to classic methods, we don't solve an optimization problem at test time. We train DNNs once and apply it in one pass during the test which reduces the computational complexity. The major contribution is that in contrast to a supervised method, we train our DNNs in an unsupervised way. By unsupervised, we mean without the need for ground truth motion elds which are expensive to obtain for real scenes. More speci cally, we have trained our networks by designing cost functions inspired by classical optical ow estimation schemes and generative methods in Computer Vision. We rst propose a straightforward CNN method that is trained to optimize the brightness constancy constraint and we embed it in a classical multiscale scheme in order to predict motions that are large in magnitude (GradNet). We show that GradNet generalizes well to an unknown dataset and performed comparably with state-of-the-art unsupervised methods at that time. Second, we propose a convolutional Siamese architecture wherein is embedded a new soft warping scheme applied in a multiscale framework and is trained to optimize a higher-level feature constancy constraint (LikeNet). The architecture of LikeNet allows a trade-o between the computational load and memory and is 98% smaller than other SOA methods in terms of learned parameters. We show that LikeNet performs on par with SOA approaches and the best among uni-directional methods, methods that calculate motion eld in one pass. Third, we propose a novel approach to distill slower LikeNet in a much faster regression neural network without losing much of the accuracy (QLikeNet). The results show that using DNNs is a promising direction for motion estimation, although further improvements are required as classical methods yet perform the best

Queen Mary Research Online

An improved algorithm for learning long-term dependency problems in adaptive processing of data structures

Author: Chi ZG
Cho SY
Siu WC
Tsoi AC
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/12/2014
Field of study

2003-2004 > Academic research: refereed > Publication in refereed journalVersion of RecordPublishe

PolyU Institutional Repository