Search CORE

1,086 research outputs found

Evaluating Two-Stream CNN for Video Classification

Author: Jain M.
Ji S.
Krizhevsky A.
LeCun Y.
Mikolov T.
Peng X.
Schmidhuber J.
Simonyan K.
Simonyan K.
Socher R.
Soomro K.
Sutskever I.
Szegedy C.
Venugopalan S.
Ye G.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/04/2015
Field of study

Videos contain very rich semantic information. Traditional hand-crafted features are known to be inadequate in analyzing complex video semantics. Inspired by the huge success of the deep learning methods in analyzing image, audio and text data, significant efforts are recently being devoted to the design of deep nets for video analytics. Among the many practical needs, classifying videos (or video clips) based on their major semantic categories (e.g., "skiing") is useful in many applications. In this paper, we conduct an in-depth study to investigate important implementation options that may affect the performance of deep nets on video classification. Our evaluations are conducted on top of a recent two-stream convolutional neural network (CNN) pipeline, which uses both static frames and motion optical flows, and has demonstrated competitive performance against the state-of-the-art methods. In order to gain insights and to arrive at a practical guideline, many important options are studied, including network architectures, model fusion, learning parameters and the final prediction methods. Based on the evaluations, very competitive results are attained on two popular video classification benchmarks. We hope that the discussions and conclusions from this work can help researchers in related fields to quickly set up a good basis for further investigations along this very promising direction.Comment: ACM ICMR'1

arXiv.org e-Print Archive

Crossref

Receptive Field Block Net for Accurate and Fast Object Detection

Author: Brian A. Wandell
D Huang
D Weng
J. R. R. Uijlings
Karen Simonyan
M Brown
Mark Everingham
Olga Russakovsky
T-Y Lin
W Liu
Publication venue
Publication date: 26/07/2018
Field of study

Current top-performing object detectors depend on deep CNN backbones, such as ResNet-101 and Inception, benefiting from their powerful feature representations but suffering from high computational costs. Conversely, some lightweight model based detectors fulfil real time processing, while their accuracies are often criticized. In this paper, we explore an alternative to build a fast and accurate detector by strengthening lightweight features using a hand-crafted mechanism. Inspired by the structure of Receptive Fields (RFs) in human visual systems, we propose a novel RF Block (RFB) module, which takes the relationship between the size and eccentricity of RFs into account, to enhance the feature discriminability and robustness. We further assemble RFB to the top of SSD, constructing the RFB Net detector. To evaluate its effectiveness, experiments are conducted on two major benchmarks and the results show that RFB Net is able to reach the performance of advanced very deep detectors while keeping the real-time speed. Code is available at https://github.com/ruinmessi/RFBNet.Comment: Accepted by ECCV 201

arXiv.org e-Print Archive

Crossref

Towards Bottom-Up Analysis of Social Food

Author: Deng J.
Fard M. A.
Hu Y.
Layne R.
Reed S.
Simonyan K.
Sukhbaatar S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/03/2016
Field of study

in ACM Digital Health Conference 201

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Queen Mary Research Online

Single Shot Temporal Action Detection

Author: Abadi M.
Escorcia V.
Gemert J.
Glorot X.
Glorot X.
He K.
Kingma D.
Kuehne H.
Liu W.
Oneata D.
Qiu Z.
Simonyan K.
Szegedy C.
Wang R.
Yuan J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/10/2017
Field of study

Temporal action detection is a very important yet challenging problem, since videos in real applications are usually long, untrimmed and contain multiple action instances. This problem requires not only recognizing action categories but also detecting start time and end time of each action instance. Many state-of-the-art methods adopt the "detection by classification" framework: first do proposal, and then classify proposals. The main drawback of this framework is that the boundaries of action instance proposals have been fixed during the classification step. To address this issue, we propose a novel Single Shot Action Detector (SSAD) network based on 1D temporal convolutional layers to skip the proposal generation step via directly detecting action instances in untrimmed video. On pursuit of designing a particular SSAD network that can work effectively for temporal action detection, we empirically search for the best network architecture of SSAD due to lacking existing models that can be directly adopted. Moreover, we investigate into input feature types and fusion strategies to further improve detection accuracy. We conduct extensive experiments on two challenging datasets: THUMOS 2014 and MEXaction2. When setting Intersection-over-Union threshold to 0.5 during evaluation, SSAD significantly outperforms other state-of-the-art systems by increasing mAP from 19.0% to 24.6% on THUMOS 2014 and from 7.4% to 11.0% on MEXaction2.Comment: ACM Multimedia 201

arXiv.org e-Print Archive

Crossref

The age of data-driven proteomics : how machine learning enables novel workflows

Author: Bergstra J.
Bojarski M.
Boser B. E.
Deutsch E. W.
Ho T. K.
Montufar G. F.
Searle B. C.
Serrano G.
Silva A. S. C.
Simonyan K.
Young S. R.
Zimmer D.
Publication venue: 'Wiley'
Publication date: 01/01/2020
Field of study

A lot of energy in the field of proteomics is dedicated to the application of challenging experimental workflows, which include metaproteomics, proteogenomics, data independent acquisition (DIA), non-specific proteolysis, immunopeptidomics, and open modification searches. These workflows are all challenging because of ambiguity in the identification stage; they either expand the search space and thus increase the ambiguity of identifications, or, in the case of DIA, they generate data that is inherently more ambiguous. In this context, machine learning-based predictive models are now generating considerable excitement in the field of proteomics because these predictive models hold great potential to drastically reduce the ambiguity in the identification process of the above-mentioned workflows. Indeed, the field has already produced classical machine learning and deep learning models to predict almost every aspect of a liquid chromatography-mass spectrometry (LC-MS) experiment. Yet despite all the excitement, thorough integration of predictive models in these challenging LC-MS workflows is still limited, and further improvements to the modeling and validation procedures can still be made. In this viewpoint we therefore point out highly promising recent machine learning developments in proteomics, alongside some of the remaining challenges

Crossref

Ghent University Academic Bibliography

Bose-Einstein Condensation of Helium and Hydrogen inside Bundles of Carbon Nanotubes

Author: A. D. Migone
A. Griffin
A. Griffin
D. C. Mattis
F. Ancilotto
G. Vidali
J. M. Kosterlitz
J. M. Ziman
L. W. Bruch
M. C. Gordillo
M. K. Kostov
M. M. Calbi
M. Muris
M. W. Cole
R. K. Pathria
S. A. Moskalenko
S. G. Brush
S. M. Gatica
S. M. Gatica
T. Wilson
V. Ginzburg
V. V. Simonyan
Publication venue: 'American Physical Society (APS)'
Publication date: 03/10/2003
Field of study

Helium atoms or hydrogen molecules are believed to be strongly bound within the interstitial channels (between three carbon nanotubes) within a bundle of many nanotubes. The effects on adsorption of a nonuniform distribution of tubes are evaluated. The energy of a single particle state is the sum of a discrete transverse energy Et (that depends on the radii of neighboring tubes) and a quasicontinuous energy Ez of relatively free motion parallel to the axis of the tubes. At low temperature, the particles occupy the lowest energy states, the focus of this study. The transverse energy attains a global minimum value (Et=Emin) for radii near Rmin=9.95 Ang. for H2 and 8.48 Ang.for He-4. The density of states N(E) near the lowest energy is found to vary linearly above this threshold value, i.e. N(E) is proportional to (E-Emin). As a result, there occurs a Bose-Einstein condensation of the molecules into the channel with the lowest transverse energy. The transition is characterized approximately as that of a four dimensional gas, neglecting the interactions between the adsorbed particles. The phenomenon is observable, in principle, from a singular heat capacity. The existence of this transition depends on the sample having a relatively broad distribution of radii values that include some near Rmin.Comment: 21 pages, 9 figure

arXiv.org e-Print Archive

Crossref

Generic 3D Representation via Pose Estimation and Matching

Author: B Caprile
B Li
C Xu
D Tell
DG Lowe
EJ Gibson
H Bay
J Matas
J Weston
JM Morel
K Köser
K Mikolajczyk
Karen Simonyan
L Smith
L Van der Maaten
M Brown
MJ Tarr
N Silberman
Nancy Rader
P Denis
P Moreels
R Hartley
R Held
R Kümmerle
S Agarwal
Z Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/10/2017
Field of study

Though a large body of computer vision research has investigated developing generic semantic representations, efforts towards developing a similar representation for 3D has been limited. In this paper, we learn a generic 3D representation through solving a set of foundational proxy 3D tasks: object-centric camera pose estimation and wide baseline feature matching. Our method is based upon the premise that by providing supervision over a set of carefully selected foundational tasks, generalization to novel tasks and abstraction capabilities can be achieved. We empirically show that the internal representation of a multi-task ConvNet trained to solve the above core problems generalizes to novel 3D tasks (e.g., scene layout estimation, object pose estimation, surface normal estimation) without the need for fine-tuning and shows traits of abstraction abilities (e.g., cross-modality pose estimation). In the context of the core supervised tasks, we demonstrate our representation achieves state-of-the-art wide baseline feature matching results without requiring apriori rectification (unlike SIFT and the majority of learned features). We also show 6DOF camera pose estimation given a pair local image patches. The accuracy of both supervised tasks come comparable to humans. Finally, we contribute a large-scale dataset composed of object-centric street view scenes along with point correspondences and camera pose information, and conclude with a discussion on the learned representation and open research questions.Comment: Published in ECCV16. See the project website http://3drepresentation.stanford.edu/ and dataset website https://github.com/amir32002/3D_Street_Vie

arXiv.org e-Print Archive

Crossref

Recommended from our members

Deep Learning for Single-Molecule Science

Author: Bengio Y
Bishop C M
Chang S
Chang S
Coates A
Deng L
Eduardo Alonso
Glorot X
Goodfellow I
Gregory Slabaugh
Hebb D O
Hinton G E
Hinton G E
Minsky M
Mitchell T
Nair V
Saon G
Schwenk H
Simonyan K
Simonyan K
SM Masudur R Al-Arif
Tim Albrecht
Werbos P
Widrow B
Yosinski J
Zeiler M D
Publication venue: 'IOP Publishing'
Publication date: 18/09/2017
Field of study

Exploring and making predictions based on single-molecule data can be challenging, not only due to the sheer size of the datasets, but also because a priori knowledge about the signal characteristics is typically limited and poor signal-to-noise ratio. For example, hypothesis-driven data exploration, informed by an expectation of the signal characteristics, can lead to interpretation bias or loss of information. Equally, even when the different data categories are known, e.g., the four bases in DNA sequencing, it is often difficult to know how to make best use of the available information content. The latest developments in Machine Learning (ML), so-called Deep Learning (DL) offers an interesting, new avenues to address such challenges. In some applications, such as speech and image recognition, DL has been able to outperform conventional Machine Learning strategies and even human performance. However, to date DL has not been applied much in single-molecule science, presumably in part because relatively little is known about the 'internal workings' of such DL tools within single-molecule science as a field. In this Tutorial, we make an attempt to illustrate in a step-by-step guide how one of those, a Convolutional Neural Network, may be used for base calling in DNA sequencing applications. We compare it with a Support Vector Machine as a more conventional ML method, and and discuss some of the strengths and weaknesses of the approach. In particular, a 'deep' neural network has many features of a 'black box', which has important implications on how we look at and interpret data

City Research Online

Crossref

University of Birmingham Research Portal

Isotopic and spin selectivity of H_2 adsorbed in bundles of carbon nanotubes

Author: A. Kuznetsova
A. Thess
A.A. Lucas
A.C. Dillon
A.C. Dillon
A.D. Novaco
B. C Hathorn
C.M. Brown
D. Basmadjian
D. Basmadjian
D. Goulding
D.G. Narehood
F. Stephanie-Victoire
G. Gao
G. Stan
G. Stan
G. Vidali
H. Cheng
I.S. Averbukh
J.J.M. Beenakker
K.A. Williams
K.A. Williams
M. Boninsegni
M. K. Kostov
M. W. Cole
M.C. Gordillo
M.K. Kostov
M.M. Calbi
M.S. Dresselhaus
M.W. Cole
M.W. Cole
P. Chen
P.T. Greenland
Q. Wang
Q. Wang
Q. Wang
R. A. Trasca
R.R. Schlittler
R.S. Hansen
S. Inoue
S.A. FitzGerald
S.B. Sinnott
S.E. Weber
S.R. Challa
T. Wilson
V. Meregalli
V.V. Simonyan
Y. Ye
Y. Ye
Y.F. Yin
Publication venue: 'American Physical Society (APS)'
Publication date: 02/07/2002
Field of study

Due to its large surface area and strongly attractive potential, a bundle of carbon nanotubes is an ideal substrate material for gas storage. In addition, adsorption in nanotubes can be exploited in order to separate the components of a mixture. In this paper, we investigate the preferential adsorption of D_2 versus H_2(isotope selectivity) and of ortho versus para(spin selectivity) molecules confined in the one-dimensional grooves and interstitial channels of carbon nanotube bundles. We perform selectivity calculations in the low coverage regime, neglecting interactions between adsorbate molecules. We find substantial spin selectivity for a range of temperatures up to 100 K, and even greater isotope selectivity for an extended range of temperatures,up to 300 K. This isotope selectivity is consistent with recent experimental data, which exhibit a large difference between the isosteric heats of D_2 and H_2 adsorbed in these bundles.Comment: Paper submitted to Phys.Rev. B; 17 pages, 2 tables, 6 figure

arXiv.org e-Print Archive

Crossref

Towards a resolution of the proton form factor problem: new electron and positron scattering data

Author: Adhikari K P
Adikaram D
Afanasev A V
Amaryan M J
Anderson M D
Arrington J
Ball J
Battaglieri M
Bedlinskiy I
Bennet R P
Biselli A S
Boiarinov S
Bono J
Briscoe W J
Brooks W K
Burkert V D
Carman D S
Celentano A
Chandavar S
Charles G
Colaneri L
Cole P L
Contalbrigo M
D'Angelo A
Dashyan N
De Sanctis E
De Vita R
Deur A
Djalali C
Dodge G E
Dupre R
Egiyan H
El Alaoui A
El Fassi L
Eugenio P
Fedotov G
Fegan S
Filippi A
Fleming J A
Fradi A
Gilfoyle G P
Giovanetti K L
Girod F X
Goetz J T
Gohn W
Golovatch E
Gothe R W
Griffioen K A
Guidal M
Guo L
Hafidi K
Hakobyan H
Harrison N
Hattawy M
Hicks K
Holtrop M
Hughes S M
Hyde C E
Ilieva Y
Ireland D.G.
Ishkhanov B S
Jenkins D
Jiang H
Joo K
Joosten S
Khandaker M
Khetarpal P
Kim W
Klein A
Klein F J
Koirala S
Kubarovsky V
Kuhn S E
Livingston K.
Lu H Y
MacGregor I J D
Markov N
Mayer M
McKinnon B
Mestayer M D
Meyer C A
Mirazita M
Mokeev V
Montgomery R
Moody C I
Moutarde H
Movsisyan A
Munoz Camacho C
Nadel-Turonski P
Niccolai S
Niculescu G
Osipenko M
Ostrovidov A I
Park K
Pasyuk E
Pisano S
Pogorelko O
Procureur S
Prok Y
Protopopescu D
Puckett A J R
Raue B
Rimal D
Ripani M
Rizzo A
Rosner G
Rossi P
Sabatie F
Schott D
Schumacher R A
Sharabian Y G
Simonyan A
Skorodumina I
Smith E S
Smith G D
Sober D I
Sokhan D.
Sparveris N
Stepanyan S
Strauch S
Sytnik V
Taiuti M
Tian Y
Trivedi A
Ungaro M
Voskanyan H
Voutier E
Walford N K
Watts D P
Wei X
Weinstein L B
Wood M H
Zachariou N
Zana L
Zhang J
Zhao Z W
Zonta I
Publication venue: 'American Physical Society (APS)'
Publication date: 25/11/2014
Field of study

There is a significant discrepancy between the values of the proton electric form factor,

G_E^p

, extracted using unpolarized and polarized electron scattering. Calculations predict that small two-photon exchange (TPE) contributions can significantly affect the extraction of

G_E^p

from the unpolarized electron-proton cross sections. We determined the TPE contribution by measuring the ratio of positron-proton to electron-proton elastic scattering cross sections using a simultaneous, tertiary electron-positron beam incident on a liquid hydrogen target and detecting the scattered particles in the Jefferson Lab CLAS detector. This novel technique allowed us to cover a wide range in virtual photon polarization (