Search CORE

1,546 research outputs found

A Comprehensive Survey on Graph Neural Networks

Author: Chen Fengwen
Long Guodong
Pan Shirui
Wu Zonghan
Yu Philip S.
Zhang Chengqi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/12/2019
Field of study

Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the Euclidean space. However, there is an increasing number of applications where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. The complexity of graph data has imposed significant challenges on existing machine learning algorithms. Recently, many studies on extending deep learning approaches for graph data have emerged. In this survey, we provide a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields. We propose a new taxonomy to divide the state-of-the-art graph neural networks into four categories, namely recurrent graph neural networks, convolutional graph neural networks, graph autoencoders, and spatial-temporal graph neural networks. We further discuss the applications of graph neural networks across various domains and summarize the open source codes, benchmark data sets, and model evaluation of graph neural networks. Finally, we propose potential research directions in this rapidly growing field.Comment: Minor revision (updated tables and references

arXiv.org e-Print Archive

Harnessing Deep Neural Networks with Logic Rules

Author: Hovy Eduard
Hu Zhiting
Liu Zhengzhong
Ma Xuezhe
Xing Eric
Publication venue
Publication date: 08/08/2020
Field of study

Combining deep neural networks with structured logic rules is desirable to harness flexibility and reduce uninterpretability of the neural models. We propose a general framework capable of enhancing various types of neural networks (e.g., CNNs and RNNs) with declarative first-order logic rules. Specifically, we develop an iterative distillation method that transfers the structured information of logic rules into the weights of neural networks. We deploy the framework on a CNN for sentiment analysis, and an RNN for named entity recognition. With a few highly intuitive rules, we obtain substantial improvements and achieve state-of-the-art or comparable results to previous best-performing systems.Comment: Fix typos in appendix. ACL 201

arXiv.org e-Print Archive

Machine Learning Methods for Data Association in Multi-Object Tracking

Author: Elefteriadou Lily
Emami Patrick
Pardalos Panos M.
Ranka Sanjay
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 25/08/2020
Field of study

Data association is a key step within the multi-object tracking pipeline that is notoriously challenging due to its combinatorial nature. A popular and general way to formulate data association is as the NP-hard multidimensional assignment problem (MDAP). Over the last few years, data-driven approaches to assignment have become increasingly prevalent as these techniques have started to mature. We focus this survey solely on learning algorithms for the assignment step of multi-object tracking, and we attempt to unify various methods by highlighting their connections to linear assignment as well as to the MDAP. First, we review probabilistic and end-to-end optimization approaches to data association, followed by methods that learn association affinities from data. We then compare the performance of the methods presented in this survey, and conclude by discussing future research directions.Comment: Accepted for publication in ACM Computing Survey

arXiv.org e-Print Archive

Semistochastic Quadratic Bound Methods

Author: Aravkin Aleksandr Y.
Choromanska Anna
Jebara Tony
Kanevsky Dimitri
Publication venue
Publication date: 17/02/2014
Field of study

Partition functions arise in a variety of settings, including conditional random fields, logistic regression, and latent gaussian models. In this paper, we consider semistochastic quadratic bound (SQB) methods for maximum likelihood inference based on partition function optimization. Batch methods based on the quadratic bound were recently proposed for this class of problems, and performed favorably in comparison to state-of-the-art techniques. Semistochastic methods fall in between batch algorithms, which use all the data, and stochastic gradient type methods, which use small random selections at each iteration. We build semistochastic quadratic bound-based methods, and prove both global convergence (to a stationary point) under very weak assumptions, and linear convergence rate under stronger assumptions on the objective. To make the proposed methods faster and more stable, we consider inexact subproblem minimization and batch-size selection schemes. The efficacy of SQB methods is demonstrated via comparison with several state-of-the-art techniques on commonly used datasets.Comment: 11 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX

AdaNet: Adaptive Structural Learning of Artificial Neural Networks

Author: Cortes Corinna
Gonzalvo Xavi
Kuznetsov Vitaly
Mohri Mehryar
Yang Scott
Publication venue
Publication date: 27/02/2017
Field of study

We present new algorithms for adaptively learning artificial neural networks. Our algorithms (AdaNet) adaptively learn both the structure of the network and its weights. They are based on a solid theoretical analysis, including data-dependent generalization guarantees that we prove and discuss in detail. We report the results of large-scale experiments with one of our algorithms on several binary classification tasks extracted from the CIFAR-10 dataset. The results demonstrate that our algorithm can automatically learn network structures with very competitive performance accuracies when compared with those achieved for neural networks found by standard approaches

arXiv.org e-Print Archive

Constrained Deep Learning using Conditional Gradient and Applications in Computer Vision

Author: Dinh Tuan
Lokhande Vishnu
Ravi Sathya N.
Singh Vikas
Publication venue
Publication date: 16/03/2018
Field of study

A number of results have recently demonstrated the benefits of incorporating various constraints when training deep architectures in vision and machine learning. The advantages range from guarantees for statistical generalization to better accuracy to compression. But support for general constraints within widely used libraries remains scarce and their broader deployment within many applications that can benefit from them remains under-explored. Part of the reason is that Stochastic gradient descent (SGD), the workhorse for training deep neural networks, does not natively deal with constraints with global scope very well. In this paper, we revisit a classical first order scheme from numerical optimization, Conditional Gradients (CG), that has, thus far had limited applicability in training deep models. We show via rigorous analysis how various constraints can be naturally handled by modifications of this algorithm. We provide convergence guarantees and show a suite of immediate benefits that are possible -- from training ResNets with fewer layers but better accuracy simply by substituting in our version of CG to faster training of GANs with 50% fewer epochs in image inpainting applications to provably better generalization guarantees using efficiently implementable forms of recently proposed regularizers

arXiv.org e-Print Archive

End-to-end representation learning for Correlation Filter based tracking

Author: Bertinetto Luca
Henriques João F.
Torr Philip H. S.
Valmadre Jack
Vedaldi Andrea
Publication venue
Publication date: 20/04/2017
Field of study

The Correlation Filter is an algorithm that trains a linear template to discriminate between images and their translations. It is well suited to object tracking because its formulation in the Fourier domain provides a fast solution, enabling the detector to be re-trained once per frame. Previous works that use the Correlation Filter, however, have adopted features that were either manually designed or trained for a different task. This work is the first to overcome this limitation by interpreting the Correlation Filter learner, which has a closed-form solution, as a differentiable layer in a deep neural network. This enables learning deep features that are tightly coupled to the Correlation Filter. Experiments illustrate that our method has the important practical benefit of allowing lightweight architectures to achieve state-of-the-art performance at high framerates.Comment: To appear at CVPR 201

arXiv.org e-Print Archive

cvpaper.challenge in 2016: Futuristic Computer Vision through 1,600 Papers Survey

Author: Abe Kaori
Fuchida Masataka
He Yun
Kanehara Yoshihiro
Kanezaki Asako
Kataoka Hirokatsu
Maruyama Shinya
Matsuzaki Yuta
Miyashita Yudai
Morita Shin'ichiro
Okayasu Kazushige
Shirakabe Soma
Suzuki Teppei
Takasawa Ryosuke
Ueta Shunya
Yabe Toshiyuki
Yatsuyanagi Hiroya
Publication venue
Publication date: 20/07/2017
Field of study

The paper gives futuristic challenges disscussed in the cvpaper.challenge. In 2015 and 2016, we thoroughly study 1,600+ papers in several conferences/journals such as CVPR/ICCV/ECCV/NIPS/PAMI/IJCV

arXiv.org e-Print Archive

Length bias in Encoder Decoder Models and a Case for Global Conditioning

Author: Sarawagi Sunita
Sountsov Pavel
Publication venue
Publication date: 21/09/2016
Field of study

Encoder-decoder networks are popular for modeling sequences probabilistically in many applications. These models use the power of the Long Short-Term Memory (LSTM) architecture to capture the full dependence among variables, unlike earlier models like CRFs that typically assumed conditional independence among non-adjacent variables. However in practice encoder-decoder models exhibit a bias towards short sequences that surprisingly gets worse with increasing beam size. In this paper we show that such phenomenon is due to a discrepancy between the full sequence margin and the per-element margin enforced by the locally conditioned training objective of a encoder-decoder model. The discrepancy more adversely impacts long sequences, explaining the bias towards predicting short sequences. For the case where the predicted sequences come from a closed set, we show that a globally conditioned model alleviates the above problems of encoder-decoder models. From a practical point of view, our proposed model also eliminates the need for a beam-search during inference, which reduces to an efficient dot-product based search in a vector-space

arXiv.org e-Print Archive

Frank-Wolfe Network: An Interpretable Deep Structure for Non-Sparse Coding

Author: Liu Dong
Liu Runsheng
Sun Ke
Wang Zhangyang
Zha Zheng-Jun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/08/2019
Field of study

The problem of

L_p

-norm constrained coding is to convert signal into code that lies inside an

L_p

-ball and most faithfully reconstructs the signal. Previous works under the name of sparse coding considered the cases of

L_0

and

L_1

norms. The cases with

p>1

values, i.e. non-sparse coding studied in this paper, remain a difficulty. We propose an interpretable deep structure namely Frank-Wolfe Network (F-W Net), whose architecture is inspired by unrolling and truncating the Frank-Wolfe algorithm for solving an

L_p

-norm constrained problem with

p\geq 1

. We show that the Frank-Wolfe solver for the

L_p

-norm constraint leads to a novel closed-form nonlinear unit, which is parameterized by

p

and termed

pool_p

. The

pool_p

unit links the conventional pooling, activation, and normalization operations, making F-W Net distinct from existing deep networks either heuristically designed or converted from projected gradient descent algorithms. We further show that the hyper-parameter

p

can be made learnable instead of pre-chosen in F-W Net, which gracefully solves the non-sparse coding problem even with unknown

p

. We evaluate the performance of F-W Net on an extensive range of simulations as well as the task of handwritten digit recognition, where F-W Net exhibits strong learning capability. We then propose a convolutional version of F-W Net, and apply the convolutional F-W Net into image denoising and super-resolution tasks, where F-W Net all demonstrates impressive effectiveness, flexibility, and robustness.Comment: Accepted to IEEE Transactions on Circuits and Systems for Video Technology. Code and pretrained models: https://github.com/sunke123/FW-Ne

arXiv.org e-Print Archive