Search CORE

9,805 research outputs found

Convolutional Networks with Adaptive Inference Graphs

Author: Belongie Serge
Veit Andreas
Publication venue
Publication date: 08/05/2020
Field of study

Do convolutional networks really need a fixed feed-forward structure? What if, after identifying the high-level concept of an image, a network could move directly to a layer that can distinguish fine-grained differences? Currently, a network would first need to execute sometimes hundreds of intermediate layers that specialize in unrelated aspects. Ideally, the more a network already knows about an image, the better it should be at deciding which layer to compute next. In this work, we propose convolutional networks with adaptive inference graphs (ConvNet-AIG) that adaptively define their network topology conditioned on the input image. Following a high-level structure similar to residual networks (ResNets), ConvNet-AIG decides for each input image on the fly which layers are needed. In experiments on ImageNet we show that ConvNet-AIG learns distinct inference graphs for different categories. Both ConvNet-AIG with 50 and 101 layers outperform their ResNet counterpart, while using 20% and 38% less computations respectively. By grouping parameters into layers for related classes and only executing relevant layers, ConvNet-AIG improves both efficiency and overall classification quality. Lastly, we also study the effect of adaptive inference graphs on the susceptibility towards adversarial examples. We observe that ConvNet-AIG shows a higher robustness than ResNets, complementing other known defense mechanisms.Comment: IJCV 201

arXiv.org e-Print Archive

Topology and Prediction Focused Research on Graph Convolutional Neural Networks

Author: Baron Matthew
Publication venue
Publication date: 23/08/2018
Field of study

Important advances have been made using convolutional neural network (CNN) approaches to solve complicated problems in areas that rely on grid structured data such as image processing and object classification. Recently, research on graph convolutional neural networks (GCNN) has increased dramatically as researchers try to replicate the success of CNN for graph structured data. Unfortunately, traditional CNN methods are not readily transferable to GCNN, given the irregularity and geometric complexity of graphs. The emerging field of GCNN is further complicated by research papers that differ greatly in their scope, detail, and level of academic sophistication needed by the reader. The present paper provides a review of some basic properties of GCNN. As a guide to the interested reader, recent examples of GCNN research are then grouped according to techniques that attempt to uncover the underlying topology of the graph model and those that seek to generalize traditional CNN methods on graph data to improve prediction of class membership. Discrete Signal Processing on Graphs (DSPg) is used as a theoretical framework to better understand some of the performance gains and limitations of these recent GCNN approaches. A brief discussion of Topology Adaptive Graph Convolutional Networks (TAGCN) is presented as an approach motivated by DSPg and future research directions using this approach are briefly discussed

arXiv.org e-Print Archive

Graph Attribute Aggregation Network with Progressive Margin Folding

Author: Ling Haibin
Lyu Xiaoqing
Qu Jingwei
Sun Penghui
Tang Zhi
Publication venue
Publication date: 13/05/2019
Field of study

Graph convolutional neural networks (GCNNs) have been attracting increasing research attention due to its great potential in inference over graph structures. However, insufficient effort has been devoted to the aggregation methods between different convolution graph layers. In this paper, we introduce a graph attribute aggregation network (GAAN) architecture. Different from the conventional pooling operations, a graph-transformation-based aggregation strategy, progressive margin folding, PMF, is proposed for integrating graph features. By distinguishing internal and margin elements, we provide an approach for implementing the folding iteratively. And a mechanism is also devised for preserving the local structures during progressively folding. In addition, a hypergraph-based representation is introduced for transferring the aggregated information between different layers. Our experiments applied to the public molecule datasets demonstrate that the proposed GAAN outperforms the existing GCNN models with significant effectiveness

arXiv.org e-Print Archive

Representation Learning on Visual-Symbolic Graphs for Video Understanding

Author: Haro Benjamín Béjar
Mavroudi Effrosyni
Vidal René
Publication venue
Publication date: 30/09/2020
Field of study

Events in natural videos typically arise from spatio-temporal interactions between actors and objects and involve multiple co-occurring activities and object classes. To capture this rich visual and semantic context, we propose using two graphs: (1) an attributed spatio-temporal visual graph whose nodes correspond to actors and objects and whose edges encode different types of interactions, and (2) a symbolic graph that models semantic relationships. We further propose a graph neural network for refining the representations of actors, objects and their interactions on the resulting hybrid graph. Our model goes beyond current approaches that assume nodes and edges are of the same type, operate on graphs with fixed edge weights and do not use a symbolic graph. In particular, our framework: a) has specialized attention-based message functions for different node and edge types; b) uses visual edge features; c) integrates visual evidence with label relationships; and d) performs global reasoning in the semantic space. Experiments on challenging video understanding tasks, such as temporal action localization on the Charades dataset, show that the proposed method leads to state-of-the-art performance.Comment: ECCV 202

arXiv.org e-Print Archive

Resolution Adaptive Networks for Efficient Inference

Author: Chen Xi
Dai Jifeng
Han Yizeng
Huang Gao
Song Shiji
Yang Le
Publication venue
Publication date: 18/05/2020
Field of study

Adaptive inference is an effective mechanism to achieve a dynamic tradeoff between accuracy and computational cost in deep networks. Existing works mainly exploit architecture redundancy in network depth or width. In this paper, we focus on spatial redundancy of input samples and propose a novel Resolution Adaptive Network (RANet), which is inspired by the intuition that low-resolution representations are sufficient for classifying "easy" inputs containing large objects with prototypical features, while only some "hard" samples need spatially detailed information. In RANet, the input images are first routed to a lightweight sub-network that efficiently extracts low-resolution representations, and those samples with high prediction confidence will exit early from the network without being further processed. Meanwhile, high-resolution paths in the network maintain the capability to recognize the "hard" samples. Therefore, RANet can effectively reduce the spatial redundancy involved in inferring high-resolution inputs. Empirically, we demonstrate the effectiveness of the proposed RANet on the CIFAR-10, CIFAR-100 and ImageNet datasets in both the anytime prediction setting and the budgeted batch classification setting.Comment: CVPR 202

arXiv.org e-Print Archive

Exploring Visual Relationship for Image Captioning

Author: Li Yehao
Mei Tao
Pan Yingwei
Yao Ting
Publication venue
Publication date: 19/09/2018
Field of study

It is always well believed that modeling relationships between objects would be helpful for representing and eventually describing an image. Nevertheless, there has not been evidence in support of the idea on image description generation. In this paper, we introduce a new design to explore the connections between objects for image captioning under the umbrella of attention-based encoder-decoder framework. Specifically, we present Graph Convolutional Networks plus Long Short-Term Memory (dubbed as GCN-LSTM) architecture that novelly integrates both semantic and spatial object relationships into image encoder. Technically, we build graphs over the detected objects in an image based on their spatial and semantic connections. The representations of each region proposed on objects are then refined by leveraging graph structure through GCN. With the learnt region-level features, our GCN-LSTM capitalizes on LSTM-based captioning framework with attention mechanism for sentence generation. Extensive experiments are conducted on COCO image captioning dataset, and superior results are reported when comparing to state-of-the-art approaches. More remarkably, GCN-LSTM increases CIDEr-D performance from 120.1% to 128.7% on COCO testing set.Comment: ECCV 201

arXiv.org e-Print Archive

Edge-labeling Graph Neural Network for Few-shot Learning

Author: Kim Jongmin
Kim Sungwoong
Kim Taesup
Yoo Chang D.
Publication venue
Publication date: 04/05/2019
Field of study

In this paper, we propose a novel edge-labeling graph neural network (EGNN), which adapts a deep neural network on the edge-labeling graph, for few-shot learning. The previous graph neural network (GNN) approaches in few-shot learning have been based on the node-labeling framework, which implicitly models the intra-cluster similarity and the inter-cluster dissimilarity. In contrast, the proposed EGNN learns to predict the edge-labels rather than the node-labels on the graph that enables the evolution of an explicit clustering by iteratively updating the edge-labels with direct exploitation of both intra-cluster similarity and the inter-cluster dissimilarity. It is also well suited for performing on various numbers of classes without retraining, and can be easily extended to perform a transductive inference. The parameters of the EGNN are learned by episodic training with an edge-labeling loss to obtain a well-generalizable model for unseen low-data problem. On both of the supervised and semi-supervised few-shot image classification tasks with two benchmark datasets, the proposed EGNN significantly improves the performances over the existing GNNs.Comment: accepted to CVPR 201

arXiv.org e-Print Archive

Looking back to lower-level information in few-shot learning

Author: Raschka Sebastian
Yu Zhongjie
Publication venue: 'MDPI AG'
Publication date: 15/07/2020
Field of study

Humans are capable of learning new concepts from small numbers of examples. In contrast, supervised deep learning models usually lack the ability to extract reliable predictive rules from limited data scenarios when attempting to classify new examples. This challenging scenario is commonly known as few-shot learning. Few-shot learning has garnered increased attention in recent years due to its significance for many real-world problems. Recently, new methods relying on meta-learning paradigms combined with graph-based structures, which model the relationship between examples, have shown promising results on a variety of few-shot classification tasks. However, existing work on few-shot learning is only focused on the feature embeddings produced by the last layer of the neural network. In this work, we propose the utilization of lower-level, supporting information, namely the feature embeddings of the hidden neural network layers, to improve classifier accuracy. Based on a graph-based meta-learning framework, we develop a method called Looking-Back, where such lower-level information is used to construct additional graphs for label propagation in limited data settings. Our experiments on two popular few-shot learning datasets, miniImageNet and tieredImageNet, show that our method can utilize the lower-level information in the network to improve state-of-the-art classification performance.Comment: 13 pages, 2 figures; fixed typographic errors and added journal re

arXiv.org e-Print Archive

Adaptive Hierarchical Down-Sampling for Point Cloud Classification

Author: Liu Bingbing
Luo Jun
Nezhadarya Ehsan
Razani Ryan
Taghavi Ehsan
Publication venue
Publication date: 22/05/2020
Field of study

While several convolution-like operators have recently been proposed for extracting features out of point clouds, down-sampling an unordered point cloud in a deep neural network has not been rigorously studied. Existing methods down-sample the points regardless of their importance for the output. As a result, some important points in the point cloud may be removed, while less valuable points may be passed to the next layers. In contrast, adaptive down-sampling methods sample the points by taking into account the importance of each point, which varies based on the application, task and training data. In this paper, we propose a permutation-invariant learning-based adaptive down-sampling layer, called Critical Points Layer (CPL), which reduces the number of points in an unordered point cloud while retaining the important points. Unlike most graph-based point cloud down-sampling methods that use

k

-NN search algorithm to find the neighbouring points, CPL is a global down-sampling method, rendering it computationally very efficient. The proposed layer can be used along with any graph-based point cloud convolution layer to form a convolutional neural network, dubbed CP-Net in this paper. We introduce a CP-Net for

3

D object classification that achieves the best accuracy for the ModelNet

40

dataset among point cloud-based methods, which validates the effectiveness of the CPL

arXiv.org e-Print Archive

Adaptive Neural Networks for Efficient Inference

Author: Bolukbasi Tolga
Dekel Ofer
Saligrama Venkatesh
Wang Joseph
Publication venue
Publication date: 18/09/2017
Field of study

We present an approach to adaptively utilize deep neural networks in order to reduce the evaluation time on new examples without loss of accuracy. Rather than attempting to redesign or approximate existing networks, we propose two schemes that adaptively utilize networks. We first pose an adaptive network evaluation scheme, where we learn a system to adaptively choose the components of a deep network to be evaluated for each example. By allowing examples correctly classified using early layers of the system to exit, we avoid the computational time associated with full evaluation of the network. We extend this to learn a network selection system that adaptively selects the network to be evaluated for each example. We show that computational time can be dramatically reduced by exploiting the fact that many examples can be correctly classified using relatively efficient networks and that complex, computationally costly networks are only necessary for a small fraction of examples. We pose a global objective for learning an adaptive early exit or network selection policy and solve it by reducing the policy learning problem to a layer-by-layer weighted binary classification problem. Empirically, these approaches yield dramatic reductions in computational cost, with up to a 2.8x speedup on state-of-the-art networks from the ImageNet image recognition challenge with minimal (<1%) loss of top5 accuracy

arXiv.org e-Print Archive