Search CORE

34,713 research outputs found

Learning Two-layer Neural Networks with Symmetric Inputs

Author: Ge Rong
Kuditipudi Rohith
Li Zhize
Wang Xiang
Publication venue
Publication date: 03/02/2019
Field of study

We give a new algorithm for learning a two-layer neural network under a general class of input distributions. Assuming there is a ground-truth two-layer network

y = A \sigma(Wx) + \xi,

where

A,W

are weight matrices,

\xi

represents noise, and the number of neurons in the hidden layer is no larger than the input or output, our algorithm is guaranteed to recover the parameters

A,W

of the ground-truth network. The only requirement on the input

x

is that it is symmetric, which still allows highly complicated and structured input. Our algorithm is based on the method-of-moments framework and extends several results in tensor decompositions. We use spectral algorithms to avoid the complicated non-convex optimization in learning neural networks. Experiments show that our algorithm can robustly learn the ground-truth neural network with a small number of samples for many symmetric input distributions

arXiv.org e-Print Archive

Institutional Knowledge at Singapore Management University

Convolutional Neural Networks over Tree Structures for Programming Language Processing

Author: Jin Zhi
Li Ge
Mou Lili
Wang Tao
Zhang Lu
Publication venue
Publication date: 08/12/2015
Field of study

Programming language processing (similar to natural language processing) is a hot research topic in the field of software engineering; it has also aroused growing interest in the artificial intelligence community. However, different from a natural language sentence, a program contains rich, explicit, and complicated structural information. Hence, traditional NLP models may be inappropriate for programs. In this paper, we propose a novel tree-based convolutional neural network (TBCNN) for programming language processing, in which a convolution kernel is designed over programs' abstract syntax trees to capture structural information. TBCNN is a generic architecture for programming language processing; our experiments show its effectiveness in two different program analysis tasks: classifying programs according to functionality, and detecting code snippets of certain patterns. TBCNN outperforms baseline methods, including several neural models for NLP.Comment: Accepted at AAAI-1

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications