19,420 research outputs found
Argument Component Classification for Classroom Discussions
This paper focuses on argument component classification for transcribed
spoken classroom discussions, with the goal of automatically classifying
student utterances into claims, evidence, and warrants. We show that an
existing method for argument component classification developed for another
educationally-oriented domain performs poorly on our dataset. We then show that
feature sets from prior work on argument mining for student essays and online
dialogues can be used to improve performance considerably. We also provide a
comparison between convolutional neural networks and recurrent neural networks
when trained under different conditions to classify argument components in
classroom discussions. While neural network models are not always able to
outperform a logistic regression model, we were able to gain some useful
insights: convolutional networks are more robust than recurrent networks both
at the character and at the word level, and specificity information can help
boost performance in multi-task training
Seeing Convolution Through the Eyes of Finite Transformation Semigroup Theory: An Abstract Algebraic Interpretation of Convolutional Neural Networks
Researchers are actively trying to gain better insights into the
representational properties of convolutional neural networks for guiding better
network designs and for interpreting a network's computational nature. Gaining
such insights can be an arduous task due to the number of parameters in a
network and the complexity of a network's architecture. Current approaches of
neural network interpretation include Bayesian probabilistic interpretations
and information theoretic interpretations. In this study, we take a different
approach to studying convolutional neural networks by proposing an abstract
algebraic interpretation using finite transformation semigroup theory.
Specifically, convolutional layers are broken up and mapped to a finite space.
The state space of the proposed finite transformation semigroup is then defined
as a single element within the convolutional layer, with the acting elements
defined by surrounding state elements combined with convolution kernel
elements. Generators of the finite transformation semigroup are defined to
complete the interpretation. We leverage this approach to analyze the basic
properties of the resulting finite transformation semigroup to gain insights on
the representational properties of convolutional neural networks, including
insights into quantized network representation. Such a finite transformation
semigroup interpretation can also enable better understanding outside of the
confines of fixed lattice data structures, thus useful for handling data that
lie on irregular lattices. Furthermore, the proposed abstract algebraic
interpretation is shown to be viable for interpreting convolutional operations
within a variety of convolutional neural network architectures.Comment: 9 page
Topology and Prediction Focused Research on Graph Convolutional Neural Networks
Important advances have been made using convolutional neural network (CNN)
approaches to solve complicated problems in areas that rely on grid structured
data such as image processing and object classification. Recently, research on
graph convolutional neural networks (GCNN) has increased dramatically as
researchers try to replicate the success of CNN for graph structured data.
Unfortunately, traditional CNN methods are not readily transferable to GCNN,
given the irregularity and geometric complexity of graphs. The emerging field
of GCNN is further complicated by research papers that differ greatly in their
scope, detail, and level of academic sophistication needed by the reader.
The present paper provides a review of some basic properties of GCNN. As a
guide to the interested reader, recent examples of GCNN research are then
grouped according to techniques that attempt to uncover the underlying topology
of the graph model and those that seek to generalize traditional CNN methods on
graph data to improve prediction of class membership. Discrete Signal
Processing on Graphs (DSPg) is used as a theoretical framework to better
understand some of the performance gains and limitations of these recent GCNN
approaches. A brief discussion of Topology Adaptive Graph Convolutional
Networks (TAGCN) is presented as an approach motivated by DSPg and future
research directions using this approach are briefly discussed
Deep Learning on Operational Facility Data Related to Large-Scale Distributed Area Scientific Workflows
Distributed computing platforms provide a robust mechanism to perform
large-scale computations by splitting the task and data among multiple
locations, possibly located thousands of miles apart geographically. Although
such distribution of resources can lead to benefits, it also comes with its
associated problems such as rampant duplication of file transfers increasing
congestion, long job completion times, unexpected site crashing, suboptimal
data transfer rates, unpredictable reliability in a time range, and suboptimal
usage of storage elements. In addition, each sub-system becomes a potential
failure node that can trigger system wide disruptions. In this vision paper, we
outline our approach to leveraging Deep Learning algorithms to discover
solutions to unique problems that arise in a system with computational
infrastructure that is spread over a wide area. The presented vision, motivated
by a real scientific use case from Belle II experiments, is to develop
multilayer neural networks to tackle forecasting, anomaly detection and
optimization challenges in a complex and distributed data movement environment.
Through this vision based on Deep Learning principles, we aim to achieve
reduced congestion events, faster file transfer rates, and enhanced site
reliability
Building effective deep neural network architectures one feature at a time
Successful training of convolutional neural networks is often associated with
sufficiently deep architectures composed of high amounts of features. These
networks typically rely on a variety of regularization and pruning techniques
to converge to less redundant states. We introduce a novel bottom-up approach
to expand representations in fixed-depth architectures. These architectures
start from just a single feature per layer and greedily increase width of
individual layers to attain effective representational capacities needed for a
specific task. While network growth can rely on a family of metrics, we propose
a computationally efficient version based on feature time evolution and
demonstrate its potency in determining feature importance and a networks'
effective capacity. We demonstrate how automatically expanded architectures
converge to similar topologies that benefit from lesser amount of parameters or
improved accuracy and exhibit systematic correspondence in representational
complexity with the specified task. In contrast to conventional design patterns
with a typical monotonic increase in the amount of features with increased
depth, we observe that CNNs perform better when there is more learnable
parameters in intermediate, with falloffs to earlier and later layers
Demand Forecasting from Spatiotemporal Data with Graph Networks and Temporal-Guided Embedding
Short-term demand forecasting models commonly combine convolutional and
recurrent layers to extract complex spatiotemporal patterns in data. Long-term
histories are also used to consider periodicity and seasonality patterns as
time series data. In this study, we propose an efficient architecture,
Temporal-Guided Network (TGNet), which utilizes graph networks and
temporal-guided embedding. Graph networks extract invariant features to
permutations of adjacent regions instead of convolutional layers.
Temporal-guided embedding explicitly learns temporal contexts from training
data and is substituted for the input of long-term histories from days/weeks
ago. TGNet learns an autoregressive model, conditioned on temporal contexts of
forecasting targets from temporal-guided embedding. Finally, our model achieves
competitive performances with other baselines on three spatiotemporal demand
dataset from real-world, but the number of trainable parameters is about 20
times smaller than a state-of-the-art baseline. We also show that
temporal-guided embedding learns temporal contexts as intended and TGNet has
robust forecasting performances even to atypical event situations.Comment: NeurIPS 2018 Workshop on Modeling and Decision-Making in the
Spatiotemporal Domai
Hierarchical internal representation of spectral features in deep convolutional networks trained for EEG decoding
Recently, there is increasing interest and research on the interpretability
of machine learning models, for example how they transform and internally
represent EEG signals in Brain-Computer Interface (BCI) applications. This can
help to understand the limits of the model and how it may be improved, in
addition to possibly provide insight about the data itself. Schirrmeister et
al. (2017) have recently reported promising results for EEG decoding with deep
convolutional neural networks (ConvNets) trained in an end-to-end manner and,
with a causal visualization approach, showed that they learn to use spectral
amplitude changes in the input. In this study, we investigate how ConvNets
represent spectral features through the sequence of intermediate stages of the
network. We show higher sensitivity to EEG phase features at earlier stages and
higher sensitivity to EEG amplitude features at later stages. Intriguingly, we
observed a specialization of individual stages of the network to the classical
EEG frequency bands alpha, beta, and high gamma. Furthermore, we find first
evidence that particularly in the last convolutional layer, the network learns
to detect more complex oscillatory patterns beyond spectral phase and
amplitude, reminiscent of the representation of complex visual features in
later layers of ConvNets in computer vision tasks. Our findings thus provide
insights into how ConvNets hierarchically represent spectral EEG features in
their intermediate layers and suggest that ConvNets can exploit and might help
to better understand the compositional structure of EEG time series.Comment: 6 pages, 7 figures, The 6th International Winter Conference on
Brain-Computer Interfac
Object Classification using Ensemble of Local and Deep Features
In this paper we propose an ensemble of local and deep features for object
classification. We also compare and contrast effectiveness of feature
representation capability of various layers of convolutional neural network. We
demonstrate with extensive experiments for object classification that the
representation capability of features from deep networks can be complemented
with information captured from local features. We also find out that features
from various deep convolutional networks encode distinctive characteristic
information. We establish that, as opposed to conventional practice,
intermediate layers of deep networks can augment the classification
capabilities of features obtained from fully connected layers.Comment: Accepted for publication at Ninth International Conference on
Advances in Pattern Recognitio
Mapping Auto-context Decision Forests to Deep ConvNets for Semantic Segmentation
We consider the task of pixel-wise semantic segmentation given a small set of
labeled training images. Among two of the most popular techniques to address
this task are Decision Forests (DF) and Neural Networks (NN). In this work, we
explore the relationship between two special forms of these techniques: stacked
DFs (namely Auto-context) and deep Convolutional Neural Networks (ConvNet). Our
main contribution is to show that Auto-context can be mapped to a deep ConvNet
with novel architecture, and thereby trained end-to-end. This mapping can be
used as an initialization of a deep ConvNet, enabling training even in the face
of very limited amounts of training data. We also demonstrate an approximate
mapping back from the refined ConvNet to a second stacked DF, with improved
performance over the original. We experimentally verify that these mappings
outperform stacked DFs for two different applications in computer vision and
biology: Kinect-based body part labeling from depth images, and somite
segmentation in microscopy images of developing zebrafish. Finally, we revisit
the core mapping from a Decision Tree (DT) to a NN, and show that it is also
possible to map a fuzzy DT, with sigmoidal split decisions, to a NN. This
addresses multiple limitations of the previous mapping, and yields new insights
into the popular Rectified Linear Unit (ReLU), and more recently proposed
concatenated ReLU (CReLU), activation functions
An In-Depth Analysis of Visual Tracking with Siamese Neural Networks
This survey presents a deep analysis of the learning and inference
capabilities in nine popular trackers. It is neither intended to study the
whole literature nor is it an attempt to review all kinds of neural networks
proposed for visual tracking. We focus instead on Siamese neural networks which
are a promising starting point for studying the challenging problem of
tracking. These networks integrate efficiently feature learning and the
temporal matching and have so far shown state-of-the-art performance. In
particular, the branches of Siamese networks, their layers connecting these
branches, specific aspects of training and the embedding of these networks into
the tracker are highlighted. Quantitative results from existing papers are
compared with the conclusion that the current evaluation methodology shows
problems with the reproducibility and the comparability of results. The paper
proposes a novel Lisp-like formalism for a better comparison of trackers. This
assumes a certain functional design and functional decomposition of trackers.
The paper tries to give foundation for tracker design by a formulation of the
problem based on the theory of machine learning and by the interpretation of a
tracker as a decision function. The work concludes with promising lines of
research and suggests future work.Comment: submitted to IEEE TPAM
- …