6,041 research outputs found
The Effect of Explicit Structure Encoding of Deep Neural Networks for Symbolic Music Generation
With recent breakthroughs in artificial neural networks, deep generative
models have become one of the leading techniques for computational creativity.
Despite very promising progress on image and short sequence generation,
symbolic music generation remains a challenging problem since the structure of
compositions are usually complicated. In this study, we attempt to solve the
melody generation problem constrained by the given chord progression. This
music meta-creation problem can also be incorporated into a plan recognition
system with user inputs and predictive structural outputs. In particular, we
explore the effect of explicit architectural encoding of musical structure via
comparing two sequential generative models: LSTM (a type of RNN) and WaveNet
(dilated temporal-CNN). As far as we know, this is the first study of applying
WaveNet to symbolic music generation, as well as the first systematic
comparison between temporal-CNN and RNN for music generation. We conduct a
survey for evaluation in our generations and implemented Variable Markov Oracle
in music pattern discovery. Experimental results show that to encode structure
more explicitly using a stack of dilated convolution layers improved the
performance significantly, and a global encoding of underlying chord
progression into the generation procedure gains even more.Comment: 8 pages, 13 figure
SOM-VAE: Interpretable Discrete Representation Learning on Time Series
High-dimensional time series are common in many domains. Since human
cognition is not optimized to work well in high-dimensional spaces, these areas
could benefit from interpretable low-dimensional representations. However, most
representation learning algorithms for time series data are difficult to
interpret. This is due to non-intuitive mappings from data features to salient
properties of the representation and non-smoothness over time. To address this
problem, we propose a new representation learning framework building on ideas
from interpretable discrete dimensionality reduction and deep generative
modeling. This framework allows us to learn discrete representations of time
series, which give rise to smooth and interpretable embeddings with superior
clustering performance. We introduce a new way to overcome the
non-differentiability in discrete representation learning and present a
gradient-based version of the traditional self-organizing map algorithm that is
more performant than the original. Furthermore, to allow for a probabilistic
interpretation of our method, we integrate a Markov model in the representation
space. This model uncovers the temporal transition structure, improves
clustering performance even further and provides additional explanatory
insights as well as a natural representation of uncertainty. We evaluate our
model in terms of clustering performance and interpretability on static
(Fashion-)MNIST data, a time series of linearly interpolated (Fashion-)MNIST
images, a chaotic Lorenz attractor system with two macro states, as well as on
a challenging real world medical time series application on the eICU data set.
Our learned representations compare favorably with competitor methods and
facilitate downstream tasks on the real world data.Comment: Accepted for publication at the Seventh International Conference on
Learning Representations (ICLR 2019
Deep representation learning for human motion prediction and classification
Generative models of 3D human motion are often restricted to a small number
of activities and can therefore not generalize well to novel movements or
applications. In this work we propose a deep learning framework for human
motion capture data that learns a generic representation from a large corpus of
motion capture data and generalizes well to new, unseen, motions. Using an
encoding-decoding network that learns to predict future 3D poses from the most
recent past, we extract a feature representation of human motion. Most work on
deep learning for sequence prediction focuses on video and speech. Since
skeletal data has a different structure, we present and evaluate different
network architectures that make different assumptions about time dependencies
and limb correlations. To quantify the learned features, we use the output of
different layers for action classification and visualize the receptive fields
of the network units. Our method outperforms the recent state of the art in
skeletal motion prediction even though these use action specific training data.
Our results show that deep feedforward networks, trained from a generic mocap
database, can successfully be used for feature extraction from human motion
data and that this representation can be used as a foundation for
classification and prediction.Comment: This paper is published at the IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), 201
Activity driven modeling of time varying networks
Network modeling plays a critical role in identifying statistical
regularities and structural principles common to many systems. The large
majority of recent modeling approaches are connectivity driven. The structural
patterns of the network are at the basis of the mechanisms ruling the network
formation. Connectivity driven models necessarily provide a time-aggregated
representation that may fail to describe the instantaneous and fluctuating
dynamics of many networks. We address this challenge by defining the activity
potential, a time invariant function characterizing the agents' interactions
and constructing an activity driven model capable of encoding the instantaneous
time description of the network dynamics. The model provides an explanation of
structural features such as the presence of hubs, which simply originate from
the heterogeneous activity of agents. Within this framework, highly dynamical
networks can be described analytically, allowing a quantitative discussion of
the biases induced by the time-aggregated representations in the analysis of
dynamical processes.Comment: 10 pages, 4 figure
MULTIVARIATE MODELING OF COGNITIVE PERFORMANCE AND CATEGORICAL PERCEPTION FROM NEUROIMAGING DATA
State-of-the-art cognitive-neuroscience mainly uses hypothesis-driven statistical testing to characterize and model neural disorders and diseases. While such techniques have proven to be powerful in understanding diseases and disorders, they are inadequate in explaining causal relationships as well as individuality and variations. In this study, we proposed multivariate data-driven approaches for predictive modeling of cognitive events and disorders. We developed network descriptions of both structural and functional connectivities that are critical in multivariate modeling of cognitive performance (i.e., fluency, attention, and working memory) and categorical perceptions (i.e., emotion, speech perception). We also performed dynamic network analysis on brain connectivity measures to determine the role of different functional areas in relation to categorical perceptions and cognitive events. Our empirical studies of structural connectivity were performed using Diffusion Tensor Imaging (DTI). The main objective was to discover the role of structural connectivity in selecting clinically interpretable features that are consistent over a large range of model parameters in classifying cognitive performances in relation to Acute Lymphoblastic Leukemia (ALL). The proposed approach substantially improved accuracy (13% - 26%) over existing models and also selected a relevant, small subset of features that were verified by domain experts. In summary, the proposed approach produced interpretable models with better generalization.Functional connectivity is related to similar patterns of activation in different brain regions regardless of the apparent physical connectedness of the regions. The proposed data-driven approach to the source localized electroencephalogram (EEG) data includes an array of tools such as graph mining, feature selection, and multivariate analysis to determine the functional connectivity in categorical perceptions. We used the network description to correctly classify listeners behavioral responses with an accuracy over 92% on 35 participants. State-of-the-art network description of human brain assumes static connectivities. However, brain networks in relation to perception and cognition are complex and dynamic. Analysis of transient functional networks with spatiotemporal variations to understand cognitive functions remains challenging. One of the critical missing links is the lack of sophisticated methodologies in understanding dynamics neural activity patterns. We proposed a clustering-based complex dynamic network analysis on source localized EEG data to understand the commonality and differences in gender-specific emotion processing. Besides, we also adopted Bayesian nonparametric framework for segmentation neural activity with a finite number of microstates. This approach enabled us to find the default network and transient pattern of the underlying neural mechanism in relation to categorical perception. In summary, multivariate and dynamic network analysis methods developed in this dissertation to analyze structural and functional connectivities will have a far-reaching impact on computational neuroscience to identify meaningful changes in spatiotemporal brain activities
Persistent Homology of Attractors For Action Recognition
In this paper, we propose a novel framework for dynamical analysis of human
actions from 3D motion capture data using topological data analysis. We model
human actions using the topological features of the attractor of the dynamical
system. We reconstruct the phase-space of time series corresponding to actions
using time-delay embedding, and compute the persistent homology of the
phase-space reconstruction. In order to better represent the topological
properties of the phase-space, we incorporate the temporal adjacency
information when computing the homology groups. The persistence of these
homology groups encoded using persistence diagrams are used as features for the
actions. Our experiments with action recognition using these features
demonstrate that the proposed approach outperforms other baseline methods.Comment: 5 pages, Under review in International Conference on Image Processin
Image-based methods to investigate synchronization between time series relevant for plasma fusion diagnostics
Advanced time series analysis and causality detection techniques have been successfully applied to the assessment of synchronization experiments in tokamaks, such as Edge Localized Modes (ELMs) and sawtooth pacing. Lag synchronization is a typical strategy for fusion plasma instability control by pace-making techniques. The major difficulty, in evaluating the efficiency of the pacing methods, is the coexistence of the causal effects with the periodic or quasi-periodic nature of the plasma instabilities. In the present work, a set of methods based on the image representation of time series, are investigated as tools for evaluating the efficiency of the pace-making techniques. The main options rely on the Gramian Angular Field (GAF), the Markov Transition Field (MTF), previously used for time series classification, and the Chaos Game Representation (CGR), employed for the visualization of large collections of long time series. The paper proposes an original variation of the Markov Transition Matrix, defined for a couple of time series. Additionally, a recently proposed method, based on the mapping of time series as cross-visibility networks and their representation as images, is included in this study. The performances of the method are evaluated on synthetic data and applied to JET measurements
- …