50,892 research outputs found
Neural activity classification with machine learning models trained on interspike interval series data
The flow of information through the brain is reflected by the activity
patterns of neural cells. Indeed, these firing patterns are widely used as
input data to predictive models that relate stimuli and animal behavior to the
activity of a population of neurons. However, relatively little attention was
paid to single neuron spike trains as predictors of cell or network properties
in the brain. In this work, we introduce an approach to neuronal spike train
data mining which enables effective classification and clustering of neuron
types and network activity states based on single-cell spiking patterns. This
approach is centered around applying state-of-the-art time series
classification/clustering methods to sequences of interspike intervals recorded
from single neurons. We demonstrate good performance of these methods in tasks
involving classification of neuron type (e.g. excitatory vs. inhibitory cells)
and/or neural circuit activity state (e.g. awake vs. REM sleep vs. nonREM sleep
states) on an open-access cortical spiking activity dataset
Finding Statistically Significant Interactions between Continuous Features
The search for higher-order feature interactions that are statistically
significantly associated with a class variable is of high relevance in fields
such as Genetics or Healthcare, but the combinatorial explosion of the
candidate space makes this problem extremely challenging in terms of
computational efficiency and proper correction for multiple testing. While
recent progress has been made regarding this challenge for binary features, we
here present the first solution for continuous features. We propose an
algorithm which overcomes the combinatorial explosion of the search space of
higher-order interactions by deriving a lower bound on the p-value for each
interaction, which enables us to massively prune interactions that can never
reach significance and to thereby gain more statistical power. In our
experiments, our approach efficiently detects all significant interactions in a
variety of synthetic and real-world datasets.Comment: 13 pages, 5 figures, 2 tables, accepted to the 28th International
Joint Conference on Artificial Intelligence (IJCAI 2019
Classifying sequences by the optimized dissimilarity space embedding approach: a case study on the solubility analysis of the E. coli proteome
We evaluate a version of the recently-proposed classification system named
Optimized Dissimilarity Space Embedding (ODSE) that operates in the input space
of sequences of generic objects. The ODSE system has been originally presented
as a classification system for patterns represented as labeled graphs. However,
since ODSE is founded on the dissimilarity space representation of the input
data, the classifier can be easily adapted to any input domain where it is
possible to define a meaningful dissimilarity measure. Here we demonstrate the
effectiveness of the ODSE classifier for sequences by considering an
application dealing with the recognition of the solubility degree of the
Escherichia coli proteome. Solubility, or analogously aggregation propensity,
is an important property of protein molecules, which is intimately related to
the mechanisms underlying the chemico-physical process of folding. Each protein
of our dataset is initially associated with a solubility degree and it is
represented as a sequence of symbols, denoting the 20 amino acid residues. The
herein obtained computational results, which we stress that have been achieved
with no context-dependent tuning of the ODSE system, confirm the validity and
generality of the ODSE-based approach for structured data classification.Comment: 10 pages, 49 reference
Nonparametric Hierarchical Clustering of Functional Data
In this paper, we deal with the problem of curves clustering. We propose a
nonparametric method which partitions the curves into clusters and discretizes
the dimensions of the curve points into intervals. The cross-product of these
partitions forms a data-grid which is obtained using a Bayesian model selection
approach while making no assumptions regarding the curves. Finally, a
post-processing technique, aiming at reducing the number of clusters in order
to improve the interpretability of the clustering, is proposed. It consists in
optimally merging the clusters step by step, which corresponds to an
agglomerative hierarchical classification whose dissimilarity measure is the
variation of the criterion. Interestingly this measure is none other than the
sum of the Kullback-Leibler divergences between clusters distributions before
and after the merges. The practical interest of the approach for functional
data exploratory analysis is presented and compared with an alternative
approach on an artificial and a real world data set
Discovering Patterns of Interest in IP Traffic Using Cliques in Bipartite Link Streams
Studying IP traffic is crucial for many applications. We focus here on the
detection of (structurally and temporally) dense sequences of interactions,
that may indicate botnets or coordinated network scans. More precisely, we
model a MAWI capture of IP traffic as a link streams, i.e. a sequence of
interactions meaning that devices and exchanged
packets from time to time . This traffic is captured on a single
router and so has a bipartite structure: links occur only between nodes in two
disjoint sets. We design a method for finding interesting bipartite cliques in
such link streams, i.e. two sets of nodes and a time interval such that all
nodes in the first set are linked to all nodes in the second set throughout the
time interval. We then explore the bipartite cliques present in the considered
trace. Comparison with the MAWILab classification of anomalous IP addresses
shows that the found cliques succeed in detecting anomalous network activity
Inferring Social Status and Rich Club Effects in Enterprise Communication Networks
Social status, defined as the relative rank or position that an individual
holds in a social hierarchy, is known to be among the most important motivating
forces in social behaviors. In this paper, we consider the notion of status
from the perspective of a position or title held by a person in an enterprise.
We study the intersection of social status and social networks in an
enterprise. We study whether enterprise communication logs can help reveal how
social interactions and individual status manifest themselves in social
networks. To that end, we use two enterprise datasets with three communication
channels --- voice call, short message, and email --- to demonstrate the
social-behavioral differences among individuals with different status. We have
several interesting findings and based on these findings we also develop a
model to predict social status. On the individual level, high-status
individuals are more likely to be spanned as structural holes by linking to
people in parts of the enterprise networks that are otherwise not well
connected to one another. On the community level, the principle of homophily,
social balance and clique theory generally indicate a "rich club" maintained by
high-status individuals, in the sense that this community is much more
connected, balanced and dense. Our model can predict social status of
individuals with 93% accuracy.Comment: 13 pages, 4 figure
- âŠ