2,019 research outputs found
Generalized Shortest Path Kernel on Graphs
We consider the problem of classifying graphs using graph kernels. We define
a new graph kernel, called the generalized shortest path kernel, based on the
number and length of shortest paths between nodes. For our example
classification problem, we consider the task of classifying random graphs from
two well-known families, by the number of clusters they contain. We verify
empirically that the generalized shortest path kernel outperforms the original
shortest path kernel on a number of datasets. We give a theoretical analysis
for explaining our experimental results. In particular, we estimate
distributions of the expected feature vectors for the shortest path kernel and
the generalized shortest path kernel, and we show some evidence explaining why
our graph kernel outperforms the shortest path kernel for our graph
classification problem.Comment: Short version presented at Discovery Science 2015 in Banf
Learning what matters - Sampling interesting patterns
In the field of exploratory data mining, local structure in data can be
described by patterns and discovered by mining algorithms. Although many
solutions have been proposed to address the redundancy problems in pattern
mining, most of them either provide succinct pattern sets or take the interests
of the user into account-but not both. Consequently, the analyst has to invest
substantial effort in identifying those patterns that are relevant to her
specific interests and goals. To address this problem, we propose a novel
approach that combines pattern sampling with interactive data mining. In
particular, we introduce the LetSIP algorithm, which builds upon recent
advances in 1) weighted sampling in SAT and 2) learning to rank in interactive
pattern mining. Specifically, it exploits user feedback to directly learn the
parameters of the sampling distribution that represents the user's interests.
We compare the performance of the proposed algorithm to the state-of-the-art in
interactive pattern mining by emulating the interests of a user. The resulting
system allows efficient and interleaved learning and sampling, thus
user-specific anytime data exploration. Finally, LetSIP demonstrates favourable
trade-offs concerning both quality-diversity and exploitation-exploration when
compared to existing methods.Comment: PAKDD 2017, extended versio
A sufficient condition for a number to be the order of a nonsingular derivation of a Lie algebra
A study of the set N_p of positive integers which occur as orders of
nonsingular derivations of finite-dimensional non-nilpotent Lie algebras of
characteristic p>0 was initiated by Shalev and continued by the present author.
The main goal of this paper is to show the abundance of elements of N_p. Our
main result shows that any divisor n of q-1, where q is a power of p, such that
, belongs to N_p. This extends its special
case for p=2 which was proved in a previous paper by a different method.Comment: 10 pages. This version has been revised according to a referee's
suggestions. The additions include a discussion of the (lower) density of the
set N_p, and the results of more extensive machine computations. Note that
the title has also changed. To appear in Israel J. Mat
The Computational Power of Optimization in Online Learning
We consider the fundamental problem of prediction with expert advice where
the experts are "optimizable": there is a black-box optimization oracle that
can be used to compute, in constant time, the leading expert in retrospect at
any point in time. In this setting, we give a novel online algorithm that
attains vanishing regret with respect to experts in total
computation time. We also give a lower bound showing
that this running time cannot be improved (up to log factors) in the oracle
model, thereby exhibiting a quadratic speedup as compared to the standard,
oracle-free setting where the required time for vanishing regret is
. These results demonstrate an exponential gap between
the power of optimization in online learning and its power in statistical
learning: in the latter, an optimization oracle---i.e., an efficient empirical
risk minimizer---allows to learn a finite hypothesis class of size in time
. We also study the implications of our results to learning in
repeated zero-sum games, in a setting where the players have access to oracles
that compute, in constant time, their best-response to any mixed strategy of
their opponent. We show that the runtime required for approximating the minimax
value of the game in this setting is , yielding
again a quadratic improvement upon the oracle-free setting, where
is known to be tight
Spectral Sparsification and Regret Minimization Beyond Matrix Multiplicative Updates
In this paper, we provide a novel construction of the linear-sized spectral
sparsifiers of Batson, Spielman and Srivastava [BSS14]. While previous
constructions required running time [BSS14, Zou12], our
sparsification routine can be implemented in almost-quadratic running time
.
The fundamental conceptual novelty of our work is the leveraging of a strong
connection between sparsification and a regret minimization problem over
density matrices. This connection was known to provide an interpretation of the
randomized sparsifiers of Spielman and Srivastava [SS11] via the application of
matrix multiplicative weight updates (MWU) [CHS11, Vis14]. In this paper, we
explain how matrix MWU naturally arises as an instance of the
Follow-the-Regularized-Leader framework and generalize this approach to yield a
larger class of updates. This new class allows us to accelerate the
construction of linear-sized spectral sparsifiers, and give novel insights on
the motivation behind Batson, Spielman and Srivastava [BSS14]
The spatio-temporal distribution of lightning over Israel and the neighboring area and its relation to regional synoptic systems
The spatio-temporal distribution of lightning flashes over Israel and the neighboring area and its relation to the regional synoptic systems has been studied, based on data obtained from the Israel Lightning Location System (ILLS) operated by the Israel Electric Corporation (IEC). The system detects cloud-to-ground lightning discharges in a range of ~500 km around central Israel (32.5° N, 35° E). The study period was defined for annual activity from August through July, for 5 seasons in the period 2004–2010. <br><br> The spatial distribution of lightning flash density indicates the highest concentration over the Mediterranean Sea, attributed to the contribution of moisture as well as sensible and latent heat fluxes from the sea surface. Other centers of high density appear along the coastal plain, orographic barriers, especially in northern Israel, and downwind from the metropolitan area of Tel Aviv, Israel. The intra-annual distribution shows an absence of lightning during the summer months (JJA) due to the persistent subsidence over the region. The vast majority of lightning activity occurs during 7 months, October to April. Although over 65 % of the rainfall in Israel is obtained during the winter months (DJF), only 35 % of lightning flashes occur in these months. October is the richest month, with 40 % of total annual flashes. This is attributed both to tropical intrusions, i.e., Red Sea Troughs (RST), which are characterized by intense static instability and convection, and to Cyprus Lows (CLs) arriving from the west. <br><br> Based on daily study of the spatial distribution of lightning, three patterns have been defined; "land", "maritime" and "hybrid". CLs cause high flash density over the Mediterranean Sea, whereas some of the RST days are typified by flashes over land. The pattern defined "hybrid" is a combination of the other 2 patterns. On CL days, only the maritime pattern was noted, whereas in RST days all 3 patterns were found, including the maritime pattern. It is suggested that atmospheric processes associated with RST produce the land pattern. Hence, the occurrence of a maritime pattern in days identified as RST reflects an "apparent RST". The hybrid pattern was associated with an RST located east of Israel. This synoptic type produced the typical flash maximum over the land, but the upper-level trough together with the onshore winds it induced over the eastern coast of the Mediterranean resulted in lightning activity over the sea as well, similar to that of CLs. <br><br> It is suggested that the spatial distribution patterns of lightning may better identify the synoptic system responsible, a CL, an "active RST" or an "apparent RST". The electrical activity thus serves as a "fingerprint" for the synoptic situation responsible for its generation
Subgraphs and network motifs in geometric networks
Many real-world networks describe systems in which interactions decay with
the distance between nodes. Examples include systems constrained in real space
such as transportation and communication networks, as well as systems
constrained in abstract spaces such as multivariate biological or economic
datasets and models of social networks. These networks often display network
motifs: subgraphs that recur in the network much more often than in randomized
networks. To understand the origin of the network motifs in these networks, it
is important to study the subgraphs and network motifs that arise solely from
geometric constraints. To address this, we analyze geometric network models, in
which nodes are arranged on a lattice and edges are formed with a probability
that decays with the distance between nodes. We present analytical solutions
for the numbers of all 3 and 4-node subgraphs, in both directed and
non-directed geometric networks. We also analyze geometric networks with
arbitrary degree sequences, and models with a field that biases for directed
edges in one direction. Scaling rules for scaling of subgraph numbers with
system size, lattice dimension and interaction range are given. Several
invariant measures are found, such as the ratio of feedback and feed-forward
loops, which do not depend on system size, dimension or connectivity function.
We find that network motifs in many real-world networks, including social
networks and neuronal networks, are not captured solely by these geometric
models. This is in line with recent evidence that biological network motifs
were selected as basic circuit elements with defined information-processing
functions.Comment: 9 pages, 6 figure
Contextual Object Detection with a Few Relevant Neighbors
A natural way to improve the detection of objects is to consider the
contextual constraints imposed by the detection of additional objects in a
given scene. In this work, we exploit the spatial relations between objects in
order to improve detection capacity, as well as analyze various properties of
the contextual object detection problem. To precisely calculate context-based
probabilities of objects, we developed a model that examines the interactions
between objects in an exact probabilistic setting, in contrast to previous
methods that typically utilize approximations based on pairwise interactions.
Such a scheme is facilitated by the realistic assumption that the existence of
an object in any given location is influenced by only few informative locations
in space. Based on this assumption, we suggest a method for identifying these
relevant locations and integrating them into a mostly exact calculation of
probability based on their raw detector responses. This scheme is shown to
improve detection results and provides unique insights about the process of
contextual inference for object detection. We show that it is generally
difficult to learn that a particular object reduces the probability of another,
and that in cases when the context and detector strongly disagree this learning
becomes virtually impossible for the purposes of improving the results of an
object detector. Finally, we demonstrate improved detection results through use
of our approach as applied to the PASCAL VOC and COCO datasets
Scalable and Interpretable One-class SVMs with Deep Learning and Random Fourier features
One-class support vector machine (OC-SVM) for a long time has been one of the
most effective anomaly detection methods and extensively adopted in both
research as well as industrial applications. The biggest issue for OC-SVM is
yet the capability to operate with large and high-dimensional datasets due to
optimization complexity. Those problems might be mitigated via dimensionality
reduction techniques such as manifold learning or autoencoder. However,
previous work often treats representation learning and anomaly prediction
separately. In this paper, we propose autoencoder based one-class support
vector machine (AE-1SVM) that brings OC-SVM, with the aid of random Fourier
features to approximate the radial basis kernel, into deep learning context by
combining it with a representation learning architecture and jointly exploit
stochastic gradient descent to obtain end-to-end training. Interestingly, this
also opens up the possible use of gradient-based attribution methods to explain
the decision making for anomaly detection, which has ever been challenging as a
result of the implicit mappings between the input space and the kernel space.
To the best of our knowledge, this is the first work to study the
interpretability of deep learning in anomaly detection. We evaluate our method
on a wide range of unsupervised anomaly detection tasks in which our end-to-end
training architecture achieves a performance significantly better than the
previous work using separate training.Comment: Accepted at European Conference on Machine Learning and Principles
and Practice of Knowledge Discovery in Databases (ECML-PKDD) 201
- …