2,019 research outputs found

    Generalized Shortest Path Kernel on Graphs

    Full text link
    We consider the problem of classifying graphs using graph kernels. We define a new graph kernel, called the generalized shortest path kernel, based on the number and length of shortest paths between nodes. For our example classification problem, we consider the task of classifying random graphs from two well-known families, by the number of clusters they contain. We verify empirically that the generalized shortest path kernel outperforms the original shortest path kernel on a number of datasets. We give a theoretical analysis for explaining our experimental results. In particular, we estimate distributions of the expected feature vectors for the shortest path kernel and the generalized shortest path kernel, and we show some evidence explaining why our graph kernel outperforms the shortest path kernel for our graph classification problem.Comment: Short version presented at Discovery Science 2015 in Banf

    Learning what matters - Sampling interesting patterns

    Get PDF
    In the field of exploratory data mining, local structure in data can be described by patterns and discovered by mining algorithms. Although many solutions have been proposed to address the redundancy problems in pattern mining, most of them either provide succinct pattern sets or take the interests of the user into account-but not both. Consequently, the analyst has to invest substantial effort in identifying those patterns that are relevant to her specific interests and goals. To address this problem, we propose a novel approach that combines pattern sampling with interactive data mining. In particular, we introduce the LetSIP algorithm, which builds upon recent advances in 1) weighted sampling in SAT and 2) learning to rank in interactive pattern mining. Specifically, it exploits user feedback to directly learn the parameters of the sampling distribution that represents the user's interests. We compare the performance of the proposed algorithm to the state-of-the-art in interactive pattern mining by emulating the interests of a user. The resulting system allows efficient and interleaved learning and sampling, thus user-specific anytime data exploration. Finally, LetSIP demonstrates favourable trade-offs concerning both quality-diversity and exploitation-exploration when compared to existing methods.Comment: PAKDD 2017, extended versio

    A sufficient condition for a number to be the order of a nonsingular derivation of a Lie algebra

    Full text link
    A study of the set N_p of positive integers which occur as orders of nonsingular derivations of finite-dimensional non-nilpotent Lie algebras of characteristic p>0 was initiated by Shalev and continued by the present author. The main goal of this paper is to show the abundance of elements of N_p. Our main result shows that any divisor n of q-1, where q is a power of p, such that n(p1)1/p(q1)11/(2p)n\ge (p-1)^{1/p} (q-1)^{1-1/(2p)}, belongs to N_p. This extends its special case for p=2 which was proved in a previous paper by a different method.Comment: 10 pages. This version has been revised according to a referee's suggestions. The additions include a discussion of the (lower) density of the set N_p, and the results of more extensive machine computations. Note that the title has also changed. To appear in Israel J. Mat

    The Computational Power of Optimization in Online Learning

    Full text link
    We consider the fundamental problem of prediction with expert advice where the experts are "optimizable": there is a black-box optimization oracle that can be used to compute, in constant time, the leading expert in retrospect at any point in time. In this setting, we give a novel online algorithm that attains vanishing regret with respect to NN experts in total O~(N)\widetilde{O}(\sqrt{N}) computation time. We also give a lower bound showing that this running time cannot be improved (up to log factors) in the oracle model, thereby exhibiting a quadratic speedup as compared to the standard, oracle-free setting where the required time for vanishing regret is Θ~(N)\widetilde{\Theta}(N). These results demonstrate an exponential gap between the power of optimization in online learning and its power in statistical learning: in the latter, an optimization oracle---i.e., an efficient empirical risk minimizer---allows to learn a finite hypothesis class of size NN in time O(logN)O(\log{N}). We also study the implications of our results to learning in repeated zero-sum games, in a setting where the players have access to oracles that compute, in constant time, their best-response to any mixed strategy of their opponent. We show that the runtime required for approximating the minimax value of the game in this setting is Θ~(N)\widetilde{\Theta}(\sqrt{N}), yielding again a quadratic improvement upon the oracle-free setting, where Θ~(N)\widetilde{\Theta}(N) is known to be tight

    Spectral Sparsification and Regret Minimization Beyond Matrix Multiplicative Updates

    Full text link
    In this paper, we provide a novel construction of the linear-sized spectral sparsifiers of Batson, Spielman and Srivastava [BSS14]. While previous constructions required Ω(n4)\Omega(n^4) running time [BSS14, Zou12], our sparsification routine can be implemented in almost-quadratic running time O(n2+ε)O(n^{2+\varepsilon}). The fundamental conceptual novelty of our work is the leveraging of a strong connection between sparsification and a regret minimization problem over density matrices. This connection was known to provide an interpretation of the randomized sparsifiers of Spielman and Srivastava [SS11] via the application of matrix multiplicative weight updates (MWU) [CHS11, Vis14]. In this paper, we explain how matrix MWU naturally arises as an instance of the Follow-the-Regularized-Leader framework and generalize this approach to yield a larger class of updates. This new class allows us to accelerate the construction of linear-sized spectral sparsifiers, and give novel insights on the motivation behind Batson, Spielman and Srivastava [BSS14]

    The spatio-temporal distribution of lightning over Israel and the neighboring area and its relation to regional synoptic systems

    Get PDF
    The spatio-temporal distribution of lightning flashes over Israel and the neighboring area and its relation to the regional synoptic systems has been studied, based on data obtained from the Israel Lightning Location System (ILLS) operated by the Israel Electric Corporation (IEC). The system detects cloud-to-ground lightning discharges in a range of ~500 km around central Israel (32.5° N, 35° E). The study period was defined for annual activity from August through July, for 5 seasons in the period 2004–2010. <br><br> The spatial distribution of lightning flash density indicates the highest concentration over the Mediterranean Sea, attributed to the contribution of moisture as well as sensible and latent heat fluxes from the sea surface. Other centers of high density appear along the coastal plain, orographic barriers, especially in northern Israel, and downwind from the metropolitan area of Tel Aviv, Israel. The intra-annual distribution shows an absence of lightning during the summer months (JJA) due to the persistent subsidence over the region. The vast majority of lightning activity occurs during 7 months, October to April. Although over 65 % of the rainfall in Israel is obtained during the winter months (DJF), only 35 % of lightning flashes occur in these months. October is the richest month, with 40 % of total annual flashes. This is attributed both to tropical intrusions, i.e., Red Sea Troughs (RST), which are characterized by intense static instability and convection, and to Cyprus Lows (CLs) arriving from the west. <br><br> Based on daily study of the spatial distribution of lightning, three patterns have been defined; "land", "maritime" and "hybrid". CLs cause high flash density over the Mediterranean Sea, whereas some of the RST days are typified by flashes over land. The pattern defined "hybrid" is a combination of the other 2 patterns. On CL days, only the maritime pattern was noted, whereas in RST days all 3 patterns were found, including the maritime pattern. It is suggested that atmospheric processes associated with RST produce the land pattern. Hence, the occurrence of a maritime pattern in days identified as RST reflects an "apparent RST". The hybrid pattern was associated with an RST located east of Israel. This synoptic type produced the typical flash maximum over the land, but the upper-level trough together with the onshore winds it induced over the eastern coast of the Mediterranean resulted in lightning activity over the sea as well, similar to that of CLs. <br><br> It is suggested that the spatial distribution patterns of lightning may better identify the synoptic system responsible, a CL, an "active RST" or an "apparent RST". The electrical activity thus serves as a "fingerprint" for the synoptic situation responsible for its generation

    Subgraphs and network motifs in geometric networks

    Full text link
    Many real-world networks describe systems in which interactions decay with the distance between nodes. Examples include systems constrained in real space such as transportation and communication networks, as well as systems constrained in abstract spaces such as multivariate biological or economic datasets and models of social networks. These networks often display network motifs: subgraphs that recur in the network much more often than in randomized networks. To understand the origin of the network motifs in these networks, it is important to study the subgraphs and network motifs that arise solely from geometric constraints. To address this, we analyze geometric network models, in which nodes are arranged on a lattice and edges are formed with a probability that decays with the distance between nodes. We present analytical solutions for the numbers of all 3 and 4-node subgraphs, in both directed and non-directed geometric networks. We also analyze geometric networks with arbitrary degree sequences, and models with a field that biases for directed edges in one direction. Scaling rules for scaling of subgraph numbers with system size, lattice dimension and interaction range are given. Several invariant measures are found, such as the ratio of feedback and feed-forward loops, which do not depend on system size, dimension or connectivity function. We find that network motifs in many real-world networks, including social networks and neuronal networks, are not captured solely by these geometric models. This is in line with recent evidence that biological network motifs were selected as basic circuit elements with defined information-processing functions.Comment: 9 pages, 6 figure

    Contextual Object Detection with a Few Relevant Neighbors

    Full text link
    A natural way to improve the detection of objects is to consider the contextual constraints imposed by the detection of additional objects in a given scene. In this work, we exploit the spatial relations between objects in order to improve detection capacity, as well as analyze various properties of the contextual object detection problem. To precisely calculate context-based probabilities of objects, we developed a model that examines the interactions between objects in an exact probabilistic setting, in contrast to previous methods that typically utilize approximations based on pairwise interactions. Such a scheme is facilitated by the realistic assumption that the existence of an object in any given location is influenced by only few informative locations in space. Based on this assumption, we suggest a method for identifying these relevant locations and integrating them into a mostly exact calculation of probability based on their raw detector responses. This scheme is shown to improve detection results and provides unique insights about the process of contextual inference for object detection. We show that it is generally difficult to learn that a particular object reduces the probability of another, and that in cases when the context and detector strongly disagree this learning becomes virtually impossible for the purposes of improving the results of an object detector. Finally, we demonstrate improved detection results through use of our approach as applied to the PASCAL VOC and COCO datasets

    Scalable and Interpretable One-class SVMs with Deep Learning and Random Fourier features

    Full text link
    One-class support vector machine (OC-SVM) for a long time has been one of the most effective anomaly detection methods and extensively adopted in both research as well as industrial applications. The biggest issue for OC-SVM is yet the capability to operate with large and high-dimensional datasets due to optimization complexity. Those problems might be mitigated via dimensionality reduction techniques such as manifold learning or autoencoder. However, previous work often treats representation learning and anomaly prediction separately. In this paper, we propose autoencoder based one-class support vector machine (AE-1SVM) that brings OC-SVM, with the aid of random Fourier features to approximate the radial basis kernel, into deep learning context by combining it with a representation learning architecture and jointly exploit stochastic gradient descent to obtain end-to-end training. Interestingly, this also opens up the possible use of gradient-based attribution methods to explain the decision making for anomaly detection, which has ever been challenging as a result of the implicit mappings between the input space and the kernel space. To the best of our knowledge, this is the first work to study the interpretability of deep learning in anomaly detection. We evaluate our method on a wide range of unsupervised anomaly detection tasks in which our end-to-end training architecture achieves a performance significantly better than the previous work using separate training.Comment: Accepted at European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD) 201
    corecore