2,317 research outputs found
A generative power-law search tree model
It is now a well-established fact that search algorithms can exhibit heavy-tailed behavior. However, the reasons behind this fact are not well understood. We provide a generative search tree model whose distribution of the number of nodes visited during search is formally heavy-tailed. Our model allows us to generate search trees with any degree of heavy-tailedness. We also show how the different regimes observed for the runtime distributions of backtrack search methods across different constrainedness regions of random CSP models can be captured by a mixture of the so-called stable distributions.info:eu-repo/semantics/publishedVersio
Learning the Structure for Structured Sparsity
Structured sparsity has recently emerged in statistics, machine learning and
signal processing as a promising paradigm for learning in high-dimensional
settings. All existing methods for learning under the assumption of structured
sparsity rely on prior knowledge on how to weight (or how to penalize)
individual subsets of variables during the subset selection process, which is
not available in general. Inferring group weights from data is a key open
research problem in structured sparsity.In this paper, we propose a Bayesian
approach to the problem of group weight learning. We model the group weights as
hyperparameters of heavy-tailed priors on groups of variables and derive an
approximate inference scheme to infer these hyperparameters. We empirically
show that we are able to recover the model hyperparameters when the data are
generated from the model, and we demonstrate the utility of learning weights in
synthetic and real denoising problems
Stochastic network formation and homophily
This is a chapter of the forthcoming Oxford Handbook on the Economics of
Networks
ASlib: A Benchmark Library for Algorithm Selection
The task of algorithm selection involves choosing an algorithm from a set of
algorithms on a per-instance basis in order to exploit the varying performance
of algorithms over a set of instances. The algorithm selection problem is
attracting increasing attention from researchers and practitioners in AI. Years
of fruitful applications in a number of domains have resulted in a large amount
of data, but the community lacks a standard format or repository for this data.
This situation makes it difficult to share and compare different approaches
effectively, as is done in other, more established fields. It also
unnecessarily hinders new researchers who want to work in this area. To address
this problem, we introduce a standardized format for representing algorithm
selection scenarios and a repository that contains a growing number of data
sets from the literature. Our format has been designed to be able to express a
wide variety of different scenarios. Demonstrating the breadth and power of our
platform, we describe a set of example experiments that build and evaluate
algorithm selection models through a common interface. The results display the
potential of algorithm selection to achieve significant performance
improvements across a broad range of problems and algorithms.Comment: Accepted to be published in Artificial Intelligence Journa
Solving constraint-satisfaction problems with distributed neocortical-like neuronal networks
Finding actions that satisfy the constraints imposed by both external inputs
and internal representations is central to decision making. We demonstrate that
some important classes of constraint satisfaction problems (CSPs) can be solved
by networks composed of homogeneous cooperative-competitive modules that have
connectivity similar to motifs observed in the superficial layers of neocortex.
The winner-take-all modules are sparsely coupled by programming neurons that
embed the constraints onto the otherwise homogeneous modular computational
substrate. We show rules that embed any instance of the CSPs planar four-color
graph coloring, maximum independent set, and Sudoku on this substrate, and
provide mathematical proofs that guarantee these graph coloring problems will
convergence to a solution. The network is composed of non-saturating linear
threshold neurons. Their lack of right saturation allows the overall network to
explore the problem space driven through the unstable dynamics generated by
recurrent excitation. The direction of exploration is steered by the constraint
neurons. While many problems can be solved using only linear inhibitory
constraints, network performance on hard problems benefits significantly when
these negative constraints are implemented by non-linear multiplicative
inhibition. Overall, our results demonstrate the importance of instability
rather than stability in network computation, and also offer insight into the
computational role of dual inhibitory mechanisms in neural circuits.Comment: Accepted manuscript, in press, Neural Computation (2018
Algorithmic and Statistical Perspectives on Large-Scale Data Analysis
In recent years, ideas from statistics and scientific computing have begun to
interact in increasingly sophisticated and fruitful ways with ideas from
computer science and the theory of algorithms to aid in the development of
improved worst-case algorithms that are useful for large-scale scientific and
Internet data analysis problems. In this chapter, I will describe two recent
examples---one having to do with selecting good columns or features from a (DNA
Single Nucleotide Polymorphism) data matrix, and the other having to do with
selecting good clusters or communities from a data graph (representing a social
or information network)---that drew on ideas from both areas and that may serve
as a model for exploiting complementary algorithmic and statistical
perspectives in order to solve applied large-scale data analysis problems.Comment: 33 pages. To appear in Uwe Naumann and Olaf Schenk, editors,
"Combinatorial Scientific Computing," Chapman and Hall/CRC Press, 201
Degree-degree correlations in random graphs with heavy-tailed degrees
We investigate degree-degree correlations for scale-free graph sequences. The main conclusion of this paper is that the assortativity coefficient is not the appropriate way to describe degree-dependences in scale-free random graphs. Indeed, we study the infinite volume limit of the assortativity coefficient, and show that this limit is always non-negative when the degrees have finite first but infinite third moment, i.e., when the degree exponent of the density satisfies . More generally, our results show that the correlation coefficient is inappropriate to describe dependencies between random variables having infinite variance. We start with a simple model of the sample correlation of random variables and , which are linear combinations with non-negative coefficients of the same infinite variance random variables. In this case, the correlation coefficient of and is not defined, and the sample covariance converges to a proper random variable with support that is a subinterval of . Further, for any joint distribution with equal marginals being non-negative power-law distributions with infinite variance (as in the case of degree-degree correlations), we show that the limit is non-negative. We next adapt these results to the assortativity in networks as described by the degree-degree correlation coefficient, and show that it is non-negative in the large graph limit when the degree distribution has an infinite third moment. We illustrate these results with several examples where the assortativity behaves in a non-sensible way. We further discuss alternatives for describing assortativity in networks based on rank correlations that are appropriate for infinite variance variables. We support these mathematical results by simulations
- âŠ