7,836 research outputs found
The IBMAP approach for Markov networks structure learning
In this work we consider the problem of learning the structure of Markov
networks from data. We present an approach for tackling this problem called
IBMAP, together with an efficient instantiation of the approach: the IBMAP-HC
algorithm, designed for avoiding important limitations of existing
independence-based algorithms. These algorithms proceed by performing
statistical independence tests on data, trusting completely the outcome of each
test. In practice tests may be incorrect, resulting in potential cascading
errors and the consequent reduction in the quality of the structures learned.
IBMAP contemplates this uncertainty in the outcome of the tests through a
probabilistic maximum-a-posteriori approach. The approach is instantiated in
the IBMAP-HC algorithm, a structure selection strategy that performs a
polynomial heuristic local search in the space of possible structures. We
present an extensive empirical evaluation on synthetic and real data, showing
that our algorithm outperforms significantly the current independence-based
algorithms, in terms of data efficiency and quality of learned structures, with
equivalent computational complexities. We also show the performance of IBMAP-HC
in a real-world application of knowledge discovery: EDAs, which are
evolutionary algorithms that use structure learning on each generation for
modeling the distribution of populations. The experiments show that when
IBMAP-HC is used to learn the structure, EDAs improve the convergence to the
optimum
Describing the complexity of systems: multi-variable "set complexity" and the information basis of systems biology
Context dependence is central to the description of complexity. Keying on the
pairwise definition of "set complexity" we use an information theory approach
to formulate general measures of systems complexity. We examine the properties
of multi-variable dependency starting with the concept of interaction
information. We then present a new measure for unbiased detection of
multi-variable dependency, "differential interaction information." This
quantity for two variables reduces to the pairwise "set complexity" previously
proposed as a context-dependent measure of information in biological systems.
We generalize it here to an arbitrary number of variables. Critical limiting
properties of the "differential interaction information" are key to the
generalization. This measure extends previous ideas about biological
information and provides a more sophisticated basis for study of complexity.
The properties of "differential interaction information" also suggest new
approaches to data analysis. Given a data set of system measurements
differential interaction information can provide a measure of collective
dependence, which can be represented in hypergraphs describing complex system
interaction patterns. We investigate this kind of analysis using simulated data
sets. The conjoining of a generalized set complexity measure, multi-variable
dependency analysis, and hypergraphs is our central result. While our focus is
on complex biological systems, our results are applicable to any complex
system.Comment: 44 pages, 12 figures; made revisions after peer revie
Testing for Collusion in Asymmetric First-Price Auctions
This paper proposes fully nonparametric tests to detect possible collusion in first-price procurement (auctions). The aim of the tests is to detect possible collusion before knowing whether or not bidders are colluding. Thus we do not rely on data on anti-competitive hearing, and in that sense is âex-anteâ. We propose a two steps (model selection) procedure: First, we use a reduced form test of independence and symmetry to shortlist bidders whose bidding behavior is at-odds with competitive bidding, and Second, the recovered (latent) cost for these bidders must be higher under collusion than under competition, because collusion dwarfs competition, hence detecting collusion boils down to testing if the estimated cost distribution under collusion first order stochastically dominates that under competition. We propose rank based and Kolmogorov-Smirnov (K-S) tests. We implement the tests for Highway Procurement data in California and conclude that there is no evidence of collusion even though the reduced form test supports collusion.
Learning Material-Aware Local Descriptors for 3D Shapes
Material understanding is critical for design, geometric modeling, and
analysis of functional objects. We enable material-aware 3D shape analysis by
employing a projective convolutional neural network architecture to learn
material- aware descriptors from view-based representations of 3D points for
point-wise material classification or material- aware retrieval. Unfortunately,
only a small fraction of shapes in 3D repositories are labeled with physical
mate- rials, posing a challenge for learning methods. To address this
challenge, we crowdsource a dataset of 3080 3D shapes with part-wise material
labels. We focus on furniture models which exhibit interesting structure and
material variabil- ity. In addition, we also contribute a high-quality expert-
labeled benchmark of 115 shapes from Herman-Miller and IKEA for evaluation. We
further apply a mesh-aware con- ditional random field, which incorporates
rotational and reflective symmetries, to smooth our local material predic-
tions across neighboring surface patches. We demonstrate the effectiveness of
our learned descriptors for automatic texturing, material-aware retrieval, and
physical simulation. The dataset and code will be publicly available.Comment: 3DV 201
Normalized Information Distance
The normalized information distance is a universal distance measure for
objects of all kinds. It is based on Kolmogorov complexity and thus
uncomputable, but there are ways to utilize it. First, compression algorithms
can be used to approximate the Kolmogorov complexity if the objects have a
string representation. Second, for names and abstract concepts, page count
statistics from the World Wide Web can be used. These practical realizations of
the normalized information distance can then be applied to machine learning
tasks, expecially clustering, to perform feature-free and parameter-free data
mining. This chapter discusses the theoretical foundations of the normalized
information distance and both practical realizations. It presents numerous
examples of successful real-world applications based on these distance
measures, ranging from bioinformatics to music clustering to machine
translation.Comment: 33 pages, 12 figures, pdf, in: Normalized information distance, in:
Information Theory and Statistical Learning, Eds. M. Dehmer, F.
Emmert-Streib, Springer-Verlag, New-York, To appea
- âŠ