3,926 research outputs found
On the Hardness of Entropy Minimization and Related Problems
We investigate certain optimization problems for Shannon information
measures, namely, minimization of joint and conditional entropies ,
, , and maximization of mutual information , over
convex regions. When restricted to the so-called transportation polytopes (sets
of distributions with fixed marginals), very simple proofs of NP-hardness are
obtained for these problems because in that case they are all equivalent, and
their connection to the well-known \textsc{Subset sum} and \textsc{Partition}
problems is revealed. The computational intractability of the more general
problems over arbitrary polytopes is then a simple consequence. Further, a
simple class of polytopes is shown over which the above problems are not
equivalent and their complexity differs sharply, namely, minimization of
and is trivial, while minimization of and
maximization of are strongly NP-hard problems. Finally, two new
(pseudo)metrics on the space of discrete probability distributions are
introduced, based on the so-called variation of information quantity, and
NP-hardness of their computation is shown.Comment: IEEE Information Theory Workshop (ITW) 201
Empirical Risk Minimization for Probabilistic Grammars: Sample Complexity and Hardness of Learning
Probabilistic grammars are generative statistical models that are useful for compositional and sequential structures. They are used ubiquitously in computational linguistics. We present a framework, reminiscent of structural risk minimization, for empirical risk minimization of probabilistic grammars using the log-loss. We derive sample complexity bounds in this framework that apply both to the supervised setting and the unsupervised setting. By making assumptions about the underlying distribution that are appropriate for natural language scenarios, we are able to derive distribution-dependent sample complexity bounds for probabilistic grammars. We also give simple algorithms for carrying out empirical risk minimization using this framework in both the supervised and unsupervised settings. In the unsupervised case, we show that the problem of minimizing empirical risk is NP-hard. We therefore suggest an approximate algorithm, similar to expectation-maximization, to minimize the empirical risk. Learning from data is central to contemporary computational linguistics. It is in common in such learning to estimate a model in a parametric family using the maximum likelihood principle. This principle applies in the supervised case (i.e., using annotate
Simplest random K-satisfiability problem
We study a simple and exactly solvable model for the generation of random
satisfiability problems. These consist of random boolean constraints
which are to be satisfied simultaneously by logical variables. In
statistical-mechanics language, the considered model can be seen as a diluted
p-spin model at zero temperature. While such problems become extraordinarily
hard to solve by local search methods in a large region of the parameter space,
still at least one solution may be superimposed by construction. The
statistical properties of the model can be studied exactly by the replica
method and each single instance can be analyzed in polynomial time by a simple
global solution method. The geometrical/topological structures responsible for
dynamic and static phase transitions as well as for the onset of computational
complexity in local search method are thoroughly analyzed. Numerical analysis
on very large samples allows for a precise characterization of the critical
scaling behaviour.Comment: 14 pages, 5 figures, to appear in Phys. Rev. E (Feb 2001). v2: minor
errors and references correcte
Exploration of the High Entropy Alloy Space as a Constraint Satisfaction Problem
High Entropy Alloys (HEAs), Multi-principal Component Alloys (MCA), or
Compositionally Complex Alloys (CCAs) are alloys that contain multiple
principal alloying elements. While many HEAs have been shown to have unique
properties, their discovery has been largely done through costly and
time-consuming trial-and-error approaches, with only an infinitesimally small
fraction of the entire possible composition space having been explored. In this
work, the exploration of the HEA composition space is framed as a Continuous
Constraint Satisfaction Problem (CCSP) and solved using a novel Constraint
Satisfaction Algorithm (CSA) for the rapid and robust exploration of alloy
thermodynamic spaces. The algorithm is used to discover regions in the HEA
Composition-Temperature space that satisfy desired phase constitution
requirements. The algorithm is demonstrated against a new (TCHEA1) CALPHAD HEA
thermodynamic database. The database is first validated by comparing phase
stability predictions against experiments and then the CSA is deployed and
tested against design tasks consisting of identifying not only single phase
solid solution regions in ternary, quaternary and quinary composition spaces
but also the identification of regions that are likely to yield
precipitation-strengthened HEAs.Comment: 14 pages, 13 figure
Phase coexistence and finite-size scaling in random combinatorial problems
We study an exactly solvable version of the famous random Boolean
satisfiability problem, the so called random XOR-SAT problem. Rare events are
shown to affect the combinatorial ``phase diagram'' leading to a coexistence of
solvable and unsolvable instances of the combinatorial problem in a certain
region of the parameters characterizing the model. Such instances differ by a
non-extensive quantity in the ground state energy of the associated diluted
spin-glass model. We also show that the critical exponent , controlling
the size of the critical window where the probability of having solutions
vanishes, depends on the model parameters, shedding light on the link between
random hyper-graph topology and universality classes. In the case of random
satisfiability, a similar behavior was conjectured to be connected to the onset
of computational intractability.Comment: 10 pages, 5 figures, to appear in J. Phys. A. v2: link to the XOR-SAT
probelm adde
- …