3,926 research outputs found

    On the Hardness of Entropy Minimization and Related Problems

    Full text link
    We investigate certain optimization problems for Shannon information measures, namely, minimization of joint and conditional entropies H(X,Y)H(X,Y), H(XY)H(X|Y), H(YX)H(Y|X), and maximization of mutual information I(X;Y)I(X;Y), over convex regions. When restricted to the so-called transportation polytopes (sets of distributions with fixed marginals), very simple proofs of NP-hardness are obtained for these problems because in that case they are all equivalent, and their connection to the well-known \textsc{Subset sum} and \textsc{Partition} problems is revealed. The computational intractability of the more general problems over arbitrary polytopes is then a simple consequence. Further, a simple class of polytopes is shown over which the above problems are not equivalent and their complexity differs sharply, namely, minimization of H(X,Y)H(X,Y) and H(YX)H(Y|X) is trivial, while minimization of H(XY)H(X|Y) and maximization of I(X;Y)I(X;Y) are strongly NP-hard problems. Finally, two new (pseudo)metrics on the space of discrete probability distributions are introduced, based on the so-called variation of information quantity, and NP-hardness of their computation is shown.Comment: IEEE Information Theory Workshop (ITW) 201

    Empirical Risk Minimization for Probabilistic Grammars: Sample Complexity and Hardness of Learning

    Get PDF
    Probabilistic grammars are generative statistical models that are useful for compositional and sequential structures. They are used ubiquitously in computational linguistics. We present a framework, reminiscent of structural risk minimization, for empirical risk minimization of probabilistic grammars using the log-loss. We derive sample complexity bounds in this framework that apply both to the supervised setting and the unsupervised setting. By making assumptions about the underlying distribution that are appropriate for natural language scenarios, we are able to derive distribution-dependent sample complexity bounds for probabilistic grammars. We also give simple algorithms for carrying out empirical risk minimization using this framework in both the supervised and unsupervised settings. In the unsupervised case, we show that the problem of minimizing empirical risk is NP-hard. We therefore suggest an approximate algorithm, similar to expectation-maximization, to minimize the empirical risk. Learning from data is central to contemporary computational linguistics. It is in common in such learning to estimate a model in a parametric family using the maximum likelihood principle. This principle applies in the supervised case (i.e., using annotate

    Simplest random K-satisfiability problem

    Full text link
    We study a simple and exactly solvable model for the generation of random satisfiability problems. These consist of γN\gamma N random boolean constraints which are to be satisfied simultaneously by NN logical variables. In statistical-mechanics language, the considered model can be seen as a diluted p-spin model at zero temperature. While such problems become extraordinarily hard to solve by local search methods in a large region of the parameter space, still at least one solution may be superimposed by construction. The statistical properties of the model can be studied exactly by the replica method and each single instance can be analyzed in polynomial time by a simple global solution method. The geometrical/topological structures responsible for dynamic and static phase transitions as well as for the onset of computational complexity in local search method are thoroughly analyzed. Numerical analysis on very large samples allows for a precise characterization of the critical scaling behaviour.Comment: 14 pages, 5 figures, to appear in Phys. Rev. E (Feb 2001). v2: minor errors and references correcte

    Exploration of the High Entropy Alloy Space as a Constraint Satisfaction Problem

    Get PDF
    High Entropy Alloys (HEAs), Multi-principal Component Alloys (MCA), or Compositionally Complex Alloys (CCAs) are alloys that contain multiple principal alloying elements. While many HEAs have been shown to have unique properties, their discovery has been largely done through costly and time-consuming trial-and-error approaches, with only an infinitesimally small fraction of the entire possible composition space having been explored. In this work, the exploration of the HEA composition space is framed as a Continuous Constraint Satisfaction Problem (CCSP) and solved using a novel Constraint Satisfaction Algorithm (CSA) for the rapid and robust exploration of alloy thermodynamic spaces. The algorithm is used to discover regions in the HEA Composition-Temperature space that satisfy desired phase constitution requirements. The algorithm is demonstrated against a new (TCHEA1) CALPHAD HEA thermodynamic database. The database is first validated by comparing phase stability predictions against experiments and then the CSA is deployed and tested against design tasks consisting of identifying not only single phase solid solution regions in ternary, quaternary and quinary composition spaces but also the identification of regions that are likely to yield precipitation-strengthened HEAs.Comment: 14 pages, 13 figure

    Phase coexistence and finite-size scaling in random combinatorial problems

    Full text link
    We study an exactly solvable version of the famous random Boolean satisfiability problem, the so called random XOR-SAT problem. Rare events are shown to affect the combinatorial ``phase diagram'' leading to a coexistence of solvable and unsolvable instances of the combinatorial problem in a certain region of the parameters characterizing the model. Such instances differ by a non-extensive quantity in the ground state energy of the associated diluted spin-glass model. We also show that the critical exponent ν\nu, controlling the size of the critical window where the probability of having solutions vanishes, depends on the model parameters, shedding light on the link between random hyper-graph topology and universality classes. In the case of random satisfiability, a similar behavior was conjectured to be connected to the onset of computational intractability.Comment: 10 pages, 5 figures, to appear in J. Phys. A. v2: link to the XOR-SAT probelm adde
    corecore