26,385 research outputs found

    Learning by stochastic serializations

    Full text link
    Complex structures are typical in machine learning. Tailoring learning algorithms for every structure requires an effort that may be saved by defining a generic learning procedure adaptive to any complex structure. In this paper, we propose to map any complex structure onto a generic form, called serialization, over which we can apply any sequence-based density estimator. We then show how to transfer the learned density back onto the space of original structures. To expose the learning procedure to the structural particularities of the original structures, we take care that the serializations reflect accurately the structures' properties. Enumerating all serializations is infeasible. We propose an effective way to sample representative serializations from the complete set of serializations which preserves the statistics of the complete set. Our method is competitive or better than state of the art learning algorithms that have been specifically designed for given structures. In addition, since the serialization involves sampling from a combinatorial process it provides considerable protection from overfitting, which we clearly demonstrate on a number of experiments.Comment: Submission to NeurIPS 201

    Exotic trees

    Full text link
    We discuss the scaling properties of free branched polymers. The scaling behaviour of the model is classified by the Hausdorff dimensions for the internal geometry: d_L and d_H, and for the external one: D_L and D_H. The dimensions d_H and D_H characterize the behaviour for long distances while d_L and D_L for short distances. We show that the internal Hausdorff dimension is d_L=2 for generic and scale-free trees, contrary to d_H which is known be equal two for generic trees and to vary between two and infinity for scale-free trees. We show that the external Hausdorff dimension D_H is directly related to the internal one as D_H = \alpha d_H, where \alpha is the stability index of the embedding weights for the nearest-vertex interactions. The index is \alpha=2 for weights from the gaussian domain of attraction and 0<\alpha <2 for those from the L\'evy domain of attraction. If the dimension D of the target space is larger than D_H one finds D_L=D_H, or otherwise D_L=D. The latter result means that the fractal structure cannot develop in a target space which has too low dimension.Comment: 33 pages, 6 eps figure

    The Grow-Shrink strategy for learning Markov network structures constrained by context-specific independences

    Full text link
    Markov networks are models for compactly representing complex probability distributions. They are composed by a structure and a set of numerical weights. The structure qualitatively describes independences in the distribution, which can be exploited to factorize the distribution into a set of compact functions. A key application for learning structures from data is to automatically discover knowledge. In practice, structure learning algorithms focused on "knowledge discovery" present a limitation: they use a coarse-grained representation of the structure. As a result, this representation cannot describe context-specific independences. Very recently, an algorithm called CSPC was designed to overcome this limitation, but it has a high computational complexity. This work tries to mitigate this downside presenting CSGS, an algorithm that uses the Grow-Shrink strategy for reducing unnecessary computations. On an empirical evaluation, the structures learned by CSGS achieve competitive accuracies and lower computational complexity with respect to those obtained by CSPC.Comment: 12 pages, and 8 figures. This works was presented in IBERAMIA 201

    Elimination of Spurious Ambiguity in Transition-Based Dependency Parsing

    Get PDF
    We present a novel technique to remove spurious ambiguity from transition systems for dependency parsing. Our technique chooses a canonical sequence of transition operations (computation) for a given dependency tree. Our technique can be applied to a large class of bottom-up transition systems, including for instance Nivre (2004) and Attardi (2006)

    Causal and homogeneous networks

    Full text link
    Growing networks have a causal structure. We show that the causality strongly influences the scaling and geometrical properties of the network. In particular the average distance between nodes is smaller for causal networks than for corresponding homogeneous networks. We explain the origin of this effect and illustrate it using as an example a solvable model of random trees. We also discuss the issue of stability of the scale-free node degree distribution. We show that a surplus of links may lead to the emergence of a singular node with the degree proportional to the total number of links. This effect is closely related to the backgammon condensation known from the balls-in-boxes model.Comment: short review submitted to AIP proceedings, CNET2004 conference; changes in the discussion of the distance distribution for growing trees, Fig. 6-right change
    corecore