44,866 research outputs found

    Predicting Complex Word Emotions and Topics through a Hierarchical Bayesian Network

    Get PDF
    In this paper, we provide a Word Emotion Topic (WET) model to predict the complex word emotion information from text, and discover the distribution of emotions among different topics. A complex emotion is defined as the combination of one or more singular emotions from following 8 basic emotion categories: joy, love, expectation, surprise, anxiety, sorrow, anger and hate. We use a hierarchical Bayesian network to model the emotions and topics in the text. Both the complex emotions and topics are drawn from raw texts, without considering any complicated language features. Our experiment shows promising results of word emotion prediction, which outperforms the traditional parsing methods such as the Hidden Markov Model and the Conditional Random Fields(CRFs) on raw text. We also explore the topic distribution by examining the emotion topic variation in an emotion topic diagram

    Computing the Cramer-Rao bound of Markov random field parameters: Application to the Ising and the Potts models

    Get PDF
    This report considers the problem of computing the Cramer-Rao bound for the parameters of a Markov random field. Computation of the exact bound is not feasible for most fields of interest because their likelihoods are intractable and have intractable derivatives. We show here how it is possible to formulate the computation of the bound as a statistical inference problem that can be solve approximately, but with arbitrarily high accuracy, by using a Monte Carlo method. The proposed methodology is successfully applied on the Ising and the Potts models.% where it is used to assess the performance of three state-of-the art estimators of the parameter of these Markov random fields

    Two-parameter Poisson-Dirichlet measures and reversible exchangeable fragmentation-coalescence processes

    Get PDF
    We show that for 0−α0-\alpha, the Poisson-Dirichlet distribution with parameter (α,Ξ)(\alpha, \theta) is the unique reversible distribution of a rather natural fragmentation-coalescence process. This completes earlier results in the literature for certain split and merge transformations and the parameter α=0\alpha =0

    Learning Reputation in an Authorship Network

    Full text link
    The problem of searching for experts in a given academic field is hugely important in both industry and academia. We study exactly this issue with respect to a database of authors and their publications. The idea is to use Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA) to perform topic modelling in order to find authors who have worked in a query field. We then construct a coauthorship graph and motivate the use of influence maximisation and a variety of graph centrality measures to obtain a ranked list of experts. The ranked lists are further improved using a Markov Chain-based rank aggregation approach. The complete method is readily scalable to large datasets. To demonstrate the efficacy of the approach we report on an extensive set of computational simulations using the Arnetminer dataset. An improvement in mean average precision is demonstrated over the baseline case of simply using the order of authors found by the topic models

    Learning loopy graphical models with latent variables: Efficient methods and guarantees

    Get PDF
    The problem of structure estimation in graphical models with latent variables is considered. We characterize conditions for tractable graph estimation and develop efficient methods with provable guarantees. We consider models where the underlying Markov graph is locally tree-like, and the model is in the regime of correlation decay. For the special case of the Ising model, the number of samples nn required for structural consistency of our method scales as n=Ω(ΞminâĄâˆ’ÎŽÎ·(η+1)−2log⁥p)n=\Omega(\theta_{\min}^{-\delta\eta(\eta+1)-2}\log p), where p is the number of variables, Ξmin⁥\theta_{\min} is the minimum edge potential, ÎŽ\delta is the depth (i.e., distance from a hidden node to the nearest observed nodes), and η\eta is a parameter which depends on the bounds on node and edge potentials in the Ising model. Necessary conditions for structural consistency under any algorithm are derived and our method nearly matches the lower bound on sample requirements. Further, the proposed method is practical to implement and provides flexibility to control the number of latent variables and the cycle lengths in the output graph.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1070 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org
    • 

    corecore