198 research outputs found

    A survey of max-type recursive distributional equations

    Full text link
    In certain problems in a variety of applied probability settings (from probabilistic analysis of algorithms to statistical physics), the central requirement is to solve a recursive distributional equation of the form X =^d g((\xi_i,X_i),i\geq 1). Here (\xi_i) and g(\cdot) are given and the X_i are independent copies of the unknown distribution X. We survey this area, emphasizing examples where the function g(\cdot) is essentially a ``maximum'' or ``minimum'' function. We draw attention to the theoretical question of endogeny: in the associated recursive tree process X_i, are the X_i measurable functions of the innovations process (\xi_i)?Comment: Published at http://dx.doi.org/10.1214/105051605000000142 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Percolation-like Scaling Exponents for Minimal Paths and Trees in the Stochastic Mean Field Model

    Full text link
    In the mean field (or random link) model there are nn points and inter-point distances are independent random variables. For 0<<0 < \ell < \infty and in the nn \to \infty limit, let δ()=1/n×\delta(\ell) = 1/n \times (maximum number of steps in a path whose average step-length is \leq \ell). The function δ()\delta(\ell) is analogous to the percolation function in percolation theory: there is a critical value =e1\ell_* = e^{-1} at which δ()\delta(\cdot) becomes non-zero, and (presumably) a scaling exponent β\beta in the sense δ()()β\delta(\ell) \asymp (\ell - \ell_*)^\beta. Recently developed probabilistic methodology (in some sense a rephrasing of the cavity method of Mezard-Parisi) provides a simple albeit non-rigorous way of writing down such functions in terms of solutions of fixed-point equations for probability distributions. Solving numerically gives convincing evidence that β=3\beta = 3. A parallel study with trees instead of paths gives scaling exponent β=2\beta = 2. The new exponents coincide with those found in a different context (comparing optimal and near-optimal solutions of mean-field TSP and MST) and reinforce the suggestion that these scaling exponents determine universality classes for optimization problems on random points.Comment: 19 page

    Belief propagation for optimal edge cover in the random complete graph

    Full text link
    We apply the objective method of Aldous to the problem of finding the minimum-cost edge cover of the complete graph with random independent and identically distributed edge costs. The limit, as the number of vertices goes to infinity, of the expected minimum cost for this problem is known via a combinatorial approach of Hessler and W\"{a}stlund. We provide a proof of this result using the machinery of the objective method and local weak convergence, which was used to prove the ζ(2)\zeta(2) limit of the random assignment problem. A proof via the objective method is useful because it provides us with more information on the nature of the edge's incident on a typical root in the minimum-cost edge cover. We further show that a belief propagation algorithm converges asymptotically to the optimal solution. This can be applied in a computational linguistics problem of semantic projection. The belief propagation algorithm yields a near optimal solution with lesser complexity than the known best algorithms designed for optimality in worst-case settings.Comment: Published in at http://dx.doi.org/10.1214/13-AAP981 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A Socio-mathematical and Structure-Based Approach to Model Sentiment Dynamics in Event-Based Text

    Get PDF
    Natural language texts are often meant to express or impact the emotions of individuals. Recognizing the underlying emotions expressed in or triggered by textual content is essential if one is to arrive at an understanding of the full meaning that textual content conveys. Sentiment analysis (SA) researchers are becoming increasingly interested in investigating natural language processing techniques as well as emotion theory in order to detect, extract, and classify the sentiments that natural language text expresses. Most SA research is focused on the analysis of subjective documents from the writer’s perspective and their classification into categorical labels or sentiment polarity, in which text is associated with a descriptive label or a point on a continuum between two polarities. Researchers often perform sentiment or polarity classification tasks using machine learning (ML) techniques, sentiment lexicons, or hybrid-based approaches. Most ML methods rely on count-based word representations that fail to take word order into account. Despite the successful use of these flat word representations in topic-modelling problems, SA problems require a deeper understanding of sentence structure, since the entire meaning of words can be reversed through negations or word modifiers. On the other hand, approaches based on semantic lexicons are limited by the relatively small number of words they contain, which do not begin to embody the extensive and growing vocabulary on the Internet. The research presented in this thesis represents an effort to tackle the problem of sentiment analysis from a different viewpoint than those underlying current mainstream studies in this research area. A cross-disciplinary approach is proposed that incorporates affect control theory (ACT) into a structured model for determining the sentiment polarity of event-based articles from the perspectives of readers and interactants. A socio-mathematical theory, ACT provides valuable resources for handling interactions between words (event entities) and for predicting situational sentiments triggered by social events. ACT models human emotions arising from social event terms through the use of multidimensional representations that have been verified both empirically and theoretically. To model human emotions regarding textual content, the first step was to develop a fine-grained event extraction algorithm that extracts events and their entities from event-based textual information using semantic and syntactic parsing techniques. The results of the event extraction method were compared against a supervised learning approach on two human-coded corpora (a grammatically correct and a grammatically incorrect structured corpus). For both corpora, the semantic-syntactic event extraction method yielded a higher degree of accuracy than the supervised learning approach. The three-dimensional ACT lexicon was also augmented in a semi-supervised fashion using graph-based label propagation built from semantic and neural network word embeddings. The word embeddings were obtained through the training of commonly used count-based and neural-network-based algorithms on a single corpus, and each method was evaluated with respect to the reconstruction of a sentiment lexicon. The results show that, relative to other word embeddings and state-of-the-art methods, combining both semantic and neural word embeddings yielded the highest correlation scores and lowest error rates. Using the augmented lexicon and ACT mathematical equations, human emotions were modelled according to different levels of granularity (i.e., at the sentence and document levels). The initial stage involved the development of a proposed entity-based SA approach that models reader emotions triggered by event-based sentences. The emotions are modelled in a three-dimensional space based on reader sentiment toward different entities (e.g., subject and object) in the sentence. The new approach was evaluated using a human-annotated news-headline corpus; the results revealed the proposed method to be competitive with benchmark ML techniques. The second phase entailed the creation of a proposed ACT-based model for predicting the temporal progression of the emotions of the interactants and their optimal behaviour over a sequence of interactions. The model was evaluated using three different corpora: fairy tales, news articles, and a handcrafted corpus. The results produced by the proposed model demonstrate that, despite the challenging sentence structure, a reasonable agreement was achieved between the estimated emotions and behaviours and the corresponding ground truth

    Topics in random graphs, combinatorial optimization, and statistical inference

    Get PDF
    The manuscript is made of three chapters presenting three differenttopics on which I worked with Ph.D. students. Each chapter can be read independently of the others andshould be relatively self-contained. Chapter 1 is a gentle introduction to the theory of random graphswith an emphasis on contagions on such networks. In Chapter 2, I explain the main ideas of the objectivemethod developed by Aldous and Steele applied to the spectral measure of random graphs and themonomer-dimer problem. This topic is dear to me and I hope that this chapter will convince the readerthat it is an exciting field of research. Chapter 3 deals with problems in high-dimensional statistics whichnow occupy a large proportion of my time. Unlike Chapters 1 and 2 which could be easily extended inlecture notes, I felt that the material in Chapter 3 was not ready for such a treatment. This field ofresearch is currently very active and I decided to present two of my recent contributions

    Limiting behaviour of random spatial graphs and asymptotically homogeneous RWRE

    Get PDF
    We consider several random spatial graphs of the nearest-neighbour type, including the k- nearest neighbours graph, the on-line nearest-neighbour graph, and the minimal directed spanning tree. We study the large sample asymptotic behaviour of the total length of these graphs, with power-weighted edges. We give laws of large numbers and weak convergence results. We evaluate limiting constants explicitly. In Bhatt and Roy's minimal directed spanning tree (MDST) construction on random points in (0,1)(^2), each point is joined to its nearest neighbour in the south-westerly direction. We show that the limiting total length (with power-weighted egdes) of the edges joined to the origin converges in distribution to a Dickman-type random variable. We also study the length of the longest edge in the MDST. For the total weight of the MDST, we give a weak convergence result. The limiting distribution is given a normal component plus a contribution due to boundary effects, which can be characterized by a fixed point equation. There is a phase transition in the limit law as the weight exponent increases. In the second part of this thesis, we give criteria for ergodicity, transience and null recurrence for the random walk in random environment (RWRE) on z+ = {0,1,2,...}, with reflection at the origin, where the random environment is subject to a vanishing perturbation from the so-called Sinai's regime. Our results complement existing criteria for random walks in random environments and for Markov chains with asymptotically zero drift, and are significantly different to these previously studied cases. Our method is based on a martingale technique 一 the method of Lyapunov functions
    corecore