102 research outputs found

    KNESER RANKS OF RANDOM GRAPHS AND MINIMUM DIFFERENCE REPRESENTATIONS

    Get PDF
    Every graph G=(V,E)G=(V,E) is an induced subgraph of some Kneser graph of rank kk, i.e., there is an assignment of (distinct) kk-sets vAvv \mapsto A_v to the vertices vVv\in V such that AuA_u and AvA_v are disjoint if and only if uvEuv\in E. The smallest such kk is called the Kneser rank of GG and denoted by fKneser(G)f_{\rm Kneser}(G). As an application of a result of Frieze and Reed concerning the clique cover number of random graphs we show that for constant 0000, i=1,2i=1,2 such that with high probability c1n/(logn)<fKneser(G)<c2n/(logn). c_1 n/(\log n)< f_{\rm Kneser}(G) < c_2 n/(\log n). We apply this for other graph representations defined by Boros, Gurvich and Meshulam. A {\em kk-min-difference representation} of a graph GG is an assignment of a set AiA_i to each vertex iV(G)i\in V(G) such that ijE(G)min{AiAj,AjAi}k. ij\in E(G) \,\, \Leftrightarrow \, \, \min \{|A_i\setminus A_j|,|A_j\setminus A_i| \}\geq k. The smallest kk such that there exists a kk-min-difference representation of GG is denoted by fmin(G)f_{\min}(G). Balogh and Prince proved in 2009 that for every kk there is a graph GG with fmin(G)kf_{\min}(G)\geq k. We prove that there are constants c1,c2>0c''_1, c''_2>0 such that c1n/(logn)<fmin(G)<c2n/(logn)c''_1 n/(\log n)< f_{\min}(G) < c''_2n/(\log n) holds for almost all bipartite graphs GG on n+nn+n vertices

    Graph-based Patterns for Local Coherence Modeling

    Get PDF
    Coherence is an essential property of well-written texts. It distinguishes a multi-sentence text from a sequence of randomly strung sentences. The task of local coherence modeling is about the way that sentences in a text link up one another. Solving this task is beneficial for assessing the quality of texts. Moreover, a coherence model can be integrated into text generation systems such as text summarizers to produce coherent texts. In this dissertation, we present a graph-based approach to local coherence modeling that accounts for the connectivity structure among sentences in a text. Graphs give our model the capability to take into account relations between non-adjacent sentences as well as those between adjacent sentences. Besides, the connectivity style among nodes in graphs reflects the relationships among sentences in a text. We first employ the entity graph approach, proposed by Guinaudeau and Strube (2013), to represent a text via a graph. In the entity graph representation of a text, nodes encode sentences and edges depict the existence of a pair of coreferent mentions in sentences. We then devise graph-based features to capture the connectivity structure of nodes in a graph, and accordingly the connectivity structure of sentences in the corresponding text. We extract all subgraphs of entity graphs as features which encode the connectivity structure of graphs. Frequencies of subgraphs correlate with the perceived coherence of their corresponding texts. Therefore, we refer to these subgraphs as coherence patterns. In order to complete our approach to coherence modeling, we propose a new graph representation of texts, rather than the entity graph. Our approach employs lexico-semantic relations among words in sentences, instead of only entity coreference relations, to model relationships between sentences via a graph. This new lexical graph representation of text plus our method for mining coherence patterns make our coherence model. We evaluate our approach on the readability assessment task because a primary factor of readability is coherence. Coherent texts are easy to read and consequently demand less effort from their readers. Our extensive experiments on two separate readability assessment datasets show that frequencies of coherence patterns in texts correlate with the readability ratings assigned by human judges. By training a machine learning method on our coherence patterns, our model outperforms its counterparts on ranking texts with respect to their readability. As one of the ultimate goals of coherence models is to be used in text generation systems, we show how our coherence patterns can be integrated into a graph-based text summarizer to produce informative and coherent summaries. Our coherence patterns improve the performance of the summarization system based on both standard summarization metrics and human evaluations. An implementation of the approaches discussed in this dissertation is publicly available

    Information overload in structured data

    Get PDF
    Information overload refers to the difficulty of making decisions caused by too much information. In this dissertation, we address information overload problem in two separate structured domains, namely, graphs and text. Graph kernels have been proposed as an efficient and theoretically sound approach to compute graph similarity. They decompose graphs into certain sub-structures, such as subtrees, or subgraphs. However, existing graph kernels suffer from a few drawbacks. First, the dimension of the feature space associated with the kernel often grows exponentially as the complexity of sub-structures increase. One immediate consequence of this behavior is that small, non-informative, sub-structures occur more frequently and cause information overload. Second, as the number of features increase, we encounter sparsity: only a few informative sub-structures will co-occur in multiple graphs. In the first part of this dissertation, we propose to tackle the above problems by exploiting the dependency relationship among sub-structures. First, we propose a novel framework that learns the latent representations of sub-structures by leveraging recent advancements in deep learning. Second, we propose a general smoothing framework that takes structural similarity into account, inspired by state-of-the-art smoothing techniques used in natural language processing. Both the proposed frameworks are applicable to popular graph kernel families, and achieve significant performance improvements over state-of-the-art graph kernels. In the second part of this dissertation, we tackle information overload in text. We first focus on a popular social news aggregation website, Reddit, and design a submodular recommender system that tailors a personalized frontpage for individual users. Second, we propose a novel submodular framework to summarize videos, where both transcript and comments are available. Third, we demonstrate how to apply filtering techniques to select a small subset of informative features from virtual machine logs in order to predict resource usage

    Gitter und Anwendungen

    Get PDF
    The meeting focussed on lattices and their applications in mathematics and information technology. The research interests of the participants varied from engineering sciences, algebraic and analytic number theory, coding theory, algebraic geometry to name only a few

    Finding Planted Subgraphs with Few Eigenvalues using the Schur-Horn Relaxation

    Get PDF
    Extracting structured subgraphs inside large graphs---often known as the planted subgraph problem---is a fundamental question that arises in a range of application domains. This problem is NP-hard in general and, as a result, significant efforts have been directed towards the development of tractable procedures that succeed on specific families of problem instances. We propose a new computationally efficient convex relaxation for solving the planted subgraph problem; our approach is based on tractable semidefinite descriptions of majorization inequalities on the spectrum of a symmetric matrix. This procedure is effective at finding planted subgraphs that consist of few distinct eigenvalues, and it generalizes previous convex relaxation techniques for finding planted cliques. Our analysis relies prominently on the notion of spectrally comonotone matrices, which are pairs of symmetric matrices that can be transformed to diagonal matrices with sorted diagonal entries upon conjugation by the same orthogonal matrix

    Streaming and Sketch Algorithms for Large Data NLP

    Get PDF
    The availability of large and rich quantities of text data is due to the emergence of the World Wide Web, social media, and mobile devices. Such vast data sets have led to leaps in the performance of many statistically-based problems. Given a large magnitude of text data available, it is computationally prohibitive to train many complex Natural Language Processing (NLP) models on large data. This motivates the hypothesis that simple models trained on big data can outperform more complex models with small data. My dissertation provides a solution to effectively and efficiently exploit large data on many NLP applications. Datasets are growing at an exponential rate, much faster than increase in memory. To provide a memory-efficient solution for handling large datasets, this dissertation show limitations of existing streaming and sketch algorithms when applied to canonical NLP problems and proposes several new variants to overcome those shortcomings. Streaming and sketch algorithms process the large data sets in one pass and represent a large data set with a compact summary, much smaller than the full size of the input. These algorithms can easily be implemented in a distributed setting and provide a solution that is both memory- and time-efficient. However, the memory and time savings come at the expense of approximate solutions. In this dissertation, I demonstrate that approximate solutions achieved on large data are comparable to exact solutions on large data and outperform exact solutions on smaller data. I focus on many NLP problems that boil down to tracking many statistics, like storing approximate counts, computing approximate association scores like pointwise mutual information (PMI), finding frequent items (like n-grams), building streaming language models, and measuring distributional similarity. First, I introduce the concept of approximate streaming large-scale language models in NLP. Second, I present a novel variant of the Count-Min sketch that maintains approximate counts of all items. Third, I conduct a systematic study and compare many sketch algorithms that approximate count of items with focus on large-scale NLP tasks. Last, I develop fast large-scale approximate graph (FLAG), a system that quickly constructs a large-scale approximate nearest-neighbor graph from a large corpus

    Variations on a Theme: Graph Homomorphisms

    Get PDF
    This thesis investigates three areas of the theory of graph homomorphisms: cores of graphs, the homomorphism order, and quantum homomorphisms. A core of a graph X is a vertex minimal subgraph to which X admits a homomorphism. Hahn and Tardif have shown that, for vertex transitive graphs, the size of the core must divide the size of the graph. This motivates the following question: when can the vertex set of a vertex transitive graph be partitioned into sets which each induce a copy of its core? We show that normal Cayley graphs and vertex transitive graphs with cores half their size always admit such partitions. We also show that the vertex sets of vertex transitive graphs with cores less than half their size do not, in general, have such partitions. Next we examine the restriction of the homomorphism order of graphs to line graphs. Our main focus is in comparing this restriction to the whole order. The primary tool we use in our investigation is that, as a consequence of Vizing's theorem, this partial order can be partitioned into intervals which can then be studied independently. We denote the line graph of X by L(X). We show that for all n ≥ 2, for any line graph Y strictly greater than the complete graph Kₙ, there exists a line graph X sitting strictly between Kₙ and Y. In contrast, we prove that there does not exist any connected line graph which sits strictly between L(Kₙ) and Kₙ, for n odd. We refer to this property as being ``n-maximal", and we show that any such line graph must be a core and the line graph of a regular graph of degree n. Finally, we introduce quantum homomorphisms as a generalization of, and framework for, quantum colorings. Using quantum homomorphisms, we are able to define several other quantum parameters in addition to the previously defined quantum chromatic number. We also define two other parameters, projective rank and projective packing number, which satisfy a reciprocal relationship similar to that of fractional chromatic number and independence number, and are closely related to quantum homomorphisms. Using the projective packing number, we show that there exists a quantum homomorphism from X to Y if and only if the quantum independence number of a certain product graph achieves |V(X)|. This parallels a well known classical result, and allows us to construct examples of graphs whose independence and quantum independence numbers differ. Most importantly, we show that if there exists a quantum homomorphism from a graph X to a graph Y, then ϑ̄(X) ≤ ϑ̄(Y), where ϑ̄ denotes the Lovász theta function of the complement. We prove similar monotonicity results for projective rank and the projective packing number of the complement, as well as for two variants of ϑ̄. These immediately imply that all of these parameters lie between the quantum clique and quantum chromatic numbers, in particular yielding a quantum analog of the well known ``sandwich theorem". We also briefly investigate the quantum homomorphism order of graphs

    Apprentissage discriminant des modèles continus en traduction automatique

    Get PDF
    Over the past few years, neural network (NN) architectures have been successfully applied to many Natural Language Processing (NLP) applications, such as Automatic Speech Recognition (ASR) and Statistical Machine Translation (SMT).For the language modeling task, these models consider linguistic units (i.e words and phrases) through their projections into a continuous (multi-dimensional) space, and the estimated distribution is a function of these projections. Also qualified continuous-space models (CSMs), their peculiarity hence lies in this exploitation of a continuous representation that can be seen as an attempt to address the sparsity issue of the conventional discrete models. In the context of SMT, these echniques have been applied on neural network-based language models (NNLMs) included in SMT systems, and oncontinuous-space translation models (CSTMs). These models have led to significant and consistent gains in the SMT performance, but are also considered as very expensive in training and inference, especially for systems involving large vocabularies. To overcome this issue, Structured Output Layer (SOUL) and Noise Contrastive Estimation (NCE) have been proposed; the former modifies the standard structure on vocabulary words, while the latter approximates the maximum-likelihood estimation (MLE) by a sampling method. All these approaches share the same estimation criterion which is the MLE ; however using this procedure results in an inconsistency between theobjective function defined for parameter stimation and the way models are used in the SMT application. The work presented in this dissertation aims to design new performance-oriented and global training procedures for CSMs to overcome these issues. The main contributions lie in the investigation and evaluation of efficient training methods for (large-vocabulary) CSMs which aim~:(a) to reduce the total training cost, and (b) to improve the efficiency of these models when used within the SMT application. On the one hand, the training and inference cost can be reduced (using the SOUL structure or the NCE algorithm), or by reducing the number of iterations via a faster convergence. This thesis provides an empirical analysis of these solutions on different large-scale SMT tasks. On the other hand, we propose a discriminative training framework which optimizes the performance of the whole system containing the CSM as a component model. The experimental results show that this framework is efficient to both train and adapt CSM within SMT systems, opening promising research perspectives.Durant ces dernières années, les architectures de réseaux de neurones (RN) ont été appliquées avec succès à de nombreuses applications en Traitement Automatique de Langues (TAL), comme par exemple en Reconnaissance Automatique de la Parole (RAP) ainsi qu'en Traduction Automatique (TA).Pour la tâche de modélisation statique de la langue, ces modèles considèrent les unités linguistiques (c'est-à-dire des mots et des segments) à travers leurs projections dans un espace continu (multi-dimensionnel), et la distribution de probabilité à estimer est une fonction de ces projections.Ainsi connus sous le nom de "modèles continus" (MC), la particularité de ces derniers se trouve dans l'exploitation de la représentation continue qui peut être considérée comme une solution au problème de données creuses rencontré lors de l'utilisation des modèles discrets conventionnels.Dans le cadre de la TA, ces techniques ont été appliquées dans les modèles de langue neuronaux (MLN) utilisés dans les systèmes de TA, et dans les modèles continus de traduction (MCT).L'utilisation de ces modèles se sont traduit par d'importantes et significatives améliorations des performances des systèmes de TA. Ils sont néanmoins très coûteux lors des phrases d'apprentissage et d'inférence, notamment pour les systèmes ayant un grand vocabulaire.Afin de surmonter ce problème, l'architecture SOUL (pour "Structured Output Layer" en anglais) et l'algorithme NCE (pour "Noise Contrastive Estimation", ou l'estimation contrastive bruitée) ont été proposés: le premier modifie la structure standard de la couche de sortie, alors que le second cherche à approximer l'estimation du maximum de vraisemblance (MV) par une méthode d’échantillonnage.Toutes ces approches partagent le même critère d'estimation qui est la log-vraisemblance; pourtant son utilisation mène à une incohérence entre la fonction objectif définie pour l'estimation des modèles, et la manière dont ces modèles seront utilisés dans les systèmes de TA.Cette dissertation vise à concevoir de nouvelles procédures d'entraînement des MC, afin de surmonter ces problèmes.Les contributions principales se trouvent dans l'investigation et l'évaluation des méthodes d'entraînement efficaces pour MC qui visent à: (i) réduire le temps total de l'entraînement, et (ii) améliorer l'efficacité de ces modèles lors de leur utilisation dans les systèmes de TA.D'un côté, le coût d'entraînement et d'inférence peut être réduit (en utilisant l'architecture SOUL ou l'algorithme NCE), ou la convergence peut être accélérée.La dissertation présente une analyse empirique de ces approches pour des tâches de traduction automatique à grande échelle.D'un autre côté, nous proposons un cadre d'apprentissage discriminant qui optimise la performance du système entier ayant incorporé un modèle continu.Les résultats expérimentaux montrent que ce cadre d'entraînement est efficace pour l'apprentissage ainsi que pour l'adaptation des MC au sein des systèmes de TA, ce qui ouvre de nouvelles perspectives prometteuses

    Convex Relaxations for Graph and Inverse Eigenvalue Problems

    Get PDF
    This thesis is concerned with presenting convex optimization based tractable solutions for three fundamental problems: 1. Planted subgraph problem: Given two graphs, identifying the subset of vertices of the larger graph corresponding to the smaller one. 2. Graph edit distance problem: Given two graphs, calculating the number of edge/vertex additions and deletions required to transform one graph into the other. 3. Affine inverse eigenvalue problem: Given a subspace ε ⊂ &#x1D54A;ⁿ and a vector of eigenvalues λ ∈ ℝⁿ, finding a symmetric matrix with spectrum λ contained in ε. These combinatorial and algebraic problems frequently arise in various application domains such as social networks, computational biology, chemoinformatics, and control theory. Nevertheless, exactly solving them in practice is only possible for very small instances due to their complexity. For each of these problems, we introduce convex relaxations which succeed in providing exact or approximate solutions in a computationally tractable manner. Our relaxations for the two graph problems are based on convex graph invariants, which are functions of graphs that do not depend on a particular labeling. One of these convex relaxations, coined the Schur-Horn orbitope, corresponds to the convex hull of all matrices with a given spectrum, and plays a prominent role in this thesis. Specifically, we utilize relaxations based on the Schur-Horn orbitope in the context of the planted subgraph problem and the graph edit distance problem. For both of these problems, we identify conditions under which the Schur-Horn orbitope based relaxations exactly solve the corresponding problem with overwhelming probability. Specifically, we demonstrate that these relaxations turn out to be particularly effective when the underlying graph has a spectrum comprised of few distinct eigenvalues with high multiplicities. In addition to relaxations based on the Schur-Horn orbitope, we also consider outer-approximations based on other convex graph invariants such as the stability number and the maximum-cut value for the graph edit distance problem. On the other hand, for the inverse eigenvalue problem, we investigate two relaxations arising from a sum of squares hierarchy. These relaxations have different approximation qualities, and accordingly induce different computational costs. We utilize our framework to generate solutions for, or certify unsolvability of the underlying inverse eigenvalue problem. We particularly emphasize the computational aspect of our relaxations throughout this thesis. We corroborate the utility of our methods with various numerical experiments.</p
    corecore