96 research outputs found

    Adversarial Evaluation for Models of Natural Language

    No full text
    <p>We now have a rich and growing set of modeling tools and algorithms for inducing linguistic structure from text that is less than fully annotated. In this paper, we discuss some of the weaknesses of our current methodology. We present a new abstract framework for evaluating natural language processing (NLP) models in general and unsupervised NLP models in particular. The central idea is to make explicit certain adversarial roles among researchers, so that the different roles in an evaluation are more clearly defined and performers of all roles are offered ways to make measurable contributions to the larger goal. Adopting this approach may help to characterize model successes and failures by encouraging earlier consideration of error analysis. The framework can be instantiated in a variety of ways, simulating some familiar intrinsic and extrinsic evaluations as well as some new evaluations.</p

    Text-Driven Forecasting

    No full text
    Forecasting the future hinges on understanding the present. The web—particularly the social web—now gives us an up-to-the-minute snapshot of the world as it is and as it is perceived by many people, right now, but that snapshot is distributed in a way that is incomprehensible to a human. Much of this data is encoded in text, which is noisy, unstructured, and sparse; yet recent developments in natural language processing now permit us to analyze text and connect it to real-world measurable phenomena through statistical models. We propose text-driven forecasting as a challenge for natural language processing and machine learning: Given a body of text T pertinent to a social phenomenon, make a concrete prediction about a measurement M of that phenomenon, obtainable only in the future, that rivals the best-known methods for forecasting M. We seek methods that work in many settings, for many kinds of text and many kinds of measurements. Accurate text-driven forecasting will be of use to the intelligence community, policymakers, and businesses. The use of statistical models is the norm of natural language processing methods, making it straightforward to develop models that provide posterior probabilities over measurements. Evaluation and comparison of forecasting algorithms is straightforward and inexpensive. We present encouraging recent results across several domains, emphasizing that a broad suite of forecasting problems and text sources will best support progress on this task. Further, advances in text-driven forecasting will have broad impact in natural language processing, giving a concrete, theory-independent platform that encourages exploration of new ideas for tackling various aspects of text-oriented computational intelligence.</p

    Making the Most of Bag of Words: Sentence Regularization with Alternating Direction Method of Multipliers

    No full text
    <p>In many high-dimensional learning problems, only some parts of an observation are important to the prediction task; for example, the cues to correctly categorizing a document may lie in a handful of its sentences. We introduce a learning algorithm that exploits this intuition by encoding it in a regularizer. Specifically, we apply the sparse overlapping group lasso with one group for every bundle of features occurring together in a training-data sentence, leading to thousands to millions of overlapping groups. We show how to efficiently solve the resulting optimization challenge using the alternating directions method of multipliers. We find that the resulting method significantly outperforms competitive baselines (standard ridge, lasso, and elastic net regularizers) on a suite of real-world text categorization problems.</p

    Concavity and Initialization for Unsupervised Dependency Parsing

    No full text
    <p>We investigate models for unsupervised learning with concave log-likelihood functions. We begin with the most well-known example, IBM Model 1 for word alignment (Brown et al., 1993) and analyze its properties, discussing why other models for unsupervised learning are so seldom concave. We then present concave models for dependency grammar induction and validate them experimentally. We find our concave models to be effective initializers for the dependency model of Klein and Manning (2004) and show that we can encode linguistic knowledge in them for improved performance.</p

    Graph-Based Lexicon Expansion with Sparsity-Inducing Penalties

    No full text
    <p>We present novel methods to construct compact natural language lexicons within a graphbased semi-supervised learning framework, an attractive platform suited for propagating soft labels onto new natural language types from seed data. To achieve compactness, we induce sparse measures at graph vertices by incorporating sparsity-inducing penalties in Gaussian and entropic pairwise Markov networks constructed from labeled and unlabeled data. Sparse measures are desirable for high-dimensional multi-class learning problems such as the induction of labels on natural language types, which typically associate with only a few labels. Compared to standard graph-based learning methods, for two lexicon expansion problems, our approach produces significantly smaller lexicons and obtains better predictive performance.</p

    Generative Models of Monolingual and Bilingual Gappy Patterns

    No full text
    A growing body of machine translation research aims to exploit lexical patterns (e.g., ngrams and phrase pairs) with gaps (Simard et al., 2005; Chiang, 2005; Xiong et al., 2011). Typically, these “gappy patterns” are discovered using heuristics based on word alignments or local statistics such as mutual information. In this paper, we develop generative models of monolingual and parallel text that build sentences using gappy patterns of arbitrary length and with arbitrarily many gaps. We exploit Bayesian nonparametrics and collapsed Gibbs sampling to discover salient patterns in a corpus. We evaluate the patterns qualitatively and also add them as features to an MT system, reporting promising preliminary results.</p

    Tree Edit Models for Recognizing Textual Entailments, Paraphrases, and Answers to Questions

    No full text
    We describe tree edit models for representing sequences of tree transformations involving complex reordering phenomena and demonstrate that they offer a simple, intuitive, and effective method for modeling pairs of semantically related sentences. To efficiently extract sequences of edits, we employ a tree kernel as a heuristic in a greedy search routine. We describe a logistic regression model that uses 33 syntactic features of edit sequences to classify the sentence pairs. The approach leads to competitive performance in recognizing textual entailment, paraphrase identification, and answer selection for question answering</p

    Quasi-Synchronous Phrase Dependency Grammars for Machine Translation

    No full text
    We present a quasi-synchronous dependency grammar (Smith and Eisner, 2006) for machine translation in which the leaves of the tree are phrases rather than words as in previous work (Gimpel and Smith, 2009). This formulation allows us to combine structural components of phrase-based and syntax-based MT in a single model. We describe a method of extracting phrase dependencies from parallel text using a target-side dependency parser. For decoding, we describe a coarse-to-fine approach based on lattice dependency parsing of phrase lattices. We demonstrate performance improvements for Chinese-English and UrduEnglish translation over a phrase-based baseline. We also investigate the use of unsupervised dependency parsers, reporting encouraging preliminary results.</p

    Phrase Dependency Machine Translation with Quasi-Synchronous Tree-to-Tree Features

    No full text
    <p>Recent research has shown clear improvement in translation quality by exploiting linguistic syntax for either the source or target language. However, when using syntax for both languages (“tree-to-tree” translation), there is evidence that syntactic divergence can hamper the extraction of useful rules (Ding and Palmer 2005). Smith and Eisner (2006) introduced quasi-synchronous grammar, a formalism that treats non-isomorphic structure softly using features rather than hard constraints. Although a natural fit for translation modeling, its flexibility has proved challenging for building real-world systems. In this article, we present a tree-to-tree machine translation system inspired by quasi-synchronous grammar. The core of our approach is a new model that combines phrases and dependency syntax, integrating the advantages of phrase-based and syntax-based translation. We report statistically significant improvements over a phrasebased baseline on five of seven test sets across four language pairs. We also present encouraging preliminary results on the use of unsupervised dependency parsing for syntax-based machine translation.</p
    • …
    corecore