527 research outputs found

    Econometrics meets sentiment : an overview of methodology and applications

    Get PDF
    The advent of massive amounts of textual, audio, and visual data has spurred the development of econometric methodology to transform qualitative sentiment data into quantitative sentiment variables, and to use those variables in an econometric analysis of the relationships between sentiment and other variables. We survey this emerging research field and refer to it as sentometrics, which is a portmanteau of sentiment and econometrics. We provide a synthesis of the relevant methodological approaches, illustrate with empirical results, and discuss useful software

    Computational Approaches to the Syntax–Prosody Interface: Using Prosody to Improve Parsing

    Full text link
    Prosody has strong ties with syntax, since prosody can be used to resolve some syntactic ambiguities. Syntactic ambiguities have been shown to negatively impact automatic syntactic parsing, hence there is reason to believe that prosodic information can help improve parsing. This dissertation considers a number of approaches that aim to computationally examine the relationship between prosody and syntax of natural languages, while also addressing the role of syntactic phrase length, with the ultimate goal of using prosody to improve parsing. Chapter 2 examines the effect of syntactic phrase length on prosody in double center embedded sentences in French. Data collected in a previous study were reanalyzed using native speaker judgment and automatic methods (forced alignment). Results demonstrate similar prosodic splitting behavior as in English in contradiction to the original study’s findings. Chapter 3 presents a number of studies examining whether syntactic ambiguity can yield different prosodic patterns, allowing humans and/or computers to resolve the ambiguity. In an experimental study, humans disambiguated sentences with prepositional phrase- (PP)-attachment ambiguity with 49% accuracy presented as text, and 63% presented as audio. Machine learning on the same data yielded an accuracy of 63-73%. A corpus study on the Switchboard corpus used both prosodic breaks and phrase lengths to predict the attachment, with an accuracy of 63.5% for PP-attachment sentences, and 71.2% for relative clause attachment. Chapter 4 aims to identify aspects of syntax that relate to prosody and use these in combination with prosodic cues to improve parsing. The aspects identified (dependency configurations) are based on dependency structure, reflecting the relative head location of two consecutive words, and are used as syntactic features in an ensemble system based on Recurrent Neural Networks, to score parse hypotheses and select the most likely parse for a given sentence. Using syntactic features alone, the system achieved an improvement of 1.1% absolute in Unlabelled Attachment Score (UAS) on the test set, above the best parser in the ensemble, while using syntactic features combined with prosodic features (pauses and normalized duration) led to a further improvement of 0.4% absolute. The results achieved demonstrate the relationship between syntax, syntactic phrase length, and prosody, and indicate the ability and future potential of prosody to resolve ambiguity and improve parsing

    Online learning of latent linguistic structure with approximate search

    Get PDF
    Automatic analysis of natural language data is a frequently occurring application of machine learning systems. These analyses often revolve around some linguistic structure, for instance a syntactic analysis of a sentence by means of a tree. Machine learning models that carry out structured prediction, as opposed to simpler machine learning tasks such as classification or regression, have therefore received considerable attention in the language processing literature. As an additional twist, the sought linguistic structures are sometimes not directly modeled themselves. Rather, prediction takes place in a different space where the same linguistic structure can be represented in more than one way. However, in a standard supervised learning setting, these prediction structures are not available in the training data, but only the linguistic structure. Since multiple prediction structures may correspond to the same linguistic structure, it is thus unclear which prediction structure to use for learning. One option is to treat the prediction structure as latent and let the machine learning algorithm guide this selection. In this dissertation we present an abstract framework for structured prediction. This framework supports latent structures and is agnostic of the particular language processing task. It defines a set of hyperparameters and task-specific functions which a user must implement in order to apply it to a new task. The advantage of this modularization is that it permits comparisons and reuse across tasks in a common framework. The framework we devise is based on the structured perceptron for learning. The perceptron is an online learning algorithm which considers one training instance at a time, makes a prediction, and carries out an update if the prediction was wrong. We couple the structured perceptron with beam search, which is a general purpose search algorithm. Beam search is, however, only approximate, meaning that there is no guarantee that it will find the optimal structure in a large search space. Therefore special attention is required to handle search errors during training. This has led to the development of special update methods such as early and max-violation updates. The contributions of this dissertation sit at the intersection of machine learning and natural language processing. With regard to language processing, we consider three tasks: Coreference resolution, dependency parsing, and joint sentence segmentation and dependency parsing. For coreference resolution, we start from an existing latent tree model and extend it to accommodate non-local features drawn from a greater structural context. This requires us to sacrifice exact for approximate search, but we show that, assuming sufficiently advanced update methods are used for the structured perceptron, then the richer scope of features yields a stronger coreference model. We take a transition-based approach to dependency parsing, where dependency trees are constructed incrementally by transition system. Latent structures for transition-based parsing have previously not received enough attention, partly because the characterization of the prediction space is non-trivial. We provide a thorough analysis of this space with regard to the ArcStandard with Swap transition system. This characterization enables us to evaluate the role of latent structures in transition-based dependency parsing. Empirically we find that the utility of latent structures depend on the choice of approximate search -- for greedy search they improve performance, whereas with beam search they are on par, or sometimes slightly ahead of, previous approaches. We then go on to extend this transition system to do joint sentence segmentation and dependency parsing. We develop a transition system capable of handling this task and evaluate it on noisy, non-edited texts. With a set of carefully selected baselines and data sets we employ this system to measure the effectiveness of syntactic information for sentence segmentation. We show that, in the absence of obvious orthographic clues such as punctuation and capitalization, syntactic information can be used to improve sentence segmentation. With regard to machine learning, our contributions of course include the framework itself. The task-specific evaluations, however, allow us to probe the learning machinery along certain boundary points and draw more general conclusions. A recurring observation is that some of the standard update methods for the structured perceptron with approximate search -- e.g., early and max-violation updates -- are inadequate when the predicted structure reaches a certain size. We show that the primary problem with these updates is that they may discard training data and that this effect increases as the structure size increases. This problem can be handled by using more advanced update methods that commit to using all the available training data. Here, we propose a new update method, DLaSO, which consistently outperforms all other update methods we compare to. Moreover, while this problem potentially could be handled by an increased beam size, we also show that this cannot fully compensate for the structure size and that the more advanced methods indeed are required.Bei der automatisierten Analyse natĂŒrlicher Sprache werden in der Regel maschinelle Lernverfahren eingesetzt, um verschiedenste linguistische Information wie beispielsweise syntaktische Strukturen vorherzusagen. Structured Prediction (dt. etwa Strukturvorhersage), also der Zweig des maschinellen Lernens, der sich mit der Vorhersage komplexer Strukturen wie formalen BĂ€umen oder Graphen beschĂ€ftigt, hat deshalb erhebliche Beachtung in der Forschung zur automatischen Sprachverarbeitung gefunden. In manchen FĂ€llen ist es vorteilhaft, die gesuchte linguistische Struktur nicht direkt zu modellieren und stattdessen interne ReprĂ€sentationen zu lernen, aus denen dann die gewĂŒnschte linguistische Information abgeleitet werden kann. Da die internen ReprĂ€sentationen allerdings selten direkt in Trainingsdaten verfĂŒgbar sind, sondern erst aus der linguistischen Annotation inferiert werden mĂŒssen, kann es vorkommen, dass dabei mehrere Ă€quivalente Strukturen in Frage kommen. Anstatt nun vor dem Lernen eine Struktur beliebig auszuwĂ€hlen, kann man diese Entscheidung dem Lernverfahren selbst ĂŒberlassen, welches dann selbstĂ€ndig die fĂŒr das Modell am besten passende auszuwĂ€hlen lernt. Unter diesen UmstĂ€nden bezeichnet man die interne, nicht a priori bekannte ReprĂ€sentation fĂŒr eine gesuchte Zielstruktur als latent. Diese Dissertation stellt ein Structured Prediction Framework vor, mit dem man den Vorteil latenter ReprĂ€sentationen nutzen kann und welches gleichzeitig von konkreten AnwendungsfĂ€llen abstrahiert. Diese Modularisierung ermöglicht die Wiederverwendbarkeit und den Vergleich ĂŒber mehrere Aufgaben und Aufgabenklassen hinweg. Um das Framework auf ein reales Problem anzuwenden, mĂŒssen nur einige Hyperparameter definiert und einige problemspezifische Funktionen implementiert werden. Das vorgestellte Framework basiert auf dem Structured Perceptron. Der Perceptron-Algorithmus ist ein inkrementelles Lernverfahren (eng. online learning), bei dem wĂ€hrend des Trainings einzelne Trainingsinstanzen nacheinander betrachtet werden. In jedem Schritt wird mit dem aktuellen Modell eine Vorhersage gemacht. Stimmt die Vorhersage nicht mit dem vorgegebenen Ergebnis ĂŒberein, wird das Modell durch ein entsprechendes Update angepasst und mit der nĂ€chsten Trainingsinstanz fortgefahren. Der Structured Perceptron wird im vorgestellten Framework mit Beam Search kombiniert. Beam Search ist ein approximatives Suchverfahren, welches auch in sehr großen SuchrĂ€umen effizientes Suchen erlaubt. Es kann aus diesem Grund aber keine Garantie dafĂŒr bieten, dass das gefundene Ergebnis auch das optimale ist. Das Training eines Perceptrons mit Beam Search erfordert deshalb besondere Update-Methoden, z.B. Early- oder Max-Violation-Updates, um mögliche Vorhersagefehler, die auf den Suchalgorithmus zurĂŒckgehen, auszugleichen. Diese Dissertation ist an der Schnittstelle zwischen maschinellem Lernen und maschineller Sprachverarbeitung angesiedelt. Im Bereich Sprachverarbeitung beschĂ€ftigt sie sich mit drei Aufgaben: Koreferenzresolution, Dependenzparsing und Dependenzparsing mit gleichzeitiger Satzsegmentierung. Das vorgestellte Modell zur Koreferenzresolution ist eine Erweiterung eines existierenden Modells, welches Koreferenz mit Hilfe latenter Baumstrukturen reprĂ€sentiert. Dieses Modell wird um Features erweitert, mit denen nicht-lokale AbhĂ€ngigkeiten innerhalb eines grĂ¶ĂŸeren strukturellen Kontexts modelliert werden. Die Modellierung nicht-lokaler AbhĂ€ngigkeiten macht durch die kombinatorische Explosion der Features die Verwendung eines approximativen Suchverfahrens notwendig. Es zeigt sich aber, dass das so entstandene Koreferenzmodell trotz der approximativen Suche dem Modell ohne nicht-lokale Features ĂŒberlegen ist, sofern hinreichend gute Update-Verfahren beim Lernen verwendet werden. FĂŒr das Dependenzparsing verwenden wir ein transitionsbasiertes Verfahren, bei dem DependenzbĂ€ume inkrementell durch Transitionen zwischen definierten ZustĂ€nden konstruiert werden. Im ersten Schritt erarbeiten wir eine umfassende Analyse des latenten Strukturraums eines bekannten Transitionssystems, nĂ€mlich ArcStandard mit Swap. Diese Analyse erlaubt es uns, die Rolle der latenten Strukturen in einem transitionsbasierten Dependenzparser zu evaluieren. Wir zeigen dann empirisch, dass die NĂŒtzlichkeit latenter Strukturen von der Wahl des Suchverfahrens abhĂ€ngt -- in Kombination mit Greedy-Search verbessern sich die Ergebnisse, in Kombination mit Beam-Search bleiben sie gleich oder verbessern sich leicht gegenĂŒber vergleichbaren Modellen. FĂŒr die dritte Aufgabe wird der Parser noch einmal erweitert: wir entwickeln das Transitionssystem so weiter, dass es neben syntaktischer Struktur auch Satzgrenzen vorhersagt und testen das System auf verrauschten und unredigierten Textdaten. Mit Hilfe sorgfĂ€ltig ausgewĂ€hlter Baselinemodelle und Testdaten messen wir den Einfluss syntaktischer Information auf die VorhersagequalitĂ€t von Satzgrenzen und zeigen, dass sich in Abwesenheit orthographischer Information wie Interpunktion und Groß- und Kleinschreibung das Ergebnis durch syntaktische Information verbessert. Zu den wissenschaftlichen BeitrĂ€gen der Arbeit gehört einerseits das Framework selbst. Unsere problemspezifischen Experimente ermöglichen es uns darĂŒber hinaus, die Lernverfahren zu untersuchen und allgemeinere Schlußfolgerungen zu ziehen. So finden wir z.B. in mehreren Experimenten, dass die etablierten Update-Methoden, also Early- oder Max-Violation-Update, nicht mehr gut funktionieren, sobald die vorhergesagte Struktur eine gewisse GrĂ¶ĂŸe ĂŒberschreitet. Es zeigt sich, dass das Hauptproblem dieser Methoden das Auslassen von Trainingsdaten ist, und dass sie desto mehr Daten auslassen, je grĂ¶ĂŸer die vorhergesagte Struktur wird. Dieses Problem kann durch bessere Update-Methoden vermieden werden, bei denen stets alle Trainingsdaten verwendet werden. Wir stellen eine neue Methode vor, DLaSO, und zeigen, dass diese Methode konsequent bessere Ergebnisse liefert als alle Vergleichsmethoden. Überdies zeigen wir, dass eine erhöhte BeamgrĂ¶ĂŸe beim Suchen das Problem der ausgelassenen Trainingsdaten nicht kompensieren kann und daher keine Alternative zu besseren Update-Methoden darstellt

    Movie Description

    Get PDF
    Audio Description (AD) provides linguistic descriptions of movies and allows visually impaired people to follow a movie along with their peers. Such descriptions are by design mainly visual and thus naturally form an interesting data source for computer vision and computational linguistics. In this work we propose a novel dataset which contains transcribed ADs, which are temporally aligned to full length movies. In addition we also collected and aligned movie scripts used in prior work and compare the two sources of descriptions. In total the Large Scale Movie Description Challenge (LSMDC) contains a parallel corpus of 118,114 sentences and video clips from 202 movies. First we characterize the dataset by benchmarking different approaches for generating video descriptions. Comparing ADs to scripts, we find that ADs are indeed more visual and describe precisely what is shown rather than what should happen according to the scripts created prior to movie production. Furthermore, we present and compare the results of several teams who participated in a challenge organized in the context of the workshop "Describing and Understanding Video & The Large Scale Movie Description Challenge (LSMDC)", at ICCV 2015

    Deep Generative Models for Natural Language

    Get PDF
    Generative models aim to simulate the process by which a set of data is generated. They are intuitive, interpretable, and naturally suited to learning from unlabelled data. This is particularly appealing in natural language processing, where labels are often costly to obtain and can require significant manual input from trained annotators. However, traditional generative modelling approaches can often be inflexible due to the need to maintain tractable maximum likelihood training. On the other hand, deep learning methods are powerful, flexible, and have achieved significant success on a wide variety of natural language processing tasks. In recent years, algorithms have been developed for training generative models that incorporate neural networks to parametrise their conditional distributions. These approaches aim to take advantage of the intuitiveness and interpretability of generative models as well as the power and flexibility of deep learning. In this work, we investigate how to leverage such algorithms in order to develop deep generative models for natural language. Firstly, we present an attention-based latent variable model, trained using unlabelled data, for learning representations of sentences. Experiments such as missing word imputation and sentence similarity matching suggest that the representations are able to learn semantic information about the sentences. We then present an RNN-based latent variable model for per- forming machine translation. Trained using semi-supervised learning, our approach achieves strong results even with very limited labelled data. Finally, we present a locally-contextual conditional random field for performing sequence labelling tasks. Our method consistently outperforms the linear chain conditional random field and achieves state of the art performance on two out of the four tasks evaluated

    FastGraphTTS: An Ultrafast Syntax-Aware Speech Synthesis Framework

    Full text link
    This paper integrates graph-to-sequence into an end-to-end text-to-speech framework for syntax-aware modelling with syntactic information of input text. Specifically, the input text is parsed by a dependency parsing module to form a syntactic graph. The syntactic graph is then encoded by a graph encoder to extract the syntactic hidden information, which is concatenated with phoneme embedding and input to the alignment and flow-based decoding modules to generate the raw audio waveform. The model is experimented on two languages, English and Mandarin, using single-speaker, few samples of target speakers, and multi-speaker datasets, respectively. Experimental results show better prosodic consistency performance between input text and generated audio, and also get higher scores in the subjective prosodic evaluation, and show the ability of voice conversion. Besides, the efficiency of the model is largely boosted through the design of the AI chip operator with 5x acceleration.Comment: Accepted by The 35th IEEE International Conference on Tools with Artificial Intelligence. (ICTAI 2023

    Predictability effects in language acquisition

    Get PDF
    Human language has two fundamental requirements: it must allow competent speakers to exchange messages efficiently, and it must be readily learned by children. Recent work has examined effects of language predictability on language production, with many researchers arguing that so-called “predictability effects” function towards the efficiency requirement. Specifically, recent work has found that talkers tend to reduce linguistic forms that are more probable more heavily. This dissertation proposes the “Predictability Bootstrapping Hypothesis” that predictability effects also make language more learnable. There is a great deal of evidence that the adult grammars have substantial statistical components. Since predictability effects result in heavier reduction for more probable words and hidden structure, they provide infants with direct cues to the statistical components of the grammars they are trying to learn. The corpus studies and computational modeling experiments in this dissertation show that predictability effects could be a substantial source of information to language-learning infants, focusing on the potential utility of phonetic reduction in terms of word duration for syntax acquisition. First, corpora of spontaneous adult-directed and child-directed speech (ADS and CDS, respectively) are compared to verify that predictability effects actually exist in CDS. While revealing some differences, mixed effects regressions on those corpora indicate that predictability effects in CDS are largely similar (in kind and magnitude) to predictability effects in ADS. This result indicates that predictability effects are available to infants, however useful they may be. Second, this dissertation builds probabilistic, unsupervised, and lexicalized models for learning about syntax from words and durational cues. One series of models is based on Hidden Markov Models and learns shallow constituency structure, while the other series is based on the Dependency Model with Valence and learns dependency structure. These models are then used to measure how useful durational cues are for syntax acquisition, and to what extent their utility in this task can be attributed to effects of syntactic predictability on word duration. As part of this investigation, these models are also used to explore the venerable “Prosodic Bootstrapping Hypothesis” that prosodic structure, which is cued in part by word duration, may be useful for syntax acquisition. The empirical evaluations of these models provide evidence that effects of syntactic predictability on word duration are easier to discover and exploit than effects of prosodic structure, and that even gold-standard annotations of prosodic structure provide at most a relatively small improvement in parsing performance over raw word duration. Taken together, this work indicates that predictability effects provide useful information about syntax to infants, showing that the Predictability Bootstrapping Hypothesis for syntax acquisition is computationally plausible and motivating future behavioural investigation. Additionally, as talkers consider the probability of many different aspects of linguistic structure when reducing according to predictability effects, this result also motivates investigation of Predictability Bootstrapping of other aspects of linguistic knowledge

    Construction Grammar and Language Models

    Get PDF
    Recent progress in deep learning and natural language processing has given rise to powerful models that are primarily trained on a cloze-like task and show some evidence of having access to substantial linguistic information, including some constructional knowledge. This groundbreaking discovery presents an exciting opportunity for a synergistic relationship between computational methods and Construction Grammar research. In this chapter, we explore three distinct approaches to the interplay between computational methods and Construction Grammar: (i) computational methods for text analysis, (ii) computational Construction Grammar, and (iii) deep learning models, with a particular focus on language models. We touch upon the first two approaches as a contextual foundation for the use of computational methods before providing an accessible, yet comprehensive overview of deep learning models, which also addresses reservations construction grammarians may have. Additionally, we delve into experiments that explore the emergence of constructionally relevant information within these models while also examining the aspects of Construction Grammar that may pose challenges for these models. This chapter aims to foster collaboration between researchers in the fields of natural language processing and Construction Grammar. By doing so, we hope to pave the way for new insights and advancements in both these fields

    Construction Grammar and Language Models

    Full text link
    Recent progress in deep learning and natural language processing has given rise to powerful models that are primarily trained on a cloze-like task and show some evidence of having access to substantial linguistic information, including some constructional knowledge. This groundbreaking discovery presents an exciting opportunity for a synergistic relationship between computational methods and Construction Grammar research. In this chapter, we explore three distinct approaches to the interplay between computational methods and Construction Grammar: (i) computational methods for text analysis, (ii) computational Construction Grammar, and (iii) deep learning models, with a particular focus on language models. We touch upon the first two approaches as a contextual foundation for the use of computational methods before providing an accessible, yet comprehensive overview of deep learning models, which also addresses reservations construction grammarians may have. Additionally, we delve into experiments that explore the emergence of constructionally relevant information within these models while also examining the aspects of Construction Grammar that may pose challenges for these models. This chapter aims to foster collaboration between researchers in the fields of natural language processing and Construction Grammar. By doing so, we hope to pave the way for new insights and advancements in both these fields.Comment: Accepted for publication in The Cambridge Handbook of Construction Grammar, edited by Mirjam Fried and Kiki Nikiforidou. To appear in 202
    • 

    corecore