10 research outputs found

    Evolution of Italian Treebank and Dependency Parsing towards Universal Dependencies

    Get PDF
    Illustriamo i principali cambiamenti effettuati sulla treebank a dipendenze per l’italiano nel passaggio a una versione estesa e rivista secondo lo stile di annotazione delle Universal Dependencies. Esploriamo come questi cambiamenti influenzano l’accuratezza dei parser a dipendenze, eseguendo test comparativi su diverse versioni della treebank. Nonostante i cambiamenti rilevanti nello stile di annotazione, i parser statistici sono in grado di adeguarsi e migliorare in accuratezza.We highlight the main changes recently undergone by the Italian De-pendency Treebank in the transition to an extended and revised edition, compliant with the annotation schema of Universal Dependencies. We explore how these changes affect the accuracy of dependen-cy parsers, performing comparative tests on various versions of the treebank. De-spite significant changes in the annota-tion style, statistical parsers seem to cope well and mostly improve

    State-of-the-art Italian dependency parsers based on neural and ensemble systems

    Get PDF
    In this paper we present a work which aims to test the most advanced, state-of-the-art syntactic dependency parsers based on deep neural networks (DNN) on Italian. We made a large set of experiments by using two Italian treebanks containing different text types downloaded from the Universal Dependencies project and propose a new solution based on ensemble systems. We implemented the proposed ensemble solutions by testing different techniques described in literature, obtaining very good parsing results, well above the state of the art for Italian

    Online learning of latent linguistic structure with approximate search

    Get PDF
    Automatic analysis of natural language data is a frequently occurring application of machine learning systems. These analyses often revolve around some linguistic structure, for instance a syntactic analysis of a sentence by means of a tree. Machine learning models that carry out structured prediction, as opposed to simpler machine learning tasks such as classification or regression, have therefore received considerable attention in the language processing literature. As an additional twist, the sought linguistic structures are sometimes not directly modeled themselves. Rather, prediction takes place in a different space where the same linguistic structure can be represented in more than one way. However, in a standard supervised learning setting, these prediction structures are not available in the training data, but only the linguistic structure. Since multiple prediction structures may correspond to the same linguistic structure, it is thus unclear which prediction structure to use for learning. One option is to treat the prediction structure as latent and let the machine learning algorithm guide this selection. In this dissertation we present an abstract framework for structured prediction. This framework supports latent structures and is agnostic of the particular language processing task. It defines a set of hyperparameters and task-specific functions which a user must implement in order to apply it to a new task. The advantage of this modularization is that it permits comparisons and reuse across tasks in a common framework. The framework we devise is based on the structured perceptron for learning. The perceptron is an online learning algorithm which considers one training instance at a time, makes a prediction, and carries out an update if the prediction was wrong. We couple the structured perceptron with beam search, which is a general purpose search algorithm. Beam search is, however, only approximate, meaning that there is no guarantee that it will find the optimal structure in a large search space. Therefore special attention is required to handle search errors during training. This has led to the development of special update methods such as early and max-violation updates. The contributions of this dissertation sit at the intersection of machine learning and natural language processing. With regard to language processing, we consider three tasks: Coreference resolution, dependency parsing, and joint sentence segmentation and dependency parsing. For coreference resolution, we start from an existing latent tree model and extend it to accommodate non-local features drawn from a greater structural context. This requires us to sacrifice exact for approximate search, but we show that, assuming sufficiently advanced update methods are used for the structured perceptron, then the richer scope of features yields a stronger coreference model. We take a transition-based approach to dependency parsing, where dependency trees are constructed incrementally by transition system. Latent structures for transition-based parsing have previously not received enough attention, partly because the characterization of the prediction space is non-trivial. We provide a thorough analysis of this space with regard to the ArcStandard with Swap transition system. This characterization enables us to evaluate the role of latent structures in transition-based dependency parsing. Empirically we find that the utility of latent structures depend on the choice of approximate search -- for greedy search they improve performance, whereas with beam search they are on par, or sometimes slightly ahead of, previous approaches. We then go on to extend this transition system to do joint sentence segmentation and dependency parsing. We develop a transition system capable of handling this task and evaluate it on noisy, non-edited texts. With a set of carefully selected baselines and data sets we employ this system to measure the effectiveness of syntactic information for sentence segmentation. We show that, in the absence of obvious orthographic clues such as punctuation and capitalization, syntactic information can be used to improve sentence segmentation. With regard to machine learning, our contributions of course include the framework itself. The task-specific evaluations, however, allow us to probe the learning machinery along certain boundary points and draw more general conclusions. A recurring observation is that some of the standard update methods for the structured perceptron with approximate search -- e.g., early and max-violation updates -- are inadequate when the predicted structure reaches a certain size. We show that the primary problem with these updates is that they may discard training data and that this effect increases as the structure size increases. This problem can be handled by using more advanced update methods that commit to using all the available training data. Here, we propose a new update method, DLaSO, which consistently outperforms all other update methods we compare to. Moreover, while this problem potentially could be handled by an increased beam size, we also show that this cannot fully compensate for the structure size and that the more advanced methods indeed are required.Bei der automatisierten Analyse natürlicher Sprache werden in der Regel maschinelle Lernverfahren eingesetzt, um verschiedenste linguistische Information wie beispielsweise syntaktische Strukturen vorherzusagen. Structured Prediction (dt. etwa Strukturvorhersage), also der Zweig des maschinellen Lernens, der sich mit der Vorhersage komplexer Strukturen wie formalen Bäumen oder Graphen beschäftigt, hat deshalb erhebliche Beachtung in der Forschung zur automatischen Sprachverarbeitung gefunden. In manchen Fällen ist es vorteilhaft, die gesuchte linguistische Struktur nicht direkt zu modellieren und stattdessen interne Repräsentationen zu lernen, aus denen dann die gewünschte linguistische Information abgeleitet werden kann. Da die internen Repräsentationen allerdings selten direkt in Trainingsdaten verfügbar sind, sondern erst aus der linguistischen Annotation inferiert werden müssen, kann es vorkommen, dass dabei mehrere äquivalente Strukturen in Frage kommen. Anstatt nun vor dem Lernen eine Struktur beliebig auszuwählen, kann man diese Entscheidung dem Lernverfahren selbst überlassen, welches dann selbständig die für das Modell am besten passende auszuwählen lernt. Unter diesen Umständen bezeichnet man die interne, nicht a priori bekannte Repräsentation für eine gesuchte Zielstruktur als latent. Diese Dissertation stellt ein Structured Prediction Framework vor, mit dem man den Vorteil latenter Repräsentationen nutzen kann und welches gleichzeitig von konkreten Anwendungsfällen abstrahiert. Diese Modularisierung ermöglicht die Wiederverwendbarkeit und den Vergleich über mehrere Aufgaben und Aufgabenklassen hinweg. Um das Framework auf ein reales Problem anzuwenden, müssen nur einige Hyperparameter definiert und einige problemspezifische Funktionen implementiert werden. Das vorgestellte Framework basiert auf dem Structured Perceptron. Der Perceptron-Algorithmus ist ein inkrementelles Lernverfahren (eng. online learning), bei dem während des Trainings einzelne Trainingsinstanzen nacheinander betrachtet werden. In jedem Schritt wird mit dem aktuellen Modell eine Vorhersage gemacht. Stimmt die Vorhersage nicht mit dem vorgegebenen Ergebnis überein, wird das Modell durch ein entsprechendes Update angepasst und mit der nächsten Trainingsinstanz fortgefahren. Der Structured Perceptron wird im vorgestellten Framework mit Beam Search kombiniert. Beam Search ist ein approximatives Suchverfahren, welches auch in sehr großen Suchräumen effizientes Suchen erlaubt. Es kann aus diesem Grund aber keine Garantie dafür bieten, dass das gefundene Ergebnis auch das optimale ist. Das Training eines Perceptrons mit Beam Search erfordert deshalb besondere Update-Methoden, z.B. Early- oder Max-Violation-Updates, um mögliche Vorhersagefehler, die auf den Suchalgorithmus zurückgehen, auszugleichen. Diese Dissertation ist an der Schnittstelle zwischen maschinellem Lernen und maschineller Sprachverarbeitung angesiedelt. Im Bereich Sprachverarbeitung beschäftigt sie sich mit drei Aufgaben: Koreferenzresolution, Dependenzparsing und Dependenzparsing mit gleichzeitiger Satzsegmentierung. Das vorgestellte Modell zur Koreferenzresolution ist eine Erweiterung eines existierenden Modells, welches Koreferenz mit Hilfe latenter Baumstrukturen repräsentiert. Dieses Modell wird um Features erweitert, mit denen nicht-lokale Abhängigkeiten innerhalb eines größeren strukturellen Kontexts modelliert werden. Die Modellierung nicht-lokaler Abhängigkeiten macht durch die kombinatorische Explosion der Features die Verwendung eines approximativen Suchverfahrens notwendig. Es zeigt sich aber, dass das so entstandene Koreferenzmodell trotz der approximativen Suche dem Modell ohne nicht-lokale Features überlegen ist, sofern hinreichend gute Update-Verfahren beim Lernen verwendet werden. Für das Dependenzparsing verwenden wir ein transitionsbasiertes Verfahren, bei dem Dependenzbäume inkrementell durch Transitionen zwischen definierten Zuständen konstruiert werden. Im ersten Schritt erarbeiten wir eine umfassende Analyse des latenten Strukturraums eines bekannten Transitionssystems, nämlich ArcStandard mit Swap. Diese Analyse erlaubt es uns, die Rolle der latenten Strukturen in einem transitionsbasierten Dependenzparser zu evaluieren. Wir zeigen dann empirisch, dass die Nützlichkeit latenter Strukturen von der Wahl des Suchverfahrens abhängt -- in Kombination mit Greedy-Search verbessern sich die Ergebnisse, in Kombination mit Beam-Search bleiben sie gleich oder verbessern sich leicht gegenüber vergleichbaren Modellen. Für die dritte Aufgabe wird der Parser noch einmal erweitert: wir entwickeln das Transitionssystem so weiter, dass es neben syntaktischer Struktur auch Satzgrenzen vorhersagt und testen das System auf verrauschten und unredigierten Textdaten. Mit Hilfe sorgfältig ausgewählter Baselinemodelle und Testdaten messen wir den Einfluss syntaktischer Information auf die Vorhersagequalität von Satzgrenzen und zeigen, dass sich in Abwesenheit orthographischer Information wie Interpunktion und Groß- und Kleinschreibung das Ergebnis durch syntaktische Information verbessert. Zu den wissenschaftlichen Beiträgen der Arbeit gehört einerseits das Framework selbst. Unsere problemspezifischen Experimente ermöglichen es uns darüber hinaus, die Lernverfahren zu untersuchen und allgemeinere Schlußfolgerungen zu ziehen. So finden wir z.B. in mehreren Experimenten, dass die etablierten Update-Methoden, also Early- oder Max-Violation-Update, nicht mehr gut funktionieren, sobald die vorhergesagte Struktur eine gewisse Größe überschreitet. Es zeigt sich, dass das Hauptproblem dieser Methoden das Auslassen von Trainingsdaten ist, und dass sie desto mehr Daten auslassen, je größer die vorhergesagte Struktur wird. Dieses Problem kann durch bessere Update-Methoden vermieden werden, bei denen stets alle Trainingsdaten verwendet werden. Wir stellen eine neue Methode vor, DLaSO, und zeigen, dass diese Methode konsequent bessere Ergebnisse liefert als alle Vergleichsmethoden. Überdies zeigen wir, dass eine erhöhte Beamgröße beim Suchen das Problem der ausgelassenen Trainingsdaten nicht kompensieren kann und daher keine Alternative zu besseren Update-Methoden darstellt

    Analyzing, enhancing, optimizing and applying dependency analysis

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Ingeniería del Software e Inteligencia Artificial, leída el 19/12/2012Los analizadores de dependencias estadísticos han sido mejorados en gran medida durante los últimos años. Esto ha sido posible gracias a los sistemas basados en aprendizaje automático que muestran una gran precisión. Estos sistemas permiten la generación de parsers para idiomas en los que se disponga de un corpus adecuado sin causar, para ello, un gran esfuerzo en el usuario final. MaltParser es uno de estos sistemas. En esta tesis hemos usado sistemas del estado del arte, para mostrar una serie de contribuciones completamente relacionadas con el procesamiento de lenguaje natural (PLN) y análisis de dependencias: (i) Estudio del problema del análisis de dependencias demostrando la homogeneidad en la precisión y mostrando contribuciones interesantes sobre la longitud de las frases, el tamaño de los corpora de entrenamiento y como evaluamos los parsers. (ii) Hemos estudiado además algunas maneras de mejorar la precisión modificando el flujo de análisis de dos maneras distintas, analizando algunos segmentos de las frases de manera separada, y modificando el comportamiento interno de los algoritmos de parsing. (iii) Hemos investigado la selección automática de atributos para aprendizaje máquina para analizadores de dependencias basados en transiciones que consideramos un importante problema y algo que realmente es necesario resolver dado el estado de la cuestión, ya que además puede servir para resolver de mejor manera tareas relacionadas con el análisis de dependencias. (iv) Finalmente, hemos aplicado el análisis de dependencias para resolver algunos problemas, hoy en día importantes, para el procesamiento de lenguage natural (PLN) como son la simplificación de textos o la inferencia del alcance de señales de negación. Por último, añadir que el conocimiento adquirido en la realización de esta tesis puede usarse para implementar aplicaciones basadas en análisis de dependencias más robustas en PLN o en otras áreas relacionadas, como se demuestra a lo largo de la tesis. [ABSTRACT] Statistical dependency parsing accuracy has been improved substantially during the last years. One of the main reasons is the inclusion of data- driven (or machine learning) based methods. Machine learning allows the development of parsers for every language that has an adequate training corpus without requiring a great effort. MaltParser is one of such systems. In the present thesis we have used state of the art systems (mainly Malt- Parser), to show some contributions in four different areas inherently related to natural language processing (NLP) and dependency parsing: (i) We stu- died the parsing problem demonstrating the homogeneity of the performance and showing interesting contributions about sentence length, corpora size and how we normally evaluate the parsers. (ii) We have also tried some ways of improving the parsing accuracy by modifying the flow of analysis, parsing some segments of the sentences separately by finally constructing a parsing combination problem. We also studied the modification of the inter- nal behavior of the parsers focusing on the root of dependency structures, which is an important part of what a dependency parser parses and worth studying. (iii) We have researched automatic feature selection and parsing optimization for transition based parsers which we consider an important problem and something that definitely needs to be done in dependency par- sing in order to solve parsing problems in a more successful way. And (iv) we have applied syntactic dependency structures and dependency parsing to solve some Natural Language Processing (NLP) problems such as text simplification and inferring the scope of negation cues. Furthermore, the knowledge acquired when developing this thesis could be used to implement more robust dependency parsing–based applications in different NLP (or related) areas, as we demonstrate in the present thesis.Depto. de Ingeniería de Software e Inteligencia Artificial (ISIA)Fac. de InformáticaTRUEunpu

    Task-based parser output combination : workflow and infrastructure

    Get PDF
    This dissertation introduces the method of task-based parser output combination as a device to enhance the reliability of automatically generated syntactic information for further processing tasks. Parsers, i.e. tools generating syntactic analyses, are usually based on reference data. Typically these are modern news texts. However, the data relevant for applications or tasks beyond parsing often differs from this standard domain, or only specific phenomena from the syntactic analysis are actually relevant for further processing. In these cases, the reliability of the parsing output might deviate essentially from the expected outcome on standard news text. Studies for several levels of analysis in natural language processing have shown that combining systems from the same analysis level outperforms the best involved single system. This is due to different error distributions of the involved systems which can be exploited, e.g. in a majority voting approach. In other words: for an effective combination, the involved systems have to be sufficiently different. In these combination studies, usually the complete analyses are combined and evaluated. However, to be able to combine the analyses completely, a full mapping of their structures and tagsets has to be found. The need for a full mapping either restricts the degree to which the participating systems are allowed to differ or it results in information loss. Moreover, the evaluation of the combined complete analyses does not reflect the reliability achieved in the analysis of the specific aspects needed to resolve a given task. This work presents an abstract workflow which can be instantiated based on the respective task and the available parsers. The approach focusses on the task-relevant aspects and aims at increasing the reliability of their analysis. Moreover, this focus allows a combination of more diverging systems, since no full mapping of the structures and tagsets from the single systems is needed. The usability of this method is also increased by focussing on the output of the parsers: It is not necessary for the users to reengineer the tools. Instead, off-the-shelf parsers and parsers for which no configuration options or sources are available to the users can be included. Based on this, the method is applicable to a broad range of applications. For instance, it can be applied to tasks from the growing field of Digital Humanities, where the focus is often on tasks different from syntactic analysis

    Preliminary proceedings of the 2001 ACM SIGPLAN Haskell workshop

    Get PDF
    This volume contains the preliminary proceedings of the 2001 ACM SIGPLAN Haskell Workshop, which was held on 2nd September 2001 in Firenze, Italy. The final proceedings will published by Elsevier Science as an issue of Electronic Notes in Theoretical Computer Science (Volume 59). The HaskellWorkshop was sponsored by ACM SIGPLAN and formed part of the PLI 2001 colloquium on Principles, Logics, and Implementations of high-level programming languages, which comprised the ICFP/PPDP conferences and associated workshops. Previous Haskell Workshops have been held in La Jolla (1995), Amsterdam (1997), Paris (1999), and Montr´eal (2000). The purpose of the Haskell Workshop was to discuss experience with Haskell, and possible future developments for the language. The scope of the workshop included all aspects of the design, semantics, theory, application, implementation, and teaching of Haskell. Submissions that discussed limitations of Haskell at present and/or proposed new ideas for future versions of Haskell were particularly encouraged. Adopting an idea from ICFP 2000, the workshop also solicited two special classes of submissions, application letters and functional pearls, described below

    The mat sat on the cat : investigating structure in the evaluation of order in machine translation

    Get PDF
    We present a multifaceted investigation into the relevance of word order in machine translation. We introduce two tools, DTED and DERP, each using dependency structure to detect differences between the structures of machine-produced translations and human-produced references. DTED applies the principle of Tree Edit Distance to calculate edit operations required to convert one structure into another. Four variants of DTED have been produced, differing in the importance they place on words which match between the two sentences. DERP represents a more detailed procedure, making use of the dependency relations between words when evaluating the disparities between paths connecting matching nodes. In order to empirically evaluate DTED and DERP, and as a standalone contribution, we have produced WOJ-DB, a database of human judgments. Containing scores relating to translation adequacy and more specifically to word order quality, this is intended to support investigations into a wide range of translation phenomena. We report an internal evaluation of the information in WOJ-DB, then use it to evaluate variants of DTED and DERP, both to determine their relative merit and their strength relative to third-party baselines. We present our conclusions about the importance of structure to the tools and their relevance to word order specifically, then propose further related avenues of research suggested or enabled by our work

    Foundations of Multi-Paradigm Modelling for Cyber-Physical Systems

    Get PDF
    This open access book coherently gathers well-founded information on the fundamentals of and formalisms for modelling cyber-physical systems (CPS). Highlighting the cross-disciplinary nature of CPS modelling, it also serves as a bridge for anyone entering CPS from related areas of computer science or engineering. Truly complex, engineered systems—known as cyber-physical systems—that integrate physical, software, and network aspects are now on the rise. However, there is no unifying theory nor systematic design methods, techniques or tools for these systems. Individual (mechanical, electrical, network or software) engineering disciplines only offer partial solutions. A technique known as Multi-Paradigm Modelling has recently emerged suggesting to model every part and aspect of a system explicitly, at the most appropriate level(s) of abstraction, using the most appropriate modelling formalism(s), and then weaving the results together to form a representation of the system. If properly applied, it enables, among other global aspects, performance analysis, exhaustive simulation, and verification. This book is the first systematic attempt to bring together these formalisms for anyone starting in the field of CPS who seeks solid modelling foundations and a comprehensive introduction to the distinct existing techniques that are multi-paradigmatic. Though chiefly intended for master and post-graduate level students in computer science and engineering, it can also be used as a reference text for practitioners
    corecore