71 research outputs found

    Explanation and Downscalability of Google's Dependency Parser Parsey McParseface

    Get PDF
    Using the data collected during the hyperparameter tuning for Google's Dependency Parser Parsey McParseface, Feedforward neural networks and the correlation between its hyperparameter during the networks training are explained and analysed in depth.:1 Introduction to Neural Networks 4 1.1 History of AI 4 1.2 The role of Neural Networks in AI Research 6 1.2.1 Artificial Intelligence 6 1.2.2 Machine Learning 6 1.2.3 Neural Network 8 1.3 Structure of Neural Networks 8 1.3.1 Biology Analogy of Artificial Neural Networks 9 1.3.2 Architecture of Artificial Neural Networks 9 1.3.3 Biological Model of Nodes – Neurons 11 1.3.4 Structure of Artificial Neurons 12 1.4 Training a Neural Network 21 1.4.1 Data 21 1.4.2 Hyperparameters 22 1.4.3 Training process 26 1.4.4 Overfitting 27 2 Natural Language Processing (NLP) 29 2.1 Data Preparation 29 2.1.1 Text Preprocessing 29 2.1.2 Part-of-Speech Tagging 30 2.2 Dependency Parsing 31 2.2.1 Dependency Grammar 31 2.2.2 Dependency Parsing Rule-Based & Data-Driven Approach 33 2.2.3 Syntactic Parser 33 2.3 Parsey McParseface 34 2.3.1 SyntaxNet 34 2.3.2 Corpus 34 2.3.3 Architecture 34 2.3.4 Improvements to the Feed Forward Neural Network 38 3 Training of Parsey’s Cousins 41 3.1 Training a Model 41 3.1.1 Building the Framework 41 3.1.2 Corpus 41 3.1.3 Training Process 43 3.1.4 Settings for the Training 44 3.2 Results and Analysis 46 3.2.1 Results from Google’s Models 46 3.2.2 Effect of Hyperparameter 47 4 Conclusion 63 5 Bibliography 65 6 Appendix 7

    On Multilingual Training of Neural Dependency Parsers

    Full text link
    We show that a recently proposed neural dependency parser can be improved by joint training on multiple languages from the same family. The parser is implemented as a deep neural network whose only input is orthographic representations of words. In order to successfully parse, the network has to discover how linguistically relevant concepts can be inferred from word spellings. We analyze the representations of characters and words that are learned by the network to establish which properties of languages were accounted for. In particular we show that the parser has approximately learned to associate Latin characters with their Cyrillic counterparts and that it can group Polish and Russian words that have a similar grammatical function. Finally, we evaluate the parser on selected languages from the Universal Dependencies dataset and show that it is competitive with other recently proposed state-of-the art methods, while having a simple structure.Comment: preprint accepted into the TSD201

    Elimination of Spurious Ambiguity in Transition-Based Dependency Parsing

    Get PDF
    We present a novel technique to remove spurious ambiguity from transition systems for dependency parsing. Our technique chooses a canonical sequence of transition operations (computation) for a given dependency tree. Our technique can be applied to a large class of bottom-up transition systems, including for instance Nivre (2004) and Attardi (2006)

    Deterministic choices in a data-driven parser.

    Get PDF
    Data-driven parsers rely on recommendations from parse models, which are generated from a set of training data using a machine learning classifier, to perform parse operations. However, in some cases a parse model cannot recommend a parse action to a parser unless it learns from the training data what parse action(s) to take in every possible situation. Therefore, it will be hard for a parser to make an informed decision as to what parse operation to perform when a parse model recommends no/several parse actions to a parser. Here we examine the effect of various deterministic choices on a datadriven parser when it is presented with no/several recommendation from a parse model

    OCRonym: Entity Extraction and Retrieval for Scanned Books

    Get PDF
    In the past five years, massive book-scanning projects have produced an explosion in the number of sources for the humanities, available on-line to the broadest possible audiences. Transcribing page images by optical character recognition makes many searching and browsing tasks practical for scholars. But even low OCR error rates compound into high probability of error in a given sentence, and the error rate is even higher for names. We propose to build a prototype system for information extraction and retrieval of noisy OCR. In particular, we will optimize the extraction and retrieval of names, which are highly informative features for detecting topics and events in documents. We will build statistical models of characters and words from scanned books to improve lexical coverage, and we will improve name categorization and disambiguation by linking document contexts to external sources such as Wikipedia. Our testbed comes from over one million scanned books from the Internet Archive

    Simple voting algorithms for Italian parsing

    Get PDF
    • …
    corecore