9,177 research outputs found

    Discontinuous grammar as a foreign language

    Get PDF
    [Abstract] In order to achieve deep natural language understanding, syntactic constituent parsing is a vital step, highly demanded by many artificial intelligence systems to process both text and speech. One of the most recent proposals is the use of standard sequence-to-sequence models to perform constituent parsing as a machine translation task, instead of applying task-specific parsers. While they show a competitive performance, these text-to-parse transducers are still lagging behind classic techniques in terms of accuracy, coverage and speed. To close the gap, we here extend the framework of sequence-to-sequence models for constituent parsing, not only by providing a more powerful neural architecture for improving their performance, but also by enlarging their coverage to handle the most complex syntactic phenomena: discontinuous structures. To that end, we design several novel linearizations that can fully produce discontinuities and, for the first time, we test a sequence-to-sequence model on the main discontinuous benchmarks, obtaining competitive results on par with task-specific discontinuous constituent parsers and achieving state-of-the-art scores on the (discontinuous) English Penn Treebank.Xunta de Galicia; ED431G 2019/01Xunta de Galicia; ED431C 2020/11We acknowledge the European Research Council (ERC), which has funded this research under the European Union’s Horizon 2020 research and innovation programme (FASTPARSE, grant agreement No 714150) and the Horizon Europe research and innovation programme (SALSA, grant agreement No 101100615), ERDF/ MICINN-AEI (SCANNER-UDC, PID2020-113230RB-C21), Xunta de Galicia (ED431C 2020/11), and Centro de InvestigaciĂłn de Galicia ‘‘CITIC”, funded by Xunta de Galicia and the European Union (ERDF - Galicia 2014–2020 Program), by grant ED431G 2019/01. Funding for open access charge: Universidade da Coruña/CISUG

    Modular lifelong machine learning

    Get PDF
    Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge. Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand. This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems. First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures. Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations. Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods. Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer

    Towards A Practical High-Assurance Systems Programming Language

    Full text link
    Writing correct and performant low-level systems code is a notoriously demanding job, even for experienced developers. To make the matter worse, formally reasoning about their correctness properties introduces yet another level of complexity to the task. It requires considerable expertise in both systems programming and formal verification. The development can be extremely costly due to the sheer complexity of the systems and the nuances in them, if not assisted with appropriate tools that provide abstraction and automation. Cogent is designed to alleviate the burden on developers when writing and verifying systems code. It is a high-level functional language with a certifying compiler, which automatically proves the correctness of the compiled code and also provides a purely functional abstraction of the low-level program to the developer. Equational reasoning techniques can then be used to prove functional correctness properties of the program on top of this abstract semantics, which is notably less laborious than directly verifying the C code. To make Cogent a more approachable and effective tool for developing real-world systems, we further strengthen the framework by extending the core language and its ecosystem. Specifically, we enrich the language to allow users to control the memory representation of algebraic data types, while retaining the automatic proof with a data layout refinement calculus. We repurpose existing tools in a novel way and develop an intuitive foreign function interface, which provides users a seamless experience when using Cogent in conjunction with native C. We augment the Cogent ecosystem with a property-based testing framework, which helps developers better understand the impact formal verification has on their programs and enables a progressive approach to producing high-assurance systems. Finally we explore refinement type systems, which we plan to incorporate into Cogent for more expressiveness and better integration of systems programmers with the verification process

    On regular copying languages

    Get PDF
    This paper proposes a formal model of regular languages enriched with unbounded copying. We augment finite-state machinery with the ability to recognize copied strings by adding an unbounded memory buffer with a restricted form of first-in-first-out storage. The newly introduced computational device, finite-state buffered machines (FS-BMs), characterizes the class of regular languages and languages de-rived from them through a primitive copying operation. We name this language class regular copying languages (RCLs). We prove a pumping lemma and examine the closure properties of this language class. As suggested by previous literature (Gazdar and Pullum 1985, p.278), regular copying languages should approach the correct characteriza-tion of natural language word sets

    Old and New Minimalism: a Hopf algebra comparison

    Full text link
    In this paper we compare some old formulations of Minimalism, in particular Stabler's computational minimalism, and Chomsky's new formulation of Merge and Minimalism, from the point of view of their mathematical description in terms of Hopf algebras. We show that the newer formulation has a clear advantage purely in terms of the underlying mathematical structure. More precisely, in the case of Stabler's computational minimalism, External Merge can be described in terms of a partially defined operated algebra with binary operation, while Internal Merge determines a system of right-ideal coideals of the Loday-Ronco Hopf algebra and corresponding right-module coalgebra quotients. This mathematical structure shows that Internal and External Merge have significantly different roles in the old formulations of Minimalism, and they are more difficult to reconcile as facets of a single algebraic operation, as desirable linguistically. On the other hand, we show that the newer formulation of Minimalism naturally carries a Hopf algebra structure where Internal and External Merge directly arise from the same operation. We also compare, at the level of algebraic properties, the externalization model of the new Minimalism with proposals for assignments of planar embeddings based on heads of trees.Comment: 27 pages, LaTeX, 3 figure

    Artificial Intelligence, Robots, and Philosophy

    Get PDF
    This book is a collection of all the papers published in the special issue “Artificial Intelligence, Robots, and Philosophy,” Journal of Philosophy of Life, Vol.13, No.1, 2023, pp.1-146. The authors discuss a variety of topics such as science fiction and space ethics, the philosophy of artificial intelligence, the ethics of autonomous agents, and virtuous robots. Through their discussions, readers are able to think deeply about the essence of modern technology and the future of humanity. All papers were invited and completed in spring 2020, though because of the Covid-19 pandemic and other problems, the publication was delayed until this year. I apologize to the authors and potential readers for the delay. I hope that readers will enjoy these arguments on digital technology and its relationship with philosophy. *** Contents*** Introduction : Descartes and Artificial Intelligence; Masahiro Morioka*** Isaac Asimov and the Current State of Space Science Fiction : In the Light of Space Ethics; Shin-ichiro Inaba*** Artificial Intelligence and Contemporary Philosophy : Heidegger, Jonas, and Slime Mold; Masahiro Morioka*** Implications of Automating Science : The Possibility of Artificial Creativity and the Future of Science; Makoto Kureha*** Why Autonomous Agents Should Not Be Built for War; István Zoltán Zárdai*** Wheat and Pepper : Interactions Between Technology and Humans; Minao Kukita*** Clockwork Courage : A Defense of Virtuous Robots; Shimpei Okamoto*** Reconstructing Agency from Choice; Yuko Murakami*** Gushing Prose : Will Machines Ever be Able to Translate as Badly as Humans?; Rossa Ó Muireartaigh**

    The automatic processing of multiword expressions in Irish

    Get PDF
    It is well-documented that Multiword Expressions (MWEs) pose a unique challenge to a variety of NLP tasks such as machine translation, parsing, information retrieval, and more. For low-resource languages such as Irish, these challenges can be exacerbated by the scarcity of data, and a lack of research in this topic. In order to improve handling of MWEs in various NLP tasks for Irish, this thesis will address both the lack of resources specifically targeting MWEs in Irish, and examine how these resources can be applied to said NLP tasks. We report on the creation and analysis of a number of lexical resources as part of this PhD research. Ilfhocail, a lexicon of Irish MWEs, is created through extract- ing MWEs from other lexical resources such as dictionaries. A corpus annotated with verbal MWEs in Irish is created for the inclusion of Irish in the PARSEME Shared Task 1.2. Additionally, MWEs were tagged in a bilingual EN-GA corpus for inclusion in experiments in machine translation. For the purposes of annotation, a categorisation scheme for nine categories of MWEs in Irish is created, based on combining linguistic analysis on these types of constructions and cross-lingual frameworks for defining MWEs. A case study in applying MWEs to NLP tasks is undertaken, with the exploration of incorporating MWE information while training Neural Machine Translation systems. Finally, the topic of automatic identification of Irish MWEs is explored, documenting the training of a system capable of automatically identifying Irish MWEs from a variety of categories, and the challenges associated with developing such a system. This research contributes towards a greater understanding of Irish MWEs and their applications in NLP, and provides a foundation for future work in exploring other methods for the automatic discovery and identification of Irish MWEs, and further developing the MWE resources described above

    On the intricate relation between theory and description: A linguist’s look at The Cambridge Grammar of the English Language

    Get PDF
    On the intricate relation between theory and description: A linguist’s look at The Cambridge Grammar of the English Language

    Intelligent computing : the latest advances, challenges and future

    Get PDF
    Computing is a critical driving force in the development of human civilization. In recent years, we have witnessed the emergence of intelligent computing, a new computing paradigm that is reshaping traditional computing and promoting digital revolution in the era of big data, artificial intelligence and internet-of-things with new computing theories, architectures, methods, systems, and applications. Intelligent computing has greatly broadened the scope of computing, extending it from traditional computing on data to increasingly diverse computing paradigms such as perceptual intelligence, cognitive intelligence, autonomous intelligence, and human computer fusion intelligence. Intelligence and computing have undergone paths of different evolution and development for a long time but have become increasingly intertwined in recent years: intelligent computing is not only intelligence-oriented but also intelligence-driven. Such cross-fertilization has prompted the emergence and rapid advancement of intelligent computing
    • 

    corecore