55 research outputs found

    Grammatical Triples Extraction for the Distant Reading of Textual Corpora

    Get PDF
    Grammatical triples extraction has become increasingly important for the analysis of large, textual corpora. By providing insight into the sentence-level linguistic features of a corpus, extracted triples have supported interpretations of some of the most relevant problems of our time. The growing importance of triples extraction for analyzing large corpora has put the quality of extracted triples under new scrutiny, however. Triples outputs are known to have large amounts of erroneous triples. The extraction of erroneous triples poses a risk for understanding a textual corpus because erroneous triples can be nonfactual and even analogous to misinformation. Disciplines such as the social sciences, history, and literature rely on accurate representations of events. In some cases, misrepresentations of language can be as problematic as describing a historical event that never occurred. The present research proposes a method of triples extraction that has been designed to meet the increasing need for high-accuracy triples outputs for the analysis of text. We propose a solution aimed at reducing errors related to: a) ungrammatical extractions; b) double counting; and c) the missed detection of triples. To improve the accuracy of triples extraction, we implement a series of 12 linguistic rules that leverage syntactic dependency parsing. For its case studies, this dissertation draws upon three data sets: a) Wikipedia; b) the 19th-century British Parliamentary debates, also known as Hansard; and c) half a year of online news articles (Aug. 2021 - Dec. 2021) from FOX News and NPR. In its final chapter, this dissertation offers a pedagogical piece that applies triples extraction to teach concepts related to data analysis. Extracted triples are thus evaluated through two means: a) in Chapter 1, precision and recall is used to vet the accuracy of the present method and b) in chapters 2 and 3, we use human observation to show how the present method of triples extraction can give an accurate and insightful perspective into textual corpora that rivals and, in some cases, exceeds existing methods

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum

    Preprint arXiv: 2208.12253 Submitted on 25 Aug 2022

    Get PDF
    Sampling from a quantum distribution can be exponentially hard for classicalcomputers and yet could be performed efficiently by a noisy intermediate-scalequantum device. A prime example of a distribution that is hard to sample isgiven by the output states of a linear interferometer traversed by NNidentical boson particles. Here, we propose a scheme to implement such a bosonsampling machine with ultracold atoms in a polarization-synthesized opticallattice. We experimentally demonstrate the basic building block of such amachine by revealing the Hong-Ou-Mandel interference of two bosonic atoms in afour-mode interferometer. To estimate the sampling rate for large NN, wedevelop a theoretical model based on a master equation. Our results show that aquantum advantage compared to today's best supercomputers can be reached withN≳40N \gtrsim 40

    Tools and Algorithms for the Construction and Analysis of Systems

    Get PDF
    This open access book constitutes the proceedings of the 28th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2022, which was held during April 2-7, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 46 full papers and 4 short papers presented in this volume were carefully reviewed and selected from 159 submissions. The proceedings also contain 16 tool papers of the affiliated competition SV-Comp and 1 paper consisting of the competition report. TACAS is a forum for researchers, developers, and users interested in rigorously based tools and algorithms for the construction and analysis of systems. The conference aims to bridge the gaps between different communities with this common interest and to support them in their quest to improve the utility, reliability, exibility, and efficiency of tools and algorithms for building computer-controlled systems

    Generating networks of genetic processors

    Full text link
    [EN] The Networks of Genetic Processors (NGPs) are non-conventional models of computation based on genetic operations over strings, namely mutation and crossover operations as it was established in genetic algorithms. Initially, they have been proposed as acceptor machines which are decision problem solvers. In that case, it has been shown that they are universal computing models equivalent to Turing machines. In this work, we propose NGPs as enumeration devices and we analyze their computational power. First, we define the model and we propose its definition as parallel genetic algorithms. Once the correspondence between the two formalisms has been established, we carry out a study of the generation capacity of the NGPs under the research framework of the theory of formal languages. We investigate the relationships between the number of processors of the model and its generative power. Our results show that the number of processors is important to increase the generative capability of the model up to an upper bound, and that NGPs are universal models of computation if they are formulated as generation devices. This allows us to affirm that parallel genetic algorithms working under certain restrictions can be considered equivalent to Turing machines and, therefore, they are universal models of computation.This research was partially supported by TAILOR, a project funded by EU Horizon 2020 research and innovation programme under GA No 952215.Campos Frances, M.; Sempere Luna, JM. (2022). Generating networks of genetic processors. Genetic Programming and Evolvable Machines. 23(1):133-155. https://doi.org/10.1007/s10710-021-09423-713315523

    In Memoriam, Solomon Marcus

    Get PDF
    This book commemorates Solomon Marcus’s fifth death anniversary with a selection of articles in mathematics, theoretical computer science, and physics written by authors who work in Marcus’s research fields, some of whom have been influenced by his results and/or have collaborated with him

    Graph based representation of the music symbolic level. A music information retrieval application

    Get PDF
    In this work, a new music symbolic level representation system is described. It has been tested in two information retrieval tasks concerning similarity between segments of music and genre detection of a given segment. It could include both harmonic and contrapuntal informations. Moreover, a new large dataset consisting of more than 5000 leadsheets is presented, with meta informations taken from different web databases, including author information, year of first performance, lyrics, genre, etc.ope

    Programming Languages and Systems

    Get PDF
    This open access book constitutes the proceedings of the 31st European Symposium on Programming, ESOP 2022, which was held during April 5-7, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 21 regular papers presented in this volume were carefully reviewed and selected from 64 submissions. They deal with fundamental issues in the specification, design, analysis, and implementation of programming languages and systems

    Yavaa: supporting data workflows from discovery to visualization

    Get PDF
    Recent years have witness an increasing number of data silos being opened up both within organizations and to the general public: Scientists publish their raw data as supplements to articles or even standalone artifacts to enable others to verify and extend their work. Governments pass laws to open up formerly protected data treasures to improve accountability and transparency as well as to enable new business ideas based on this public good. Even companies share structured information about their products and services to advertise their use and thus increase revenue. Exploiting this wealth of information holds many challenges for users, though. Oftentimes data is provided as tables whose sheer endless rows of daunting numbers are barely accessible. InfoVis can mitigate this gap. However, offered visualization options are generally very limited and next to no support is given in applying any of them. The same holds true for data wrangling. Only very few options to adjust the data to the current needs and barely any protection are in place to prevent even the most obvious mistakes. When it comes to data from multiple providers, the situation gets even bleaker. Only recently tools emerged to search for datasets across institutional borders reasonably. Easy-to-use ways to combine these datasets are still missing, though. Finally, results generally lack proper documentation of their provenance. So even the most compelling visualizations can be called into question when their coming about remains unclear. The foundations for a vivid exchange and exploitation of open data are set, but the barrier of entry remains relatively high, especially for non-expert users. This thesis aims to lower that barrier by providing tools and assistance, reducing the amount of prior experience and skills required. It covers the whole workflow ranging from identifying proper datasets, over possible transformations, up until the export of the result in the form of suitable visualizations

    Automated Deduction – CADE 28

    Get PDF
    This open access book constitutes the proceeding of the 28th International Conference on Automated Deduction, CADE 28, held virtually in July 2021. The 29 full papers and 7 system descriptions presented together with 2 invited papers were carefully reviewed and selected from 76 submissions. CADE is the major forum for the presentation of research in all aspects of automated deduction, including foundations, applications, implementations, and practical experience. The papers are organized in the following topics: Logical foundations; theory and principles; implementation and application; ATP and AI; and system descriptions
    • …
    corecore