99 research outputs found

    Machine Learning Algorithm for the Scansion of Old Saxon Poetry

    Get PDF
    Several scholars designed tools to perform the automatic scansion of poetry in many languages, but none of these tools deal with Old Saxon or Old English. This project aims to be a first attempt to create a tool for these languages. We implemented a Bidirectional Long Short-Term Memory (BiLSTM) model to perform the automatic scansion of Old Saxon and Old English poems. Since this model uses supervised learning, we manually annotated the Heliand manuscript, and we used the resulting corpus as labeled dataset to train the model. The evaluation of the performance of the algorithm reached a 97% for the accuracy and a 99% of weighted average for precision, recall and F1 Score. In addition, we tested the model with some verses from the Old Saxon Genesis and some from The Battle of Brunanburh, and we observed that the model predicted almost all Old Saxon metrical patterns correctly misclassified the majority of the Old English input verses

    Static Analysis of Graph Database Transformations

    Full text link
    We investigate graph transformations, defined using Datalog-like rules based on acyclic conjunctive two-way regular path queries (acyclic C2RPQs), and we study two fundamental static analysis problems: type checking and equivalence of transformations in the presence of graph schemas. Additionally, we investigate the problem of target schema elicitation, which aims to construct a schema that closely captures all outputs of a transformation over graphs conforming to the input schema. We show all these problems are in EXPTIME by reducing them to C2RPQ containment modulo schema; we also provide matching lower bounds. We use cycle reversing to reduce query containment to the problem of unrestricted (finite or infinite) satisfiability of C2RPQs modulo a theory expressed in a description logic

    AIUCD 2022 - Proceedings

    Get PDF
    L’undicesima edizione del Convegno Nazionale dell’AIUCD-Associazione di Informatica Umanistica ha per titolo Culture digitali. Intersezioni: filosofia, arti, media. Nel titolo è presente, in maniera esplicita, la richiesta di una riflessione, metodologica e teorica, sull’interrelazione tra tecnologie digitali, scienze dell’informazione, discipline filosofiche, mondo delle arti e cultural studies

    Ontology-based transformation of natural language queries into SPARQL queries by evolutionary algorithms

    Get PDF
    In dieser Arbeit wird ein ontologiegetriebenes evolutionäres Lernsystem für natürlichsprachliche Abfragen von RDF-Graphen vorgestellt. Das lernende System beantwortet die Anfrage nicht selbst, sondern generiert eine SPARQL-Abfrage gegen die Datenbank. Zu diesem Zweck wird das Evolutionary Dataflow Agents Framework eingeführt, ein allgemeines Lernsystem, das auf der Grundlage evolutionärer Algorithmen Agenten erzeugt, die lernen, ein Problem zu lösen. Die Hauptidee des Frameworks ist es, Probleme zu unterstützen, die einen mittelgroßen Suchraum (Anwendungsfall: Analyse von natürlichsprachlichen Abfragen) von streng formal strukturierten Lösungen (Anwendungsfall: Synthese von Datenbankabfragen) mit eher lokalen klassischen strukturellen und algorithmischen Aspekten kombinieren. Dabei kombinieren die Agenten lokale algorithmische Funktionalität von Knoten mit einem flexiblen Datenfluss zwischen den Knoten zu einem globalen Problemlösungsprozess. Grob gesagt gibt es Knoten, die Informationsfragmente generieren, indem sie Eingabedaten und/oder frühere Fragmente kombinieren, oft unter Verwendung von auf Heuristik basierenden Vermutungen. Andere Knoten kombinieren, sammeln und reduzieren solche Fragmente auf mögliche Lösungen und grenzen diese auf die endgültige Lösung ein. Zu diesem Zweck werden die Informationen von den Agenten weitergegeben. Die Konfiguration dieser Agenten, welche Knoten sie kombinieren und wohin genau die Daten fließen, ist Gegenstand des Lernens. Das Training beginnt mit einfachen Agenten, die - wie in Lern-Frameworks üblich - eine Reihe von Aufgaben lösen und dafür bewertet werden. Da die erzeugten Antworten in der Regel komplexe Strukturen aufweisen, setzt das Framework einen neuartigen feinkörnigen energiebasierten Bewertungs- und Auswahlschritt ein. Die ausgewählten Agenten bilden dann die Grundlage für die Population der nächsten Runde. Die Evolution wird wie üblich durch Mutationen und Agentenfusion gewährleistet. Als Anwendungsfall wurde EvolNLQ implementiert, ein System zur Beantwortung natürlichsprachlicher Abfragen gegen RDF-Datenbanken. Hierfür wird die zugrundeliegende Ontologie medatata (extern) algorithmisch vorverarbeitet. Für die Agenten werden geeignete Datenelementtypen und Knotentypen definiert, die die Prozesse der Sprachanalyse und der Anfragesynthese in mehr oder weniger elementare Operationen zerlegen. Die "Größe" der Operationen wird bestimmt durch die Grenze zwischen Berechnungen, d.h. rein algorithmischen Schritten (implementiert in einzelnen mächtigen Knoten) und einfachen heuristischen Schritten (ebenfalls realisiert durch einfache Knoten), und freiem Datenfluss, der beliebige Verkettungen und Verzweigungskonfigurationen der Agenten erlaubt. EvolNLQ wird mit einigen anderen Ansätzen verglichen und zeigt konkurrenzfähige Ergebnisse.In this thesis an ontology-driven evolutionary learning system for natural language querying of RDF graphs is presented. The learning system itself does not answer the query, but generates a SPARQL query against the database. For this purpose, the Evolutionary Dataflow Agents framework, a general learning framework is introduced that, based on evolutionary algorithms, creates agents that learn to solve a problem. The main idea of the framework is to support problems that combine a medium-sized search space (use case: analysis of natural language queries) of strictly, formally structured solutions (use case: synthesis of database queries), with rather local classical structural and algorithmic aspects. For this, the agents combine local algorithmic functionality of nodes with a flexible dataflow between the nodes to a global problem solving process. Roughly, there are nodes that generate informational fragments by combining input data and/or earlier fragments, often using heuristics-based guessing. Other nodes combine, collect, and reduce such fragments towards possible solutions, and narrowing these towards the unique final solution. For this, informational items are floating through the agents. The configuration of these agents, what nodes they combine, and where exactly the data items are flowing, is subject to learning. The training starts with simple agents, which –as usual in learning frameworks– solve a set of tasks, and are evaluated for it. Since the produced answers usually have complex structures answers, the framework employs a novel fine-grained energy-based evaluation and selection step. The selected agents then are the basis for the population of the next round. Evolution is provided as usual by mutations and agent fusion. As a use case, EvolNLQ has been implemented, a system for answering natural language queries against RDF databases. For this, the underlying ontology medatata is (externally) algorithmically preprocessed. For the agents, appropriate data item types and node types are defined that break down the processes of language analysis and query synthesis into more or less elementary operations. The "size" of operations is determined by the border between computations, i.e., purely algorithmic steps (implemented in individual powerful nodes) and simple heuristic steps (also realized by simple nodes), and free dataflow allowing for arbitrary chaining and branching configurations of the agents. EvolNLQ is compared with some other approaches, showing competitive results.2022-01-2

    Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020

    Get PDF
    On behalf of the Program Committee, a very warm welcome to the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020). This edition of the conference is held in Bologna and organised by the University of Bologna. The CLiC-it conference series is an initiative of the Italian Association for Computational Linguistics (AILC) which, after six years of activity, has clearly established itself as the premier national forum for research and development in the fields of Computational Linguistics and Natural Language Processing, where leading researchers and practitioners from academia and industry meet to share their research results, experiences, and challenges

    Synthesis of Scientific Workflows: Theory and Practice of an Instance-Aware Approach

    Get PDF
    The last two decades brought an explosion of computational tools and processes in many scientific domains (e.g., life-, social- and geo-science). Scientific workflows, i.e., computational pipelines, accompanied by workflow management systems, were soon adopted as a de-facto standard among non-computer scientists for orchestrating such computational processes. The goal of this dissertation is to provide a framework that would automate the orchestration of such computational pipelines in practice. We refer to such problems as scientific workflow synthesis problems. This dissertation introduces the temporal logic SLTLx, and presents a novel SLTLx-based synthesis approach that overcomes limitations in handling data object dependencies present in existing synthesis approaches. The new approach uses transducers and temporal goals, which keep track of the data objects in the synthesised workflow. The proposed SLTLx-based synthesis includes a bounded and a dynamic variant, which are shown in Chapter 3 to be NP-complete and PSPACE-complete, respectively. Chapter 4 introduces a transformation algorithm that translates the bounded SLTLx-based synthesis problem into propositional logic. The transformation is implemented as part of the APE (Automated Pipeline Explorer) framework, presented in Chapter 5. It relies on highly efficient SAT solving techniques, using an off-the-shelf SAT solver to synthesise a solution for the given propositional encoding. The framework provides an API (application programming interface), a CLI (command line interface), and a web-based GUI (graphical user interface). The development of APE was accompanied by four concrete application scenarios as case studies for automated workflow composition. The studies were conducted in collaboration with domain experts and presented in Chapter 6. Each of the case studies is used to assess and illustrate specific features of the SLTLx-based synthesis approach. (1) A case study on cartographic map generation demonstrates the ability to distinguish data objects as a key feature of the framework. It illustrates the process of annotating a new domain, and presents the iterative workflow synthesis approach, where the user tries to narrow down the desired specification of the problem in a few intuitive steps. (2) A case study on geo-analytical question answering as part of the QuAnGIS project shows the benefits of using data flow dependencies to describe a synthesis problem. (3) A proteomics case study demonstrates the usability of APE as an “off-the-shelf” synthesiser, providing direct integration with existing semantic domain annotations. In addition, a manual evaluation of the synthesised results shows promising results even on large real-life domains, such as the EDAM ontology and the complete bio.tools registry. (4) A geo-event question-answering study demonstrates the usability of APE within a larger question-answering system. This dissertation answers the goals it sets to solve. It provides a formal framework, accompanied by a lightweight library, which can solve real-life scientific workflow synthesis problems. Finally, the development of the library motivated an upcoming collaborative project in the life sciences domain. The aim of the project is to develop a platform which would automatically compose (using APE) and benchmark workflows in computational proteomics

    28th International Symposium on Temporal Representation and Reasoning (TIME 2021)

    Get PDF
    The 28th International Symposium on Temporal Representation and Reasoning (TIME 2021) was planned to take place in Klagenfurt, Austria, but had to move to an online conference due to the insecurities and restrictions caused by the pandemic. Since its frst edition in 1994, TIME Symposium is quite unique in the panorama of the scientifc conferences as its main goal is to bring together researchers from distinct research areas involving the management and representation of temporal data as well as the reasoning about temporal aspects of information. Moreover, TIME Symposium aims to bridge theoretical and applied research, as well as to serve as an interdisciplinary forum for exchange among researchers from the areas of artifcial intelligence, database management, logic and verifcation, and beyond

    Memorias del Congreso Argentino en Ciencias de la Computación - CACIC 2021

    Get PDF
    Trabajos presentados en el XXVII Congreso Argentino de Ciencias de la Computación (CACIC), celebrado en la ciudad de Salta los días 4 al 8 de octubre de 2021, organizado por la Red de Universidades con Carreras en Informática (RedUNCI) y la Universidad Nacional de Salta (UNSA).Red de Universidades con Carreras en Informátic

    Modeling of query languages and applications in code refactoring and code optimization

    Get PDF
    Проблем садржаности упита један је од фундаменталних проблема у рачунар- ским наукама, иницијално дефинисан за релационе упите. Са растућом популарношћу SPARQL упитног језика, проблем постаје релевантан и актуелан и у овом новом контексту. У тези је представљен оригинални приступ решавању овог проблема заснован на сво- ђењу на задовољивост у логици првог реда. Подржана је садржаност упита узимајући у обзир RDF схему, а разматра се и релација стапања, као слабија форма садржаности. Доказана је сагласност и потпуност предложеног приступа на широком спектру језич- ких конструката. Описана је и његова имплементација, у виду решавача SPECS, чији је кôд јавно доступан. Представљени су резултати детаљне експерименаталне евалуације на релевантним скуповима примера за тестирање који показују да је SPECS ефикасан, и да у поређењу са осталим савременим решавачима истог проблема даје прецизније ре- зултате у краћем времену, уз бољу покривеност језичких конструката. Једна од примена моделовања упитних језика може бити и при рефакторисању апликација које присту- пају базама података. У таквим ситуацијама, врло су честе измене којима се мењају и упити и кôд на језику у коме се они позивају. Такве промене могу сачувати укупну еквивалентност кода, док на нивоу појединачних делова еквивалентност не мора бити одржана. Коришћење алата за аутоматску верификацију еквивалентности рефактори- саног кода може да дâ гаранцију задржавања понашања програма и од суштинског је значаја за поуздан развој софтвера. Са том мотивацијом, у тези се разматра и модело- вање SQL упита у теоријама логике првог реда, којим се омогућава аутоматска провера еквивалентности C/C++ програма са уграђеним SQL-ом, што је и имплементирано у виду јавно доступног алата отвореног кода SQLAV.The query containment problem is a very important computer science problem, originally defined for relational queries. With the growing popularity of the SPARQL query language, it became relevant and important in this new context, too. This thesis introduces a new approach for solving this problem, based on a reduction to satisfiability in first order logic. The approach covers containment under RDF SCHEMA entailment regime, and it can deal with the subsumption relation, as a weaker form of containment. The thesis proves soundness and completeness of the approach for a wide range of language constructs. It also describes an implementation of the approach as an open source solver SPECS. The experimental evaluation on relevant benchmarks shows that SPECS is efficient and comparing to state-of-the-art solvers, it gives more precise results in a shorter amount of time, while supporting a larger fragment of SPARQL constructs. An application of query language modeling can be useful also along refactoring of database driven applications, where simultaneous changes that include both a query and a host language code are very common. These changes can preserve the overall equivalence, without preserving equivalence of these two parts considered separately. Because of the ability to guarantee the absence of differences in behavior between two versions of the code, tools that automatically verify code equivalence have great benefits for reliable software development. With this motivation, a custom first-order logic modeling of SQL queries is developed and described in the thesis. It enables an automated approach for reasoning about equivalence of C/C++ programs with embedded SQL. The approach is implemented within a publicly available and open source framework SQLAV
    corecore