Search CORE

10 research outputs found

ON EXPRESSIVENESS, INFERENCE, AND PARAMETER ESTIMATION OF DISCRETE SEQUENCE MODELS

Author: Lin Chu-Cheng
Publication venue: 'The Busan Gyeongnam Mathematical Society'
Publication date: 30/01/2023
Field of study

Huge neural autoregressive sequence models have achieved impressive performance across different applications, such as NLP, reinforcement learning, and bioinformatics. However, some lingering problems (e.g., consistency and coherency of generated texts) continue to exist, regardless of the parameter count. In the first part of this thesis, we chart a taxonomy of the expressiveness of various sequence model families (Ch 3). In particular, we put forth complexity-theoretic proofs that string latent-variable sequence models are strictly more expressive than energy-based sequence models, which in turn are more expressive than autoregressive sequence models. Based on these findings, we introduce residual energy-based sequence models, a family of energy-based sequence models (Ch 4) whose sequence weights can be evaluated efficiently, and also perform competitively against autoregressive models. However, we show how unrestricted energy-based sequence models can suffer from uncomputability; and how such a problem is generally unfixable without knowledge of the true sequence distribution (Ch 5). In the second part of the thesis, we study practical sequence model families and algorithms based on theoretical findings in the first part of the thesis. We introduce neural particle smoothing (Ch 6), a family of approximate sampling methods that work with conditional latent variable models. We also introduce neural finite-state transducers (Ch 7), which extend weighted finite state transducers with the introduction of mark strings, allowing scoring transduction paths in a finite state transducer with a neural network. Finally, we propose neural regular expressions (Ch 8), a family of neural sequence models that are easy to engineer, allowing a user to design flexible weighted relations using Marked FSTs, and combine these weighted relations together with various operations

JScholarship

Algebraic Methods in Language Processing:Proceedings of the twenty-first Twente workshop on language technology

Author
Publication venue: 'University Library/University of Twente'
Publication date: 15/08/2003
Field of study

University of Twente Research Information

Tirer parti de la structure des données incertaines

Author: Amarilli Antoine
Publication venue: HAL CCSD
Publication date: 14/03/2016
Field of study

The management of data uncertainty can lead to intractability, in the case of probabilistic databases, or even undecidability, in the case of open-world reasoning under logical rules. My thesis studies how to mitigate these problems by restricting the structure of uncertain data and rules. My first contribution investigates conditions on probabilistic relational instances that ensure the tractability of query evaluation and lineage computation. I show that these tasks are tractable when we bound the treewidth of instances, for various probabilistic frameworks and provenance representations. Conversely, I show intractability under mild assumptions for any other condition on instances. The second contribution concerns query evaluation on incomplete data under logical rules, and under the finiteness assumption usually made in database theory. I show that this task is decidable for unary inclusion dependencies and functional dependencies. This establishes the first positive result for finite open-world query answering on an arbitrary-arity language featuring both referential constraints and number restrictions.La gestion des données incertaines peut devenir infaisable, dans le cas des bases de données probabilistes, ou même indécidable, dans le cas du raisonnement en monde ouvert sous des contraintes logiques. Cette thèse étudie comment pallier ces problèmes en limitant la structure des données incertaines et des règles. La première contribution présentée s'intéresse aux conditions qui permettent d'assurer la faisabilité de l'évaluation de requêtes et du calcul de lignage sur les instances relationnelles probabilistes. Nous montrons que ces tâches sont faisables, pour diverses représentations de la provenance et des probabilités, quand la largeur d'arbre des instances est bornée. Réciproquement, sous des hypothèses faibles, nous pouvons montrer leur infaisabilité pour toute autre condition imposée sur les instances. La seconde contribution concerne l'évaluation de requêtes sur des données incomplètes et sous des contraintes logiques, sous l'hypothèse de finitude généralement supposée en théorie des bases de données. Nous montrons la décidabilité de cette tâche pour les dépendances d'inclusion unaires et les dépendances fonctionnelles. Ceci constitue le premier résultat positif, sous l'hypothèse de la finitude, pour la réponse aux requêtes en monde ouvert avec un langage d'arité arbitraire qui propose à la fois des contraintes d'intégrité référentielle et des contraintes de cardinalité

Thèses en Ligne

thèses en ligne de ParisTech

Programming Languages and Systems

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This open access book constitutes the proceedings of the 30th European Symposium on Programming, ESOP 2021, which was held during March 27 until April 1, 2021, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg and changed to an online format due to the COVID-19 pandemic. The 24 papers included in this volume were carefully reviewed and selected from 79 submissions. They deal with fundamental issues in the specification, design, analysis, and implementation of programming languages and systems

OAPEN Library

Towards The Efficient Use Of Fine-Grained Provenance In Datascience Applications

Author: Wu Yinjun
Publication venue: ScholarlyCommons
Publication date: 01/01/2021
Field of study

Recent years have witnessed increased demand for users to be able to interpret the results of data science pipelines, locate erroneous data items in the input, evaluate the importance of individual input data items, and acknowledge the contributions of data curators. Such applications often involve the use of the provenance at a fine-grained level, and require very fast response time. To address this issue, my goal is to expedite the use of fine-grained provenance in applications within both the database and machine learning domains, which are ubiquitous in contemporary data science pipelines. In applications from the database domain, I focus on the problem of data citation and provide two different types of solutions, Rewriting-based solutions and Provenance-based solutions, to generate fine-grained citations to database query results by implicitly or explicitly leveraging provenance information. In applications from the ML domain, the first considers the problem of incrementally updating ML models after the deletions of a small subset of training samples. This is critical for understanding the importance of individual training samples to ML models, especially in online pipelines. For this problem, I provide two solutions, PrIU and DeltaGrad, to incrementally update ML models constructed by SGD/GD methods, which utilize provenance information collected during the training phase on the full dataset before the deletion requests. The second application from the ML domain that I focus on is to explore how to clean label uncertainties located in the ML training dataset in a more efficient and cheaper manner. To address this problem, I proposed a solution, CHEF, to reduce the cost and the overhead at each phase of the label cleaning pipeline and maintain the overall model performance simultaneously. I also propose initial ideas for how to remove some assumptions used in these solutions to extend them to more general scenarios

ScholarlyCommons@Penn

Inclusion problems for one-counter systems

Author: Totzke Patrick
Publication venue: The University of Edinburgh
Publication date: 27/11/2014
Field of study

We study the decidability and complexity of verification problems for infinite-state systems. A fundamental question in formal verification is if the behaviour of one process is reproducible by another. This inclusion problem can be studied for various models of computation and behavioural preorders. It is generally intractable or even undecidable already for very limited computational models. The aim of this work is to clarify the status of the decidability and complexity of some well-known inclusion problems for suitably restricted computational models. In particular, we address the problems of checking strong and weak simulation and trace inclusion for processes definable by one-counter automata (OCA), that consist of a finite control and a single counter ranging over the non-negative integers. We take special interest of the subclass of one-counter nets (OCNs), that cannot fully test the counter for zero and which is subsumed both by pushdown automata and Petri nets / vector addition systems. Our new results include the PSPACE-completeness of strong and weak simulation, and the undecidability of trace inclusion for OCNs. Moreover, we consider semantic preorders between OCA/OCN and finite systems and close some gaps regarding their complexity. Finally, we study deterministic processes, for which simulation and trace inclusion coincide

Edinburgh Research Archive

Critical Thinking Skills Profile of High School Students In Learning Science-Physics

Author: Khaeruddin Khaeruddin
Nur Mohammad
Wasis Wasis
Publication venue
Publication date: 16/05/2016
Field of study

This study aims to describe Critical Thinking Skills high school students in the city of Makassar. To achieve this goal, the researchers conducted an analysis of student test results of 200 people scattered in six schools in the city of Makassar. The results of the quantitative descriptive analysis of the data found that the average value of students doing the interpretation, analysis, and inference in a row by 1.53, 1.15, and 1.52. This value is still very low when compared with the maximum value that may be obtained by students, that is equal to 10.00. This shows that the critical thinking skills of high school students are still very low. One fact Competency Standards science subjects-Physics is demonstrating the ability to think logically, critically, and creatively with the guidance of teachers and demonstrate the ability to solve simple problems in daily life. In fact, according to Michael Scriven stated that the main task of education is to train students and or students to think critically because of the demands of work in the global economy, the survival of a democratic and personal decisions and decisions in an increasingly complex society needs people who can think well and make judgments good. Therefore, the need for teachers in the learning device scenario such as: driving question or problem, authentic Investigation: Science Processes

Repository Universitas Negeri Makassar

Constraint propagation in Mozart

Author: Müller Tobias
Publication venue: Fakultät 6 - Naturwissenschaftlich-Technische Fakultät I. Fachrichtung 6.2 - Informatik
Publication date: 01/01/2001
Field of study

This thesis presents constraint propagation in Mozart which is based on computational agents called propagators. The thesis designs, implements, and evaluates propagator-based propagation engines. A propagation engine is split up in generic propagation services and domain specific domain solvers which are connected by a constraint programming interface. Propagators use filters to perform constraint propagation. The interface isolates filters from propagators such that they can be shared among various systems. This thesis presents the design and implementation of a finite integer set domainsolver for Mozart which reasons over bound and cardinality approximations of sets.The solver cooperates with a finite domain solver to improve its propagation and expressiveness. This thesis promotes constraints to first-class citizens and thus, provides extra control over constraints. Novel programming techniques taking advantage of the first-class status of constraints are developed and illustrated.Diese Dissertation beschreibt Constraint-Propagierung in Mozart, die auf Berechnungsagenten, Propagierer genannt, basiert. Die Dissertation entwirft, implementiert und evaluiert Propagierer-basierte Propagierungsmaschinen. Eine Propagierungsmaschine ist aufgeteilt in generische Propagierungsdienste und domänenspezifische Domänenlöser, die durch eine Schnittstelle zur Constraint-Programmierung miteinander verbunden sind. Propagierer benutzen Filter, um Constraints zu propagieren. Die Schnittstelle isoliert Filter von Propagierern, so dass Programmkodes von Filtern von verschiedenen Systemen genutzt werden können. Diese Dissertation präsentiert den Entwurf und die Implementierung eines Domänenlösers über endliche Mengen von ganzen Zahlen für Mozart, die über Mengen- und Kardinalitätsschranken approximiert werden. Dieser kooperiert mit einem Löser über endlichen Bereichen, um die Propagierung und die Ausdrucksfähigkeit zu verbessern. Diese Dissertation erhebt Constraints zu emanzipierten Datenstrukturen und stellt auf dieseWeise zusätzliche Steuerungsmöglichkeiten über Constraints zur Verfügung. Des Weiteren werden neuartige Programmiertechniken für emanzipierte Constraints entwickelt und demonstriert

Constraint propagation in Mozart

Author: Müller Tobias
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 23/09/2004
Field of study

Universaar

Acronym

LIPIcs, Volume 261, ICALP 2023, Complete Volume

Author: Etessami Kousha
Feige Uriel
Puppis Gabriele
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)
Publication date: 01/01/2023
Field of study

LIPIcs, Volume 261, ICALP 2023, Complete Volum

Dagstuhl Research Online Publication Server