Search CORE

192 research outputs found

Differentiable Perturb-and-Parse: Semi-Supervised Parsing with a Structured Variational Autoencoder

Author: Corro Caio
Titov Ivan
Publication venue
Publication date: 20/02/2019
Field of study

Human annotation for syntactic parsing is expensive, and large resources are available only for a fraction of languages. A question we ask is whether one can leverage abundant unlabeled texts to improve syntactic parsers, beyond just using the texts to obtain more generalisable lexical features (i.e. beyond word embeddings). To this end, we propose a novel latent-variable generative model for semi-supervised syntactic dependency parsing. As exact inference is intractable, we introduce a differentiable relaxation to obtain approximate samples and compute gradients with respect to the parser parameters. Our method (Differentiable Perturb-and-Parse) relies on differentiable dynamic programming over stochastically perturbed edge scores. We demonstrate effectiveness of our approach with experiments on English, French and Swedish.Comment: Accepted at ICLR 201

arXiv.org e-Print Archive

Edinburgh Research Explorer

International Migration, Integration and Social Cohesion online publications

UvA-DARE

LAST: Scalable Lattice-Based Speech Modelling in JAX

Author: Bagby Tom
Riley Michael
Variani Ehsan
Wu Ke
Publication venue
Publication date: 25/04/2023
Field of study

We introduce LAST, a LAttice-based Speech Transducer library in JAX. With an emphasis on flexibility, ease-of-use, and scalability, LAST implements differentiable weighted finite state automaton (WFSA) algorithms needed for training \& inference that scale to a large WFSA such as a recognition lattice over the entire utterance. Despite these WFSA algorithms being well-known in the literature, new challenges arise from performance characteristics of modern architectures, and from nuances in automatic differentiation. We describe a suite of generally applicable techniques employed in LAST to address these challenges, and demonstrate their effectiveness with benchmarks on TPUv3 and V100 GPU

arXiv.org e-Print Archive

Simple Hardware-Efficient PCFGs with Independent Left and Right Productions

Author: Kim Yoon
Liu Wei
Tu Kewei
Yang Songlin
Publication venue
Publication date: 23/10/2023
Field of study

Scaling dense PCFGs to thousands of nonterminals via a low-rank parameterization of the rule probability tensor has been shown to be beneficial for unsupervised parsing. However, PCFGs scaled this way still perform poorly as a language model, and even underperform similarly-sized HMMs. This work introduces \emph{SimplePCFG}, a simple PCFG formalism with independent left and right productions. Despite imposing a stronger independence assumption than the low-rank approach, we find that this formalism scales more effectively both as a language model and as an unsupervised parser. As an unsupervised parser, our simple PCFG obtains an average F1 of 65.1 on the English PTB, and as a language model, it obtains a perplexity of 119.0, outperforming similarly-sized low-rank PCFGs. We further introduce \emph{FlashInside}, a hardware IO-aware implementation of the inside algorithm for efficiently scaling simple PCFGs.Comment: Accepted to Findings of EMNLP, 202

arXiv.org e-Print Archive

High-dimensional hidden Markov models:methodology, computational issues, solutions and applications

Author: Rimella Lorenzo
Publication venue
Publication date: 28/09/2021
Field of study

Explore Bristol Research

Recommended from our members

Algorithms for Optimal Paths of One, Many, and an Infinite Number of Agents

Author: Lin Alex Tong
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

In this dissertation, we provide efficient algorithms for modeling the behavior of a single agent, multiple agents, and a continuum of agents. For a single agent, we combine the modeling framework of optimal control with advances in optimization splitting in order to efficiently find optimal paths for problems in very high-dimensions, thus providing alleviation from the curse of dimensionality. For a multiple, but finite, number of agents, we take the framework of multi-agent reinforcement learning and utilize imitation learning in order to decentralize a centralized expert, thus obtaining optimal multi-agents that act in a decentralized fashion. For a continuum of agents, we take the framework of mean-field games and use two neural networks, which we train in an alternating scheme, in order to efficiently find optimal paths for high-dimensional and stochastic problems. These tools cover a wide variety of use-cases that can be immediately deployed for practical applications

eScholarship - University of California

On the relationship between predictive coding and backpropagation

Author: Rosenbaum Robert
Publication venue
Publication date: 25/06/2021
Field of study

Artificial neural networks are often interpreted as abstract models of biological neuronal networks, but they are typically trained using the biologically unrealistic backpropagation algorithm and its variants. Predictive coding has been offered as a potentially more biologically realistic alternative to backpropagation for training neural networks. In this manuscript, I review and extend recent work on the mathematical relationship between predictive coding and backpropagation for training feedforward artificial neural networks on supervised learning tasks. I discuss some implications of these results for the interpretation of predictive coding and deep neural networks as models of biological learning and I describe a repository of functions, Torch2PC, for performing predictive coding with PyTorch neural network models

arXiv.org e-Print Archive

PubMed Central