Search CORE

8 research outputs found

Fast Collocation-Based Bayesian HMM Word Alignment

Author: Aziz W.
Schulz P.
Publication venue: The COLING 2016 Organizing Committee
Publication date: 01/01/2016
Field of study

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Fast Collocation-Based Bayesian HMM Word Alignment

Author: Aziz W.
Schulz P.
Publication venue: The COLING 2016 Organizing Committee
Publication date: 01/01/2016
Field of study

International Migration, Integration and Social Cohesion online publications

Leveraging Language to Learn Program Abstractions and Search Heuristics

Author: Andreas Jacob
Ellis Kevin
Tenenbaum Joshua B.
Wong Catherine
Publication venue
Publication date: 18/06/2021
Field of study

Inductive program synthesis, or inferring programs from examples of desired behavior, offers a general paradigm for building interpretable, robust, and generalizable machine learning systems. Effective program synthesis depends on two key ingredients: a strong library of functions from which to build programs, and an efficient search strategy for finding programs that solve a given task. We introduce LAPS (Language for Abstraction and Program Search), a technique for using natural language annotations to guide joint learning of libraries and neurally-guided search models for synthesis. When integrated into a state-of-the-art library learning system (DreamCoder), LAPS produces higher-quality libraries and improves search efficiency and generalization on three domains -- string editing, image composition, and abstract reasoning about scenes -- even when no natural language hints are available at test time.Comment: appeared in Thirty-eighth International Conference on Machine Learning (ICML 2021

arXiv.org e-Print Archive

DSpace@MIT

Enriching the Swedish Sign Language Corpus with Part of Speech Tags Using Joint Bayesian Word Alignment and Annotation Transfer

Author: Carl Robertöstling
Lars Börstell
Wallin
Publication venue
Publication date: 11/04/2020
Field of study

Abstract We have used a novel Bayesian model of joint word alignment and part of speech (PoS) annotation transfer to enrich the Swedish Sign Language Corpus with PoS tags. The annotations were then handcorrected in order to both improve annotation quality for the corpus, and allow the empirical evaluation presented herein

CiteSeerX

Enriching the Swedish Sign Language Corpus with Part of Speech Tags Using Joint Bayesian Word Alignment and Annotation Transfer

Author: Carl Robertöstling
Lars Börstell
Wallin
Publication venue
Publication date: 11/04/2020
Field of study

CiteSeerX

A Bayesian model for joint word alignment and part-of-speech transfer

Author: Östling Robert
Publication venue: The Association for Computational Linguistics
Publication date: 01/01/2016
Field of study

Current methods for word alignment require considerable amounts of parallel text to deliver accurate results, a requirement which is met only for a small minority of the world's approximately 7,000 languages. We show that by jointly performing word alignment and annotation transfer in a novel Bayesian model, alignment accuracy can be improved for language pairs where annotations are available for only one of the languages---a finding which could facilitate the study and processing of a vast number of low-resource languages. We also present an evaluation where our method is used to perform single-source and multi-source part-of-speech transfer with 22 translations of the same text in four different languages. This allows us to quantify the considerable variation in accuracy depending on the specific source text(s) used, even with different translations into the same language.Non peer reviewe

Publikationer från Stockholms universitet

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Helsingin yliopiston digitaalinen arkisto

A Systematic Bayesian Treatment of the IBM Alignment Models

Author: Phil Blunsom
Yarin Gal
Publication venue
Publication date: 01/01/2013
Field of study

The dominant yet ageing IBM and HMM word alignment models underpin most popular Statistical Machine Translation implementations in use today. Though beset by the limitations of implausible independence assumptions, intractable optimisation problems, and an excess of tunable parameters, these models provide a scalable and reliable starting point for inducing translation systems. In this paper we build upon this venerable base by recasting these models in the non-parametric Bayesian framework. By replacing the categorical distributions at their core with hierarchical Pitman-Yor processes, and through the use of collapsed Gibbs sampling, we provide a more flexible formulation and sidestep the original heuristic optimisation techniques. The resulting models are highly extendible, naturally permitting the introduction of phrasal dependencies. We present extensive experimental results showing improvements in both AER and BLEU when benchmarked against Giza++, including significant improvements over IBM model 4.

CiteSeerX

Oxford University Research Archive