Search CORE

579 research outputs found

Simulating Segmentation by Simultaneous Interpreters for Simultaneous Machine Translation

Author: Kato Tsuneaki
Nakabayashi Akiko
Publication venue: Waseda Institute for the Study of Language and Information
Publication date: 01/01/2019
Field of study

A Defense of Pure Connectionism

Author: Kiefer Alex B
Publication venue: CUNY Academic Works
Publication date: 01/02/2019
Field of study

Connectionism is an approach to neural-networks-based cognitive modeling that encompasses the recent deep learning movement in artificial intelligence. It came of age in the 1980s, with its roots in cybernetics and earlier attempts to model the brain as a system of simple parallel processors. Connectionist models center on statistical inference within neural networks with empirically learnable parameters, which can be represented as graphical models. More recent approaches focus on learning and inference within hierarchical generative models. Contra influential and ongoing critiques, I argue in this dissertation that the connectionist approach to cognitive science possesses in principle (and, as is becoming increasingly clear, in practice) the resources to model even the most rich and distinctly human cognitive capacities, such as abstract, conceptual thought and natural language comprehension and production. Consonant with much previous philosophical work on connectionism, I argue that a core principle—that proximal representations in a vector space have similar semantic values—is the key to a successful connectionist account of the systematicity and productivity of thought, language, and other core cognitive phenomena. My work here differs from preceding work in philosophy in several respects: (1) I compare a wide variety of connectionist responses to the systematicity challenge and isolate two main strands that are both historically important and reflected in ongoing work today: (a) vector symbolic architectures and (b) (compositional) vector space semantic models; (2) I consider very recent applications of these approaches, including their deployment on large-scale machine learning tasks such as machine translation; (3) I argue, again on the basis mostly of recent developments, for a continuity in representation and processing across natural language, image processing and other domains; (4) I explicitly link broad, abstract features of connectionist representation to recent proposals in cognitive science similar in spirit, such as hierarchical Bayesian and free energy minimization approaches, and offer a single rebuttal of criticisms of these related paradigms; (5) I critique recent alternative proposals that argue for a hybrid Classical (i.e. serial symbolic)/statistical model of mind; (6) I argue that defending the most plausible form of a connectionist cognitive architecture requires rethinking certain distinctions that have figured prominently in the history of the philosophy of mind and language, such as that between word- and phrase-level semantic content, and between inference and association

City University of New York

Online Versus Offline NMT Quality: An In-depth Analysis on English-German and German-English

Author: Besacier Laurent
Elbayad Maha
Esperança-Rodier Emmanuelle
Manquat Francis Brunet
Ustaszewski Michael
Verbeek Jakob
Publication venue
Publication date: 01/01/2020
Field of study

We conduct in this work an evaluation study comparing offline and online neural machine translation architectures. Two sequence-to-sequence models: convolutional Pervasive Attention (Elbayad et al. 2018) and attention-based Transformer (Vaswani et al. 2017) are considered. We investigate, for both architectures, the impact of online decoding constraints on the translation quality through a carefully designed human evaluation on English-German and German-English language pairs, the latter being particularly sensitive to latency constraints. The evaluation results allow us to identify the strengths and shortcomings of each model when we shift to the online setup.Comment: Accepted at COLING 202

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Efficient Wait-k Models for Simultaneous Machine Translation

Author: Besacier Laurent
Elbayad Maha
Verbeek Jakob
Publication venue
Publication date: 03/08/2020
Field of study

Simultaneous machine translation consists in starting output generation before the entire input sequence is available. Wait-k decoders offer a simple but efficient approach for this problem. They first read k source tokens, after which they alternate between producing a target token and reading another source token. We investigate the behavior of wait-k decoding in low resource settings for spoken corpora using IWSLT datasets. We improve training of these models using unidirectional encoders, and training across multiple values of k. Experiments with Transformer and 2D-convolutional architectures show that our wait-k models generalize well across a wide range of latency levels. We also show that the 2D-convolution architecture is competitive with Transformers for simultaneous translation of spoken language.Comment: Accepted at INTERSPEECH 202

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

INCREMENTAL PREDICTION OF SENTENCE-FINAL VERBS WITH ATTENTIVE RECURRENT NEURAL NETWORKS

Author: LI Wenyan
Publication venue
Publication date: 01/01/2018
Field of study

Sentence-final verb prediction has garnered attention both in computational lin- guistics and psycholinguistics. It is indispensable for understanding human processing of verb-final languages. More recently, it has been used for computational approaches to simultaneous interpretation, i.e. translation in real-time, from verb-final to verb-medial languages. While previous approaches use classical statistical methods including pattern- matching rules, n-gram language models, or a logistic regression with linguistic features, we introduce an attention-based neural model, Attentive Neural Verb Inference for Incre- mental Language (ANVIIL), to incrementally predict final verbs on incomplete sentences. Our approach better predicts the final verbs in Japanese and German and provides more interpretable explanations of why those verbs are selected

Digital Repository at the University of Maryland

Layer-wise Representation Fusion for Compositional Generalization

Author: Chen Yidong
Fu Biao
Lai Zhaohong
Lin Lei
Liu Shan
Rao Wenhao
Shi Xiaodong
Wang Binling
Ye Peigen
Zheng Yafang
Publication venue
Publication date: 20/07/2023
Field of study

Despite successes across a broad range of applications, sequence-to-sequence models' construct of solutions are argued to be less compositional than human-like generalization. There is mounting evidence that one of the reasons hindering compositional generalization is representations of the encoder and decoder uppermost layer are entangled. In other words, the syntactic and semantic representations of sequences are twisted inappropriately. However, most previous studies mainly concentrate on enhancing token-level semantic information to alleviate the representations entanglement problem, rather than composing and using the syntactic and semantic representations of sequences appropriately as humans do. In addition, we explain why the entanglement problem exists from the perspective of recent studies about training deeper Transformer, mainly owing to the ``shallow'' residual connections and its simple, one-step operations, which fails to fuse previous layers' information effectively. Starting from this finding and inspired by humans' strategies, we propose \textsc{FuSion} (\textbf{Fu}sing \textbf{S}yntactic and Semant\textbf{i}c Representati\textbf{on}s), an extension to sequence-to-sequence models to learn to fuse previous layers' information back into the encoding and decoding process appropriately through introducing a \emph{fuse-attention module} at each encoder and decoder layer. \textsc{FuSion} achieves competitive and even \textbf{state-of-the-art} results on two realistic benchmarks, which empirically demonstrates the effectiveness of our proposal.Comment: work in progress. arXiv admin note: substantial text overlap with arXiv:2305.1216

arXiv.org e-Print Archive

Proceedings of the Seventh International Conference Formal Approaches to South Slavic and Balkan languages

Author
Publication venue: Croatian Language Technologies Society, Faculty of Humanities and Social Science
Publication date: 01/01/2010
Field of study

Proceedings of the Seventh International Conference Formal Approaches to South Slavic and Balkan Languages publishes 17 papers that were presented at the conference organised in Dubrovnik, Croatia, 4-6 Octobre 2010

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb