Search CORE

4,246 research outputs found

Cooperative Learning of Zero-Shot Machine Reading Comprehension

Author: Glass James
Li Shang-Wen
Luo Hongyin
Yu Seunghak
Publication venue
Publication date: 22/03/2021
Field of study

Pretrained language models have significantly improved the performance of down-stream language understanding tasks, including extractive question answering, by providing high-quality contextualized word embeddings. However, learning question answering models still need large-scaled data annotation in specific domains. In this work, we propose a cooperative, self-play learning framework, REGEX, for question generation and answering. REGEX is built upon a masked answer extraction task with an interactive learning environment containing an answer entity REcognizer, a question Generator, and an answer EXtractor. Given a passage with a masked entity, the generator generates a question around the entity, and the extractor is trained to extract the masked entity with the generated question and raw texts. The framework allows the training of question generation and answering models on any text corpora without annotation. We further leverage a reinforcement learning technique to reward generating high-quality questions and to improve the answer extraction model's performance. Experiment results show that REGEX outperforms the state-of-the-art (SOTA) pretrained language models and zero-shot approaches on standard question-answering benchmarks, and yields the new SOTA performance under the zero-shot setting

arXiv.org e-Print Archive

Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode

Author: Harwath David
Li Shang-Wen
Mohamed Abdelrahman
Peng Puyuan
Räsänen Okko
Publication venue
Publication date: 19/05/2023
Field of study

In this paper, we show that representations capturing syllabic units emerge when training a self-supervised speech model with a visually-grounded training objective. We demonstrate that a nearly identical model architecture (HuBERT) trained with a masked language modeling loss does not exhibit this same ability, suggesting that the visual grounding objective is responsible for the emergence of this phenomenon. We propose the use of a minimum cut algorithm to automatically predict syllable boundaries in speech, followed by a 2-stage clustering method to group identical syllables together. We show that our model not only outperforms a state-of-the-art syllabic segmentation method on the language it was trained on (English), but also generalizes in a zero-shot fashion to Estonian. Finally, we show that the same model is capable of zero-shot generalization for a word segmentation task on 4 other languages from the Zerospeech Challenge, in some cases beating the previous state-of-the-art.Comment: Interspeech 2023. Code & Model: https://github.com/jasonppy/syllable-discover

arXiv.org e-Print Archive

Bis[(E)-4-bromo-2-(ethoxyiminomethyl)phenolato-κ2 N,O 1]copper(II)

Author: Dong Wen-Kui
Gong Shang-Sheng
Li Li
Tong Jun-Feng
Wu Jian-Chao
Publication venue: International Union of Crystallography
Publication date: 01/11/2009
Field of study

The title compound, [Cu(C9H9BrNO2)2], is a centrosymmetric mononuclear copper(II) complex. The Cu atom is four-coordinated in a trans-CuN2O2 square-planar geometry by two phenolate O and two oxime N atoms from two symmetry-related N,O-bidentate (E)-4-bromo-2-(ethoxyiminomethyl)phenolate oxime-type ligands. An interesting feature of the crystal structure is the centrosymmetric intermolecular Cu⋯O interaction [3.382 (1) Å], which establishes an infinite chain structure along the b axis

Crossref

Directory of Open Access Journals

PubMed Central

Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target

Author: Lee Hung-yi
Li Shang-Wen
Lin Guan-Ting
Wu Guan-Wei
Publication venue
Publication date: 29/05/2023
Field of study

Spoken Language Understanding (SLU) is a task that aims to extract semantic information from spoken utterances. Previous research has made progress in end-to-end SLU by using paired speech-text data, such as pre-trained Automatic Speech Recognition (ASR) models or paired text as intermediate targets. However, acquiring paired transcripts is expensive and impractical for unwritten languages. On the other hand, Textless SLU extracts semantic information from speech without utilizing paired transcripts. However, the absence of intermediate targets and training guidance for textless SLU often results in suboptimal performance. In this work, inspired by the content-disentangled discrete units from self-supervised speech models, we proposed to use discrete units as intermediate guidance to improve textless SLU performance. Our method surpasses the baseline method on five SLU benchmark corpora. Additionally, we find that unit guidance facilitates few-shot learning and enhances the model's ability to handle noise.Comment: Accepted by interspeech 202

arXiv.org e-Print Archive