Search CORE

407 research outputs found

TeaForN: Teacher-Forcing with N-grams

Author: Ding Nan
Goodman Sebastian
Soricut Radu
Publication venue
Publication date: 01/01/2020
Field of study

Sequence generation models trained with teacher-forcing suffer from issues related to exposure bias and lack of differentiability across timesteps. Our proposed method, Teacher-Forcing with N-grams (TeaForN), addresses both these problems directly, through the use of a stack of N decoders trained to decode along a secondary time axis that allows model parameter updates based on N prediction steps. TeaForN can be used with a wide class of decoder architectures and requires minimal modifications from a standard teacher-forcing setup. Empirically, we show that TeaForN boosts generation quality on one Machine Translation benchmark, WMT 2014 English-French, and two News Summarization benchmarks, CNN/Dailymail and Gigaword.Comment: to be published in EMNLP 202

arXiv.org e-Print Archive

Crossref

CausalLM is not optimal for in-context learning

Author: Ding Nan
Goodman Sebastian
Levinboim Tomer
Soricut Radu
Wu Jialin
Publication venue
Publication date: 13/08/2023
Field of study

Recent empirical evidence indicates that transformer based in-context learning performs better when using a prefix language model (prefixLM), in which in-context samples can all attend to each other, compared to causal language models (causalLM), which use auto-regressive attention that prohibits in-context samples to attend to future samples. While this result is intuitive, it is not understood from a theoretical perspective. In this paper we take a theoretical approach and analyze the convergence behavior of prefixLM and causalLM under a certain parameter construction. Our analysis shows that both LM types converge to their stationary points at a linear rate, but that while prefixLM converges to the optimal solution of linear regression, causalLM convergence dynamics follows that of an online gradient descent algorithm, which is not guaranteed to be optimal even as the number of samples grows infinitely. We supplement our theoretical claims with empirical experiments over synthetic and real tasks and using various types of transformers. Our experiments verify that causalLM consistently underperforms prefixLM in all settings

arXiv.org e-Print Archive

Antioxidant generation during coffee roasting : a comparison and interpretation from three complementary assays

Author: Goodman Bernard
Keller Marco
Opitz Sebastian
Schenker Stefan
Smrke Samo
Yeretzian Chahan
Publication venue: MDPI
Publication date: 01/01/2014
Field of study

Coffee is a major source of dietary antioxidants; some are present in the green bean, whereas others are generated during roasting. However, there is no single accepted analytical method for their routine determination. This paper describes the adaption of three complementary assays (Folin-Ciocalteu (FC), ABTS and ORAC) for the routine assessment of antioxidant capacity of beverages, their validation, and use for determining the antioxidant capacities of extracts from coffee beans at different stages in the roasting process. All assays showed a progressive increase in antioxidant capacity during roasting to a light roast state, consistent with the production of melanoidins having a higher antioxidant effect than the degradation of CGAs. However, the three assays gave different numbers for the total antioxidant capacity of green beans relative to gallic acid (GA), although the range of values was much smaller when chlorogenic acid (CGA) was used as reference. Therefore, although all three assays indicated that there was an increase in antioxidant activity during coffee roasting, and the large differences in responses to GA and CGA illustrate their different sensitivities to different types of antioxidant molecule

Crossref

Directory of Open Access Journals

PubMed Central

ZHAW digitalcollection

PreSTU: Pre-Training for Scene-Text Understanding

Author: Changpinyo Soravit
Chao Wei-Lun
Chen Xi
Goodman Sebastian
Hu Hexiang
Kil Jihyung
Soricut Radu
Publication venue
Publication date: 19/08/2023
Field of study

The ability to recognize and reason about text embedded in visual inputs is often lacking in vision-and-language (V&L) models, perhaps because V&L pre-training methods have often failed to include such an ability in their training objective. In this paper, we propose PreSTU, a novel pre-training recipe dedicated to scene-text understanding (STU). PreSTU introduces OCR-aware pre-training objectives that encourage the model to recognize text from an image and connect it to the rest of the image content. We implement PreSTU using a simple transformer-based encoder-decoder architecture, combined with large-scale image-text datasets with scene text obtained from an off-the-shelf OCR system. We empirically demonstrate the effectiveness of this pre-training approach on eight visual question answering and four image captioning benchmarks.Comment: Accepted to ICCV 202

arXiv.org e-Print Archive

Interpretation of gene associations with risk of acute respiratory distress syndrome: P values, Bayes factors, positive predictive values, and need for replication

Author: A Sapru
Ari R. Joffe
F Perez-Marques
J Chen
J Little
JM Baughn
JPA Ioannidis
JPA Ioannidis
L Broer
MK Dahmer
P Tejera
PP Patwari
R Russell
S Greenland
Sebastian Rimpau
SN Goodman
SN Goodman
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref