88 research outputs found
Robustification of Multilingual Language Models to Real-world Noise with Robust Contrastive Pretraining
Advances in neural modeling have achieved state-of-the-art (SOTA) results on
public natural language processing (NLP) benchmarks, at times surpassing human
performance. However, there is a gap between public benchmarks and real-world
applications where noise such as typos or grammatical mistakes is abundant,
resulting in degraded performance. Unfortunately, works that assess the
robustness of neural models on noisy data and suggest improvements are limited
to the English language. Upon analyzing noise in different languages, we
observe that noise types vary across languages and thus require their own
investigation. Thus, to benchmark the performance of pretrained multilingual
models, we construct noisy datasets covering five languages and four NLP tasks.
We see a gap in performance between clean and noisy data. After investigating
ways to boost the zero-shot cross-lingual robustness of multilingual pretrained
models, we propose Robust Contrastive Pretraining (RCP). RCP combines data
augmentation with a contrastive loss term at the pretraining stage and achieves
large improvements on noisy (& original test data) across two sentence-level
classification (+3.2%) and two sequence-labeling (+10 F1-score) multilingual
tasks
Knowledge-driven slot constraints for goal-oriented dialogue systems
In goal-oriented dialogue systems, users provide information through slot values to achieve specific goals. Practically, some combinations of slot values can be invalid according to external knowledge. For example, a combination of "cheese pizza" (a menu item) and "oreo cookies" (a topping) from an input utterance "Can I order a cheese pizza with oreo cookies on top?" exemplifies such invalid combinations according to the menu of a restaurant business. Traditional dialogue systems allow execution of validation rules as a post-processing step after slots have been filled which can lead to error accumulation. In this paper, we formalize knowledge-driven slot constraints and present a new task of constraint violation detection accompanied with benchmarking data. Then, we propose methods to integrate the external knowledge into the system and model constraint violation detection as an end-to-end classification task and compare it to the traditional rule-based pipeline approach. Experiments on two domains of the MultiDoGO dataset reveal challenges of constraint violation detection and sets the stage for future work and improvements
Влияние условий фабрикации таблеток на основе углерода на их пористость
Структура работы: выпускная квалификационная работа состоит из четырех частей. В первой части проведен обзор различных видов топлива, способов хранения водорода, анализ свойств углеродных материалов, используемых при сорбции водорода; способов обеспечения развитой внутренней поверхности таблеток; веществ, выступающих в роли пластификаторов при фабрикации углеродных таблеток.
Во второй – представлено описание подготовки пресс-порошков, а также условия фабрикации и свойства полученных таблеток.
В третьей части приведен экономический расчет затрат на проведение исследования, составлен календарный план работы.
В четвертой – рассмотрена охрана труда и техника безопасности при проведении научно-исследовательской работы.The structure of the work: the final qualifying work consists of four parts. The first part provides an overview of various types of fuel, methods of hydrogen storage, analysis of the properties of carbon materials used in hydrogen sorption; ways to provide a developed inner surface of the tablets; substances that act as plasticizers in the manufacture of carbon tablets.
In the second, a description of the preparation of press powders is presented, as well as the conditions of fabrication and the properties of the obtained tablets.
In the third part, an economic calculation of the costs of conducting a study is given, a work schedule is drawn up.
In the fourth, labor protection and safety measures are considered in the course of research work
Parameter and Data Efficient Continual Pre-training for Robustness to Dialectal Variance in Arabic
The use of multilingual language models for tasks in low and high-resource
languages has been a success story in deep learning. In recent times, Arabic
has been receiving widespread attention on account of its dialectal variance.
While prior research studies have tried to adapt these multilingual models for
dialectal variants of Arabic, it still remains a challenging problem owing to
the lack of sufficient monolingual dialectal data and parallel translation data
of such dialectal variants. It remains an open problem on whether the limited
dialectical data can be used to improve the models trained in Arabic on its
dialectal variants. First, we show that multilingual-BERT (mBERT) incrementally
pretrained on Arabic monolingual data takes less training time and yields
comparable accuracy when compared to our custom monolingual Arabic model and
beat existing models (by an avg metric of +). We then explore two
continual pre-training methods -- (1) using small amounts of dialectical data
for continual finetuning and (2) parallel Arabic to English data and a
Translation Language Modeling loss function. We show that both approaches help
improve performance on dialectal classification tasks ( avg. gain) when
used on monolingual models
User Simulation with Large Language Models for Evaluating Task-Oriented Dialogue
One of the major impediments to the development of new task-oriented dialogue
(TOD) systems is the need for human evaluation at multiple stages and
iterations of the development process. In an effort to move toward automated
evaluation of TOD, we propose a novel user simulator built using recently
developed large pretrained language models (LLMs). In order to increase the
linguistic diversity of our system relative to the related previous work, we do
not fine-tune the LLMs used by our system on existing TOD datasets; rather we
use in-context learning to prompt the LLMs to generate robust and
linguistically diverse output with the goal of simulating the behavior of human
interlocutors. Unlike previous work, which sought to maximize goal success rate
(GSR) as the primary metric of simulator performance, our goal is a system
which achieves a GSR similar to that observed in human interactions with TOD
systems. Using this approach, our current simulator is effectively able to
interact with several TOD systems, especially on single-intent conversational
goals, while generating lexically and syntactically diverse output relative to
previous simulators that rely upon fine-tuned models. Finally, we collect a
Human2Bot dataset of humans interacting with the same TOD systems with which we
experimented in order to better quantify these achievements.Comment: 13 page
Conversation Style Transfer using Few-Shot Learning
Conventional text style transfer approaches for natural language focus on
sentence-level style transfer without considering contextual information, and
the style is described with attributes (e.g., formality). When applying style
transfer on conversations such as task-oriented dialogues, existing approaches
suffer from these limitations as context can play an important role and the
style attributes are often difficult to define in conversations. In this paper,
we introduce conversation style transfer as a few-shot learning problem, where
the model learns to perform style transfer by observing only the target-style
dialogue examples. We propose a novel in-context learning approach to solve the
task with style-free dialogues as a pivot. Human evaluation shows that by
incorporating multi-turn context, the model is able to match the target style
while having better appropriateness and semantic correctness compared to
utterance-level style transfer. Additionally, we show that conversation style
transfer can also benefit downstream tasks. Results on multi-domain intent
classification tasks show improvement in F1 scores after transferring the style
of training data to match the style of test data
Результаты численного решения уравнения Лапласа для фигур вращения с формами от - 0,0025 до - 0,2500
Search For Charged Higgs Decays of the Top Quark Using Hadronic Decays of the Tau Lepton
This Letter describes a direct search for charged Higgs boson production in
proton-antiproton collisions at sqrt(s)=1.8 TeV recorded by the Collider
Detector at Fermilab. Two-Higgs-doublet extensions to the standard model
predict the existence of charged Higgs bosons. In such models, the branching
fraction for top quarks B(t --> H b --> tau nu b) can be large. This search
uses the hadronic decays of the tau lepton in this channel to significantly
extend previous limits on charged Higgs production.Comment: 6pages, 4 figures, 1 table; LaTeX; submitted to PR
- …