997 research outputs found
Document-Level Machine Translation with Large Language Models
Large language models (LLMs) such as Chat-GPT can produce coherent, cohesive,
relevant, and fluent answers for various natural language processing (NLP)
tasks. Taking document-level machine translation (MT) as a testbed, this paper
provides an in-depth evaluation of LLMs' ability on discourse modeling. The
study fo-cuses on three aspects: 1) Effects of Discourse-Aware Prompts, where
we investigate the impact of different prompts on document-level translation
quality and discourse phenomena; 2) Comparison of Translation Models, where we
compare the translation performance of Chat-GPT with commercial MT systems and
advanced document-level MT methods; 3) Analysis of Discourse Modelling
Abilities, where we further probe discourse knowledge encoded in LLMs and
examine the impact of training techniques on discourse modeling. By evaluating
a number of benchmarks, we surprisingly find that 1) leveraging their powerful
long-text mod-eling capabilities, ChatGPT outperforms commercial MT systems in
terms of human evaluation. 2) GPT-4 demonstrates a strong ability to explain
discourse knowledge, even through it may select incorrect translation
candidates in contrastive testing. 3) ChatGPT and GPT-4 have demonstrated
superior performance and show potential to become a new and promising paradigm
for document-level translation. This work highlights the challenges and
opportunities of discourse modeling for LLMs, which we hope can inspire the
future design and evaluation of LLMs
Usefulness, localizability, humanness, and language-benefit: additional evaluation criteria for natural language dialogue systems
Humanâcomputer dialogue systems interact with human users using natural language. We used the ALICE/AIML chatbot architecture as a platform to develop a range of chatbots covering different languages, genres, text-types, and user-groups, to illustrate qualitative aspects of natural language dialogue system evaluation. We present some of the different evaluation techniques used in natural language dialogue systems, including black box and glass box, comparative, quantitative, and qualitative evaluation. Four aspects of NLP dialogue system evaluation are often overlooked: âusefulnessâ in terms of a userâs qualitative needs, âlocalizabilityâ to new genres and languages, âhumannessâ or ânaturalnessâ compared to humanâhuman dialogues, and âlanguage benefitâ compared to alternative interfaces. We illustrated these aspects with respect to our work on machine-learnt chatbot dialogue systems; we believe these aspects are worthwhile in impressing potential new users and customers
A Survey on Semantic Processing Techniques
Semantic processing is a fundamental research domain in computational
linguistics. In the era of powerful pre-trained language models and large
language models, the advancement of research in this domain appears to be
decelerating. However, the study of semantics is multi-dimensional in
linguistics. The research depth and breadth of computational semantic
processing can be largely improved with new technologies. In this survey, we
analyzed five semantic processing tasks, e.g., word sense disambiguation,
anaphora resolution, named entity recognition, concept extraction, and
subjectivity detection. We study relevant theoretical research in these fields,
advanced methods, and downstream applications. We connect the surveyed tasks
with downstream applications because this may inspire future scholars to fuse
these low-level semantic processing tasks with high-level natural language
processing tasks. The review of theoretical research may also inspire new tasks
and technologies in the semantic processing domain. Finally, we compare the
different semantic processing techniques and summarize their technical trends,
application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN
1566-2535. The equal contribution mark is missed in the published version due
to the publication policies. Please contact Prof. Erik Cambria for detail
Research in the Language, Information and Computation Laboratory of the University of Pennsylvania
This report takes its name from the Computational Linguistics Feedback Forum (CLiFF), an informal discussion group for students and faculty. However the scope of the research covered in this report is broader than the title might suggest; this is the yearly report of the LINC Lab, the Language, Information and Computation Laboratory of the University of Pennsylvania.
It may at first be hard to see the threads that bind together the work presented here, work by faculty, graduate students and postdocs in the Computer Science and Linguistics Departments, and the Institute for Research in Cognitive Science. It includes prototypical Natural Language fields such as: Combinatorial Categorial Grammars, Tree Adjoining Grammars, syntactic parsing and the syntax-semantics interface; but it extends to statistical methods, plan inference, instruction understanding, intonation, causal reasoning, free word order languages, geometric reasoning, medical informatics, connectionism, and language acquisition.
Naturally, this introduction cannot spell out all the connections between these abstracts; we invite you to explore them on your own. In fact, with this issue itâs easier than ever to do so: this document is accessible on the âinformation superhighwayâ. Just call up http://www.cis.upenn.edu/~cliff-group/94/cliffnotes.html
In addition, you can find many of the papers referenced in the CLiFF Notes on the net. Most can be obtained by following links from the authorsâ abstracts in the web version of this report.
The abstracts describe the researchersâ many areas of investigation, explain their shared concerns, and present some interesting work in Cognitive Science. We hope its new online format makes the CLiFF Notes a more useful and interesting guide to Computational Linguistics activity at Penn
Discourse-aware neural machine translation
Machine translation (MT) models usually translate a text by considering isolated sentences
based on a strict assumption that the sentences in a text are independent of one another.
However, it is a truism that texts have properties of connectedness that go beyond those of
their individual sentences. Disregarding dependencies across sentences will harm translation quality especially in terms of coherence, cohesion, and consistency. Previously,
some discourse-aware approaches have been investigated for conventional statistical machine translation (SMT). However, this is a serious obstacle for the state-of-the-art neural
machine translation (NMT), which recently has surpassed the performance of SMT.
In this thesis, we try to incorporate useful discourse information for enhancing NMT
models. More specifically, we conduct research on two main parts: 1) exploiting novel
document-level NMT architecture; and 2) dealing with a specific discourse phenomenon
for translation models.
Firstly, we investigate the influence of historical contextual information on the perfor-
mance of NMT models. A cross-sentence context-aware NMT model is proposed to consider the influence of previous sentences in the same document. Specifically, this history
is summarized using an additional hierarchical encoder. The historical representations are
then integrated into the standard NMT model in different strategies. Experimental results
on a ChineseâEnglish document-level translation task show that the approach significantly
improves upon a strong attention-based NMT system by up to +2.1 BLEU points. In addition, analysis and comparison also give insightful discussions and conclusions for this
research direction.
Secondly, we explore the impact of discourse phenomena on the performance of MT.
In this thesis, we focus on the phenomenon of pronoun-dropping (pro-drop), where, in pro-drop languages, pronouns can be omitted when it is possible to infer the referent from the
context. As the data for training a dropped pronoun (DP) generator is scarce, we propose to
automatically annotate DPs using alignment information from a large parallel corpus. We
then introduce a hybrid approach: building a neural-based DP generator and integrating it
into the SMT model. Experimental results on both ChineseâEnglish and JapaneseâEnglish
translation tasks demonstrate that our approach achieves a significant improvement of up to
+1.58 BLEU points with 66% F-score for DP generation accuracy.
Motivated by this promising result, we further exploit the DP translation approach for
advanced NMT models. A novel reconstruction-based model is proposed to reconstruct the
DP-annotated source sentence from the hidden states of either encoder or decoder, or both
components. Experimental results on the same translation tasks show that the proposed approach significantly and consistently improves translation performance over a strong NMT
baseline, which is trained on DP-annotated parallel data.
To avoid the errors propagated from an external DP prediction model, we finally investigate an end-to-end DP translation model. Specifically, we improve the reconstruction-based
model from three perspectives. We first employ a shared reconstructor to better exploit encoder and decoder representations. Secondly, we propose to jointly learn to translate and
predict DPs. In order to capture discourse information for DP prediction, we finally combine the hierarchical encoder with the DP translation model. Experimental results on the
same translation tasks show that our approach significantly improves both translation performance and DP prediction accuracy
Recommended from our members
Enabling Structured Navigation of Longform Spoken Dialog with Automatic Summarization
Longform spoken dialog is a rich source of information that is present in all facets of everyday life, taking the form of podcasts, debates, and interviews; these mediums contain important topics ranging from healthcare and diversity to current events, economics and politics. Individuals need to digest informative content to know how to vote, decide how to stay safe from COVID-19, and how to increase diversity in the workplace.
Unfortunately compared to text, spoken dialog can be challenging to consume as it is slower than reading and difficult to skim or navigate. Although an individual may be interested in a given topic, they may be unwilling to commit the required time necessary to consume long form auditory media given the uncertainty as to whether such content will live up to their expectations. Clearly, there exists a need to provide access to the information spoken dialog provides in a manner through which individuals can quickly and intuitively access areas of interest without investing large amounts of time.
From Human Computer Interaction, we apply the idea of information foraging, which theorizes how people browse and navigate to satisfy an information need, to the longform spoken dialog domain. Information foraging states that people do not browse linearly. Rather people âforageâ for information similar to how animals sniff around for food, scanning from area to area, constantly deciding whether to keep investigating their current area or to move on to greener pastures. This is an instance of the classic breadth vs. depth dilemma. People rely on perceived structure and information cues to make these decisions. Unfortunately speech, either spoken or transcribed, is unstructured and lacks information cues, making it difficult for users to browse and navigate.
We create a longform spoken dialog browsing system that utilizes automatic summarization and speech modeling to structure longform dialog to present information in a manner that is both intuitive and flexible towards different user browsing needs. Leveraging summarization models to automatically and hierarchically structure spoken dialog, the system is able to distill information into increasingly salient and abstract summaries, allowing for a tiered representation that, if interested, users can progressively explore. Additionally, we address spoken dialogâs own set of technical challenges to speech modeling that are not present in written text, such as disfluencies, improper punctuation, lack of annotated speech data, and inherent lack of structure.
We create a longform spoken dialog browsing system that utilizes automatic summarization and speech modeling to structure longform dialog to present information in a manner that is both intuitive and flexible towards different user browsing needs. Leveraging summarization models to automatically and hierarchically structure spoken dialog, the system is able to distill information into increasingly salient and abstract summaries, allowing for a tiered representation that, if interested, users can progressively explore. Additionally, we address spoken dialogâs own set of technical challenges to speech modeling that are not present in written text, such as disfluencies, improper punctuation, lack of annotated speech data, and inherent lack of structure. Since summarization is a lossy compression of information, the system provides users with information cues to signal how much additional information is contained on a topic.
This thesis makes the following contributions:
1. We applied the HCI concept of information foraging to longform speech, enabling people to browse and navigate information in podcasts, interviews, panels, and meetings.
2. We created a system that structures longform dialog into hierarchical summaries which help users to 1) skim (browse) audio and 2) navigate and drill down into interesting sections to read full details.
3. We created a human annotated hierarchical dataset to quantitatively evaluate the effectiveness of our systemâs hierarchical text generation performance.
4. Lastly, we developed a suite of dialog oriented processing optimizations to improve the user experience of summaries: enhanced readability and fluency of short summaries through better topic chunking and pronoun imputation, and reliable indication of semantic coverage within short summaries to help direct navigation towards interesting information.
We discuss future research in extending the browsing and navigating system to more challenging domains such as lectures, which contain many external references, or workplace conversations, which contain uncontextualized background information and are far less structured than podcasts and interviews
- âŠ