1,546 research outputs found
Recommended from our members
Proceedings of QG2010: The Third Workshop on Question Generation
These are the peer-reviewed proceedings of "QG2010, The Third Workshop on Question Generation". The workshop included a special track for "QGSTEC2010: The First Question Generation Shared Task and Evaluation Challenge".
QG2010 was held as part of The Tenth International Conference on Intelligent Tutoring Systems (ITS2010)
Recommended from our members
Semantic chunking
Long sentences pose a challenge for natural language processing (NLP) applications. They are associated with a complex information structure leading to increased requirements for processing resources. Although the issue is present in many areas of research, there is little uniformity in the solutions used by research communities dedicated to individual NLP applications. Different aspects of the problem are addressed by different tasks, such as sentence simplification or shallow chunking.
The main contribution of this thesis is the introduction of the task of semantic chunking as a general approach to reducing the cost of processing long sentences. The goal of semantic chunking is to find semantically contained fragments of a sentence representation that can be processed independently and recombined without loss of information. We anchor its principles in established concepts of semantic theory, in particular event and situation semantics. Most of the experiments in this thesis focus on semantic chunking defined on complex semantic representations in Dependency Minimal Recursion Semantics (DMRS),
but we also demonstrate that the task can be performed on sentence strings. We present three chunking models: a) rule-based proof-of-concept DMRS chunking system; b) a semi-supervised sequence labelling neural model for surface semantic chunking; c) a system capable of finding semantic chunk boundaries based on the inherent structure of DMRS graphs, generalisable in the form of descriptive templates. We show how semantic chunking can be applied within a divide-and-conquer processing paradigm, using as an example the task of realization from DMRS. The application of semantic chunking yields noticeable efficiency gains without decreasing the quality of results
Maximum Entropy Models For Natural Language Ambiguity Resolution
This thesis demonstrates that several important kinds of natural language ambiguities can be resolved to state-of-the-art accuracies using a single statistical modeling technique based on the principle of maximum entropy.
We discuss the problems of sentence boundary detection, part-of-speech tagging, prepositional phrase attachment, natural language parsing, and text categorization under the maximum entropy framework. In practice, we have found that maximum entropy models offer the following advantages:
State-of-the-art Accuracy: The probability models for all of the tasks discussed perform at or near state-of-the-art accuracies, or outperform competing learning algorithms when trained and tested under similar conditions. Methods which outperform those presented here require much more supervision in the form of additional human involvement or additional supporting resources.
Knowledge-Poor Features: The facts used to model the data, or features, are linguistically very simple, or knowledge-poor but yet succeed in approximating complex linguistic relationships.
Reusable Software Technology: The mathematics of the maximum entropy framework are essentially independent of any particular task, and a single software implementation can be used for all of the probability models in this thesis.
The experiments in this thesis suggest that experimenters can obtain state-of-the-art accuracies on a wide range of natural language tasks, with little task-specific effort, by using maximum entropy probability models
A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena
Word reordering is one of the most difficult aspects of statistical machine
translation (SMT), and an important factor of its quality and efficiency.
Despite the vast amount of research published to date, the interest of the
community in this problem has not decreased, and no single method appears to be
strongly dominant across language pairs. Instead, the choice of the optimal
approach for a new translation task still seems to be mostly driven by
empirical trials. To orientate the reader in this vast and complex research
area, we present a comprehensive survey of word reordering viewed as a
statistical modeling challenge and as a natural language phenomenon. The survey
describes in detail how word reordering is modeled within different
string-based and tree-based SMT frameworks and as a stand-alone task, including
systematic overviews of the literature in advanced reordering modeling. We then
question why some approaches are more successful than others in different
language pairs. We argue that, besides measuring the amount of reordering, it
is important to understand which kinds of reordering occur in a given language
pair. To this end, we conduct a qualitative analysis of word reordering
phenomena in a diverse sample of language pairs, based on a large collection of
linguistic knowledge. Empirical results in the SMT literature are shown to
support the hypothesis that a few linguistic facts can be very useful to
anticipate the reordering characteristics of a language pair and to select the
SMT framework that best suits them.Comment: 44 pages, to appear in Computational Linguistic
Exploratory Search on Mobile Devices
The goal of this thesis is to provide a general framework (MobEx) for exploratory search especially on mobile devices. The central part is the design, implementation, and evaluation of several core modules for on-demand unsupervised information extraction well suited for exploratory search on mobile devices and creating the MobEx framework. These core processing elements, combined with a multitouch - able user interface specially designed for two families of mobile devices, i.e. smartphones and tablets, have been finally implemented in a research prototype. The initial information request, in form of a query topic description, is issued online by a user to the system. The system then retrieves web snippets by using standard search engines. These snippets are passed through a chain of NLP components which perform an ondemand or ad-hoc interactive Query Disambiguation, Named Entity Recognition, and Relation Extraction task. By on-demand or ad-hoc we mean the components are capable to perform their operations on an unrestricted open domain within special time constraints. The result of the whole process is a topic graph containing the detected associated topics as nodes and the extracted relation ships as labelled edges between the nodes. The Topic Graph is presented to the user in different ways depending on the size of the device she is using. Various evaluations have been conducted that help us to understand the potentials and limitations of the framework and the prototype
Towards Usable End-user Authentication
Authentication is the process of validating the identity of an entity, e.g., a person, a machine, etc.; the entity usually provides a proof of identity in order to be authenticated. When the entity - to be authenticated - is a human, the authentication process is called end-user authentication. Making an end-user authentication usable entails making it easy for a human to obtain, manage, and input the proof of identity in a secure manner. In machine-to-machine authentication, both ends have comparable memory and computational power to securely carry out the authentication process using cryptographic primitives and protocols. On the contrary, as a human has limited memory and computational power, in end-user authentication, cryptography is of little use. Although password based end-user authentication has many well-known security and usability problems, it is the de facto standard. Almost half a century of research effort has produced a multitude of end-user authentication methods more sophisticated than passwords; yet, none has come close to replacing passwords. In this dissertation, taking advantage of the built-in sensing capability of smartphones, we propose an end-user authentication framework for smartphones - called ePet - which does not require any active participation from the user most of the times; thus the proposed framework is highly usable. Using data collected from subjects, we validate a part of the authentication framework for the Android platform. For web authentication, in this dissertation, we propose a novel password creation interface, which helps a user remember a newly created password with more confidence - by allowing her to perform various memory tasks built upon her new password. Declarative and motor memory help the user remember and efficiently input a password. From a within-subjects study we show that declarative memory is sufficient for passwords; motor memory mostly facilitate the input process and thus the memory tasks have been designed to help cement the declarative memory for a newly created password. This dissertation concludes with an evaluation of the increased usability of the proposed interface through a between-subjects study
MEGA: Multilingual Evaluation of Generative AI
Generative AI models have shown impressive performance on many Natural
Language Processing tasks such as language understanding, reasoning, and
language generation. An important question being asked by the AI community
today is about the capabilities and limits of these models, and it is clear
that evaluating generative AI is very challenging. Most studies on generative
LLMs have been restricted to English and it is unclear how capable these models
are at understanding and generating text in other languages. We present the
first comprehensive benchmarking of generative LLMs - MEGA, which evaluates
models on standard NLP benchmarks, covering 16 NLP datasets across 70
typologically diverse languages. We compare the performance of generative LLMs
including Chat-GPT and GPT-4 to State of the Art (SOTA) non-autoregressive
models on these tasks to determine how well generative models perform compared
to the previous generation of LLMs. We present a thorough analysis of the
performance of models across languages and tasks and discuss challenges in
improving the performance of generative LLMs on low-resource languages. We
create a framework for evaluating generative LLMs in the multilingual setting
and provide directions for future progress in the field.Comment: EMNLP 202
ClimateGPT: Towards AI Synthesizing Interdisciplinary Research on Climate Change
This paper introduces ClimateGPT, a model family of domain-specific large
language models that synthesize interdisciplinary research on climate change.
We trained two 7B models from scratch on a science-oriented dataset of 300B
tokens. For the first model, the 4.2B domain-specific tokens were included
during pre-training and the second was adapted to the climate domain after
pre-training. Additionally, ClimateGPT-7B, 13B and 70B are continuously
pre-trained from Llama~2 on a domain-specific dataset of 4.2B tokens. Each
model is instruction fine-tuned on a high-quality and human-generated
domain-specific dataset that has been created in close cooperation with climate
scientists. To reduce the number of hallucinations, we optimize the model for
retrieval augmentation and propose a hierarchical retrieval strategy. To
increase the accessibility of our model to non-English speakers, we propose to
make use of cascaded machine translation and show that this approach can
perform comparably to natively multilingual models while being easier to scale
to a large number of languages. Further, to address the intrinsic
interdisciplinary aspect of climate change we consider different research
perspectives. Therefore, the model can produce in-depth answers focusing on
different perspectives in addition to an overall answer. We propose a suite of
automatic climate-specific benchmarks to evaluate LLMs. On these benchmarks,
ClimateGPT-7B performs on par with the ten times larger Llama-2-70B Chat model
while not degrading results on general domain benchmarks. Our human evaluation
confirms the trends we saw in our benchmarks. All models were trained and
evaluated using renewable energy and are released publicly
- …