1,473 research outputs found
Comparing knowledge sources for nominal anaphora resolution
We compare two ways of obtaining lexical knowledge for antecedent selection in other-anaphora
and definite noun phrase coreference. Specifically, we compare an algorithm that relies on links
encoded in the manually created lexical hierarchy WordNet and an algorithm that mines corpora
by means of shallow lexico-semantic patterns. As corpora we use the British National
Corpus (BNC), as well as the Web, which has not been previously used for this task. Our
results show that (a) the knowledge encoded in WordNet is often insufficient, especially for
anaphor-antecedent relations that exploit subjective or context-dependent knowledge; (b) for
other-anaphora, the Web-based method outperforms the WordNet-based method; (c) for definite
NP coreference, the Web-based method yields results comparable to those obtained using
WordNet over the whole dataset and outperforms the WordNet-based method on subsets of the
dataset; (d) in both case studies, the BNC-based method is worse than the other methods because
of data sparseness. Thus, in our studies, the Web-based method alleviated the lexical knowledge
gap often encountered in anaphora resolution, and handled examples with context-dependent relations
between anaphor and antecedent. Because it is inexpensive and needs no hand-modelling
of lexical knowledge, it is a promising knowledge source to integrate in anaphora resolution systems
Crowdsourcing Multiple Choice Science Questions
We present a novel method for obtaining high-quality, domain-targeted
multiple choice questions from crowd workers. Generating these questions can be
difficult without trading away originality, relevance or diversity in the
answer options. Our method addresses these problems by leveraging a large
corpus of domain-specific text and a small set of existing questions. It
produces model suggestions for document selection and answer distractor choice
which aid the human question generation process. With this method we have
assembled SciQ, a dataset of 13.7K multiple choice science exam questions
(Dataset available at http://allenai.org/data.html). We demonstrate that the
method produces in-domain questions by providing an analysis of this new
dataset and by showing that humans cannot distinguish the crowdsourced
questions from original questions. When using SciQ as additional training data
to existing questions, we observe accuracy improvements on real science exams.Comment: accepted for the Workshop on Noisy User-generated Text (W-NUT) 201
Boundaries of Semantic Distraction: Dominance and Lexicality Act at Retrieval
Three experiments investigated memory for semantic information with the goal of determining boundary conditions for the manifestation of semantic auditory distraction. Irrelevant speech disrupted the free recall of semantic category-exemplars to an equal degree regardless of whether the speech coincided with presentation or test phases of the task (Experiment 1) and occurred regardless of whether it comprised random words or coherent sentences (Experiment 2). The effects of background speech were greater when the irrelevant speech was semantically related to the to-be-remembered material, but only when the irrelevant words were high in output dominance (Experiment 3). The implications of these findings in relation to the processing of task material and the processing of background speech is discussed
Co-Attention Hierarchical Network: Generating Coherent Long Distractors for Reading Comprehension
In reading comprehension, generating sentence-level distractors is a
significant task, which requires a deep understanding of the article and
question. The traditional entity-centered methods can only generate word-level
or phrase-level distractors. Although recently proposed neural-based methods
like sequence-to-sequence (Seq2Seq) model show great potential in generating
creative text, the previous neural methods for distractor generation ignore two
important aspects. First, they didn't model the interactions between the
article and question, making the generated distractors tend to be too general
or not relevant to question context. Second, they didn't emphasize the
relationship between the distractor and article, making the generated
distractors not semantically relevant to the article and thus fail to form a
set of meaningful options. To solve the first problem, we propose a
co-attention enhanced hierarchical architecture to better capture the
interactions between the article and question, thus guide the decoder to
generate more coherent distractors. To alleviate the second problem, we add an
additional semantic similarity loss to push the generated distractors more
relevant to the article. Experimental results show that our model outperforms
several strong baselines on automatic metrics, achieving state-of-the-art
performance. Further human evaluation indicates that our generated distractors
are more coherent and more educative compared with those distractors generated
by baselines.Comment: 8 pages, 3 figures. Accepted by AAAI202
InDEX: Indonesian Idiom and Expression Dataset for Cloze Test
We propose InDEX, an Indonesian Idiom and Expression dataset for cloze test.
The dataset contains 10438 unique sentences for 289 idioms and expressions for
which we generate 15 different types of distractors, resulting in a large
cloze-style corpus. Many baseline models of cloze test reading comprehension
apply BERT with random initialization to learn embedding representation. But
idioms and fixed expressions are different such that the literal meaning of the
phrases may or may not be consistent with their contextual meaning. Therefore,
we explore different ways to combine static and contextual representations for
a stronger baseline model. Experimentations show that combining definition and
random initialization will better support cloze test model performance for
idioms whether independently or mixed with fixed expressions. While for fixed
expressions with no special meaning, static embedding with random
initialization is sufficient for cloze test model.Comment: Accepted to "2022 International Conference on Asian Language
Processing (IALP)
Automatic Distractor Generation for Multiple Choice Questions in Standard Tests
To assess the knowledge proficiency of a learner, multiple choice question is
an efficient and widespread form in standard tests. However, the composition of
the multiple choice question, especially the construction of distractors is
quite challenging. The distractors are required to both incorrect and plausible
enough to confuse the learners who did not master the knowledge. Currently, the
distractors are generated by domain experts which are both expensive and
time-consuming. This urges the emergence of automatic distractor generation,
which can benefit various standard tests in a wide range of domains. In this
paper, we propose a question and answer guided distractor generation (EDGE)
framework to automate distractor generation. EDGE consists of three major
modules: (1) the Reforming Question Module and the Reforming Passage Module
apply gate layers to guarantee the inherent incorrectness of the generated
distractors; (2) the Distractor Generator Module applies attention mechanism to
control the level of plausibility. Experimental results on a large-scale public
dataset demonstrate that our model significantly outperforms existing models
and achieves a new state-of-the-art.Comment: accepted by COLING202
- …