3,343 research outputs found
A Mention-Ranking Model for Abstract Anaphora Resolution
Resolving abstract anaphora is an important, but difficult task for text
understanding. Yet, with recent advances in representation learning this task
becomes a more tangible aim. A central property of abstract anaphora is that it
establishes a relation between the anaphor embedded in the anaphoric sentence
and its (typically non-nominal) antecedent. We propose a mention-ranking model
that learns how abstract anaphors relate to their antecedents with an
LSTM-Siamese Net. We overcome the lack of training data by generating
artificial anaphoric sentence--antecedent pairs. Our model outperforms
state-of-the-art results on shell noun resolution. We also report first
benchmark results on an abstract anaphora subset of the ARRAU corpus. This
corpus presents a greater challenge due to a mixture of nominal and pronominal
anaphors and a greater range of confounders. We found model variants that
outperform the baselines for nominal anaphors, without training on individual
anaphor data, but still lag behind for pronominal anaphors. Our model selects
syntactically plausible candidates and -- if disregarding syntax --
discriminates candidates using deeper features.Comment: In Proceedings of the 2017 Conference on Empirical Methods in Natural
Language Processing (EMNLP). Copenhagen, Denmar
Recipe instruction semantics corpus (RISeC) : resolving semantic structure and zero anaphora in recipes
We propose a newly annotated dataset for information extraction on recipes. Unlike previous approaches to machine comprehension of procedural texts, we avoid a priori pre-defining domain-specific predicates to recognize (e.g., the primitive instructionsin MILK) and focus on basic understanding of the expressed semantics rather than directly reduce them to a simplified state representation (e.g., ProPara). We thus frame the semantic comprehension of procedural text such as recipes, as fairly generic NLP subtasks, covering (i) entity recognition (ingredients, tools and actions), (ii) relation extraction (what ingredients and tools are involved in the actions), and (iii) zero anaphora resolution (link actions to implicit arguments, e.g., results from previous recipe steps). Further, our Recipe Instruction Semantic Corpus (RISeC) dataset includes textual descriptions for the zero anaphora, to facilitate language generation thereof. Besides the dataset itself, we contribute a pipeline neural architecture that addresses entity and relation extractionas well an identification of zero anaphora. These basic building blocks can facilitate more advanced downstream applications (e.g., question answering, conversational agents)
- …