11,386 research outputs found
German Perception Verbs: Automatic Classification of Prototypical and Multiple Non-literal Meanings
This paper presents a token-based automatic classification of German perception verbs into literal vs. multiple non-literal senses. Based on a corpus-based dataset of German perception verbs and their systematic meaning shifts, we identify one verb of each of the four perception classes optical, acoustic, olfactory, haptic, and use Decision Trees relying on syntactic and semantic corpus-based features to classify the verb uses into 3-4 senses each. Our classifier reaches accuracies between 45.5% and 69.4%, in comparison to baselines between 27.5% and 39.0%. In three out of four cases analyzed our classifier’s accuracy is significantly higher than the according baseline
Learning Sentence-internal Temporal Relations
In this paper we propose a data intensive approach for inferring
sentence-internal temporal relations. Temporal inference is relevant for
practical NLP applications which either extract or synthesize temporal
information (e.g., summarisation, question answering). Our method bypasses the
need for manual coding by exploiting the presence of markers like after", which
overtly signal a temporal relation. We first show that models trained on main
and subordinate clauses connected with a temporal marker achieve good
performance on a pseudo-disambiguation task simulating temporal inference
(during testing the temporal marker is treated as unseen and the models must
select the right marker from a set of possible candidates). Secondly, we assess
whether the proposed approach holds promise for the semi-automatic creation of
temporal annotations. Specifically, we use a model trained on noisy and
approximate data (i.e., main and subordinate clauses) to predict
intra-sentential relations present in TimeBank, a corpus annotated rich
temporal information. Our experiments compare and contrast several
probabilistic models differing in their feature space, linguistic assumptions
and data requirements. We evaluate performance against gold standard corpora
and also against human subjects
Automatic Identification of AltLexes using Monolingual Parallel Corpora
The automatic identification of discourse relations is still a challenging
task in natural language processing. Discourse connectives, such as "since" or
"but", are the most informative cues to identify explicit relations; however
discourse parsers typically use a closed inventory of such connectives. As a
result, discourse relations signaled by markers outside these inventories (i.e.
AltLexes) are not detected as effectively. In this paper, we propose a novel
method to leverage parallel corpora in text simplification and lexical
resources to automatically identify alternative lexicalizations that signal
discourse relation. When applied to the Simple Wikipedia and Newsela corpora
along with WordNet and the PPDB, the method allowed the automatic discovery of
91 AltLexes.Comment: 6 pages, Proceedings of Recent Advances in Natural Language
Processing (RANLP 2017
- …