Search CORE

31 research outputs found

Unsupervised Sense-Aware Hypernymy Extraction

Author: Biemann Chris
Panchenko Alexander
Ponzetto Simone Paolo
Ustalov Dmitry
Publication venue
Publication date: 17/09/2018
Field of study

In this paper, we show how unsupervised sense representations can be used to improve hypernymy extraction. We present a method for extracting disambiguated hypernymy relationships that propagates hypernyms to sets of synonyms (synsets), constructs embeddings for these sets, and establishes sense-aware relationships between matching synsets. Evaluation on two gold standard datasets for English and Russian shows that the method successfully recognizes hypernymy relationships that cannot be found with standard Hearst patterns and Wiktionary datasets for the respective languages.Comment: In Proceedings of the 14th Conference on Natural Language Processing (KONVENS 2018). Vienna, Austri

arXiv.org e-Print Archive

Fighting with the Sparsity of Synonymy Dictionaries

Author: A Panchenko
C Fellbaum
D Hope
DJ Herrmann
H Gonçalo Oliveira
NV Loukachevitch
R Navigli
S Lappin
XM Zeng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/08/2017
Field of study

Graph-based synset induction methods, such as MaxMax and Watset, induce synsets by performing a global clustering of a synonymy graph. However, such methods are sensitive to the structure of the input synonymy graph: sparseness of the input dictionary can substantially reduce the quality of the extracted synsets. In this paper, we propose two different approaches designed to alleviate the incompleteness of the input dictionaries. The first one performs a pre-processing of the graph by adding missing edges, while the second one performs a post-processing by merging similar synset clusters. We evaluate these approaches on two datasets for the Russian language and discuss their impact on the performance of synset induction methods. Finally, we perform an extensive error analysis of each approach and discuss prominent alternative methods for coping with the problem of the sparsity of the synonymy dictionaries.Comment: In Proceedings of the 6th Conference on Analysis of Images, Social Networks, and Texts (AIST'2017): Springer Lecture Notes in Computer Science (LNCS

arXiv.org e-Print Archive

Crossref

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

MAnnheim DOCument Server

Unsupervised Semantic Frame Induction using Triclustering

Author: Biemann Chris
Kutuzov Andrei
Panchenko Alexander
Ponzetto Simone Paolo
Ustalov Dmitry
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

We use dependency triples automatically extracted from a Web-scale corpus to perform unsupervised semantic frame induction. We cast the frame induction problem as a triclustering problem that is a generalization of clustering for triadic data. Our replicable benchmarks demonstrate that the proposed graph-based approach, Triframes, shows state-of-the art results on this task on a FrameNet-derived dataset and performing on par with competitive methods on a verb class clustering task.Comment: 8 pages, 1 figure, 4 tables, accepted at ACL 201

arXiv.org e-Print Archive

Crossref

Watset : automatic induction of synsets from a graph of synonyms

Author: Biemann Chris
Panchenko Alexander
Ustalov Dmitry
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

This paper presents a new graph-based approach that induces synsets using synonymy dictionaries and word embeddings. First, we build a weighted graph of synonyms extracted from commonly available resources, such as Wiktionary. Second, we apply word sense induction to deal with ambiguous words. Finally, we cluster the disambiguated version of the ambiguous input graph into synsets. Our meta-clustering approach lets us use an efficient hard clustering algorithm to perform a fuzzy clustering of the graph. Despite its simplicity, our approach shows excellent results, outperforming five competitive state-of-the-art methods in terms of F-score on three gold standard datasets for English and Russian derived from large-scale manually constructed lexical resources

arXiv.org e-Print Archive

Crossref

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

MAnnheim DOCument Server

Unsupervised semantic frame induction using triclustering

Author: Biemann Chris
Kutuzov Andrei
Panchenko Alexander
Ponzetto Simone Paolo
Ustalov Dmitry
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

arXiv.org e-Print Archive

Crossref

MAnnheim DOCument Server

NORA - Norwegian Open Research Archives

Mining Entity Synonyms with Efficient Neural Set Generation

Author: Han Jiawei
Lyu Ruiliang
Ren Xiang
Sadler Brian
Shen Jiaming
Vanni Michelle
Publication venue
Publication date: 16/11/2018
Field of study

Mining entity synonym sets (i.e., sets of terms referring to the same entity) is an important task for many entity-leveraging applications. Previous work either rank terms based on their similarity to a given query term, or treats the problem as a two-phase task (i.e., detecting synonymy pairs, followed by organizing these pairs into synonym sets). However, these approaches fail to model the holistic semantics of a set and suffer from the error propagation issue. Here we propose a new framework, named SynSetMine, that efficiently generates entity synonym sets from a given vocabulary, using example sets from external knowledge bases as distant supervision. SynSetMine consists of two novel modules: (1) a set-instance classifier that jointly learns how to represent a permutation invariant synonym set and whether to include a new instance (i.e., a term) into the set, and (2) a set generation algorithm that enumerates the vocabulary only once and applies the learned set-instance classifier to detect all entity synonym sets in it. Experiments on three real datasets from different domains demonstrate both effectiveness and efficiency of SynSetMine for mining entity synonym sets.Comment: AAAI 2019 camera-ready versio

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Unsupervised sense-aware hypernymy extraction

Author: Biemann Chris
Panchenko Alexander
Ponzetto Simone Paolo
Ustalov Dmitry
Publication venue: ÖGAI, Österreichische Akademie der Wissenschaften
Publication date: 01/01/2018
Field of study

In this paper, we show how unsupervised sense representations can be used to improve hypernymy extraction. We present a method for extracting disambiguated hypernymy relationships that propagate hypernyms to sets of synonyms (synsets), constructs embeddings for these sets, and establishes sense-aware relationships between matching synsets. Evaluation on two gold standard datasets for English and Russian shows that the method successfully recognizes hypernymy relationships that cannot be found with standard Hearst patterns and Wiktionary datasets for the respective languages

arXiv.org e-Print Archive

MAnnheim DOCument Server

Unsupervised, Knowledge-Free, and Interpretable Word Sense Disambiguation

Author: Biemann Chris
Faralli Stefano
Marten Fide
Panchenko Alexander
Ponzetto Simone Paolo
Ruppert Eugen
Ustalov Dmitry
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

Interpretability of a predictive model is a powerful feature that gains the trust of users in the correctness of the predictions. In word sense disambiguation (WSD), knowledge-based systems tend to be much more interpretable than knowledge-free counterparts as they rely on the wealth of manually-encoded elements representing word senses, such as hypernyms, usage examples, and images. We present a WSD system that bridges the gap between these two so far disconnected groups of methods. Namely, our system, providing access to several state-of-the-art WSD models, aims to be interpretable as a knowledge-based system while it remains completely unsupervised and knowledge-free. The presented tool features a Web interface for all-word disambiguation of texts that makes the sense predictions human readable by providing interpretable word sense inventories, sense representations, and disambiguation results. We provide a public API, enabling seamless integration.Comment: In Proceedings of the the Conference on Empirical Methods on Natural Language Processing (EMNLP 2017). 2017. Copenhagen, Denmark. Association for Computational Linguistic

arXiv.org e-Print Archive

Crossref

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

MAnnheim DOCument Server

Semantic frame induction as a community detection problem

Author: Charles J. Fillmore
D Das
DM Blei
J Lang
M Marcus
Publication venue: Springer
Publication date: 01/01/2020
Field of study

Resources such as FrameNet provide semantic information that is important for multiple tasks. However, they are expensive to build and, consequently, are unavailable for many languages and domains. Thus, approaches able to induce semantic frames in an unsupervised manner are highly valuable. In this paper we approach that task from a network perspective as a community detection problem that targets the identification of groups of verb instances that evoke the same semantic frame. To do so, we apply a graph-clustering algorithm to a graph with contextualized representations of verb instances as nodes connected by an edge if the distance between them is below a threshold that defines the granularity of the induced frames. By applying this approach to the benchmark dataset defined in the context of the SemEval shared task we outperformed all the previous approaches to the task.info:eu-repo/semantics/acceptedVersio

Crossref

Repositório Institucional do ISCTE-IUL

Graph clustering for natural language processing

Author: Ustalov Dmitry
Publication venue
Publication date: 01/01/2018
Field of study

Graph-based representations are proven to be an effective approach for a variety of Natural Language Processing (NLP) tasks. Graph clustering makes it possible to extract useful knowledge by exploiting the implicit structure of the data. In this tutorial, we will present several efficient graph clustering algorithms, show their strengths and weaknesses as well as their implementations and applications. Then, the evaluation methodology in unsupervised NLP tasks will be discussed

ZENODO

MAnnheim DOCument Server

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY