6 research outputs found
Fighting with the Sparsity of Synonymy Dictionaries
Graph-based synset induction methods, such as MaxMax and Watset, induce
synsets by performing a global clustering of a synonymy graph. However, such
methods are sensitive to the structure of the input synonymy graph: sparseness
of the input dictionary can substantially reduce the quality of the extracted
synsets. In this paper, we propose two different approaches designed to
alleviate the incompleteness of the input dictionaries. The first one performs
a pre-processing of the graph by adding missing edges, while the second one
performs a post-processing by merging similar synset clusters. We evaluate
these approaches on two datasets for the Russian language and discuss their
impact on the performance of synset induction methods. Finally, we perform an
extensive error analysis of each approach and discuss prominent alternative
methods for coping with the problem of the sparsity of the synonymy
dictionaries.Comment: In Proceedings of the 6th Conference on Analysis of Images, Social
Networks, and Texts (AIST'2017): Springer Lecture Notes in Computer Science
(LNCS
Unsupervised Sense-Aware Hypernymy Extraction
In this paper, we show how unsupervised sense representations can be used to
improve hypernymy extraction. We present a method for extracting disambiguated
hypernymy relationships that propagates hypernyms to sets of synonyms
(synsets), constructs embeddings for these sets, and establishes sense-aware
relationships between matching synsets. Evaluation on two gold standard
datasets for English and Russian shows that the method successfully recognizes
hypernymy relationships that cannot be found with standard Hearst patterns and
Wiktionary datasets for the respective languages.Comment: In Proceedings of the 14th Conference on Natural Language Processing
(KONVENS 2018). Vienna, Austri
Learning Word Subsumption Projections for the Russian Language
The semantic relations of hypernymy and hyponymy are widely used in various natural language processing tasks for modelling the subsumptions in common sense reasoning. Since the popularisation of the distributional semantics, a significant attention is paid to applying word embeddings for inducing the relations between words. In this paper, we show our preliminary results on adopting the projection learning technique for computing hypernyms from hyponyms using word embeddings. We also conduct a series of experiments on the Russian language and release the open source software for learning hyponym-hypernym projections using both CPUs and GPUs, implemented with the TensorFlow machine learning framework
A Semantic Similarity Measure Based on Lexico-Syntactic Patterns.
This paper presents a novel semantic similarity measure based on lexico-syntactic patterns such as those proposed by Hearst (1992). The measure achieves a correlation with human judgements up to 0.739. Additionally, we evaluate it on the tasks of semantic relation ranking and extraction. Our results show that the measure provides results comparable to the baselines without the need for any fine-grained semantic resource such as WordNet