134 research outputs found
Similarity Models in Distributional Semantics using Task Specific Information
In distributional semantics, the unsupervised learning approach has been widely used for a large number of tasks. On the other hand, supervised learning has less coverage.
In this dissertation, we investigate the supervised learning approach for semantic relatedness tasks in distributional semantics. The investigation considers mainly semantic similarity and semantic classification tasks. Existing and newly-constructed datasets are used as an input for the experiments. The new datasets are constructed from thesauruses like Eurovoc. The Eurovoc thesaurus is a multilingual thesaurus maintained by the Publications Office of the European Union. The meaning of the words in the dataset is represented by using a distributional semantic approach.
The distributional semantic approach collects co-occurrence information from large texts and represents the words in high-dimensional vectors. The English words are represented by using UkWaK corpus while German words are represented by using DeWaC corpus. After representing each word by the high dimensional vector, different supervised machine learning methods are used on the selected tasks. The outputs from the supervised machine learning methods are evaluated by comparing the tasks performance and accuracy with the state of the art unsupervised machine learning methods’ results. In addition, multi-relational matrix factorization is introduced as one supervised learning method in distributional semantics. This dissertation shows the multi-relational matrix factorization method as a good alternative method to integrate different sources of information of words in distributional semantics.
In the dissertation, some new applications are also introduced. One of the applications is an application which analyzes a German company’s website text, and provides information about the company with a concept cloud visualization. The other applications are automatic recognition/disambiguation of the library of congress subject headings and automatic identification of synonym relations in the Dutch Parliament thesaurus applications
A Survey of Paraphrasing and Textual Entailment Methods
Paraphrasing methods recognize, generate, or extract phrases, sentences, or
longer natural language expressions that convey almost the same information.
Textual entailment methods, on the other hand, recognize, generate, or extract
pairs of natural language expressions, such that a human who reads (and trusts)
the first element of a pair would most likely infer that the other element is
also true. Paraphrasing can be seen as bidirectional textual entailment and
methods from the two areas are often similar. Both kinds of methods are useful,
at least in principle, in a wide range of natural language processing
applications, including question answering, summarization, text generation, and
machine translation. We summarize key ideas from the two areas by considering
in turn recognition, generation, and extraction methods, also pointing to
prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of
Informatics, Athens University of Economics and Business, Greece, 201
Automatic taxonomy evaluation
This thesis would not be made possible without the generous support of IATA.Les taxonomies sont une représentation essentielle des connaissances, jouant un rôle central dans de nombreuses applications riches en connaissances. Malgré cela, leur construction est laborieuse que ce soit manuellement ou automatiquement, et l'évaluation quantitative de taxonomies est un sujet négligé. Lorsque les chercheurs se concentrent sur la construction d'une taxonomie à partir de grands corpus non structurés, l'évaluation est faite souvent manuellement, ce qui implique des biais et se traduit souvent par une reproductibilité limitée. Les entreprises qui souhaitent améliorer leur taxonomie manquent souvent d'étalon ou de référence, une sorte de taxonomie bien optimisée pouvant service de référence.
Par conséquent, des connaissances et des efforts spécialisés sont nécessaires pour évaluer une taxonomie.
Dans ce travail, nous soutenons que l'évaluation d'une taxonomie effectuée automatiquement et de manière reproductible est aussi importante que la génération automatique de telles taxonomies. Nous proposons deux nouvelles méthodes d'évaluation qui produisent des scores moins biaisés: un modèle de classification de la taxonomie extraite d'un corpus étiqueté, et un modèle de langue non supervisé qui sert de source de connaissances pour évaluer les relations hyperonymiques. Nous constatons que nos substituts d'évaluation corrèlent avec les jugements humains et que les modèles de langue pourraient imiter les experts humains dans les tâches riches en connaissances.Taxonomies are an essential knowledge representation and play an important role in classification and numerous knowledge-rich applications, yet quantitative taxonomy evaluation remains to be overlooked and left much to be desired. While studies focus on automatic taxonomy construction (ATC) for extracting meaningful structures and semantics from large corpora, their evaluation is usually manual and subject to bias and low reproducibility. Companies wishing to improve their domain-focused taxonomies also suffer from lacking ground-truths. In fact, manual taxonomy evaluation requires substantial labour and expert knowledge.
As a result, we argue in this thesis that automatic taxonomy evaluation (ATE) is just as important as taxonomy construction. We propose two novel taxonomy evaluation methods for automatic taxonomy scoring, leveraging supervised classification for labelled corpora and unsupervised language modelling as a knowledge source for unlabelled data. We show that our evaluation proxies can exert similar effects and correlate well with human judgments and that language models can imitate human experts on knowledge-rich tasks
Sentiment polarity shifters : creating lexical resources through manual annotation and bootstrapped machine learning
Alleviating pain is good and abandoning hope is bad. We instinctively understand how words like "alleviate" and "abandon" affect the polarity of a phrase, inverting or weakening it. When these words are content words, such as verbs, nouns and adjectives, we refer to them as polarity shifters. Shifters are a frequent occurrence in human language and an important part of successfully modeling negation in sentiment analysis; yet research on negation modeling has focussed almost exclusively on a small handful of closed class negation words, such as "not", "no" and "without. A major reason for this is that shifters are far more lexically diverse than negation words, but no resources exist to help identify them. We seek to remedy this lack of shifter resources. Our most central step towards this is the creation of a large lexicon of polarity shifters that covers verbs, nouns and adjectives. To reduce the prohibitive cost of such a large annotation task, we develop a bootstrapping approach that combines automatic classification with human verification. This ensures the high quality of our lexicon while reducing annotation cost by over 70%. In designing the bootstrap classifier we develop a variety of features which use both existing semantic resources and linguistically informed text patterns. In addition we investigate how knowledge about polarity shifters might be shared across different parts of speech, highlighting both the potential and limitations of such an approach. The applicability of our bootstrapping approach extends beyond the creation of a single resource. We show how it can further be used to introduce polarity shifter resources for other languages. Through the example case of German we show that all our features are transferable to other languages. Keeping in mind the requirements of under-resourced languages, we also explore how well a classifier would do when relying only on data- but not resource-driven features. We also introduce ways to use cross-lingual information, leveraging the shifter resources we previously created for other languages. Apart from the general question of which words can be polarity shifters, we also explore a number of other factors. One of these is the matter of shifting directions, which indicates whether a shifter affects positive polarities, negative polarities or whether it can shift in either direction. Using a supervised classifier we add shifting direction information to our bootstrapped lexicon. For other aspects of polarity shifting, manual annotation is preferable to automatic classification. Not every word that can cause polarity shifting does so for every of its word senses. As word sense disambiguation technology is not robust enough to allow the automatic handling of such nuances, we manually create a complete sense-level annotation of verbal polarity shifters. To verify the usefulness of the lexica which we create, we provide an extrinsic evaluation in which we apply them to a sentiment analysis task. In this task the different lexica are not only compared amongst each other, but also against a state-of-the-art compositional polarity neural network classifier that has been shown to be able to implicitly learn the negating effect of negation words from a training corpus. However, we find that the same is not true for the far more lexically diverse polarity shifters. Instead, the use of the explicit knowledge provided by our shifter lexica brings clear gains in performance.Deutsche Forschungsgesellschaf
Recommended from our members
Inferring unobserved co-occurrence events in Anchored Packed Trees
Anchored Packed Trees (APTs) are a novel approach to distributional semantics that takes distributional composition to be a process of lexeme contextualisation. A lexeme’s meaning, characterised as knowledge concerning co-occurrences involving that lexeme, is represented with a higher-order dependency-typed structure (the APT) where paths associated with higher-order dependencies connect vertices associated with weighted lexeme multisets. The central innovation in the compositional theory is that the APT’s type structure enables the precise alignment of the semantic representation of each of the lexemes being composed.
Like other count-based distributional spaces, however, Anchored Packed Trees are prone to considerable data sparsity, caused by not observing all plausible co-occurrences in the given data. This problem is amplified for models like APTs, that take the grammatical type of a co-occurrence into account. This results in a very sparse distributional space, requiring a mechanism for inferring missing knowledge. Most methods face this challenge in ways that render the resulting word representations uninterpretable, with the consequence that distributional composition becomes difficult to model and reason about.
In this thesis, I will present a practical evaluation of the Apt theory, including a large-scale hyperparameter sensitivity study and a characterisation of the distributional space that APTs give rise to. Based on the empirical analysis, the impact of the problem of data sparsity is investigated. In order to address the data sparsity challenge and retain the interpretability of the model, I explore an alternative algorithm — distributional inference — for improving elementary representations. The algorithm involves explicitly inferring unobserved co-occurrence events by leveraging the distributional neighbourhood of the semantic space. I then leverage the rich type structure in APTs and propose a generalisation of the distributional inference algorithm. I empirically show that distributional inference improves elementary word representations and is especially beneficial when combined with an intersective composition function, which is due to the complementary nature of inference and composition. Lastly, I qualitatively analyse the proposed algorithms in order to characterise the knowledge that they are able to infer, as well as their impact on the distributional APT space
- …