12 research outputs found

    Word Sense Embedded in Geometric Spaces - From Induction to Applications using Machine Learning

    Get PDF
    Words are not detached individuals but part of a beautiful interconnected web of related concepts, and to capture the full complexity of this web they need to be represented in a way that encapsulates all the semantic and syntactic facets of the language. Further, to enable computational processing they need to be expressed in a consistent manner so that similar properties are encoded in a similar way. In this thesis dense real valued vector representations, i.e. word embeddings, are extended and studied for their applicability to natural language processing (NLP). Word embeddings of two distinct flavors are presented as part of this thesis, sense aware word representations where different word senses are represented as distinct objects, and grounded word representations that are learned using multi-agent deep reinforcement learning to explicitly express properties of the physical world while the agents learn to play Guess Who?. The empirical usefulness of word embeddings are evaluated by employing them in a series of NLP related applications, i.e. word sense induction, word sense disambiguation, and automatic document summarisation. The results show great potential for word embeddings by outperforming previous state-of-the-art methods in two out of three applications, and achieving a statistically equivalent result in the third application but using a much simpler model than previous work

    Word Sense Disambiguation using a Bidirectional LSTM

    Full text link
    In this paper we present a clean, yet effective, model for word sense disambiguation. Our approach leverage a bidirectional long short-term memory network which is shared between all words. This enables the model to share statistical strength and to scale well with vocabulary size. The model is trained end-to-end, directly from the raw text to sense labels, and makes effective use of word order. We evaluate our approach on two standard datasets, using identical hyperparameter settings, which are in turn tuned on a third set of held out data. We employ no external resources (e.g. knowledge graphs, part-of-speech tagging, etc), language specific features, or hand crafted rules, but still achieve statistically equivalent results to the best state-of-the-art systems, that employ no such limitations

    BERTを利用した単語用例のクラスタリング

    Get PDF
    Ibaraki UniversityIbaraki UniversityIbaraki UniversityIbaraki UniversityIbaraki University会議名: 言語資源活用ワークショップ2019, 開催地: 国立国語研究所, 会期: 2019年9月2日−4日, 主催: 国立国語研究所 コーパス開発センター事前学習モデルであるBERTは入力文中の単語に対する埋め込み表現を出力するが、その埋め込み表現はその単語の文脈に依存した形となっている。つまりBERTから得られる単語の埋め込み表現はその単語の意味を表現していると考えられる。本論文では、この点を確認するためにBERTから得られる単語の埋め込み表現を利用して、その単語の用例のクラスタリングを行う。実験では日本語版BERT事前学習モデルを利用して、単語「意味」の用例クラスタリングを行った。語義曖昧性解消のための標準的な特徴ベクトルや分散表現から構築した特徴ベクトルからクラスタリングを行う場合と比較することで、BERTから得られる単語の埋め込み表現が、より適切に意味を表現できていることを示す

    Learning to Embed Words in Context for Syntactic Tasks

    Full text link
    We present models for embedding words in the context of surrounding words. Such models, which we refer to as token embeddings, represent the characteristics of a word that are specific to a given context, such as word sense, syntactic category, and semantic role. We explore simple, efficient token embedding models based on standard neural network architectures. We learn token embeddings on a large amount of unannotated text and evaluate them as features for part-of-speech taggers and dependency parsers trained on much smaller amounts of annotated data. We find that predictors endowed with token embeddings consistently outperform baseline predictors across a range of context window and training set sizes.Comment: Accepted by ACL 2017 Repl4NLP worksho

    MUSE: Modularizing Unsupervised Sense Embeddings

    Full text link
    This paper proposes to address the word sense ambiguity issue in an unsupervised manner, where word sense representations are learned along a word sense selection mechanism given contexts. Prior work focused on designing a single model to deliver both mechanisms, and thus suffered from either coarse-grained representation learning or inefficient sense selection. The proposed modular approach, MUSE, implements flexible modules to optimize distinct mechanisms, achieving the first purely sense-level representation learning system with linear-time sense selection. We leverage reinforcement learning to enable joint training on the proposed modules, and introduce various exploration techniques on sense selection for better robustness. The experiments on benchmark data show that the proposed approach achieves the state-of-the-art performance on synonym selection as well as on contextual word similarities in terms of MaxSimC

    Word Representations for Emergent Communication and Natural Language Processing

    Get PDF
    The task of listing all semantic properties of a single word might seem manageable at first but as you unravel all the context dependent subtle variations in meaning that a word can encompass, you soon realize that precise mathematical definition of a word’s semantics is extremely difficult. In analogy, humans have no problem identifying their favorite pet in an image but the task of precisely defining how, is still beyond our capabilities. A solution that has proved effective in the visual domain is to solve the problem by learning abstract representations using machine learning. Inspired by the success of learned representations in computer vision, the line of work presented in this thesis will explore learned word representations in three different contexts. Starting in the domain of artificial languages, three computational frameworks for emergent communication between collaborating agents are developed in an attempt to study word representations that exhibit grounding of concepts. The first two are designed to emulate the natural development of discrete color words using deep reinforcement learning, and used to simulate the emergence of color terms that partition the continuous color spectra of visual light. The properties of the emerged color communication schema is compared to human languages to ensure its validity as a cognitive model, and subsequently the frameworks are utilized to explore central questions in cognitive science about universals in language within the semantic domain of color. Moving beyond the color domain, a third framework is developed for the less controlled environment of human faces and multi-step communication. Subsequently, as for the color domain we carefully analyze the semantic properties of the words emerged between the agents but in this case focusing on the grounding. Turning the attention to the empirical usefulness, different types of learned word representations are evaluated in the context of automatic document summarisation, word sense disambiguation, and word sense induction with results that show great potential for learned word representations in natural language processing by reaching state-of-the-art performance in all applications and outperforming previous methods in two out of three applications. Finally, although learned word representations seem to improve the performance of real world systems, they do also lack in interpretability when compared to classical hand-engineered representations. Acknowledging this, an effort is made towards construct- ing learned representations that regain some of that interpretability by designing and evaluating disentangled representations, which could be used to represent words in a more interpretable way in the future

    Learning with Geometric Embeddings of Graphs

    Get PDF
    Graphs are natural representations of problems and data in many fields. For example, in computational biology, interaction networks model the functional relationships between genes in living organisms; in the social sciences, graphs are used to represent friendships and business relations among people; in chemoinformatics, graphs represent atoms and molecular bonds. Fields like these are often rich in data, to the extent that manual analysis is not feasible and machine learning algorithms are necessary to exploit the wealth of available information. Unfortunately, in machine learning research, there is a huge bias in favor of algorithms operating only on continuous vector valued data, algorithms that are not suitable for the combinatorial structure of graphs. In this thesis, we show how to leverage both the expressive power of graphs and the strength of established machine learning tools by introducing methods that combine geometric embeddings of graphs with standard learning algorithms. We demonstrate the generality of this idea by developing embedding algorithms for both simple and weighted graphs and applying them in both supervised and unsupervised learning problems such as classification and clustering. Our results provide both theoretical support for the usefulness of graph embeddings in machine learning and empirical evidence showing that this framework is often more flexible and better performing than competing machine learning algorithms for graphs

    Neural context embeddings for automatic discovery of word senses

    No full text
    Word sense induction (WSI) is the problem ofautomatically building an inventory of sensesfor a set of target words using only a textcorpus. We introduce a new method for embedding word instances and their context, for use in WSI. The method, Instance-context embedding (ICE), leverages neural word embeddings, and the correlation statistics they capture, to compute high quality embeddings of word contexts. In WSI, these context embeddings are clustered to find the word senses present in the text. ICE is based on a novel method for combining word embeddings using continuous Skip-gram, based on both se-mantic and a temporal aspects of contextwords. ICE is evaluated both in a new system, and in an extension to a previous systemfor WSI. In both cases, we surpass previousstate-of-the-art, on the WSI task of SemEval-2013, which highlights the generality of ICE. Our proposed system achieves a 33% relative improvement
    corecore