34 research outputs found

    Information retrieval with semantic memory model

    Get PDF
    Psycholinguistic theories of semantic memory form the basis of understanding of natural language concepts. These theories are used here as an inspiration for implementing a computational model of semantic memory in the form of semantic network. Combining this network with a vector-based object-relation-feature value representation of concepts that includes also weights for confidence and support, allows for recognition of concepts by referring to their features, enabling a semantic search algorithm. This algorithm has been used for word games, in particular the 20-question game in which the program tries to guess a concept that a human player thinks about. The game facilitates lexical knowledge validation and acquisition through the interaction with humans via supervised dialog templates. The elementary linguistic competencies of the proposed model have been evaluated assessing how well it can represent the meaning of linguistic concepts. To study properties of information retrieval based on this type of semantic representation in contexts derived from on-going dialogs experiments in limited domains have been performed. Several similarity measures have been used to compare the completeness of knowledge retrieved automatically and corrected through active dialogs to a “golden standard”. Comparison of semantic search with human performance has been made in a series of 20-question games. On average results achieved by human players were better than those obtained by semantic search, but not by a wide margin

    Annotating Words Using WordNet Semantic Glosses

    Get PDF
    An approach to the word sense disambiguation (WSD) relaying on the WordNet synsets is proposed. The method uses semantically tagged glosses to perform a process similar to the spreading activation in semantic network, creating ranking of the most probable meanings for word annotation. Preliminary evaluation shows quite promising results. Comparison with the state-of-theart WSD methods indicates that the use of WordNet relations and semantically tagged glosses should enhance accuracy of word disambiguation methods

    Self Organizing Maps for Visualization of Categories

    Get PDF
    Visualization of Wikipedia categories using Self Organizing Maps shows an overview of categories and their relations, helping to narrow down search domains. Selecting particular neurons this approach enables retrieval of conceptually similar categories. Evaluation of neural activations indicates that they form coherent patterns that may be useful for building user interfaces for navigation over category structures

    Context Search Algorithm for Lexical Knowledge Acquisition

    Get PDF
    This work was supported by Polish Committee for Scientific Research grant N516 035 31/3499.A Context Search algorithm used for lexical knowledge acquisition is presented. Knowledge representation based on psycholinguistic theories of cognitive processes allows for implementation of a computational model of semantic memory in the form of semantic network. A knowledge acquisition using supervised dialog templates have been performed in a word game designed to guess the concept a human user is thinking about. The game, that has been implemented on a web server, demonstrates elementary linguistic competencies based on lexical knowledge stored in semantic memory, enabling at the same time acquisition and validation of knowledge. Possible applications of the algorithm in domains of medical diagnosis and information retrieval are sketched

    Wyszukiwanie kontekstowe w pamięci semantycznej

    No full text

    Knowledge representation and acquisition for large-scale semantic memory

    No full text
    Abstract—Acquisition and representation of semantic concepts is a necessary requirement for the understanding of natural languages by cognitive systems. Word games provide an interesting opportunity for semantic knowledge acquisition that may be used to construct semantic memory. A task-dependent architecture of the knowledge base inspired by psycholinguistic theories of human cognition process is introduced. The core of the system is an algorithm for semantic search using a simplified vector representation of concepts. Based on this algorithm a 20 questions game has been implemented. This implementation provides an example of an application of the semantic memory, but also allows for testing the linguistic competence of the system. A web portal with Haptek-based talking head interface facilitates acquisition of a new knowledge while playing the game and engaging in dialogs with users. I

    Study of Statistical Text Representation Methods for Performance Improvement of a Hierarchical Attention Network

    No full text
    To effectively process textual data, many approaches have been proposed to create text representations. The transformation of a text into a form of numbers that can be computed using computers is crucial for further applications in downstream tasks such as document classification, document summarization, and so forth. In our work, we study the quality of text representations using statistical methods and compare them to approaches based on neural networks. We describe in detail nine different algorithms used for text representation and then we evaluate five diverse datasets: BBCSport, BBC, Ohsumed, 20Newsgroups, and Reuters. The selected statistical models include Bag of Words (BoW), Term Frequency-Inverse Document Frequency (TFIDF) weighting, Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA). For the second group of deep neural networks, Partition-Smooth Inverse Frequency (P-SIF), Doc2Vec-Distributed Bag of Words Paragraph Vector (Doc2Vec-DBoW), Doc2Vec-Memory Model of Paragraph Vectors (Doc2Vec-DM), Hierarchical Attention Network (HAN) and Longformer were selected. The text representation methods were benchmarked in the document classification task and BoW and TFIDF models were used were used as a baseline. Based on the identified weaknesses of the HAN method, an improvement in the form of a Hierarchical Weighted Attention Network (HWAN) was proposed. The incorporation of statistical features into HAN latent representations improves or provides comparable results on four out of five datasets. The article presents how the length of the processed text affects the results of HAN and variants of HWAN models
    corecore