Search CORE

34 research outputs found

Spectral Clustering Wikipedia Keyword-Based Search Results

Author: Julian Szymański
Tomasz Dziubich
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2017
Field of study

Information retrieval with semantic memory model

Author: Duch Włodzisław
Szymański Julian
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

Psycholinguistic theories of semantic memory form the basis of understanding of natural language concepts. These theories are used here as an inspiration for implementing a computational model of semantic memory in the form of semantic network. Combining this network with a vector-based object-relation-feature value representation of concepts that includes also weights for confidence and support, allows for recognition of concepts by referring to their features, enabling a semantic search algorithm. This algorithm has been used for word games, in particular the 20-question game in which the program tries to guess a concept that a human player thinks about. The game facilitates lexical knowledge validation and acquisition through the interaction with humans via supervised dialog templates. The elementary linguistic competencies of the proposed model have been evaluated assessing how well it can represent the meaning of linguistic concepts. To study properties of information retrieval based on this type of semantic representation in contexts derived from on-going dialogs experiments in limited domains have been performed. Several similarity measures have been used to compare the completeness of knowledge retrieved automatically and corrected through active dialogs to a “golden standard”. Comparison of semantic search with human performance has been made in a series of 20-question games. On average results achieved by human players were better than those obtained by semantic search, but not by a wide margin

Repository of Nicolaus Copernicus University

DR-NTU (Digital Repository of NTU)

Annotating Words Using WordNet Semantic Glosses

Author: Duch Włodzisław
Szymański Julian
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2012
Field of study

An approach to the word sense disambiguation (WSD) relaying on the WordNet synsets is proposed. The method uses semantically tagged glosses to perform a process similar to the spreading activation in semantic network, creating ranking of the most probable meanings for word annotation. Preliminary evaluation shows quite promising results. Comparison with the state-of-theart WSD methods indicates that the use of WordNet relations and semantically tagged glosses should enhance accuracy of word disambiguation methods

Repository of Nicolaus Copernicus University

Self Organizing Maps for Visualization of Categories

Author: Duch Włodzisław
Szymański Julian
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2012
Field of study

Visualization of Wikipedia categories using Self Organizing Maps shows an overview of categories and their relations, helping to narrow down search domains. Selecting particular neurons this approach enables retrieval of conceptually similar categories. Evaluation of neural activations indicates that they form coherent patterns that may be useful for building user interfaces for navigation over category structures

Repository of Nicolaus Copernicus University

Context Search Algorithm for Lexical Knowledge Acquisition

Author: Duch Włodzisław
Szymański Julian
Publication venue
Publication date: 01/01/2012
Field of study

This work was supported by Polish Committee for Scientific Research grant N516 035 31/3499.A Context Search algorithm used for lexical knowledge acquisition is presented. Knowledge representation based on psycholinguistic theories of cognitive processes allows for implementation of a computational model of semantic memory in the form of semantic network. A knowledge acquisition using supervised dialog templates have been performed in a word game designed to guess the concept a human user is thinking about. The game, that has been implemented on a web server, demonstrates elementary linguistic competencies based on lexical knowledge stored in semantic memory, enabling at the same time acquisition and validation of knowledge. Possible applications of the algorithm in domains of medical diagnosis and information retrieval are sketched

CiteSeerX

Biblioteka Nauki - repozytorium artykuÅÃ³w

Repository of Nicolaus Copernicus University

Wyszukiwanie kontekstowe w pamięci semantycznej

Author: Szymański Julian
Publication venue: nakł. aut.
Publication date: 01/01/2009
Field of study

Pomeranian Digital Library

Knowledge representation and acquisition for large-scale semantic memory

Author: Julian Szymański
Włodzisław Duch
Publication venue: IEEE Press
Publication date: 01/01/2008
Field of study

Abstract—Acquisition and representation of semantic concepts is a necessary requirement for the understanding of natural languages by cognitive systems. Word games provide an interesting opportunity for semantic knowledge acquisition that may be used to construct semantic memory. A task-dependent architecture of the knowledge base inspired by psycholinguistic theories of human cognition process is introduced. The core of the system is an algorithm for semantic search using a simplified vector representation of concepts. Based on this algorithm a 20 questions game has been implemented. This implementation provides an example of an application of the semantic memory, but also allows for testing the linguistic competence of the system. A web portal with Haptek-based talking head interface facilitates acquisition of a new knowledge while playing the game and engaging in dialogs with users. I

CiteSeerX

Crossref

Study of Statistical Text Representation Methods for Performance Improvement of a Hierarchical Attention Network

Author: Adam Wawrzyński
Julian Szymański
Publication venue: 'MDPI AG'
Publication date: 01/06/2021
Field of study

To effectively process textual data, many approaches have been proposed to create text representations. The transformation of a text into a form of numbers that can be computed using computers is crucial for further applications in downstream tasks such as document classification, document summarization, and so forth. In our work, we study the quality of text representations using statistical methods and compare them to approaches based on neural networks. We describe in detail nine different algorithms used for text representation and then we evaluate five diverse datasets: BBCSport, BBC, Ohsumed, 20Newsgroups, and Reuters. The selected statistical models include Bag of Words (BoW), Term Frequency-Inverse Document Frequency (TFIDF) weighting, Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA). For the second group of deep neural networks, Partition-Smooth Inverse Frequency (P-SIF), Doc2Vec-Distributed Bag of Words Paragraph Vector (Doc2Vec-DBoW), Doc2Vec-Memory Model of Paragraph Vectors (Doc2Vec-DM), Hierarchical Attention Network (HAN) and Longformer were selected. The text representation methods were benchmarked in the document classification task and BoW and TFIDF models were used were used as a baseline. Based on the identified weaknesses of the HAN method, an improvement in the form of a Hierarchical Weighted Attention Network (HWAN) was proposed. The incorporation of statistical features into HAN latent representations improves or provides comparable results on four out of five datasets. The article presents how the length of the processed text affects the results of HAN and variants of HWAN models

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals