14 research outputs found

    Emotion detection based on sentiment analysis : an example of a social robots on short and long texts conversation

    Get PDF
    PURPOSE: The aim of this paper is to present a solution to detect emotions from text obtained in a conversation with a social robot. Emotions will be detected using sentiment analysis based on the English and Polish lexicon.DESIGN/METHODOLOGY/APPROACH: Data from social robot conversation records will be converted into text and then split into short and long speech. The original language utterances will then be analysed using the Polish lexicon, while the translated texts will be analysed using the English emotional lexicon.FINDINGS: The results obtained indicate the same or similar distribution of emotions made by sentiment analysis using both plNetWord and NRC lexicons.PRACTICAL IMPLICATIONS: The results obtained can be used for further research addressing the creation and development of lexicons based on the selected language. They are also applicable to the implementation of solutions for detecting and responding to conversational emotions by social robots.ORIGINALITY/VALUE: The analyses so far mostly take up the subject of textual analysis in English. The aim of the present study is to analyse a Polish text and to compare the results obtained with those for English texts. The analysis of differences in the emotional sentiment of utterances may lead to the construction of more effective models based on the chosen language.This publication was supported under the Initiative of Excellence— Research University program implemented at the Silesian University of Technology, 2020– 2022peer-reviewe

    A Comparison of Online Slavic Valency Dictionaries

    Get PDF
    U radu se prikazuje pet mrežnih rječnika nastalih u svrhu opisa (glagolske) valentnosti u slavenskim jezicima. Uspoređuju se lingvistički i leksikografski opisi valentnosti te se ne razmatra njihova računalna osnova. Četiri mrežna valencijska rječnika (ruski FrameBank, češki VALLEX, hrvatski CROVALLEX i poljski Walenty) detaljno se uspoređuju s e-Glavom, mrežnim valencijskim rječnikom hrvatskoga jezika koji se izrađuje u Institutu za hrvatski jezik i jezikoslovlje. Na kraju rada opisuju se slična i različita svojstva svih spomenutih rječnika. Rječnici se razlikuju odnosom prema svršenim i nesvršenim te povratnim glagolima, odnosom prema adjunktima (dodatcima), polaznom točkom u shvaćanju valentnosti (semantička ili sintaktička) te bilježenjem uže (obvezne, neobvezne dopune i dodatci) ili šire valentnosti (sintaktičke preoblike poput pasiva, recipročnosti ili bezličnosti).The aim of the paper is to compare the representation of valency in five online valency dictionaries of Slavic languages (the Russian FrameBank, the Czech VALLEX, the Polish Walenty, and two online valency dictionaries of Croatian, Crovallex and e-Glava). FrameBank, VALLEX, Walenty, and CROVALLEX are compared in detail with e-Glava and at the end all five online resources are correlated. The five dictionaries are based on various linguistic traditions, but despite that, some parts of their description are similar due to common features of slavic languages. Due to rich case systems, the morphological description is indispensable in these valency dictionaries

    Attempt to understand public-health relevant social dimensions of COVID-19 outbreak in Poland

    Get PDF
    Recently, the whole of Europe, including Poland, have been significantly affected by COVID-19 and its social and economic consequences which are already causing dozens of billions of euros monthly losses in Poland alone. Social behaviour has a fundamental impact on the dynamics of the spread of infectious diseases such as SARS-CoV-2, challenging the existing health infrastructure and social organization. Modelling and understanding mechanisms of social behaviour (e.g. panic and social distancing) and its contextualization with regard to Poland can contribute to better response to the outbreak on a national and local level. In the presented study we aim to investigate the impact of the COVID-19 on society by: (i) measuring the relevant activity in internet news and social media; (ii) analysing attitudes and demographic patterns in Poland. In the end, we are going to implement computational social science and digital epidemiology research approach to provide urgently needed information on social dynamics during the outbreak. This study is an ad hoc reaction only, and our goal is to signal the main areas of possible research to be done in the future and cover issues with direct or indirect relation to public health

    Lexical platform – the first step towards user-centred integration of lexical resources

    Get PDF
    Lexical platform – the first step towards user-centred integration of lexical resources Lexical platform – the first step towards user-centred integration of lexical resources The paper describes the Lexical Platform - a means for lightweight integration of independent lexical resources. Lexical resources (LRs) are represented as web components that implement a minimal set of predefined programming interfaces. These provide functionality for querying and generate a simple, common presentation format. Therefore, a common data format is not needed and the identity of component LRs is preserved. Users can search, browse and navigate via resources on the basis of a limited set of anchor elements such as base form, word form and synset id.   Platforma leksykalna – pierwszy krok w kierunku integracji zasobów leksykalnych zorientowanej na użytkowników Artykuł opisuje Platformę Leksykalną – sposób na lekką integrację niezależnych zasobów leksykalnych. Zasoby leksykalne są na niej reprezentowane jako komponenty webowe, które implementują minimalny zestaw predefiniowanych interfejsów programistycznych. Interfejsy te dostarczają funkcjonalność do przeszukiwania oraz generują prosty, jednolity format prezentacji zasobów. W związku z tym wspólny format danych nie jest konieczny i tożsamość składowych zasobów leksykalnych jest zachowana. Użytkownicy mogą przeszukiwać zasoby na podstawie ograniczonego zbioru odwołań takich jak forma podstawowa, forma wyrazowa i identyfikator synsetu

    Słowa klucze kultury jako nazwy pojęć wyrazistych o wysokim stopniu utrwalenia a zagadnienia synonimii leksykalnej

    Get PDF
    The paper discusses a method for discovering important concepts of culture through the process of scrutinizing the most numerous sets of synonyms (which are treated as names of entrenched and sali-ent concepts of the given language’s culture), and subsequently combining them in a set of cultural concept networks. The author focuses on the evolution of cultural concepts and the linguistic mate-rial from th and th c. dictionaries of synonyms, and the Polish version of WordNet which served as the base for the analysis. The paper shows the evolution of some of the concepts of culture established in the th c. and still vivid in present day discourse of collective identity

    An open stylometric system based on multilevel text analysis

    Get PDF
    An open stylometric system based on multilevel text analysisStylometric techniques are usually applied to a limited number of typical tasks, such as authorship attribution, genre analysis, or gender studies. However, they could be applied to several tasks beyond this canonical set, if only stylometric tools were more accessible to users from different areas of the humanities and social sciences. This paper presents a general idea, followed by a fully functional prototype of an open stylometric system that facilitates its wide use through to two aspects: technical and research flexibility. The system relies on a server installation combined with a web-based user interface. This frees the user from the necessity of installing any additional software. At the same time, the system offers a variety of ways in which the input texts can be analysed: they include not only the usual lexical level, but also deep-level linguistic features. This enables a range of possible applications, from typical stylometric tasks to the semantic analysis of text documents. The internal architecture of the system relies on several well-known software packages: a collection of language tools (for text pre-processing), Stylo (for stylometric analysis) and Cluto (for text clustering). The paper presents: (1) The idea behind the system from the user’s perspective. (2) The architecture of the system, with a focus on data processing. (3) Features for text description. (4) The use of analytical systems such as Stylo and Cluto. The presentation is illustrated with example applications. Otwarty system stylometryczny wykorzystujący wielopoziomową analizę języka Zastosowania metod stylometrycznych na ogół ograniczają się do kilku typowych problemów badawczych, takich jak atrybucja autorska, styl gatunków literackich czy studia nad zróżnicowaniem stylistycznym kobiet i mężczyzn. Z pewnością dałoby się je z powodzeniem zastosować również do wielu innych problemów klasyfikacji tekstów, gdyby tylko owe metody oraz odpowiednie narzędzia były bardziej dostępne dla uczonych reprezentujących różne dyscypliny nauk humanistycznych i społecznych. Artykuł niniejszy omawia założenia teoretyczne oraz w pełni funkcjonalny prototyp otwartego systemu stylometrycznego, którego szerokie zastosowanie umożliwią dwie jego cechy: elastyczność techniczna oraz dostosowywalność do różnych pytań badawczych. System opiera się na instalacji serwerowej sprzęgniętej z sieciowym interfejsem użytkownika. Uwalnia to użytkownika od konieczności instalowania jakichkolwiek dodatkowych programów. Jednocześnie system oferuje wiele sposobów analizowania tekstów nie tylko na poziomie leksykalnym, lecz także poprzez cechy językowe niskiego poziomu. Daje to możliwość stosowania systemu na wiele różnych sposobów, od typowych testów stylometrycznych do analizy semantycznej dokumentów. Wewnętrzna architektura systemu składa się z wielu elementów znanych ze swej funkcjonalności, w tym z pakietu Stylo przeznaczonego do analiz stylometrycznych oraz pakietu Cluto służącego do zaawansowanej analizy skupień. Artykuł omawia: (1) Koncepcję całego systemu, postrzeganą z punktu widzenia użytkownika, (2) Architekturę systemu oraz jego elementy odpowiedzialne za przetwarzanie tekstu, (3) Cechy językowe służące do opisu dokumentów, (4) Zastosowanie modułów analizy danych, takich jak Stylo czy Cluto. W artykule zostały też przedstawione przykładowe zastosowania systemu

    An open stylometric system based on multilevel text analysis

    Get PDF
    An open stylometric system based on multilevel text analysis Stylometric techniques are usually applied to a limited number of typical tasks, such as authorship attribution, genre analysis, or gender studies. However, they could be applied to several tasks beyond this canonical set, if only stylometric tools were more accessible to users from different areas of the humanities and social sciences. This paper presents a general idea, followed by a fully functional prototype of an open stylometric system that facilitates its wide use through to two aspects: technical and research flexibility. The system relies on a server installation combined with a web-based user interface. This frees the user from the necessity of installing any additional software. At the same time, the system offers a variety of ways in which the input texts can be analysed: they include not only the usual lexical level, but also deep-level linguistic features. This enables a range of possible applications, from typical stylometric tasks to the semantic analysis of text documents. The internal architecture of the system relies on several well-known software packages: a collection of language tools (for text pre-processing), Stylo (for stylometric analysis) and Cluto (for text clustering). The paper presents: (1) The idea behind the system from the user’s perspective. (2) The architecture of the system, with a focus on data processing. (3) Features for text description. (4) The use of analytical systems such as Stylo and Cluto. The presentation is illustrated with example applications.   Otwarty system stylometryczny wykorzystujący wielopoziomową analizę języka  Zastosowania metod stylometrycznych na ogół ograniczają się do kilku typowych problemów badawczych, takich jak atrybucja autorska, styl gatunków literackich czy studia nad zróżnicowaniem stylistycznym kobiet i mężczyzn. Z pewnością dałoby się je z powodzeniem zastosować również do wielu innych problemów klasyfikacji tekstów, gdyby tylko owe metody oraz odpowiednie narzędzia były bardziej dostępne dla uczonych reprezentujących różne dyscypliny nauk humanistycznych i społecznych. Artykuł niniejszy omawia założenia teoretyczne oraz w pełni funkcjonalny prototyp otwartego systemu stylometrycznego, którego szerokie zastosowanie umożliwią dwie jego cechy: elastyczność techniczna oraz dostosowywalność do różnych pytań badawczych. System opiera się na instalacji serwerowej sprzęgniętej z sieciowym interfejsem użytkownika. Uwalnia to użytkownika od konieczności instalowania jakichkolwiek dodatkowych programów. Jednocześnie system oferuje wiele sposobów analizowania tekstów nie tylko na poziomie leksykalnym, lecz także poprzez cechy językowe niskiego poziomu. Daje to możliwość stosowania systemu na wiele różnych sposobów, od typowych testów stylometrycznych do analizy semantycznej dokumentów. Wewnętrzna architektura systemu składa się z wielu elementów znanych ze swej funkcjonalności, w tym z pakietu Stylo przeznaczonego do analiz stylometrycznych oraz pakietu Cluto służącego do zaawansowanej analizy skupień. Artykuł omawia: (1) Koncepcję całego systemu, postrzeganą z punktu widzenia użytkownika, (2) Architekturę systemu oraz jego elementy odpowiedzialne za przetwarzanie tekstu, (3) Cechy językowe służące do opisu dokumentów, (4) Zastosowanie modułów analizy danych, takich jak Stylo czy Cluto. W artykule zostały też przedstawione przykładowe zastosowania systemu

    An ontology for human-like interaction systems

    Get PDF
    This report proposes and describes the development of a Ph.D. Thesis aimed at building an ontological knowledge model supporting Human-Like Interaction systems. The main function of such knowledge model in a human-like interaction system is to unify the representation of each concept, relating it to the appropriate terms, as well as to other concepts with which it shares semantic relations. When developing human-like interactive systems, the inclusion of an ontological module can be valuable for both supporting interaction between participants and enabling accurate cooperation of the diverse components of such an interaction system. On one hand, during human communication, the relation between cognition and messages relies in formalization of concepts, linked to terms (or words) in a language that will enable its utterance (at the expressive layer). Moreover, each participant has a unique conceptualization (ontology), different from other individual’s. Through interaction, is the intersection of both part’s conceptualization what enables communication. Therefore, for human-like interaction is crucial to have a strong conceptualization, backed by a vast net of terms linked to its concepts, and the ability of mapping it with any interlocutor’s ontology to support denotation. On the other hand, the diverse knowledge models comprising a human-like interaction system (situation model, user model, dialogue model, etc.) and its interface components (natural language processor, voice recognizer, gesture processor, etc.) will be continuously exchanging information during their operation. It is also required for them to share a solid base of references to concepts, providing consistency, completeness and quality to their processing. Besides, humans usually handle a certain range of similar concepts they can use when building messages. The subject of similarity has been and continues to be widely studied in the fields and literature of computer science, psychology and sociolinguistics. Good similarity measures are necessary for several techniques from these fields such as information retrieval, clustering, data-mining, sense disambiguation, ontology translation and automatic schema matching. Furthermore, the ontological component should also be able to perform certain inferential processes, such as the calculation of semantic similarity between concepts. The principal benefit gained from this procedure is the ability to substitute one concept for another based on a calculation of the similarity of the two, given specific circumstances. From the human’s perspective, the procedure enables referring to a given concept in cases where the interlocutor either does not know the term(s) initially applied to refer that concept, or does not know the concept itself. In the first case, the use of synonyms can do, while in the second one it will be necessary to refer the concept from some other similar (semantically-related) concepts...Programa Oficial de Doctorado en Ciencia y Tecnología InformáticaSecretario: Inés María Galván León.- Secretario: José María Cavero Barca.- Vocal: Yolanda García Rui

    Word Sense Disambiguation Based on Large Scale Polish CLARIN Heterogeneous Lexical Resources

    Get PDF
    Word Sense Disambiguation Based on Large Scale Polish CLARIN Heterogeneous Lexical ResourcesLexical resources can be applied in many different Natural Language Engineering tasks, but the most fundamental task is the recognition of word senses used in text contexts. The problem is difficult, not yet fully solved and different lexical resources provided varied support for it. Polish CLARIN lexical semantic resources are based on the plWordNet — a very large wordnet for Polish — as a central structure which is a basis for linking together several resources of different types. In this paper, several Word Sense Disambiguation (henceforth WSD) methods developed for Polish that utilise plWordNet are discussed. Textual sense descriptions in the traditional lexicon can be compared with text contexts using Lesk’s algorithm in order to find best matching senses. In the case of a wordnet, lexico-semantic relations provide the main description of word senses. Thus, first, we adapted and applied to Polish a WSD method based on the Page Rank. According to it, text words are mapped on their senses in the plWordNet graph and Page Rank algorithm is run to find senses with the highest scores. The method presents results lower but comparable to those reported for English. The error analysis showed that the main problems are: fine grained sense distinctions in plWordNet and limited number of connections between words of different parts of speech. In the second approach plWordNet expanded with the mapping onto the SUMO ontology concepts was used. Two scenarios for WSD were investigated: two step disambiguation and disambiguation based on combined networks of plWordNet and SUMO. In the former scenario, words are first assigned SUMO concepts and next plWordNet senses are disambiguated. In latter, plWordNet and SUMO are combined in one large network used next for the disambiguation of senses. The additional knowledge sources used in WSD improved the performance. The obtained results and potential further lines of developments were discussed
    corecore