Search CORE

20 research outputs found

Component Analysis of Adjectives in Luxembourgish for Detecting Sentiments

Author: Gierschek Daniela
Schommer Christoph
Sirajzade Joshgun
Publication venue
Publication date: 01/05/2020
Field of study

The aim of this paper is to investigate the role of Luxembourgish adjectives in expressing sentiments in user comments written at the web presence of rtl.lu (RTL is the abbreviation for Radio Television Lëtzebuerg). Alongside many textual features or representations, adjectives could be used in order to detect sentiment, even on a sentence or comment level. In fact, they are also by themselves one of the best ways to describe a sentiment, despite the fact that other word classes such as nouns, verbs, adverbs or conjunctions can also be utilized for this purpose. The empirical part of this study focuses on a list of adjectives that were extracted from an annotated corpus. The corpus contains the part of speech tags of individual words and sentiment annotation on the adjective, sentence, and comment level. Suffixes of Luxembourgish adjectives like -esch, -eg, -lech, -al, -el, -iv, -ent, -los, -bar and the prefix on- were explicitly investigated, especially by paying attention to their role in regards to building a model by applying classical machine learning techniques. We also considered the interaction of adjectives with other grammatical means, especially other part of speeches, e.g. negations, which can completely reverse the meaning, thus the sentiment of an utterance

Open Repository and Bibliography - Luxembourg

The LuNa Open Toolbox for the Luxembourgish Language

Author: Schommer Christoph
Sirajzade Joshgun
Publication venue
Publication date: 01/01/2019
Field of study

Despite some recent work, the ongoing research for the processing of Luxembourgish is still largely in its infancy. While a rich variety of linguistic processing tools exist, especially for English, these software tools offer little scope for the Luxembourgish language. LuNa (a Tool for Luxembourgish National Corpus) is an Open Toolbox that allows researchers to annotate a text corpus written in Luxembourgish language and to build/query an annotated corpus. The aim of the paper is to demonstrate the components of the system and its usage for Machine Learning applications like Topic Modelling and Sentiment Detection. Overall, LuNa bases on a XML-database to store the data and to define the XML scheme, it offers a Graphical User Interface (GUI) for a linguistic data preparation such as tokenization, Part-Of-Speech tagging, and morphological analysis -- just to name a few

Open Repository and Bibliography - Luxembourg

Review of: Romain Hilgert (ed.): Michel Rodange, Renert: De Fuuss am Frack an a Maansgréisst. Komplett Editioun mat historeschen a politeschen Explicatioune, Lëtzebuerg: Éditions Guy Binsfeld, 2020, in: Hémecht, 2021, 3, p. 377-378.

Author: Sirajzade Joshgun
Publication venue
Publication date: 01/01/2021
Field of study

Open Repository and Bibliography - Luxembourg

An Annotation Framework for Luxembourgish Sentiment Analysis

Author: Gierschek Daniela
Schommer Christoph
Sirajzade Joshgun
Publication venue
Publication date: 01/05/2020
Field of study

The aim of this paper is to present a framework developed for crowdsourcing sentiment annotation for the low-resource language Luxembourgish. Our tool is easily accessible through a web interface and facilitates sentence-level annotation of several annotators in parallel. In the heart of our framework is an XML database, which serves as central part linking several components. The corpus in the database consists of news articles and user comments. One of the components is LuNa, a tool for linguistic preprocessing of the data set. It tokenizes the text, splits it into sentences and assigns POS-tags to the tokens. After that, the preprocessed text is stored in XML format into the database. The Sentiment Annotation Tool, which is a browser-based tool, then enables the annotation of split sentences from the database. The Sentiment Engine, a separate module, is trained with this material in order to annotate the whole data set and analyze the sentiment of the comments over time and in relationship to the news articles. The gained knowledge can again be used to improve the sentiment classification on the one hand and on the other hand to understand the sentiment phenomenon from the linguistic point of view

Open Repository and Bibliography - Luxembourg

A Temporal Warehouse for Modern Luxembourgish Text Collections

Author: Gierschek Daniela
Gilles Peter
Purschke Christoph
Schommer Christoph
Sirajzade Joshgun
Publication venue
Publication date: 01/01/2019
Field of study

Open Repository and Bibliography - Luxembourg

Deep Mining Covid-19 Literature

Author: Bouvry Pascal
Schommer Christoph
Sirajzade Joshgun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

In this paper we investigate how scientific and medical papers about Covid-19 can be effectively mined. For this purpose we use the CORD19 dataset which is a huge collection of all papers published about and around the SARS-CoV2 virus and the pandemic it caused. We discuss how classical text mining algorithms like Latent Semantic Analysis (LSA) or its modern version Latent Drichlet Allocation (LDA) can be used for this purpose and also touch more modern variant of these algorithms like word2vec which came with deep learning wave and show their advantages and disadvantages each. We finish the paper with showing some topic examples from the corpus and answer questions such as which topics are the most prominent for the corpus or how many percentage of the corpus is dedicated to them. We also give a discussion of how topics around RNA research in connection with Covid-19 can be examined

Open Repository and Bibliography - Luxembourg

Corpus based investigation of word formation affixes in the Luxembourgish language. Technical challenges and linguistic analysis

Author: Sirajzade Joshgun
Publication venue
Publication date: 01/01/2018
Field of study

This article is a report about compiling a corpus of Luxembourgish for investigation of word formation. First it gives an example for benefits of using a corpus with annotations in investigation of productivity of some selected word formation affixes of Luxembourgish. Then it describes how this can be achieved from a technical point of view

Open Repository and Bibliography - Luxembourg

About the morphological means of expressing the concept of gender in Old Turkic

Author: Sirajzade Joshgun
Publication venue
Publication date: 01/01/2004
Field of study

Open Repository and Bibliography - Luxembourg

Designing of linguistic and literary tools for the Historical Critical Michel Rodange Portal

Author: Sirajzade Joshgun
Publication venue
Publication date: 01/01/2011
Field of study

Open Repository and Bibliography - Luxembourg

Compiling Tools and Resources for Studying of Luxemburgish Language and beyond

Author: Sirajzade Joshgun
Publication venue
Publication date: 01/06/2016
Field of study

Open Repository and Bibliography - Luxembourg