Search CORE

39,682 research outputs found

Searching strategies for the Bulgarian language

Author: Savoy Jacques
Publication venue
Publication date: 18/06/2018
Field of study

This paper reports on the underlying IR problems encountered when indexing and searching with the Bulgarian language. For this language we propose a general light stemmer and demonstrate that it can be quite effective, producing significantly better MAP (around + 34%) than an approach not applying stemming. We implement the GL2 model derived from the Divergence from Randomness paradigm and find its retrieval effectiveness better than other probabilistic, vector-space and language models. The resulting MAP is found to be about 50% better than the classical tf idf approach. Moreover, increasing the query size enhances the MAP by around 10% (from T to TD). In order to compare the retrieval effectiveness of our suggested stopword list and the light stemmer developed for the Bulgarian language, we conduct a set of experiments on another stopword list and also a more complex and aggressive stemmer. Results tend to indicate that there is no statistically significant difference between these variants and our suggested approach. This paper evaluates other indexing strategies such as 4-gram indexing and indexing based on the automatic decompounding of compound words. Finally, we analyze certain queries to discover why we obtained poor results, when indexing Bulgarian documents using the suggested word-based approac

RERO DOC Digital Library

Multi-cultural Switzerland – multicultural public service media?

Author: Ratajczak Magdalena
Publication venue: 'Adam Mickiewicz University Poznan'
Publication date: 01/01/2014
Field of study

In this article a special attention is paid to the to the role of public service broadcaster in cultural diversity societies. The main aims of the author was answering the following questions: how cultural pluralism is implemented by the public service broadcaster in Switzerland? How the Swiss PSB implements the principle of cultural pluralism, particularly in the context of the access of national, language communities and the migrants minorities to the media? Are all groups recognized by the public broadcaster in the same way? The second goal of the author was delivering answer to a question about the manner of how public broadcaster has adopted to the new situation and how the new ethnic groups are recognized by SGR SSR.The considerations are related to a cultural pluralism, which assumes that the media provide a guarantee of cultural diversity in a society. This article is a case study of Swiss public service broadcaster – SGR SSR idee suisse

Adam Mickiewicz University Repository

Biblioteka Nauki - repozytorium artykuÅÃ³w

Przegląd Politologiczny

Repozytorium Uniwersytetu im. Adama Mickiewicza (AMUR)

Створення та тестування спеціалізованих словників для аналізу тексту

Author: Fu Zun Yang Winson
Gunturu Srivinasa Murthy
Marcy William M.
Nalabandian Taleen
Pittman Jessica
Taraban Roman
Марсі Вільям
Налабандян Талін
Піттман Джесіка
Тарабань Роман
Янґ Вінсон Фу Зун
Ґунтуру Шрівінаса Мерті
Publication venue
Publication date: 07/04/2020
Field of study

Practitioners in many domains–e.g., clinical psychologists, college instructors, researchers–collect written responses from clients. A well-developed method that has been applied to texts from sources like these is the computer application Linguistic Inquiry and Word Count (LIWC). LIWC uses the words in texts as cues to a person’s thought processes, emotional states, intentions, and motivations. In the present study, we adopt analytic principles from LIWC and develop and test an alternative method of text analysis using naïve Bayes methods. We further show how output from the naïve Bayes analysis can be used for mark up of student work in order to provide immediate, constructive feedback to students and instructors.Робота фахівців-практиків у багатьох галузях, наприклад, клінічних психологів, викладачів кол д ів, дослідників п р дбача збір пись ових відповід хніх клі нтів чи студ нтів. обр розробл ни тод, яки застосову ться сьогодні до т кстів такого типу, ц ко п’ют рни додаток Linguistic Inquiry and Word Count (LIWC). Програма LIWC тракту слова в т кстах як індикатори нтальних проц сів людини, оці них станів, на ірів і отивів. У статті використано аналітичні принципи LIWC, розробл но та прот стовано альт рнативни тод аналізу т ксту з використання тодів на вного ба сового класифікатора. Автори д онструють, як р зультати аналізу за на вни ба сови класифікаторо о уть бути використані для аналізу студ нтсько роботи з тою надання н га ного, конструктивного зворотного зв’язку і студ нта і викладача

Electronic Eastern European National University Institutional Repository

Subjectivity in contemporary visualization of reality: re-visiting Ottoman miniatures

Author: Germen Murat
Publication venue: BCS (British Computer Society)
Publication date: 01/01/2012
Field of study

Though Ottoman miniatures are 2D representations, they carry the potential of conveying an individual’s perception in a more detailed manner as compared to 3D perspective renderings. In a typical 2-vanishing-point perspective; objects / subjects drawn in the foreground hide the ones that are located at their back: This phenomenon is called occlusion. In Ottoman miniatures there is no occlusion, all object / subject illustrations are wholistic, there is no partial description of figures. Consequently, you end up with a life form that is the synthesis of individual forms, a sui generis state... This unique visual narrative can be extended to cubist works where multifaceted descriptions are observed. Another advantage of Ottoman miniatures is that hierarchies of image and image maker are quite clear. Miniatures make use of distance, void, shape, scale relationships and their layout to give a sense of depth in space. Though objectivity is very much valued in visual representation, ideal objectivity is not possible since representations are created by subjects and subjects belong to cultures that have different criteria in forming / perceiving portrayals. Moreover, tools that are used for visual representations usually prove to be narrower than the scope of human perception. Departing from the point of view explained above, Muta-morphosis is a photography project that is created as an almost surreal visualization stemming from the real. The lack of a single perspectival structure due to multiplicity of perspectives after compressed panoramic imaging, can be linked to Ottoman miniatures, which in turn, connects the global contemporary representation to its local traditional counterpart. Keywords: Ottoman miniature painting, contemporary photography, child drawings, visualization, representation, reality, documentary, subjectivity, objectivity, visual narration

Crossref

Sabanci University Research Database

A case study in decompounding for Bengali information retrieval

Author: A. Chen
C. Monz
D. Ganguly
M. Braschler
P. Koehn
S. Dasgupta
Publication venue
Publication date: 01/01/2013
Field of study

Decompounding has been found to improve information retrieval (IR) effectiveness for compounding languages such as Dutch, German, or Finnish. No previous studies, however, exist on the effect of decomposition of compounds in IR for Indian languages. In this case study, we investigate the effect of decompounding for Bengali, a highly agglutinative Indian language. Some unique characteristics of Bengali compounding are: i) only one constituent may be a valid word in contrast to the stricter requirement of both being so; and ii) the first character of the right constituent can be modified by the rules of sandhi in contrast to simple concatenation. While the standard approach of decompounding based on maximization of the total frequency of the constituents formed by candidate split positions has proven beneficial for European languages, our reported experiments in this paper show that such a standard approach does not work particularly well for Bengali IR. As a solution, we firstly propose a more relaxed decompounding where a compound word can be decomposed into only one constituent if the other constituent is not a valid word, and secondly we perform selective decompounding by employing a co-occurrence threshold to ensure that the constituent often co-occurs with the compound word, which in this case is representative of how related are the constituents with the compound. We perform experiments on Bengali ad-hoc IR collections from FIRE 2008 to 2012. Our experiments show that both the relaxed decomposition and the co-occurrence-based constituent selection proves more effective than the standard frequency-based decomposition. improving MAP up to 2:72% and recall up to 1:8%

Crossref

Irish Universities

DCU Online Research Access Service