Search CORE

16 research outputs found

Pillar 3 and Modelling of Stakeholders’ Behaviour at the Commercial Bank Website during the Recent Financial Crisis

Author: Drlik Martin
Kapusta Jozef
Munk Michal
Pilkova Anna
Svec Peter
Publication venue: The Authors. Published by Elsevier B.V.
Publication date: 31/12/2013
Field of study

AbstractThe paper analyses domestic and foreign market participants’ interests in mandatory Basel 2, Pillar 3 information disclosure of a commercial bank during the recent financial crisis. The authors try to ascertain whether the purposes of Basel 2 regulations under the Pillar 3 - Market discipline, publishing the financial and risk related information, have been fulfilled. Therefore, the paper focuses on modelling of visitors’ behaviour at the commercial bank website where information according to Basel 2 is available. The authors present a detailed analysis of the user log data stored by web servers. The analysis can help better understand the rate of use of the mandatory and optional Pillar 3 information disclosure web pages at the commercial bank website in the recent financial crisis in Slovakia. The authors used association rule analysis to identify the association among content categories of the website. The results show that there is in general a small interest of stakeholders in mandating the commercial bank's disclosure of financial information. Foreign website visitors were more concerned about information disclosure according to Pillar 3, Basel 2 regulation, and they have less interest in general information about the bank than domestic ones

Elsevier - Publisher Connector

Conceptual framework for programming skilss development based on microlearning and automated source code evaluation in virtual learning environment

Author: Benko Lubomir
Drlik Martin
Kapusta Jozef
Rodriguez del Pino Juan Carlos
Skalka Jan
Smyrnova-Trybulska Eugenia
Stolinska Anna
Svec Peter
Turcinek Pavel
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

Understanding how software works and writing a program are currently frequent requirements when hiring employees. The complexity of learning programming often results in educational failures, student frustration and lack of motivation, because different students prefer different learning paths. Although e-learning courses have led to many improvements in the methodology and the supporting technology for more effective programming learning, misunderstanding of programming principles is one of the main reasons for students leaving school early. Universities face a challenging task: how to harmonise students’ education, focusing on advanced knowledge in the development of software applications, with students’ education in cases where writing code is a new skill. The article proposes a conceptual framework focused on the comprehensive training of future programmers using microlearning and automatic evaluation of source codes to achieve immediate feedback for students. This framework is designed to involve students in the software development of virtual learning environment software that will provide their education, thus ensuring the sustainability of the environment in line with modern development trends. The paper’s final part is devoted to verifying the contribution of the presented elements through quantitative research on the introductory parts of the framework. It turned out that although the application of interactive features did not lead to significant measurable progress during the first semester of study, it significantly improved the results of students in subsequent courses focused on advanced programming

Repozytorium Uniwersytetu Śląskiego RE-BUŚ

Recommended from our members

Research and Design of a Routing Protocol in Large-Scale Wireless Sensor Networks

Author: Abe Toshinori
Abernathy Jason M.
Abramowicz Halina
Adamus Marek
Adeva Bernardo
Afanaciev Konstantin
Aguilar-Saavedra Juan Antonio
Alabau Pons Carmen
Albrecht Hartwig
Andricek Ladislav
Anduze Marc
Aplin Steve J.
Arai Yasuo
Asano Masaki
Attie David
Attree Derek J.
Bailey David
Balbuena Juan Pablo
Ball Markus
Ballin James
Barbi Mauricio
Barlow Roger
Bartels Christoph
Bartsch Valeria
Bassignana Daniela
Bates Richard
Baudot Jerome
Bechtle Philip
Beck Jeannine
Beckmann Moritz
Bedjidian Marc
Behnke Ties
Belkadhi Khaled
Bellerive Alain
Bentvelsen Stan
Bergauer Thomas
Berggren C. Mikael U.
Bergholz Matthias
Bernreuther Werner
Besancon Marc
Besson Auguste
Bhattacharya Sudeb
Bhuyan Bipul
Biebel Otmar
Bilki Burak
Blair Grahame
Blumlein Johannes
Bo Li
Boer Wim
Boisvert Veronique
Bondar A.
Bonvicini Giovanni
Boos Eduard
Boudry Vincent
Bouquet Bernard
Bouvier Joel
Bozovic-Jelisavcic Ivanka
Brient Jean-Claude
Brock Ian
Brogna Andrea
Buchholz Peter
Buesser Karsten
Bulgheroni Antonio
Burger Jochen
Butler John
Buttar Craig
Buzulutskov A. F.
Caccia Massimo
Caiazza Stefano
Calcaterra Alessandro
Caldwell Allen
Callier Stephane L. C.
Calvo Alamillo Enrique
Campbell Alan J.
Campbell Michael
Cappellini Chiara
Carloganu Cristina
Castro Carballo Maria Elena
Castro Nuno
Chadeeva Marina
Chakraborty Dhiman
Chang Paoti
Charpy Alexandre
Chen Hongfang
Chen Shaomin
Chen Xun
Cheon Byunggu
Choi Suyong
Choudhary B. C.
Christen Sandra
Ciborowski Jacek
Ciobanu Catalin
Claus Gilles
Clerc Catherine
Coca Cornelia
Colas Paul
Colijn Auke
Colledani Claude
Combaret Christophe
Cornat Remi
Cornebise Patrick
Corriveau Francois
Cvach Jaroslav
Czakon Michal
D Ascenzo Nicola
Da Silva Wilfrid
Dadoun Olivier
Dam Mogens
Damerell Chris
Danilov Mikhail
Daniluk Witold
Daubard Guillaume
David Dorte
David Jacques
Decotigny David
Dehmelt Klaus
Delagnes Eric
Deng Zhi
Desch Klaus
Dieguez Angel
Diener Ralf
Dima Mihai-Octavian
Dissertori Gunther
Dixit Madhu S.
Dolezal Zdenek
Dolgoshein Boris A.
Dollan Ralph
Doren Brian
Dorokhov Andrei
Doublet Philippe
Doyle Tony
Doziere Guy
Dragicevic Marko
Drasal Zbynek
Drugakov Vladimir
Duarte Campderros Jordi
Dulucq Frederic
Dumitru Laurentiu Alexandru
Dzahini Daniel
Eberl Helmut
Eckerlin Guenter
Ehrenfeld Wolfgang
Eigen Gerald
Eklund Lars
Elsen Eckhard
Elsener Konrad
Emeliantchik Igor
Engels Jan
Evrard Christophe
Fabbri Riccardo
Faber Gerard
Faucci Giannelli Michele
Faus-Golfe Angeles
Feege Nils
Feng Cunfeng
Ferencei Jozef
Fernandez Garcia Marcos
Filthaut Frank
Fleck Ivor
Fleischer Manfred
Fleta Celeste
Fleury Julien L.
Fontaine Jean-Charles
Foster Brian
Fourches Nicolas
Fouz Mary-Cruz
Frank Sebastian
Frey Ariane
Frotin Mickael
Fujii Hirofumi
Fujii Keisuke
Fujimoto Junpei
Fujita Yowichi
Fusayasu Takahiro
Fuster Juan
Gaddi Andrea
Gaede Frank
Galkin Alexei
Galkin Valery
Gallas Abraham
Gallin-Martel Laurent
Gamba Diego
Gao Yuanning
Garrido Beltran Lluis
Garutti Erika
Gastaldi Franck
Gaur Bakul
Gay Pascal
Gellrich Andreas
Genat Jean-Francois
Gentile Simonetta
Gerwig Hubert
Gibbons Lawrence
Ginina Elena
Giraud Julien
Giraudo Giuseppe
Gladilin Leonid
Goldstein Joel
Gonzalez Sanchez Francisco Javier
Gournaris Filimon
Graaf Harry
Greenshaw Tim
Greenwood Z. D.
Grefe Christian
Gregor Ingrid-Maria
Grenier Gerald Jean
Gris Philippe
Grondin Denis
Groot Nicolo
Grunewald Martin
Grzelak Grzegorz
Gurtu Atul
Haas Tobias
Haensel Stephan
Hajdu Csaba
Hallermann Lea
Han Liang
Hansen Peter H.
Hara Takanori
Harder Kristian
Hartin Anthony
Haruyama Tomiyoshi
Harz Martin
Hasegawa Yoji
Hauschild Michael
He Qing
Hedberg Vincent
Hedin David
Heinze Isa
Helebrant Christian
Henschel Hans
Hensel Carsten
Hertenberger Ralf
Herve Alain
Higuchi Takeo
Himmi Abdelkader
Hironori Kazurayama
Hlucha Hana
Hommels Bart
Horii Yasuyuki
Horvath Dezso
Hostachy Jean-Yves
Hou Wei-Shu
Hu-Guo Christine
Huang Xingtao
Huppert Jean Francois
Ide Yasuhiro
Idzik Marek
Iglesias Escudero Carmen
Ignatenko Alexandr
Igonkina Olga
Ikeda Hirokazu
Ikematsu Katsumasa
Ikemoto Yukiko
Ikuno Toshinori
Imbault Didier
Imhof Andreas
Imhoff Marc
Ingbir Ronen
Inoue Eiji
Ioannis Giomataris
Ishikawa Akimasa
Itagaki Kennosuke
Ito Kazutoshi
Itoh Hideo
Iwabuchi Masaya
Iwai Go
Iwamoto Toshiyuki
Jacosalem Editha P.
Jaramillo Echeverria Richard
Jeans Daniel T. D.
Jing Fanfan
Jing Ge
Jokic Stevan
Jong Paul
Jong Sijbrand
Jonsson Leif
Jore Matthieu
Jovin Tatjana
Kafer Daniela
Kajino Fumiyoshi
Kamai Yusuke
Kaminski Jochen
Kamiya Yoshio
Kaplan Alexander
Kapusta Frederic
Kar Deepak
Karlen Dean
Katayama Nobu
Kato Eriko
Kato Yukihiro
Kaukher Alexander
Kawagoe Kiyotomo
Kawahara Hiroki
Kawai Masanori
Kawasaki Takeo
Khan Sameen Ahmed
Kieffer Robert
Kielar Eryk
Kiesenhofer Wolfgang
Kiesling Christian M.
Killenberg Martin
Kim Choong Sun
Kim Donghee
Kim Eun-Joo
Kim Guinyun
Kim Hong Joo
Kim Hyunok
Kim Shinhong
Kircher Francois
Kisielewska Danuta
Kleinwort Claus
Klimkovich Tatsiana
Kluge Hanna
Kluit Peter Martin
Kobayashi Makoto
Kobel Michael
Kodama Hideyo
Kodys Peter
Koetz U.
Koffeman Els
Kohriki Takashi
Komamiya Sachio
Kondou Yoshinari
Kooten Rick J.
Korbel Volker
Kotera Katsushige
Kraml Sabine
Krammer Manfred
Krastev Kaloyan
Krause Bernward
Krautscheid Thorsten
Krucker Dirk
Kschioneck Kirsten
Kuang Yu-Ping
Kuhlmann Jan
Kuroiwa Hirotoshi
Kusano Tomonori
Kvasnicka Peter
La Taille Christophe
Lacasta Llacer Carlos
Lagorio Eric
Laktineh Imad
Lange Wolfgang
Lebrun Patrice
Lee Jik
Lehner Frank
Lesiak Tadeusz
Levy Aharon
Li Bo
Li Hengne
Li Ting
Li Yulan
Liang Zuotang
Lima Guilherme
Linde Frank
Linssen Lucie
Linzmaier Diana
List Benno
List Jenny
Liu Bo
Llopart Cudie Xavier
Lohmann Wolfgang
Lopez Virto Amparo
Lozano Manuel
Lu Shaojun
Lucaci-Timoce Angela Isabela
Lumb Nick
Lundberg Bjorn
Lutz Benjamin
Lutz Pierre
Lux Thorsten
Luzniak Pawel
Lyapin Alexey
Ma Wengan
Maczewski Lukasz
Mader Wolfgang F.
Maity Manas
Majumdar Nayana
Majumder Gobinda
Maki Akihiro
Makida Yasuhiro
Mamuzic Judita
Marc Dhellot
Marchesini Ivan
Marcisovsky Michal
Marias Carlos
Marshall John
Martens Cornelius
Martin Jean-Pierre
Martin Victoria J.
Martin-Chassard Gisele
Martinez Rivero Celso
Martyn Hans-Ulrich
Masi Rita
Mathez Herve
Mathieu Antoine
Matsuda Takeshi
Matsunaga Hiroyuki
Matsushita Takashi
Mavromanolakis Georgios
Mcdonald Kirk T.
Mereu Paolo
Merk Marcel
Merkin Mikhail M.
Meyer Niels
Meyners Norbert
Mihara Satoshi
Miller David J.
Miller Owen
Mitaroff Winfried A.
Miyamoto Akiya
Miyata Hitoshi
Mjornmark Ulf
Mnich Joachim
Moll Andreas
Monig Klaus
Moortgat-Pick Gudrid A.
Mora Freitas Paulo
Morel Frederic
Moretti Stefano
Morgunov Vasily
Mori Takashi
Mori Toshinori
Morin Laurent
Morozov Sergey
Moser Fabian
Moser Hans-Gunther
Moya David
Mudrinic Mihajlo
Mukhopadhyay Supratik
Murakami Takeshi
Musa Luciano
Musat Gabriel
Nagamine Tadashi
Nakamura Isamu
Nakano Eiichi
Nakashima Kenichi
Nakayoshi Kazuo
Nakazawa Hideyuki
Nam Jiwoo
Nam Shinwoo
Nemecek Stanislav
Niebuhr Carsten
Niechciol Marcus
Niezurawski Piotr
Nishida Shohei
Nishiyama Miho
Nitoh Osamu
Norbeck Ed
Nozaki Mitsuaki
O Shea Val
Ohlerich Martin
Okada Nobuchika
Olchevski Alexander
Olivier Bob
Oliwa Krzysztof
Omori Tsunehiko
Onel Yasar
Ono Hiroaki
Ono Yoshimasa
Onuki Yoshiyuki
Ootani Wataru
Orava Risto
Orlandea Marius Ciprian
Oskarsson Anders
Osland Per
Ossetski Dmitri
Osterman Lennart
Padilla Cristobal
Pandurovic Mila
Park Hwanbae
Park Il Hung
Parkes Chris
Patrick Ghislain
Patterson J. Ritchie
Pawlik Bogdan
Pellegrini Giulio
Pellegrino Antonio
Peterson Daniel
Petrov Alexander
Pham Thanh Hung
Piccolo Marcello
Poeschl Roman
Polak Ivo
Popova Elena
Postranecky Martin
Prahl Volker
Prudent Xavier
Przysiezniak Helenka
Puerta-Pelayo Jesus
Qian Wenbin
Quadt Arnulf
Rarbi Fatah-Ellah
Raspereza Alexei
Ratti Lodovico
Raux Ludovic
Raven Gerhard
Re Valerio
Regler Meinhard
Reinhard Marcel
Renz Uwe
Repain Philippe
Repond Jose
Richard Francois
Riemann Sabine
Riemann Tord
Riera-Babures Jordi
Riu Imma
Robert Kieffer
Robson Aidan
Roeck Albert
Roloff Philipp
Rosca Aura
Rosemann Christoph
Rosiek Janusz
Rossmanith Robert
Roth Stefan
Royon Christophe
Ruan Manqi
Ruiz-Jimeno Alberto
Rusinov Vladimir
Ruzicka Pavel
Ryzhikov Dmitri
Saborido Juan J.
Sadeh Iftach
Sailer Andre
Saito Masatoshi
Sakuma Takayuki
Sanami Toshiya
Sanuki Tomoyuki
Sarkar Sandip
Sasaki Rei
Sato Yutaro
Saveliev Valeri
Savoy-Navarro Aurore
Sawyer Lee
Schade Peter
Schafer Oliver
Schaffran Joern
Schalicke Andreas
Scheirich Jan
Schlatter Dieter
Schmidt Ringo Sebastian
Schmitt Sebastian
Schneekloth Uwe
Schreiber Heinz Juergen
Schuler K. Peter
Schultz-Coulon Hans-Christian
Schumacher Markus
Schumm Bruce A.
Schuwalow Sergej
Schwierz Rainer
Sefkow Felix
Sefri Rachid
Seguin-Moreau Nathalie
Seidel Katja
Sekaric Jadranka
Sendai Hiroshi
Settles Ronald Dean
Shao Ming
Shechtman L. I.
Shimazaki Shoichi
Shumeiko Nikolai
Sicho Petr
Simon Frank
Sinram Klaus
Smiljanic Ivan
Smiljkovic Nebojsa
Smolik Jan
Sobloher Blanka
Soldner Christian
Song Kezhu
Sopczak Andre
Speckmayer Peter
Stenlund Evert
Stockinger Dominik
Stoeck Holger
Straessner Arno
Strohmer Raimund
Stromhagen Richard
Sudo Yuji
Suehara Taikan
Suekane Fumihiko
Suetsugu Yusuke
Sugimoto Yasuhiro
Sugiyama Akira
Sumisawa Kazutaka
Suzuki Shiro
Swientek Krzysztof
Tabassam Hajrah
Takahashi Tohru
Takeda Hiroshi
Takeshita Tohru
Takubo Yosuke
Tanabe Tomohiko
Tanaka Ken-Ichi
Tanaka Manobu
Tanaka Shuji
Tapprogge Stefan
Tarkovsky Evgueny I.
Tauchi Kazuya
Tauchi Toshiaki
Telnov Valery I.
Teodorescu Eliza
Thomson Mark
Tian Junping
Timmermans Jan
Titov Maxim P.
Tokushuku Katsuo
Tozuka Shunsuke
Tsuboyama Toru
Ueno Koji
Ullan Miguel
Uozumi Satoru
Urakawa Junji
Ushakov Andriy
Ushiroda Yutaka
Valentan Manfred
Valin Isabelle
Vander Donckt Muriel
Vanel Jean-Charles
Vazquez Regueiro Pablo
Verzocchi Marco
Vescovi Christophe
Videau Henri L.
Vila Ivan
Vilasis-Cardona Xavier
Vogel Adrian
Volkenborn Robert
Vos Marcel
Voutsinas Yorgos
Vrba Vaclav
Vreeswijk Marcel
Walsh Roberval
Waltenberger Wolfgang
Wang Meng
Wang Michael
Wang Min-Zu
Wang Qun
Wang Xiaoliang
Wang Yi
Ward David R.
Warren Matthew
Watanabe Minori
Watanabe Takashi
Watson Nigel K.
Wattimena Nanda
Wendt Oliver
Wermes Norbert
Weuste Lars
Wichmann Katarzyna
Wienemann Peter
Wierba Wojciech
Wilson Graham W.
Wilson John A.
Wing Matthew
Winter Marc
Wobisch Markus
Worek Malgorzata
Xella Stefania
Xu Zizong
Yamaguchi Akira
Yamaguchi Hiroshi
Yamamoto Hitoshi
Yamaoka Hiroshi
Yamashita Satoru
Yamauchi M.
Yamazaki Yuji
Yamouni Mahfoud
Yan Wenbiao
Yanagida Koji
Yang Haijun
Yang Jin Min
Yang Jongmann
Yang Zhenwei
Yasu Yoshiji
Yonamine Ryo
Yoshida Kohei
Yoshida Takuo
Yoshioka Tamaki
Yu Chunxu
Yu Intae
Yue Qian
Zacek Josef
Zalesak Jaroslav
Zarnecki Aleksander Filip
Zawiejski Leszek
Zeitnitz Christian
Zerwas Dirk
Zeuner Wolfram
Zhang Renyou
Zhang Xueyao
Zhang Yanxi
Zhang Zhiqing
Zhang Ziping
Zhao Jiawei
Zhao Zhengguo
Zheng Baojun
Zhong Liang
Zhou Yongzhao
Zhu Chengguang
Zhu Xianglei
Zomer Fabian
Zutshi Vishnu
Publication venue
Publication date: 31/05/2010
Field of study

无线传感器网络，作为全球未来十大技术之一，集成了传感器技术、嵌入式计算技术、分布式信息处理和自组织网技术，可实时感知、采集、处理、传输网络分布区域内的各种信息数据，在军事国防、生物医疗、环境监测、抢险救灾、防恐反恐、危险区域远程控制等领域具有十分广阔的应用前景。本文研究分析了无线传感器网络的已有路由协议，并针对大规模的无线传感器网络设计了一种树状路由协议，它根据节点地址信息来形成路由，从而简化了复杂繁冗的路由表查找和维护，节省了不必要的开销，提高了路由效率，实现了快速有效的数据传输。为支持此路由协议本文提出了一种自适应动态地址分配算——ADAR（AdaptiveDynamicAddre...As one of the ten high technologies in the future, wireless sensor network, which is the integration of micro-sensors, embedded computing, modern network and Ad Hoc technologies, can apperceive, collect, process and transmit various information data within the region. It can be used in military defense, biomedical, environmental monitoring, disaster relief, counter-terrorism, remote control of haz...学位：工学硕士院系专业：信息科学与技术学院通信工程系_通信与信息系统学号：2332007115216

University of Miami: Scholarship Miami

Joint Institute for Nuclear Research (JINR)

CERN Document Server

Xiamen University Institutional Repository

Coreference Resolution for Improving Performance Measures of Classification Tasks

Author: Jozef Kapusta
Kirsten Šteflovič
Publication venue: MDPI AG
Publication date: 01/08/2023
Field of study

There are several possibilities to improve classification in natural language processing tasks. In this article, we focused on the issue of coreference resolution that was applied to a manually annotated dataset of true and fake news. This dataset was used for the classification task of fake news detection. The research aimed to determine whether performing coreference resolution on the input data before classification or classifying them without performing coreference resolution is more effective. We also wanted to verify whether it is possible to enhance classifier performance metrics by incorporating coreference resolution into the data preparation process. A methodology was proposed, in which we described the implementation methods in detail, starting from the identification of entity mentions in the text using the neuralcoref algorithm, then through word-embedding models (TF–IDF, Doc2Vec), and finally to several machine learning methods. The result was a comparison of the implemented classifiers based on the performance metrics described in the theoretical part. The best result for accuracy was observed for the dataset with coreference resolution applied, which had a median value of 0.8149, while for the F1 score, the best result had a median value of 0.8101. However, the more important finding is that the processed data with the application of coreference resolution led to an improvement in performance metrics in the classification tasks

Directory of Open Access Journals

Improvement of Misleading and Fake News Classification for Flective Languages by Morphological Group Analysis

Author: Jozef Kapusta
Juraj Obonya
Publication venue: 'MDPI AG'
Publication date: 05/02/2020
Field of study

Due to the constantly evolving social media and different types of sources of information, we are facing different fake news and different types of misinformation. Currently, we are working on a project to identify applicable methods for identifying fake news for floating language types. We explored different approaches to detect fake news in the presented research, which are based on morphological analysis. This is one of the basic components of natural language processing. The aim of the article is to find out whether it is possible to improve the methods of dataset preparation based on morphological analysis. We collected our own and unique dataset, which consisted of articles from verified publishers and articles from news portals that are known as the publishers of fake and misleading news. Articles were in the Slovak language, which belongs to the floating types of languages. We explored different approaches in this article to the dataset preparation based on morphological analysis. The prepared datasets were the input data for creating the classifier of fake and real news. We selected decision trees for classification. The evaluation of the success of two different methods of preparation was carried out because of the success of the created classifier. We found a suitable dataset pre-processing technique by morphological group analysis. This technique could be used for improving fake news classification

Multidisciplinary Digital Publishing Institute

Framework for e-Learning Materials Optimization

Author: Halvoník Dominik
Kapusta Jozef
Publication venue: 'International Association of Online Engineering (IAOE)'
Publication date: 12/06/2020
Field of study

Creating educational materials (activities, e-books etc.) in each e-learning course can be divided into 2 main parts. The first can be defined as a compilation of ideas and information that we want to pass on to the student. This section of building e-learning materials process is very abstract and correct selection of what we want to teach the students is highly delicate and depends on teacher’s skills and didactic principles. The second phase is also important but it can be formalized. Main aim of this paper is to define and confirms a set of formal rules compiled into framework which can be used as a tool for building e-learning materials. We assume that the rules presented in this paper can be used for each e-learning platform. To confirm validity of defined rules we integrated these rules into module on LMS Moodle and part of this paper is proposed experiment carried out on the same platfor

Online-Journals.org (International Association of Online Engineering)

Využití n-gramů z morfologických značek pro klasifikaci falešných zpráv

Author: Drlik Martin
Kapusta Jozef
Munk Michal
Publication venue: 'PeerJ'
Publication date: 01/01/2021
Field of study

Research of the techniques for effective fake news detection has become very needed and attractive. These techniques have a background in many research disciplines, including morphological analysis. Several researchers stated that simple content related n-grams and POS tagging had been proven insufficient for fake news classification. However, they did not realise any empirical research results, which could confirm these statements experimentally in the last decade. Considering this contradiction, the main aim of the paper is to experimentally evaluate the potential of the common use of n-grams and POS tags for the correct classification of fake and true news. The dataset of published fake or real news about the current Covid-19 pandemic was pre-processed using morphological analysis. As a result, n-grams of POS tags were prepared and further analysed. Three techniques based on POS tags were proposed and applied to different groups of n-grams in the pre-processing phase of fake news detection. The n-gram size was examined as the first. Subsequently, the most suitable depth of the decision trees for sufficient generalization was scoped. Finally, the performance measures of models based on the proposed techniques were compared with the standardised reference TF-IDF technique. The performance measures of the model like accuracy, precision, recall and f1-score are considered, together with the 10-fold cross-validation technique. Simultaneously, the question, whether the TF-IDF technique can be improved using POS tags was researched in detail. The results showed that the newly proposed techniques are comparable with the traditional TF-IDF technique. At the same time, it can be stated that the morphological analysis can improve the baseline TF-IDF technique. As a result, the performance measures of the model, precision for fake news and recall for real news, were statistically significantly improved.Výzkum technik pro účinnou detekci falešných zpráv se stal velmi potřebným a atraktivním. Tyto techniky mají zázemí v mnoha výzkumných disciplínách, včetně morfologické analýzy. Několik výzkumníků uvedlo, že jednoduché n-gramy související s obsahem a POS tagování se ukázaly jako nedostatečné pro klasifikaci falešných zpráv. V posledním desetiletí však nerealizovali žádné výsledky empirického výzkumu, které by tato tvrzení experimentálně potvrdily. Vzhledem k tomuto rozporu je hlavním cílem článku experimentálně zhodnotit potenciál běžného použití n-gramů a POS tagů pro správnou klasifikaci falešných a pravdivých zpráv. Datový soubor publikovaných falešných či pravdivých zpráv o současné pandemii Covid-19 byl předem zpracován pomocí morfologické analýzy. Výsledkem byla příprava n-gramů POS tagů, které byly dále analyzovány. Byly navrženy tři techniky založené na POS značkách, které byly aplikovány na různé skupiny n-gramů ve fázi předzpracování detekce falešných zpráv. Jako první byla zkoumána velikost n-gramů. Následně byla zakreslena nejvhodnější hloubka rozhodovacích stromů pro dostatečnou generalizaci. Nakonec byly porovnány míry výkonnosti modelů založených na navržených technikách se standardizovanou referenční technikou TF-IDF. Uvažuje se o výkonnostních mírách modelu, jako je přesnost, precision, recall a f1-skóre, spolu s technikou desetinásobné křížové validace. Současně byla podrobně zkoumána otázka, zda lze techniku TF-IDF zlepšit pomocí POS značek. Výsledky ukázaly, že nově navržené techniky jsou srovnatelné s tradiční technikou TF-IDF. Zároveň lze konstatovat, že morfologická analýza může zlepšit základní techniku TF-IDF. V důsledku toho se statisticky významně zlepšily výkonnostní ukazatele modelu, precision pro falešné zprávy a recall pro skutečné zprávy

Directory of Open Access Journals

Digital Library of the University of Pardubice

Natural Sciences Publishing Cor. Methodology Design for Data Preparation in the Process of Discovering Patterns of Web Users Behaviour

Author: Jozef Kapusta
Martin Drlík
Michal Munk
Publication venue
Publication date
Field of study

Abstract: Discovering of behaviour patterns of website visitors is one of the most common applications in web log mining. Based on the discovered users ’ behaviour patterns, it is possible to restructure or in combination with other knowledge personalize the examined website, portal or other web-based system. Data preparation represents the first inevitable step in the process of discovering users’ behavioural patterns. In this paper we summarize the results of our previous research, where we carefully examined the relevance of individual steps of data preparation from a web server log file and virtual learning environment for further analysis. The aim of our experiments was to find out to what extent it is necessary to realize the time-consuming data preparation in the process of discovering patterns of behaviour of web users and to determine the inevitable steps to obtain reliable data from different types of log files. Considering the obtained results we propose a methodology for data preparation in the process of discovering patterns of web user behaviour based on the results of experiments we carried out. The research results showed, that in the case of systems providing sophisticated navigation options and a rigid structure of the content (which is characteristic for the most virtual learning environments), the paths completing is not an inevitable step in data preparation in the process of discovering patterns of web users ‘ behaviour

CiteSeerX

Data pre-processing for web log mining: Case study of commercial bank website usage analysis

Author: Anna Pilková
Jozef Kapusta
Michal Munk
Peter Švec
Publication venue: 'Mendel University Press'
Publication date: 01/01/2013
Field of study

We use data cleaning, integration, reduction and data conversion methods in the pre-processing level of data analysis. Data processing techniques improve the overall quality of the patterns mined. The paper describes using of standard pre-processing methods for preparing data of the commercial bank website in the form of the log file obtained from the web server. Data cleaning, as the simplest step of data pre-processing, is non–trivial as the analysed content is highly specific. We had to deal with the problem of frequent changes of the content and even frequent changes of the structure. Regular changes in the structure make use of the sitemap impossible. We presented approaches how to deal with this problem. We were able to create the sitemap dynamically just based on the content of the log file. In this case study, we also examined just the one part of the website over the standard analysis of an entire website, as we did not have access to all log files for the security reason. As the result, the traditional practices had to be adapted for this special case. Analysing just the small fraction of the website resulted in the short session time of regular visitors. We were not able to use recommended methods to determine the optimal value of session time. Therefore, we proposed new methods based on outliers identification for raising the accuracy of the session length in this paper

Directory of Open Access Journals

Data advance preparation factors affecting results of sequence rule analysis in web log mining

Author: Kapusta Jozef
Munk Michal
Turčáni Milan
Švec Peter
Publication venue: Technická univerzita v Liberci
Publication date: 01/01/2010
Field of study

One of the main tasks of web log mining is discovering patterns of behaviour of portal visitors. Based on the found patterns of users behaviour, which are represented by sequence rules it is possible to modify and improve the web page of an organisation. This article aims at finding out by means of an experiment to what degree it is necessary to realize data preparation for web log mi- ning and it aims also at specifying inevitable steps for obtaining valid data from the log file. Results of the experiment are very important for the portal, which is regularly analysed and modified, since they can prove correctness of individual steps at analysis, or through an identification of “usele- ss” steps they can make the advance preparation of data simpler. These results show that data cleaning from crawlers accesses has a significant impact on the quantity of extracted rules only in case, when we use the method of paths completion. On the contrary, the impact on the reduction of the portion of inexplicable rules as well as the impact on the quality of extracted rules in terms of their basic characteristics was not proved. Paths completing was proved crucial in data prepa- ration for web log mining. It was proved that paths completing has a significant impact both on the quantity and the quality of extracted rules. However, it was prov ed that allowing the used browser upon identifying sessions has neither any significant impact on the quantity nor on the quality of extracted rules. There exist a number of models for identification of users sessions, which are cru- cial in data preparation, however, there e xists also a method, which identifies them expressly. Our next goal is to additionally programme this functionality into the existing system and analyse various parameters of individual methods of identification of sessions compared with the reference direct identification. It also mentions the necessity to pay attention to the analysis of web logs in the real time and to reduce the time needed for the advance preparation of these logs and at the same time to increase accuracy of these data depending on the time of their collection

University of West Bohemia Digital Library

DSpace at University of West Bohemia