16 research outputs found
Pillar 3 and Modelling of Stakeholders’ Behaviour at the Commercial Bank Website during the Recent Financial Crisis
AbstractThe paper analyses domestic and foreign market participants’ interests in mandatory Basel 2, Pillar 3 information disclosure of a commercial bank during the recent financial crisis. The authors try to ascertain whether the purposes of Basel 2 regulations under the Pillar 3 - Market discipline, publishing the financial and risk related information, have been fulfilled. Therefore, the paper focuses on modelling of visitors’ behaviour at the commercial bank website where information according to Basel 2 is available. The authors present a detailed analysis of the user log data stored by web servers. The analysis can help better understand the rate of use of the mandatory and optional Pillar 3 information disclosure web pages at the commercial bank website in the recent financial crisis in Slovakia. The authors used association rule analysis to identify the association among content categories of the website. The results show that there is in general a small interest of stakeholders in mandating the commercial bank's disclosure of financial information. Foreign website visitors were more concerned about information disclosure according to Pillar 3, Basel 2 regulation, and they have less interest in general information about the bank than domestic ones
Conceptual framework for programming skilss development based on microlearning and automated source code evaluation in virtual learning environment
Understanding how software works and writing a program are currently frequent requirements
when hiring employees. The complexity of learning programming often results in educational
failures, student frustration and lack of motivation, because different students prefer different learning
paths. Although e-learning courses have led to many improvements in the methodology and the
supporting technology for more effective programming learning, misunderstanding of programming
principles is one of the main reasons for students leaving school early. Universities face a challenging
task: how to harmonise students’ education, focusing on advanced knowledge in the development of
software applications, with students’ education in cases where writing code is a new skill. The article
proposes a conceptual framework focused on the comprehensive training of future programmers
using microlearning and automatic evaluation of source codes to achieve immediate feedback for
students. This framework is designed to involve students in the software development of virtual
learning environment software that will provide their education, thus ensuring the sustainability of
the environment in line with modern development trends. The paper’s final part is devoted to verifying
the contribution of the presented elements through quantitative research on the introductory
parts of the framework. It turned out that although the application of interactive features did not
lead to significant measurable progress during the first semester of study, it significantly improved
the results of students in subsequent courses focused on advanced programming
Recommended from our members
Research and Design of a Routing Protocol in Large-Scale Wireless Sensor Networks
无线传感器网络,作为全球未来十大技术之一,集成了传感器技术、嵌入式计算技术、分布式信息处理和自组织网技术,可实时感知、采集、处理、传输网络分布区域内的各种信息数据,在军事国防、生物医疗、环境监测、抢险救灾、防恐反恐、危险区域远程控制等领域具有十分广阔的应用前景。 本文研究分析了无线传感器网络的已有路由协议,并针对大规模的无线传感器网络设计了一种树状路由协议,它根据节点地址信息来形成路由,从而简化了复杂繁冗的路由表查找和维护,节省了不必要的开销,提高了路由效率,实现了快速有效的数据传输。 为支持此路由协议本文提出了一种自适应动态地址分配算——ADAR(AdaptiveDynamicAddre...As one of the ten high technologies in the future, wireless sensor network, which is the integration of micro-sensors, embedded computing, modern network and Ad Hoc technologies, can apperceive, collect, process and transmit various information data within the region. It can be used in military defense, biomedical, environmental monitoring, disaster relief, counter-terrorism, remote control of haz...学位:工学硕士院系专业:信息科学与技术学院通信工程系_通信与信息系统学号:2332007115216
Coreference Resolution for Improving Performance Measures of Classification Tasks
There are several possibilities to improve classification in natural language processing tasks. In this article, we focused on the issue of coreference resolution that was applied to a manually annotated dataset of true and fake news. This dataset was used for the classification task of fake news detection. The research aimed to determine whether performing coreference resolution on the input data before classification or classifying them without performing coreference resolution is more effective. We also wanted to verify whether it is possible to enhance classifier performance metrics by incorporating coreference resolution into the data preparation process. A methodology was proposed, in which we described the implementation methods in detail, starting from the identification of entity mentions in the text using the neuralcoref algorithm, then through word-embedding models (TF–IDF, Doc2Vec), and finally to several machine learning methods. The result was a comparison of the implemented classifiers based on the performance metrics described in the theoretical part. The best result for accuracy was observed for the dataset with coreference resolution applied, which had a median value of 0.8149, while for the F1 score, the best result had a median value of 0.8101. However, the more important finding is that the processed data with the application of coreference resolution led to an improvement in performance metrics in the classification tasks
Improvement of Misleading and Fake News Classification for Flective Languages by Morphological Group Analysis
Due to the constantly evolving social media and different types of sources of information, we are facing different fake news and different types of misinformation. Currently, we are working on a project to identify applicable methods for identifying fake news for floating language types. We explored different approaches to detect fake news in the presented research, which are based on morphological analysis. This is one of the basic components of natural language processing. The aim of the article is to find out whether it is possible to improve the methods of dataset preparation based on morphological analysis. We collected our own and unique dataset, which consisted of articles from verified publishers and articles from news portals that are known as the publishers of fake and misleading news. Articles were in the Slovak language, which belongs to the floating types of languages. We explored different approaches in this article to the dataset preparation based on morphological analysis. The prepared datasets were the input data for creating the classifier of fake and real news. We selected decision trees for classification. The evaluation of the success of two different methods of preparation was carried out because of the success of the created classifier. We found a suitable dataset pre-processing technique by morphological group analysis. This technique could be used for improving fake news classification
Framework for e-Learning Materials Optimization
Creating educational materials (activities, e-books etc.) in each e-learning course can be divided into 2 main parts. The first can be defined as a compilation of ideas and information that we want to pass on to the student. This section of building e-learning materials process is very abstract and correct selection of what we want to teach the students is highly delicate and depends on teacher’s skills and didactic principles. The second phase is also important but it can be formalized. Main aim of this paper is to define and confirms a set of formal rules compiled into framework which can be used as a tool for building e-learning materials. We assume that the rules presented in this paper can be used for each e-learning platform. To confirm validity of defined rules we integrated these rules into module on LMS Moodle and part of this paper is proposed experiment carried out on the same platfor
Využití n-gramů z morfologických značek pro klasifikaci falešných zpráv
Research of the techniques for effective fake news detection has become very needed and attractive. These techniques have a background in many research disciplines, including morphological analysis. Several researchers stated that simple content related n-grams and POS tagging had been proven insufficient for fake news classification. However, they did not realise any empirical research results, which could confirm these statements experimentally in the last decade. Considering this contradiction, the main aim of the paper is to experimentally evaluate the potential of the common use of n-grams and POS tags for the correct classification of fake and true news. The dataset of published fake or real news about the current Covid-19 pandemic was pre-processed using morphological analysis. As a result, n-grams of POS tags were prepared and further analysed. Three techniques based on POS tags were proposed and applied to different groups of n-grams in the pre-processing phase of fake news detection. The n-gram size was examined as the first. Subsequently, the most suitable depth of the decision trees for sufficient generalization was scoped. Finally, the performance measures of models based on the proposed techniques were compared with the standardised reference TF-IDF technique. The performance measures of the model like accuracy, precision, recall and f1-score are considered, together with the 10-fold cross-validation technique. Simultaneously, the question, whether the TF-IDF technique can be improved using POS tags was researched in detail. The results showed that the newly proposed techniques are comparable with the traditional TF-IDF technique. At the same time, it can be stated that the morphological analysis can improve the baseline TF-IDF technique. As a result, the performance measures of the model, precision for fake news and recall for real news, were statistically significantly improved.Výzkum technik pro účinnou detekci falešných zpráv se stal velmi potřebným a atraktivním. Tyto techniky mají zázemí v mnoha výzkumných disciplínách, včetně morfologické analýzy. Několik výzkumníků uvedlo, že jednoduché n-gramy související s obsahem a POS tagování se ukázaly jako nedostatečné pro klasifikaci falešných zpráv. V posledním desetiletí však nerealizovali žádné výsledky empirického výzkumu, které by tato tvrzení experimentálně potvrdily. Vzhledem k tomuto rozporu je hlavním cílem článku experimentálně zhodnotit potenciál běžného použití n-gramů a POS tagů pro správnou klasifikaci falešných a pravdivých zpráv. Datový soubor publikovaných falešných či pravdivých zpráv o současné pandemii Covid-19 byl předem zpracován pomocí morfologické analýzy. Výsledkem byla příprava n-gramů POS tagů, které byly dále analyzovány. Byly navrženy tři techniky založené na POS značkách, které byly aplikovány na různé skupiny n-gramů ve fázi předzpracování detekce falešných zpráv. Jako první byla zkoumána velikost n-gramů. Následně byla zakreslena nejvhodnější hloubka rozhodovacích stromů pro dostatečnou generalizaci. Nakonec byly porovnány míry výkonnosti modelů založených na navržených technikách se standardizovanou referenční technikou TF-IDF. Uvažuje se o výkonnostních mírách modelu, jako je přesnost, precision, recall a f1-skóre, spolu s technikou desetinásobné křížové validace. Současně byla podrobně zkoumána otázka, zda lze techniku TF-IDF zlepšit pomocí POS značek. Výsledky ukázaly, že nově navržené techniky jsou srovnatelné s tradiční technikou TF-IDF. Zároveň lze konstatovat, že morfologická analýza může zlepšit základní techniku TF-IDF. V důsledku toho se statisticky významně zlepšily výkonnostní ukazatele modelu, precision pro falešné zprávy a recall pro skutečné zprávy
Natural Sciences Publishing Cor. Methodology Design for Data Preparation in the Process of Discovering Patterns of Web Users Behaviour
Abstract: Discovering of behaviour patterns of website visitors is one of the most common applications in web log mining. Based on the discovered users ’ behaviour patterns, it is possible to restructure or in combination with other knowledge personalize the examined website, portal or other web-based system. Data preparation represents the first inevitable step in the process of discovering users’ behavioural patterns. In this paper we summarize the results of our previous research, where we carefully examined the relevance of individual steps of data preparation from a web server log file and virtual learning environment for further analysis. The aim of our experiments was to find out to what extent it is necessary to realize the time-consuming data preparation in the process of discovering patterns of behaviour of web users and to determine the inevitable steps to obtain reliable data from different types of log files. Considering the obtained results we propose a methodology for data preparation in the process of discovering patterns of web user behaviour based on the results of experiments we carried out. The research results showed, that in the case of systems providing sophisticated navigation options and a rigid structure of the content (which is characteristic for the most virtual learning environments), the paths completing is not an inevitable step in data preparation in the process of discovering patterns of web users ‘ behaviour
Data pre-processing for web log mining: Case study of commercial bank website usage analysis
We use data cleaning, integration, reduction and data conversion methods in the pre-processing level of data analysis. Data processing techniques improve the overall quality of the patterns mined. The paper describes using of standard pre-processing methods for preparing data of the commercial bank website in the form of the log file obtained from the web server. Data cleaning, as the simplest step of data pre-processing, is non–trivial as the analysed content is highly specific. We had to deal with the problem of frequent changes of the content and even frequent changes of the structure. Regular changes in the structure make use of the sitemap impossible. We presented approaches how to deal with this problem. We were able to create the sitemap dynamically just based on the content of the log file. In this case study, we also examined just the one part of the website over the standard analysis of an entire website, as we did not have access to all log files for the security reason. As the result, the traditional practices had to be adapted for this special case. Analysing just the small fraction of the website resulted in the short session time of regular visitors. We were not able to use recommended methods to determine the optimal value of session time. Therefore, we proposed new methods based on outliers identification for raising the accuracy of the session length in this paper
Data advance preparation factors affecting results of sequence rule analysis in web log mining
One of the main tasks of web log mining is discovering patterns of behaviour of portal visitors.
Based on the found patterns of users behaviour, which are represented by sequence rules it is
possible to modify and improve the web page of an organisation. This article aims at finding out by
means of an experiment to what degree it is necessary to realize data preparation for web log mi-
ning and it aims also at specifying inevitable steps for obtaining valid data from the log file. Results
of the experiment are very important for the portal, which is regularly analysed and modified, since
they can prove correctness of individual steps at analysis, or through an identification of “usele-
ss” steps they can make the advance preparation of data simpler. These results show that data
cleaning from crawlers accesses has a significant impact on the quantity of extracted rules only in
case, when we use the method of paths completion. On the contrary, the impact on the reduction
of the portion of inexplicable rules as well as the impact on the quality of extracted rules in terms
of their basic characteristics was not proved. Paths completing was proved crucial in data prepa-
ration for web log mining. It was proved that paths completing has a significant impact both on the
quantity and the quality of extracted
rules. However, it was prov
ed that allowing the used browser
upon identifying sessions has neither any significant impact on the quantity nor on the quality of
extracted rules. There exist a number of models for identification of users sessions, which are cru-
cial in data preparation,
however, there e
xists also a method, which identifies them expressly. Our
next goal is to additionally programme this functionality into the existing system and analyse various
parameters of individual methods of identification of sessions compared with the reference direct
identification. It also mentions the necessity to pay attention to the analysis of web logs in the real
time and to reduce the time needed for the advance preparation of these logs and at the same time
to increase accuracy of these data depending on the time of their collection