16 research outputs found

    Pillar 3 and Modelling of Stakeholders’ Behaviour at the Commercial Bank Website during the Recent Financial Crisis

    Get PDF
    AbstractThe paper analyses domestic and foreign market participants’ interests in mandatory Basel 2, Pillar 3 information disclosure of a commercial bank during the recent financial crisis. The authors try to ascertain whether the purposes of Basel 2 regulations under the Pillar 3 - Market discipline, publishing the financial and risk related information, have been fulfilled. Therefore, the paper focuses on modelling of visitors’ behaviour at the commercial bank website where information according to Basel 2 is available. The authors present a detailed analysis of the user log data stored by web servers. The analysis can help better understand the rate of use of the mandatory and optional Pillar 3 information disclosure web pages at the commercial bank website in the recent financial crisis in Slovakia. The authors used association rule analysis to identify the association among content categories of the website. The results show that there is in general a small interest of stakeholders in mandating the commercial bank's disclosure of financial information. Foreign website visitors were more concerned about information disclosure according to Pillar 3, Basel 2 regulation, and they have less interest in general information about the bank than domestic ones

    Conceptual framework for programming skilss development based on microlearning and automated source code evaluation in virtual learning environment

    Get PDF
    Understanding how software works and writing a program are currently frequent requirements when hiring employees. The complexity of learning programming often results in educational failures, student frustration and lack of motivation, because different students prefer different learning paths. Although e-learning courses have led to many improvements in the methodology and the supporting technology for more effective programming learning, misunderstanding of programming principles is one of the main reasons for students leaving school early. Universities face a challenging task: how to harmonise students’ education, focusing on advanced knowledge in the development of software applications, with students’ education in cases where writing code is a new skill. The article proposes a conceptual framework focused on the comprehensive training of future programmers using microlearning and automatic evaluation of source codes to achieve immediate feedback for students. This framework is designed to involve students in the software development of virtual learning environment software that will provide their education, thus ensuring the sustainability of the environment in line with modern development trends. The paper’s final part is devoted to verifying the contribution of the presented elements through quantitative research on the introductory parts of the framework. It turned out that although the application of interactive features did not lead to significant measurable progress during the first semester of study, it significantly improved the results of students in subsequent courses focused on advanced programming

    Coreference Resolution for Improving Performance Measures of Classification Tasks

    No full text
    There are several possibilities to improve classification in natural language processing tasks. In this article, we focused on the issue of coreference resolution that was applied to a manually annotated dataset of true and fake news. This dataset was used for the classification task of fake news detection. The research aimed to determine whether performing coreference resolution on the input data before classification or classifying them without performing coreference resolution is more effective. We also wanted to verify whether it is possible to enhance classifier performance metrics by incorporating coreference resolution into the data preparation process. A methodology was proposed, in which we described the implementation methods in detail, starting from the identification of entity mentions in the text using the neuralcoref algorithm, then through word-embedding models (TF–IDF, Doc2Vec), and finally to several machine learning methods. The result was a comparison of the implemented classifiers based on the performance metrics described in the theoretical part. The best result for accuracy was observed for the dataset with coreference resolution applied, which had a median value of 0.8149, while for the F1 score, the best result had a median value of 0.8101. However, the more important finding is that the processed data with the application of coreference resolution led to an improvement in performance metrics in the classification tasks

    Improvement of Misleading and Fake News Classification for Flective Languages by Morphological Group Analysis

    No full text
    Due to the constantly evolving social media and different types of sources of information, we are facing different fake news and different types of misinformation. Currently, we are working on a project to identify applicable methods for identifying fake news for floating language types. We explored different approaches to detect fake news in the presented research, which are based on morphological analysis. This is one of the basic components of natural language processing. The aim of the article is to find out whether it is possible to improve the methods of dataset preparation based on morphological analysis. We collected our own and unique dataset, which consisted of articles from verified publishers and articles from news portals that are known as the publishers of fake and misleading news. Articles were in the Slovak language, which belongs to the floating types of languages. We explored different approaches in this article to the dataset preparation based on morphological analysis. The prepared datasets were the input data for creating the classifier of fake and real news. We selected decision trees for classification. The evaluation of the success of two different methods of preparation was carried out because of the success of the created classifier. We found a suitable dataset pre-processing technique by morphological group analysis. This technique could be used for improving fake news classification

    Framework for e-Learning Materials Optimization

    No full text
    Creating educational materials (activities, e-books etc.) in each e-learning course can be divided into 2 main parts. The first can be defined as a compilation of ideas and information that we want to pass on to the student. This section of building e-learning materials process is very abstract and correct selection of what we want to teach the students is highly delicate and depends on teacher’s skills and didactic principles. The second phase is also important but it can be formalized. Main aim of this paper is to define and confirms a set of formal rules compiled into framework which can be used as a tool for building e-learning materials. We assume that the rules presented in this paper can be used for each e-learning platform. To confirm validity of defined rules we integrated these rules into module on LMS Moodle and part of this paper is proposed experiment carried out on the same platfor

    Využití n-gramů z morfologických značek pro klasifikaci falešných zpráv

    No full text
    Research of the techniques for effective fake news detection has become very needed and attractive. These techniques have a background in many research disciplines, including morphological analysis. Several researchers stated that simple content related n-grams and POS tagging had been proven insufficient for fake news classification. However, they did not realise any empirical research results, which could confirm these statements experimentally in the last decade. Considering this contradiction, the main aim of the paper is to experimentally evaluate the potential of the common use of n-grams and POS tags for the correct classification of fake and true news. The dataset of published fake or real news about the current Covid-19 pandemic was pre-processed using morphological analysis. As a result, n-grams of POS tags were prepared and further analysed. Three techniques based on POS tags were proposed and applied to different groups of n-grams in the pre-processing phase of fake news detection. The n-gram size was examined as the first. Subsequently, the most suitable depth of the decision trees for sufficient generalization was scoped. Finally, the performance measures of models based on the proposed techniques were compared with the standardised reference TF-IDF technique. The performance measures of the model like accuracy, precision, recall and f1-score are considered, together with the 10-fold cross-validation technique. Simultaneously, the question, whether the TF-IDF technique can be improved using POS tags was researched in detail. The results showed that the newly proposed techniques are comparable with the traditional TF-IDF technique. At the same time, it can be stated that the morphological analysis can improve the baseline TF-IDF technique. As a result, the performance measures of the model, precision for fake news and recall for real news, were statistically significantly improved.Výzkum technik pro účinnou detekci falešných zpráv se stal velmi potřebným a atraktivním. Tyto techniky mají zázemí v mnoha výzkumných disciplínách, včetně morfologické analýzy. Několik výzkumníků uvedlo, že jednoduché n-gramy související s obsahem a POS tagování se ukázaly jako nedostatečné pro klasifikaci falešných zpráv. V posledním desetiletí však nerealizovali žádné výsledky empirického výzkumu, které by tato tvrzení experimentálně potvrdily. Vzhledem k tomuto rozporu je hlavním cílem článku experimentálně zhodnotit potenciál běžného použití n-gramů a POS tagů pro správnou klasifikaci falešných a pravdivých zpráv. Datový soubor publikovaných falešných či pravdivých zpráv o současné pandemii Covid-19 byl předem zpracován pomocí morfologické analýzy. Výsledkem byla příprava n-gramů POS tagů, které byly dále analyzovány. Byly navrženy tři techniky založené na POS značkách, které byly aplikovány na různé skupiny n-gramů ve fázi předzpracování detekce falešných zpráv. Jako první byla zkoumána velikost n-gramů. Následně byla zakreslena nejvhodnější hloubka rozhodovacích stromů pro dostatečnou generalizaci. Nakonec byly porovnány míry výkonnosti modelů založených na navržených technikách se standardizovanou referenční technikou TF-IDF. Uvažuje se o výkonnostních mírách modelu, jako je přesnost, precision, recall a f1-skóre, spolu s technikou desetinásobné křížové validace. Současně byla podrobně zkoumána otázka, zda lze techniku TF-IDF zlepšit pomocí POS značek. Výsledky ukázaly, že nově navržené techniky jsou srovnatelné s tradiční technikou TF-IDF. Zároveň lze konstatovat, že morfologická analýza může zlepšit základní techniku TF-IDF. V důsledku toho se statisticky významně zlepšily výkonnostní ukazatele modelu, precision pro falešné zprávy a recall pro skutečné zprávy

    Natural Sciences Publishing Cor. Methodology Design for Data Preparation in the Process of Discovering Patterns of Web Users Behaviour

    No full text
    Abstract: Discovering of behaviour patterns of website visitors is one of the most common applications in web log mining. Based on the discovered users ’ behaviour patterns, it is possible to restructure or in combination with other knowledge personalize the examined website, portal or other web-based system. Data preparation represents the first inevitable step in the process of discovering users’ behavioural patterns. In this paper we summarize the results of our previous research, where we carefully examined the relevance of individual steps of data preparation from a web server log file and virtual learning environment for further analysis. The aim of our experiments was to find out to what extent it is necessary to realize the time-consuming data preparation in the process of discovering patterns of behaviour of web users and to determine the inevitable steps to obtain reliable data from different types of log files. Considering the obtained results we propose a methodology for data preparation in the process of discovering patterns of web user behaviour based on the results of experiments we carried out. The research results showed, that in the case of systems providing sophisticated navigation options and a rigid structure of the content (which is characteristic for the most virtual learning environments), the paths completing is not an inevitable step in data preparation in the process of discovering patterns of web users ‘ behaviour

    Data pre-processing for web log mining: Case study of commercial bank website usage analysis

    No full text
    We use data cleaning, integration, reduction and data conversion methods in the pre-processing level of data analysis. Data processing techniques improve the overall quality of the patterns mined. The paper describes using of standard pre-processing methods for preparing data of the commercial bank website in the form of the log file obtained from the web server. Data cleaning, as the simplest step of data pre-processing, is non–trivial as the analysed content is highly specific. We had to deal with the problem of frequent changes of the content and even frequent changes of the structure. Regular changes in the structure make use of the sitemap impossible. We presented approaches how to deal with this problem. We were able to create the sitemap dynamically just based on the content of the log file. In this case study, we also examined just the one part of the website over the standard analysis of an entire website, as we did not have access to all log files for the security reason. As the result, the traditional practices had to be adapted for this special case. Analysing just the small fraction of the website resulted in the short session time of regular visitors. We were not able to use recommended methods to determine the optimal value of session time. Therefore, we proposed new methods based on outliers identification for raising the accuracy of the session length in this paper

    Data advance preparation factors affecting results of sequence rule analysis in web log mining

    Get PDF
    One of the main tasks of web log mining is discovering patterns of behaviour of portal visitors. Based on the found patterns of users behaviour, which are represented by sequence rules it is possible to modify and improve the web page of an organisation. This article aims at finding out by means of an experiment to what degree it is necessary to realize data preparation for web log mi- ning and it aims also at specifying inevitable steps for obtaining valid data from the log file. Results of the experiment are very important for the portal, which is regularly analysed and modified, since they can prove correctness of individual steps at analysis, or through an identification of “usele- ss” steps they can make the advance preparation of data simpler. These results show that data cleaning from crawlers accesses has a significant impact on the quantity of extracted rules only in case, when we use the method of paths completion. On the contrary, the impact on the reduction of the portion of inexplicable rules as well as the impact on the quality of extracted rules in terms of their basic characteristics was not proved. Paths completing was proved crucial in data prepa- ration for web log mining. It was proved that paths completing has a significant impact both on the quantity and the quality of extracted rules. However, it was prov ed that allowing the used browser upon identifying sessions has neither any significant impact on the quantity nor on the quality of extracted rules. There exist a number of models for identification of users sessions, which are cru- cial in data preparation, however, there e xists also a method, which identifies them expressly. Our next goal is to additionally programme this functionality into the existing system and analyse various parameters of individual methods of identification of sessions compared with the reference direct identification. It also mentions the necessity to pay attention to the analysis of web logs in the real time and to reduce the time needed for the advance preparation of these logs and at the same time to increase accuracy of these data depending on the time of their collection
    corecore