48 research outputs found

    Data Mining Module

    Get PDF
    Tato práce se zabývá problematikou získávání znalostí z databází (ZZD), a to zejména klasifikací pomocí Support Vector Machines (SVM). Na FIT VUT v Brně je vyvjíjen systém pro ZZD s modulární strukturou. Pro popis procesu dolování se používá jazyk DMSL. Cílem práce bylo rozšířit DMSL o potřeby SVM klasifikátoru, navrhnout, implementovat a otestovat modul pro tento systém.This thesis concerns knowledge discovery in databases (KDD), especially classification by Support Vector Machines (SVM). System for KDD has been developed at FIT BUT. For KDD process description is used language DMSL. The goal of the thesis was to extend DMSL with respect to SVM classifier, propose, implement and test a module for this system.

    Ouroboros: early identification of at-risk students without models based on legacy data

    Get PDF
    This paper focuses on the problem of identifying students, who are at risk of failing their course. The presented method proposes a solution in the absence of data from previous courses, which are usually used for training machine learning models. This situation typically occurs in new courses. We present the concept of a "self-learner" that builds the machine learning models from the data generated during the current course. The approach utilises information about already submitted assessments, which introduces the problem of imbalanced data for training and testing the classification models. There are three main contributions of this paper: (1) the concept of training the models for identifying at-risk students using data from the current course, (2) specifying the problem as a classification task, and (3) tackling the challenge of imbalanced data, which appears both in training and testing data. The results show the comparison with the traditional approach of learning the models from the legacy course data, validating the proposed concept

    Cluster Analysis Module of a Data Mining System

    Get PDF
    Tato diplomová práce pojednává o tvorbě shlukovacího modulu k vyvíjenému dolovacímu systému DataMiner na FIT VUT v Brně. V dolovacím systému chyběl modul pro shlukovou analýzu. Hlavním cílem práce bylo proto rozšířit systém o algoritmy shlukové analýzy. Společně se mnou na modulu pracoval Pavel Riedl. S ním jsme vytvořili společnou část pro všechny algoritmy tak, aby bylo možné systém snadno rozšířit o další shlukovací algoritmy. Sám jsem systém rozšířil o algoritmy založené na hustotě DBSCAN, OPTICS a DENCLUE. Ty byly implementovány a jejich funkčnost ověřena na vhodném vzorku dat.This thesis deals with the design and implementation of a cluster analysis module for currently developing datamining system DataMiner on FIT BUT. So far, the system lacked cluster analysis module. The main objective of the thesis was therefore to extend the system of such a module. Together with me, Pavel Riedl worked on the module. We have created a common part for all the algorithms so that the system can be easily extended to other clustering algorithms. In the second part, I extended the clustering module by adding three density based clustering aglorithms - DBSCAN, OPTICS and DENCLUE. Algorithms have been implemented and appropriate sample data was chosen to verify theirs functionality.

    Investigating Influence of Demographic Factors on Study Recommenders

    Get PDF
    Recommender systems in e-learning platforms, can utilise various data about learners in order to provide them with the next best material to study. We build on our previous work, which defines the recommendations in terms of two measures (i.e. relevance and effort) calculated from data of successful students in the previous runs of the courses. In this paper we investigate the impact of students’ socio-demographic factors and analyse how these factors improved the recommendation. It has been shown that education and age were found to have a significant impact on engagement with materials

    Measures for recommendations based on past students' activity

    Get PDF
    This paper introduces two measures for the recommendation of study materials based on students' past study activity. We use records from the Virtual Learning Environment (VLE) and analyse the activity of previous students. We assume that the activity of past students represents patterns, which can be used as a basis for recommendations to current students.The measures we define are Relevance, for description of a supposed VLE activity derived from previous students of the course, and Effort, that represents the actual effort of individual current students. Based on these measures, we propose a composite measure, which we call Importance.We use data from the previous course presentations to evaluate of the consistency of students' behaviour. We use correlation of the defined measures Relevance and Average Effort to evaluate the behaviour of two different student cohorts and the Root Mean Square Error to measure the deviation of Average Effort and individual student Effort
    corecore