10 research outputs found

    Evidence of Students’ Academic Performance at the Federal College of Education Asaba Nigeria: Mining Education Data

    Get PDF
    One main objective of higher education is to provide quality education to its students. One way to achieve the highest level of quality in the higher education system is by discovering knowledge for prediction regarding enrolment of students in a particular course, alienation of traditional classroom teaching model, detection of unfair means used in online examination, detection of abnormal values in the result sheets of the students, and prediction about students’ performance. The knowledge is hidden among the educational data set and is extractable through data mining techniques. The present paper is designed to justify the capabilities of data mining techniques in the context of higher education by offering a data mining model for the higher education system in the university. In this research, the classification task is used to evaluate student’s performance, and as many approaches are used for data classification, the decision tree method is used here. By this, we extract data that describes students’ summative performance at semester’s end, helps to identify the dropouts and students who need special attention, and allows the teacher to provide appropriate advising/counseling

    An investigation into new kernels for categorical variables

    Get PDF
    Kernel-based methods first appeared in the form of support vector machines. Since the first Support Vector Machine (SVM) formulation in 1995, we have seen how the number of proposed kernel functions has quickly grown, and how these kernels have approached a wide range of problems and domains. The most common and direct applications of these methods are focused on continuous numeric data, given that SVMs at the end involves the solution of an optimization problem. Additionally, some kernel functions have been oriented to more symbolic data, in problems like text analysis, or hand-written digits recognition. But surprisingly, there is a gap in the area of kernel functions devoted to handle datasets with qualitative variables. One of the most common practices to overcome this lack consists on recoding the source qualitative information, making them suitable for applying numeric kernel functions. This thesis presents the development of new kernel functions that can better model symbolic information presented as categorical variables, in a direct way, and without the need of data preprocessing methods. The proposition is based on the use of probabilistic information (probability mass distribution) to compare the different modalities of a variable. Additionally, the idea is formulated through a modular schema, combining a set of components to obtain the kernel functions, facilitating the modification and extension of single components. The experimental results suggest an slightly improvement with respect to traditional kernel functions, in the accuracy obtained on classification problems. This progress is clearer on datasets with known probabilistic structure

    Die Rolle von FGF21 in der hepatisch-zerebralen Kommunikation

    Get PDF
    Adipositas geht mit verschiedenen Komorbiditäten einher wie einer erhöhten Prävalenz für das metabolische Syndrom, der nichtalkoholischen Fettlebererkrankung und zentralnervösen Erkrankungen. Ein prominenter Kandidat in der hepatisch-neuronalen Kommunikation ist der Fibroblast Growth Factor 21 (FGF21), der den Stoffwechsel bei gesunden und adipösen Individuen moduliert. Mit der vorgelegten Arbeit ist es gelungen, neurodegenerative und Adipositas-induzierte (neuro-)inflammatorische Prozesse durch verschiedene „Lifestyle“-Faktoren zu modulieren und mit FGF21 zu assoziieren

    Heuristic ensembles of filters for accurate and reliable feature selection

    Get PDF
    Feature selection has become increasingly important in data mining in recent years. However, the accuracy and stability of feature selection methods vary considerably when used individually, and yet no rule exists to indicate which one should be used for a particular dataset. Thus, an ensemble method that combines the outputs of several individual feature selection methods appears to be a promising approach to address the issue and hence is investigated in this research. This research aims to develop an effective ensemble that can improve the accuracy and stability of the feature selection. We proposed a novel heuristic ensemble of filters (HEF). It combines two types of filters: subset filters and ranking filters with a heuristic consensus algorithm in order to utilise the strength of each type. The ensemble is tested on ten benchmark datasets and its performance is evaluated by two stability measures and three classifiers. The experimental results demonstrate that HEF improves the stability and accuracy of the selected features and in most cases outperforms the other ensemble algorithms, individual filters and the full feature set. The research on the HEF algorithm is extended in several dimensions; including more filter members, three novel schemes of mean rank aggregation with partial lists, and three novel schemes for a weighted heuristic ensemble of filters. However, the experimental results demonstrate that adding weight to filters in HEF does not achieve the expected improvement in accuracy, but increases time and space complexity, and clearly decreases stability. Therefore, the core ensemble algorithm (HEF) is demonstrated to be not just simpler but also more reliable and consistent than the later more complicated and weighted ensembles. In addition, we investigated how to use data in feature selection, using ALL or PART of it. Systematic experiments with thirty five synthetic and benchmark real-world datasets were carried out

    Automated detection of depression from brain structural magnetic resonance imaging (sMRI) scans

    Full text link
     Automated sMRI-based depression detection system is developed whose components include acquisition and preprocessing, feature extraction, feature selection, and classification. The core focus of the research is on the establishment of a new feature selection algorithm that quantifies the most relevant brain volumetric feature for depression detection at an individual level

    Feature selection and hierarchical classifier design with applications to human motion recognition

    Get PDF
    The performance of a classifier is affected by a number of factors including classifier type, the input features and the desired output. This thesis examines the impact of feature selection and classification problem division on classification accuracy and complexity. Proper feature selection can reduce classifier size and improve classifier performance by minimizing the impact of noisy, redundant and correlated features. Noisy features can cause false association between the features and the classifier output. Redundant and correlated features increase classifier complexity without adding additional information. Output selection or classification problem division describes the division of a large classification problem into a set of smaller problems. Problem division can improve accuracy by allocating more resources to more difficult class divisions and enabling the use of more specific feature sets for each sub-problem. The first part of this thesis presents two methods for creating feature-selected hierarchical classifiers. The feature-selected hierarchical classification method jointly optimizes the features and classification tree-design using genetic algorithms. The multi-modal binary tree (MBT) method performs the class division and feature selection sequentially and tolerates misclassifications in the higher nodes of the tree. This yields a piecewise separation for classes that cannot be fully separated with a single classifier. Experiments show that the accuracy of MBT is comparable to other multi-class extensions, but with lower test time. Furthermore, the accuracy of MBT is significantly higher on multi-modal data sets. The second part of this thesis focuses on input feature selection measures. A number of filter-based feature subset evaluation measures are evaluated with the goal of assessing their performance with respect to specific classifiers. Although there are many feature selection measures proposed in literature, it is unclear which feature selection measures are appropriate for use with different classifiers. Sixteen common filter-based measures are tested on 20 real and 20 artificial data sets, which are designed to probe for specific feature selection challenges. The strengths and weaknesses of each measure are discussed with respect to the specific feature selection challenges in the artificial data sets, correlation with classifier accuracy and their ability to identify known informative features. The results indicate that the best filter measure is classifier-specific. K-nearest neighbours classifiers work well with subset-based RELIEF, correlation feature selection or conditional mutual information maximization, whereas Fisher's interclass separability criterion and conditional mutual information maximization work better for support vector machines. Based on the results of the feature selection experiments, two new filter-based measures are proposed based on conditional mutual information maximization, which performs well but cannot identify dependent features in a set and does not include a check for correlated features. Both new measures explicitly check for dependent features and the second measure also includes a term to discount correlated features. Both measures correctly identify known informative features in the artificial data sets and correlate well with classifier accuracy. The final part of this thesis examines the use of feature selection for time-series data by using feature selection to determine important individual time windows or key frames in the series. Time-series feature selection is used with the MBT algorithm to create classification trees for time-series data. The feature selected MBT algorithm is tested on two human motion recognition tasks: full-body human motion recognition from joint angle data and hand gesture recognition from electromyography data. Results indicate that the feature selected MBT is able to achieve high classification accuracy on the time-series data while maintaining a short test time

    Exploiting physiological changes during the flow experience for assessing virtual-reality game design.

    Get PDF
    Immersive experiences are considered the principal attraction of video games. Achieving a healthy balance between the game's demands and the user's skills is a particularly challenging goal. However, it is a coveted outcome, as it gives rise to the flow experience – a mental state of deep concentration and game engagement. When this balance fractures, the player may experience considerable disinclination to continue playing, which may be a product of anxiety or boredom. Thus, being able to predict manifestations of these psychological states in video game players is essential for understanding player motivation and designing better games. To this end, we build on earlier work to evaluate flow dynamics from a physiological perspective using a custom video game. Although advancements in this area are growing, there has been little consideration given to the interpersonal characteristics that may influence the expression of the flow experience. In this thesis, two angles are introduced that remain poorly understood. First, the investigation is contextualized in the virtual reality domain, a technology that putatively amplifies affective experiences, yet is still insufficiently addressed in the flow literature. Second, a novel analysis setup is proposed, whereby the recorded physiological responses and psychometric self-ratings are combined to assess the effectiveness of our game's design in a series of experiments. The analysis workflow employed heart rate and eye blink variability, and electroencephalography (EEG) as objective assessment measures of the game's impact, and self-reports as subjective assessment measures. These inputs were submitted to a clustering method, cross-referencing the membership of the observations with self-report ratings of the players they originated from. Next, this information was used to effectively inform specialized decoders of the flow state from the physiological responses. This approach successfully enabled classifiers to operate at high accuracy rates in all our studies. Furthermore, we addressed the compression of medium-resolution EEG sensors to a minimal set required to decode flow. Overall, our findings suggest that the approaches employed in this thesis have wide applicability and potential for improving game designing practices
    corecore