9 research outputs found
Multi label ranking based on positive pairwise correlations among labels
Multi-Label Classification (MLC) is a general type of classification that has attracted many researchers in the last few years. Two common approaches are being used to solve the problem of MLC: Problem Transformation Methods (PTMs) and Algorithm Adaptation Methods (AAMs). This Paper is more interested in the first approach; since it is more general and applicable to any domain. In specific, this paper aims to meet two objectives. The first objective is to propose a new multi-label ranking algorithm based on the positive pairwise correlations among labels, while the second objective aims to propose new simple PTMs that are based on labels correlations, and not based on labels frequency as in conventional PTMs. Experiments showed that the proposed algorithm overcomes the existing methods and algorithms on all evaluation metrics that have been used in the experiments. Also, the proposed PTMs show a superior performance when compared with the existing PTMs
A Deep Learning Approach Towards Student Performance Prediction in Online Courses: Challenges Based on a Global Perspective
Analyzing and evaluating students' progress in any learning environment is
stressful and time consuming if done using traditional analysis methods. This
is further exasperated by the increasing number of students due to the shift of
focus toward integrating the Internet technologies in education and the focus
of academic institutions on moving toward e-Learning, blended, or online
learning models. As a result, the topic of student performance prediction has
become a vibrant research area in recent years. To address this, machine
learning and data mining techniques have emerged as a viable solution. To that
end, this work proposes the use of deep learning techniques (CNN and RNN-LSTM)
to predict the students' performance at the midpoint stage of the online course
delivery using three distinct datasets collected from three different regions
of the world. Experimental results show that deep learning models have
promising performance as they outperform other optimized traditional ML models
in two of the three considered datasets while also having comparable
performance for the third dataset.Comment: Accepted and presented in 24th International Arab Conference on
Information Technology (ACIT'2023
Website Phishing Detection Using Machine Learning Techniques
Phishing is a cybercrime that is constantly increasing in the recent years due to the increased use of the Internet and its applications. It is one of the most common types of social engineering that aims to disclose or steel users sensitive or personal information. In this paper, two main objectives are considered. The first is to identify the best classifier that can detect phishing among twenty-four different classifiers that represent six learning strategies. The second objective aims to identify the best feature selection method for websites phishing datasets. Using two datasets that are related to Phishing with different characteristics and considering eight evaluation metrics, the results revealed the superiority of RandomForest, FilteredClassifier, and J-48 classifiers in detecting phishing websites. Also, InfoGainAttributeEval method showed the best performance among the four considered feature selection methods
Evaluating Conditional and Unconditional Correlations Capturing Strategies in Multi Label Classification
In the last few years, multi label classification has attracted many scholars and researchers; due to the increasing number of modern domains that are applicable to this general type of classification. Recently, it has been believed by many researchers that the best way to handle the problem of multi label classification is by exploiting the correlations among labels. Two main strategies have been utilized to capture these correlations: conditional correlations and unconditional correlations capturing strategies. In this paper, an extensive evaluation of both strategies has been conducted, to determine the best strategy to handle multi label classification, with respect to the size of the data set and the optimized loss function. Results showed that the unconditional correlations capturing strategy overcomes the conditional correlations capturing strategy in all multi label data sets that have been used in this experiment
Towards Advancing Distributed Data Mining: Intelligent Agent Systems
Distributed data mining (DDM) deals with the analysis of distributed data and proposes algorithmic solutions to perform different data analysis and mining operations in a distributed manner by considering resource constraints. Here, patterns are discovered, and predictions are implemented based on multiple distributed data sources. However, DDM faces several problems in terms of performance and implementation. Specifically, the main problems include issues of autonomy and privacy, which this paper aims to solve. The distributed sites must be in the same environment or organization and under the control of the same administration, implying their agreement on the same classification algorithm. Consequently, the best solution is the use of intelligent agent systems that contain a group of autonomous agents with communication and coordination facilities with different classification algorithms and environments, can collaborate with each other, and can decide whether and when to request information from other agents