10,742 research outputs found
A rule-based machine learning model for financial fraud detection
Financial fraud is a growing problem that poses a significant threat to the banking industry, the government sector, and the public. In response, financial institutions must continuously improve their fraud detection systems. Although preventative and security precautions are implemented to reduce financial fraud, criminals are constantly adapting and devising new ways to evade fraud prevention systems. The classification of transactions as legitimate or fraudulent poses a significant challenge for existing classification models due to highly imbalanced datasets. This research aims to develop rules to detect fraud transactions that do not involve any resampling technique. The effectiveness of the rule-based model (RBM) is assessed using a variety of metrics such as accuracy, specificity, precision, recall, confusion matrix, Matthew’s correlation coefficient (MCC), and receiver operating characteristic (ROC) values. The proposed rule-based model is compared to several existing machine learning models such as random forest (RF), decision tree (DT), multi-layer perceptron (MLP), k-nearest neighbor (KNN), naive Bayes (NB), and logistic regression (LR) using two benchmark datasets. The results of the experiment show that the proposed rule-based model beat the other methods, reaching accuracy and precision of 0.99 and 0.99, respectively
A novel IoT intrusion detection framework using Decisive Red Fox optimization and descriptive back propagated radial basis function models.
The Internet of Things (IoT) is extensively used in modern-day life, such as in smart homes, intelligent transportation, etc. However, the present security measures cannot fully protect the IoT due to its vulnerability to malicious assaults. Intrusion detection can protect IoT devices from the most harmful attacks as a security tool. Nevertheless, the time and detection efficiencies of conventional intrusion detection methods need to be more accurate. The main contribution of this paper is to develop a simple as well as intelligent security framework for protecting IoT from cyber-attacks. For this purpose, a combination of Decisive Red Fox (DRF) Optimization and Descriptive Back Propagated Radial Basis Function (DBRF) classification are developed in the proposed work. The novelty of this work is, a recently developed DRF optimization methodology incorporated with the machine learning algorithm is utilized for maximizing the security level of IoT systems. First, the data preprocessing and normalization operations are performed to generate the balanced IoT dataset for improving the detection accuracy of classification. Then, the DRF optimization algorithm is applied to optimally tune the features required for accurate intrusion detection and classification. It also supports increasing the training speed and reducing the error rate of the classifier. Moreover, the DBRF classification model is deployed to categorize the normal and attacking data flows using optimized features. Here, the proposed DRF-DBRF security model's performance is validated and tested using five different and popular IoT benchmarking datasets. Finally, the results are compared with the previous anomaly detection approaches by using various evaluation parameters
An improved GBSO-TAENN-based EEG signal classification model for epileptic seizure detection.
Detection and classification of epileptic seizures from the EEG signals have gained significant attention in recent decades. Among other signals, EEG signals are extensively used by medical experts for diagnosing purposes. So, most of the existing research works developed automated mechanisms for designing an EEG-based epileptic seizure detection system. Machine learning techniques are highly used for reduced time consumption, high accuracy, and optimal performance. Still, it limits by the issues of high complexity in algorithm design, increased error value, and reduced detection efficacy. Thus, the proposed work intends to develop an automated epileptic seizure detection system with an improved performance rate. Here, the Finite Linear Haar wavelet-based Filtering (FLHF) technique is used to filter the input signals and the relevant set of features are extracted from the normalized output with the help of Fractal Dimension (FD) analysis. Then, the Grasshopper Bio-Inspired Swarm Optimization (GBSO) technique is employed to select the optimal features by computing the best fitness value and the Temporal Activation Expansive Neural Network (TAENN) mechanism is used for classifying the EEG signals to determine whether normal or seizure affected. Numerous intelligence algorithms, such as preprocessing, optimization, and classification, are used in the literature to identify epileptic seizures based on EEG signals. The primary issues facing the majority of optimization approaches are reduced convergence rates and higher computational complexity. Furthermore, the problems with machine learning approaches include a significant method complexity, intricate mathematical calculations, and a decreased training speed. Therefore, the goal of the proposed work is to put into practice efficient algorithms for the recognition and categorization of epileptic seizures based on EEG signals. The combined effect of the proposed FLHF, FD, GBSO, and TAENN models might dramatically improve disease detection accuracy while decreasing complexity of system along with time consumption as compared to the prior techniques. By using the proposed methodology, the overall average epileptic seizure detection performance is increased to 99.6% with f-measure of 99% and G-mean of 98.9% values
An HMM-based text classifier less sensitive to document management problems
Background: The performance of the text classification techniques is commonly affected by the characteristics and representation of the document corpora itself. Of all the problems arising from the corpus, there are three major difficulties which the classifiers must deal with: the feature selection issues, the class imbalance problem and the size of the training set. Objective: The objective of this paper is to present a novel based-content text classifier called T-LHMM that is less sensitive to the text representation and the size of the corpus, and more efficient in terms of running time than other classification techniques. Method: In order to demonstrate it, we present a set of experiments performed on well-known biomedical text corpora. We also compare our classifier with k-Nearest Neighbours and Support Vector Machine models. Results and Conclusion: The experimental and statistical results show that the proposed HMM-based text classifier is indeed less sensitive to the class imbalance, the size of the corpus and the vocabulary than the other classifiers. In addition, it is more efficient in terms of running time than k-NN and SVM techniques.European Union | Ref. [FP7/REGPOT-2012-2013.1], n. 316265, BIO- CAPSMinisterio de EconomÃa y Competitividad de España | Ref. TIN2013-47153-C3-3-
Fractal feature selection model for enhancing high-dimensional biological problems
The integration of biology, computer science, and statistics has given rise to the interdisciplinary field of bioinformatics, which aims to decode biological intricacies. It produces extensive and diverse features, presenting an enormous challenge in classifying bioinformatic problems. Therefore, an intelligent bioinformatics classification system must select the most relevant features to enhance machine learning performance. This paper proposes a feature selection model based on the fractal concept to improve the performance of intelligent systems in classifying high-dimensional biological problems. The proposed fractal feature selection (FFS) model divides features into blocks, measures the similarity between blocks using root mean square error (RMSE), and determines the importance of features based on low RMSE. The proposed FFS is tested and evaluated over ten high-dimensional bioinformatics datasets. The experiment results showed that the model significantly improved machine learning accuracy. The average accuracy rate was 79% with full features in machine learning algorithms, while FFS delivered promising results with an accuracy rate of 94%
An intelligent rule-oriented framework for extracting key factors for grants scholarships in higher education
Education is a fundamental sector in all countries, where in some countries students com-pete to get an educational grant due to its high cost. The incorporation of artificial intelli-gence in education holds great promise for the advancement of educational systems and pro-cesses. Educational data mining involves the analysis of data generated within educational environments to extract valuable insights into student performance and other factors that enhance teaching and learning. This paper aims to analyze the factors influencing students' performance and consequently, assist granting organizations in selecting suitable students in the Arab region (Jordan as a use case). The problem was addressed using a rule-based tech-nique to facilitate the utilization and implementation of a decision support system. To this end, three classical rule induction algorithms, namely PART, JRip, and RIDOR, were em-ployed. The data utilized in this study was collected from undergraduate students at the University of Jordan from 2010 to 2020. The constructed models were evaluated based on metrics such as accuracy, recall, precision, and f1-score. The findings indicate that the JRip algorithm outperformed PART and RIDOR in most of the datasets based on f1-score metric. The interpreted decision rules of the best models reveal that both features; the average study years and high school averages play vital roles in deciding which students should receive scholarships. The paper concludes with several suggested implications to support and en-hance the decision-making process of granting agencies in the realm of higher education
On the Generation of Realistic and Robust Counterfactual Explanations for Algorithmic Recourse
This recent widespread deployment of machine learning algorithms presents many new challenges. Machine learning algorithms are usually opaque and can be particularly difficult to interpret. When humans are involved, algorithmic and automated decisions can negatively impact people’s lives. Therefore, end users would like to be insured against potential harm. One popular way to achieve this is to provide end users access to algorithmic recourse, which gives end users negatively affected by algorithmic decisions the opportunity to reverse unfavorable decisions, e.g., from a loan denial to a loan acceptance. In this thesis, we design recourse algorithms to meet various end user needs. First, we propose methods for the generation of realistic recourses. We use generative models to suggest recourses likely to occur under the data distribution. To this end, we shift the recourse action from the input space to the generative model’s latent space, allowing to generate counterfactuals that lie in regions with data support. Second, we observe that small changes applied to the recourses prescribed to end users likely invalidate the suggested recourse after being nosily implemented in practice. Motivated by this observation, we design methods for the generation of robust recourses and for assessing the robustness of recourse algorithms to data deletion requests. Third, the lack of a commonly used code-base for counterfactual explanation and algorithmic recourse algorithms and the vast array of evaluation measures in literature make it difficult to compare the per formance of different algorithms. To solve this problem, we provide an open source benchmarking library that streamlines the evaluation process and can be used for benchmarking, rapidly developing new methods, and setting up new
experiments. In summary, our work contributes to a more reliable interaction of end users and machine learned models by covering fundamental aspects of the recourse process and suggests new solutions towards generating realistic and robust counterfactual explanations for algorithmic recourse
Efficient network management and security in 5G enabled internet of things using deep learning algorithms
The rise of fifth generation (5G) networks and the proliferation of internet-of-things (IoT) devices have created new opportunities for innovation and increased connectivity. However, this growth has also brought forth several challenges related to network management and security. Based on the review of literature it has been identified that majority of existing research work are limited to either addressing the network management issue or security concerns. In this paper, the proposed work has presented an integrated framework to address both network management and security concerns in 5G internet-of-things (IoT) network using a deep learning algorithm. Firstly, a joint approach of attention mechanism and long short-term memory (LSTM) model is proposed to forecast network traffic and optimization of network resources in a, service-based and user-oriented manner. The second contribution is development of reliable network attack detection system using autoencoder mechanism. Finally, a contextual model of 5G-IoT is discussed to demonstrate the scope of the proposed models quantifying the network behavior to drive predictive decision making in network resources and attack detection with performance guarantees. The experiments are conducted with respect to various statistical error analysis and other performance indicators to assess prediction capability of both traffic forecasting and attack detection model
Advances in machine learning algorithms for financial risk management
In this thesis, three novel machine learning techniques are introduced to address distinct
yet interrelated challenges involved in financial risk management tasks. These approaches
collectively offer a comprehensive strategy, beginning with the precise classification of credit
risks, advancing through the nuanced forecasting of financial asset volatility, and ending
with the strategic optimisation of financial asset portfolios.
Firstly, a Hybrid Dual-Resampling and Cost-Sensitive technique has been proposed to combat the prevalent issue of class imbalance in financial datasets, particularly in credit risk
assessment. The key process involves the creation of heuristically balanced datasets to effectively address the problem. It uses a resampling technique based on Gaussian mixture
modelling to generate a synthetic minority class from the minority class data and concurrently uses k-means clustering on the majority class. Feature selection is then performed
using the Extra Tree Ensemble technique. Subsequently, a cost-sensitive logistic regression
model is then applied to predict the probability of default using the heuristically balanced
datasets. The results underscore the effectiveness of our proposed technique, with superior
performance observed in comparison to other imbalanced preprocessing approaches. This
advancement in credit risk classification lays a solid foundation for understanding individual
financial behaviours, a crucial first step in the broader context of financial risk management.
Building on this foundation, the thesis then explores the forecasting of financial asset volatility, a critical aspect of understanding market dynamics. A novel model that combines a
Triple Discriminator Generative Adversarial Network with a continuous wavelet transform
is proposed. The proposed model has the ability to decompose volatility time series into
signal-like and noise-like frequency components, to allow the separate detection and monitoring of non-stationary volatility data. The network comprises of a wavelet transform
component consisting of continuous wavelet transforms and inverse wavelet transform components, an auto-encoder component made up of encoder and decoder networks, and a
Generative Adversarial Network consisting of triple Discriminator and Generator networks.
The proposed Generative Adversarial Network employs an ensemble of unsupervised loss derived from the Generative Adversarial Network component during training, supervised
loss and reconstruction loss as part of its framework. Data from nine financial assets are
employed to demonstrate the effectiveness of the proposed model. This approach not only
enhances our understanding of market fluctuations but also bridges the gap between individual credit risk assessment and macro-level market analysis.
Finally the thesis ends with a novel proposal of a novel technique or Portfolio optimisation. This involves the use of a model-free reinforcement learning strategy for portfolio
optimisation using historical Low, High, and Close prices of assets as input with weights of
assets as output. A deep Capsules Network is employed to simulate the investment strategy, which involves the reallocation of the different assets to maximise the expected return
on investment based on deep reinforcement learning. To provide more learning stability in
an online training process, a Markov Differential Sharpe Ratio reward function has been
proposed as the reinforcement learning objective function. Additionally, a Multi-Memory
Weight Reservoir has also been introduced to facilitate the learning process and optimisation of computed asset weights, helping to sequentially re-balance the portfolio throughout
a specified trading period. The use of the insights gained from volatility forecasting into
this strategy shows the interconnected nature of the financial markets. Comparative experiments with other models demonstrated that our proposed technique is capable of achieving
superior results based on risk-adjusted reward performance measures.
In a nut-shell, this thesis not only addresses individual challenges in financial risk management but it also incorporates them into a comprehensive framework; from enhancing the
accuracy of credit risk classification, through the improvement and understanding of market
volatility, to optimisation of investment strategies. These methodologies collectively show
the potential of the use of machine learning to improve financial risk management
- …