25 research outputs found

    Improved Instance Selection Methods for Support Vector Machine Speed Optimization

    Get PDF
    Support vector machine (SVM) is one of the top picks in pattern recognition and classification related tasks. It has been used successfully to classify linearly separable and nonlinearly separable data with high accuracy. However, in terms of classification speed, SVMs are outperformed by many machine learning algorithms, especially, when massive datasets are involved. SVM classification speed scales linearly with number of support vectors, and support vectors increase with increase in dataset size. Hence, SVM classification speed can be enormously reduced if it is trained on a reduced dataset. Instance selection techniques are one of the most effective techniques suitable for minimizing SVM training time. In this study, two instance selection techniques suitable for identifying relevant training instances are proposed. The techniques are evaluated on a dataset containing 4000 emails and results obtained compared to other existing techniques. Result reveals excellent improvement in SVM classification speed

    Improved techniques for phishing email detection based on random forest and firefly-based support vector machine learning algorithms.

    Get PDF
    Master of Science in Computer Science. University of KwaZulu-Natal, Durban, 2014.Electronic fraud is one of the major challenges faced by the vast majority of online internet users today. Curbing this menace is not an easy task, primarily because of the rapid rate at which fraudsters change their mode of attack. Many techniques have been proposed in the academic literature to handle e-fraud. Some of them include: blacklist, whitelist, and machine learning (ML) based techniques. Among all these techniques, ML-based techniques have proven to be the most efficient, because of their ability to detect new fraudulent attacks as they appear.There are three commonly perpetrated electronic frauds, namely: email spam, phishing and network intrusion. Among these three, more financial loss has been incurred owing to phishing attacks. This research investigates and reports the use of MLand Nature Inspired technique in the domain of phishing detection, with the foremost objective of developing a dynamic and robust phishing email classifier with improved classification accuracy and reduced processing time.Two approaches to phishing email detection are proposed, and two email classifiers are developed based on the proposed approaches. In the first approach, a random forest algorithm is used to construct decision trees,which are,in turn,used for email classification. The second approach introduced a novel MLmethod that hybridizes firefly algorithm (FFA) and support vector machine (SVM). The hybridized method consists of three major stages: feature extraction phase, hyper-parameter selection phase and email classification phase. In the feature extraction phase, the feature vectors of all the features described in Section 3.6 are extracted and saved in a file for easy access.In the second stage, a novel hyper-parameter search algorithm, developed in this research, is used to generate exponentially growing sequence of paired C and Gamma (γ) values. FFA is then used to optimize the generated SVM hyper-parameters and to also find the best hyper-parameter pair. Finally, in the third phase, SVM is used to carry out the classification. This new approach addresses the problem of hyper-parameter optimization in SVM, and in turn, improves the classification speed and accuracy of SVM. Using two publicly available email datasets, some experiments are performed to evaluate the performance of the two proposed phishing email detection techniques. During the evaluation of each approach, a set of features (well suited for phishing detection) are extracted from the training dataset and used to constructthe classifiers. Thereafter, the trained classifiers are evaluated on the test dataset. The evaluations produced very good results. The RF-based classifier yielded a classification accuracy of 99.70%, a FP rate of 0.06% and a FN rate of 2.50%. Also, the hybridized classifier (known as FFA_SVM) produced a classification accuracy of 99.99%, a FP rate of 0.01% and a FN rate of 0.00%

    Intelligent instance selection techniques for support vector machine speed optimization with application to e-fraud detection.

    Get PDF
    Doctor of Philosophy in Computer Science. University of KwaZulu-Natal, Durban 2017.Decision-making is a very important aspect of many businesses. There are grievous penalties involved in wrong decisions, including financial loss, damage of company reputation and reduction in company productivity. Hence, it is of dire importance that managers make the right decisions. Machine Learning (ML) simplifies the process of decision making: it helps to discover useful patterns from historical data, which can be used for meaningful decision-making. The ability to make strategic and meaningful decisions is dependent on the reliability of data. Currently, many organizations are overwhelmed with vast amounts of data, and unfortunately, ML algorithms cannot effectively handle large datasets. This thesis therefore proposes seven filter-based and five wrapper-based intelligent instance selection techniques for optimizing the speed and predictive accuracy of ML algorithms, with a particular focus on Support Vector Machine (SVM). Also, this thesis proposes a novel fitness function for instance selection. The primary difference between the filter-based and wrapper-based technique is in their method of selection. The filter-based techniques utilizes the proposed fitness function for selection, while the wrapper-based technique utilizes SVM algorithm for selection. The proposed techniques are obtained by fusing SVM algorithm with the following Nature Inspired algorithms: flower pollination algorithm, social spider algorithm, firefly algorithm, cuckoo search algorithm and bat algorithm. Also, two of the filter-based techniques are boundary detection algorithms, inspired by edge detection in image processing and edge selection in ant colony optimization. Two different sets of experiments were performed in order to evaluate the performance of the proposed techniques (wrapper-based and filter-based). All experiments were performed on four datasets containing three popular e-fraud types: credit card fraud, email spam and phishing email. In addition, experiments were performed on 20 datasets provided by the well-known UCI data repository. The results show that the proposed filter-based techniques excellently improved SVM training speed in 100% (24 out of 24) of the datasets used for evaluation, without significantly affecting SVM classification quality. Moreover, experimental results also show that the wrapper-based techniques consistently improved SVM predictive accuracy in 78% (18 out of 23) of the datasets used for evaluation and simultaneously improved SVM training speed in all cases. Furthermore, two different statistical tests were conducted to further validate the credibility of the results: Freidman’s test and Holm’s post-hoc test. The statistical test results reveal that the proposed filter-based and wrapper-based techniques are significantly faster, compared to standard SVM and some existing instance selection techniques, in all cases. Moreover, statistical test results also reveal that Cuckoo Search Instance Selection Algorithm outperform all the proposed techniques, in terms of speed. Overall, the proposed techniques have proven to be fast and accurate ML-based e-fraud detection techniques, with improved training speed, predictive accuracy and storage reduction. In real life application, such as video surveillance and intrusion detection systems, that require a classifier to be trained very quickly for speedy classification of new target concepts, the filter-based techniques provide the best solutions; while the wrapper-based techniques are better suited for applications, such as email filters, that are very sensitive to slight changes in predictive accuracy

    Methanol fraction of Calliandra portoricensis root bark activates caspases via alteration in mitochondrial viability in vivo

    Get PDF
    Introduction: Dysregulated apoptosis is associated with a number of disease conditions. Traditionally, Calliandra portoricensis is used in the management of prostate enlargement. This study investigates the in vivo effect of potent methanol fraction of C. portoricensis (MFCP) on mitochondrial permeability transition (mPT) pore, an important pharmacological target in treatment of various diseases, and examines the toxicities associated with its oral administration. Methods: Forty-two male Wistar strain rats (70-80 g) were divided into 6 groups of 7 animals each. Each group was orally administered 25, 50, 100, 200, 400 mg/kg MFCP and the control group received distilled water for 21 and 30 days, respectively. mPT, assay for serum enzymes and hematological parameters were assessed spectrophotometrically while activation of caspases 3 and 9 was done by ELISA technique. Histological assessment of vital organs (liver, kidney, prostate) was carried out according to standard procedures. Results: There were no significant effects on mPT pore at all doses administered after 21 days of oral administration. However, after 30 days of administration, MFCP induced mPT pore opening at doses 100 and 200 mg/kg with induction folds of 2.6 and 3.3, respectively while there was no induction of mPT pore opening at lower doses of 25 mg/kg and 50 mg/kg. Furthermore, significant (P < 0.05) increases in serum enzymes (ALT, AST) were observed at all doses administered when compared with control after 30 days of oral administration. Cell counts (Hb, PCV, RBC, WBC) were adversely affected at the highest dose (200 mg/kg) compared with control and other treated groups (25, 50 and 100 mg/kg) after 30 days of administration. Similarly, activation of caspases 9 and 3 were observed in rat liver homogenate at high doses of the fraction while histological evaluation showed degeneration and distortion of organs at the highest dose. Conclusion: MFCP contains phytochemicals that elicit the opening of the pore and induction of mitochondrial-mediated apoptosis. This would be relevant in treatment of degenerative diseases that results from down-regulation of apoptosis. However, caution should be exercised in using high doses of the plant

    COVID-19 Diagnosis in Computerized Tomography (CT) and X-ray Scans Using Capsule Neural Network

    No full text
    This study proposes a deep-learning-based solution (named CapsNetCovid) for COVID-19 diagnosis using a capsule neural network (CapsNet). CapsNets are robust for image rotations and affine transformations, which is advantageous when processing medical imaging datasets. This study presents a performance analysis of CapsNets on standard images and their augmented variants for binary and multi-class classification. CapsNetCovid was trained and evaluated on two COVID-19 datasets of CT images and X-ray images. It was also evaluated on eight augmented datasets. The results show that the proposed model achieved classification accuracy, precision, sensitivity, and F1-score of 99.929%, 99.887%, 100%, and 99.319%, respectively, for the CT images. It also achieved a classification accuracy, precision, sensitivity, and F1-score of 94.721%, 93.864%, 92.947%, and 93.386%, respectively, for the X-ray images. This study presents a comparative analysis between CapsNetCovid, CNN, DenseNet121, and ResNet50 in terms of their ability to correctly identify randomly transformed and rotated CT and X-ray images without the use of data augmentation techniques. The analysis shows that CapsNetCovid outperforms CNN, DenseNet121, and ResNet50 when trained and evaluated on CT and X-ray images without data augmentation. We hope that this research will aid in improving decision making and diagnostic accuracy of medical professionals when diagnosing COVID-19

    Classification of Phishing Email Using Random Forest Machine Learning Technique

    No full text
    Phishing is one of the major challenges faced by the world of e-commerce today. Thanks to phishing attacks, billions of dollars have been lost by many companies and individuals. In 2012, an online report put the loss due to phishing attack at about $1.5 billion. This global impact of phishing attacks will continue to be on the increase and thus requires more efficient phishing detection techniques to curb the menace. This paper investigates and reports the use of random forest machine learning algorithm in classification of phishing attacks, with the major objective of developing an improved phishing email classifier with better prediction accuracy and fewer numbers of features. From a dataset consisting of 2000 phishing and ham emails, a set of prominent phishing email features (identified from the literature) were extracted and used by the machine learning algorithm with a resulting classification accuracy of 99.7% and low false negative (FN) and false positive (FP) rates

    Brain Tumor Diagnosis Using Machine Learning, Convolutional Neural Networks, Capsule Neural Networks and Vision Transformers, Applied to MRI: A Survey

    No full text
    Management of brain tumors is based on clinical and radiological information with presumed grade dictating treatment. Hence, a non-invasive assessment of tumor grade is of paramount importance to choose the best treatment plan. Convolutional Neural Networks (CNNs) represent one of the effective Deep Learning (DL)-based techniques that have been used for brain tumor diagnosis. However, they are unable to handle input modifications effectively. Capsule neural networks (CapsNets) are a novel type of machine learning (ML) architecture that was recently developed to address the drawbacks of CNNs. CapsNets are resistant to rotations and affine translations, which is beneficial when processing medical imaging datasets. Moreover, Vision Transformers (ViT)-based solutions have been very recently proposed to address the issue of long-range dependency in CNNs. This survey provides a comprehensive overview of brain tumor classification and segmentation techniques, with a focus on ML-based, CNN-based, CapsNet-based, and ViT-based techniques. The survey highlights the fundamental contributions of recent studies and the performance of state-of-the-art techniques. Moreover, we present an in-depth discussion of crucial issues and open challenges. We also identify some key limitations and promising future research directions. We envisage that this survey shall serve as a good springboard for further study

    A Survey

    No full text
    Akinyelu, A. A., Zaccagna, F., Grist, J. T., Castelli, M., & Rundo, L. (2022). Brain Tumor Diagnosis Using Machine Learning, Convolutional Neural Networks, Capsule Neural Networks and Vision Transformers, Applied to MRI: A Survey. Journal of Imaging, 8(8), 1-40. [205]. https://doi.org/10.3390/jimaging8080205 -------------- Funding: We gratefully acknowledge financial support from FCT Fundação para a Ciência e a Tecnologia (Portugal), national funding through research grant Information Management Research Center—MagIC/NOVA IMS (UIDB/04152/2020).Management of brain tumors is based on clinical and radiological information with presumed grade dictating treatment. Hence, a non-invasive assessment of tumor grade is of paramount importance to choose the best treatment plan. Convolutional Neural Networks (CNNs) represent one of the effective Deep Learning (DL)-based techniques that have been used for brain tumor diagnosis. However, they are unable to handle input modifications effectively. Capsule neural networks (CapsNets) are a novel type of machine learning (ML) architecture that was recently developed to address the drawbacks of CNNs. CapsNets are resistant to rotations and affine translations, which is beneficial when processing medical imaging datasets. Moreover, Vision Transformers (ViT)-based solutions have been very recently proposed to address the issue of long-range dependency in CNNs. This survey provides a comprehensive overview of brain tumor classification and segmentation techniques, with a focus on ML-based, CNN-based, CapsNet-based, and ViT-based techniques. The survey highlights the fundamental contributions of recent studies and the performance of state-of-the-art techniques. Moreover, we present an in-depth discussion of crucial issues and open challenges. We also identify some key limitations and promising future research directions. We envisage that this survey shall serve as a good springboard for further study.publishersversionpublishe
    corecore