159,016 research outputs found
Comparing a Hybrid Multi-layered Machine Learning Intrusion Detection System to Single-layered and Deep Learning Models
Advancements in computing technology have created additional network attack surface, allowed the development of new attack types, and increased the impact caused by an attack. Researchers agree, current intrusion detection systems (IDSs) are not able to adapt to detect these new attack forms, so alternative IDS methods have been proposed. Among these methods are machine learning-based intrusion detection systems. This research explores the current relevant studies related to intrusion detection systems and machine learning models and proposes a new hybrid machine learning IDS model consisting of the Principal Component Analysis (PCA) and Support Vector Machine (SVM) learning algorithms. The NSL-KDD Dataset, benchmark dataset for IDSs, is used for comparing the models’ performance. The performance accuracy and false-positive rate of the hybrid model are compared to the results of the model’s individual algorithmic components to determine which components most impact attack prediction performance. The performance metrics of the hybrid model are also compared to two deep learning Autoencoder Neuro Network models and the results found that the complexity of the model does not add to the performance accuracy. The research showed that pre-processing and feature selection impact the predictive accuracy across models. Future research recommendations were to implement the proposed hybrid IDS model into a live network for testing and analysis, and to focus research into the pre-processing algorithms that improve performance accuracy, and lower false-positive rate. This research indicated that pre-processing and feature selection/feature extraction can increase model performance accuracy and decrease false-positive rate helping businesses to improve network security
Subsumption is a Novel Feature Reduction Strategy for High Dimensionality Datasets
High dataset dimensionality poses challenges for machine learning classifiers because of high computational costs and the adverse consequences of redundant features. Feature reduction is an attractive remedy to high dimensionality. Three different feature reduction strategies (subsumption, Relief F, and principal component analysis) were evaluated using four machine learning classifiers on a high dimension dataset with 474 unique features, 20 diagnoses, and 364 instances. All three feature reduction strategies proved capable of significant feature reduction while maintaining classification accuracy. At high levels of feature reduction, the principal components strategy outperformed Relief F and subsumption. Subsumption is a novel strategy for feature reduction if features are organized in a hierarchical ontology
Recommended from our members
Performance Comparison of Knowledge-Based Dose Prediction Techniques Based on Limited Patient Data.
PurposeThe accuracy of dose prediction is essential for knowledge-based planning and automated planning techniques. We compare the dose prediction accuracy of 3 prediction methods including statistical voxel dose learning, spectral regression, and support vector regression based on limited patient training data.MethodsStatistical voxel dose learning, spectral regression, and support vector regression were used to predict the dose of noncoplanar intensity-modulated radiation therapy (4Ï€) and volumetric-modulated arc therapy head and neck, 4Ï€ lung, and volumetric-modulated arc therapy prostate plans. Twenty cases of each site were used for k-fold cross-validation, with k = 4. Statistical voxel dose learning bins voxels according to their Euclidean distance to the planning target volume and uses the median to predict the dose of new voxels. Distance to the planning target volume, polynomial combinations of the distance components, planning target volume, and organ at risk volume were used as features for spectral regression and support vector regression. A total of 28 features were included. Principal component analysis was performed on the input features to test the effect of dimension reduction. For the coplanar volumetric-modulated arc therapy plans, separate models were trained for voxels within the same axial slice as planning target volume voxels and voxels outside the primary beam. The effect of training separate models for each organ at risk compared to all voxels collectively was also tested. The mean squared error was calculated to evaluate the voxel dose prediction accuracy.ResultsStatistical voxel dose learning using separate models for each organ at risk had the lowest root mean squared error for all sites and modalities: 3.91 Gy (head and neck 4Ï€), 3.21 Gy (head and neck volumetric-modulated arc therapy), 2.49 Gy (lung 4Ï€), and 2.35 Gy (prostate volumetric-modulated arc therapy). Compared to using the original features, principal component analysis reduced the 4Ï€ prediction error for head and neck spectral regression (-43.9%) and support vector regression (-42.8%) and lung support vector regression (-24.4%) predictions. Principal component analysis was more effective in using all/most of the possible principal components. Separate organ at risk models were more accurate than training on all organ at risk voxels in all cases.ConclusionCompared with more sophisticated parametric machine learning methods with dimension reduction, statistical voxel dose learning is more robust to patient variability and provides the most accurate dose prediction method
Robust automated detection of microstructural white matter degeneration in Alzheimer’s disease using machine learning classification of multicenter DTI data
Diffusion tensor imaging (DTI) based assessment of white matter fiber tract integrity can support the diagnosis of Alzheimer’s disease (AD). The use of DTI as a biomarker, however, depends on its applicability in a multicenter setting accounting for effects of different MRI scanners. We applied multivariate machine learning (ML) to a large multicenter sample from the recently created framework of the European DTI study on Dementia (EDSD). We hypothesized that ML approaches may amend effects of multicenter acquisition. We included a sample of 137 patients with clinically probable AD (MMSE 20.6±5.3) and 143 healthy elderly controls, scanned in nine different scanners. For diagnostic classification we used the DTI indices fractional anisotropy (FA) and mean diffusivity (MD) and, for comparison, gray matter and white matter density maps from anatomical MRI. Data were classified using a Support Vector Machine (SVM) and a Naïve Bayes (NB) classifier. We used two cross-validation approaches, (i) test and training samples randomly drawn from the entire data set (pooled cross-validation) and (ii) data from each scanner as test set, and the data from the remaining scanners as training set (scanner-specific cross-validation). In the pooled cross-validation, SVM achieved an accuracy of 80% for FA and 83% for MD. Accuracies for NB were significantly lower, ranging between 68% and 75%. Removing variance components arising from scanners using principal component analysis did not significantly change the classification results for both classifiers. For the scanner-specific cross-validation, the classification accuracy was reduced for both SVM and NB. After mean correction, classification accuracy reached a level comparable to the results obtained from the pooled cross-validation. Our findings support the notion that machine learning classification allows robust classification of DTI data sets arising from multiple scanners, even if a new data set comes from a scanner that was not part of the training sample
Analysis of Nifty 50 index stock market trends using hybrid machine learning model in quantum finance
Predicting equities market trends is one of the most challenging tasks for market participants. This study aims to apply machine learning algorithms to aid in accurate Nifty 50 index trend predictions. The paper compares and contrasts four forecasting methods: artificial neural networks (ANN), support vector machines (SVM), naive bayes (NB), and random forest (RF). In this study, the eight technical indicators are used, and then the deterministic trend layer is used to translate the indications into trend signals. The principal component analysis (PCA) method is then applied to this deterministic trend signal. This study's main influence is using the PCA technique to find the essential components from multiple technical indicators affecting stock prices to reduce data dimensionality and improve model performance. As a result, a PCA-machine learning (ML) hybrid forecasting model was proposed. The experimental findings suggest that the technical factors are signified as trend signals and that the PCA approach combined with ML models outperforms the comparative models in prediction performance. Utilizing the first three principal components (percentage of explained variance=80%), experiments on the Nifty 50 index show that support vector classifer (SVC) with radial basis function (RBF) kernel achieves good accuracy of (0.9968) and F1-score (0.9969), and the RF model achieves an accuracy of (0.9969) and F1-Score (0.9968). In area under the curve (AUC) performance, SVC (RBF and Linear kernels) and RF have AUC scores of 1
- …