5,673 research outputs found

    A novel IoT intrusion detection framework using Decisive Red Fox optimization and descriptive back propagated radial basis function models.

    Get PDF
    The Internet of Things (IoT) is extensively used in modern-day life, such as in smart homes, intelligent transportation, etc. However, the present security measures cannot fully protect the IoT due to its vulnerability to malicious assaults. Intrusion detection can protect IoT devices from the most harmful attacks as a security tool. Nevertheless, the time and detection efficiencies of conventional intrusion detection methods need to be more accurate. The main contribution of this paper is to develop a simple as well as intelligent security framework for protecting IoT from cyber-attacks. For this purpose, a combination of Decisive Red Fox (DRF) Optimization and Descriptive Back Propagated Radial Basis Function (DBRF) classification are developed in the proposed work. The novelty of this work is, a recently developed DRF optimization methodology incorporated with the machine learning algorithm is utilized for maximizing the security level of IoT systems. First, the data preprocessing and normalization operations are performed to generate the balanced IoT dataset for improving the detection accuracy of classification. Then, the DRF optimization algorithm is applied to optimally tune the features required for accurate intrusion detection and classification. It also supports increasing the training speed and reducing the error rate of the classifier. Moreover, the DBRF classification model is deployed to categorize the normal and attacking data flows using optimized features. Here, the proposed DRF-DBRF security model's performance is validated and tested using five different and popular IoT benchmarking datasets. Finally, the results are compared with the previous anomaly detection approaches by using various evaluation parameters

    Interpreting Black-Box Models: A Review on Explainable Artificial Intelligence

    Get PDF
    Recent years have seen a tremendous growth in Artificial Intelligence (AI)-based methodological development in a broad range of domains. In this rapidly evolving field, large number of methods are being reported using machine learning (ML) and Deep Learning (DL) models. Majority of these models are inherently complex and lacks explanations of the decision making process causing these models to be termed as 'Black-Box'. One of the major bottlenecks to adopt such models in mission-critical application domains, such as banking, e-commerce, healthcare, and public services and safety, is the difficulty in interpreting them. Due to the rapid proleferation of these AI models, explaining their learning and decision making process are getting harder which require transparency and easy predictability. Aiming to collate the current state-of-the-art in interpreting the black-box models, this study provides a comprehensive analysis of the explainable AI (XAI) models. To reduce false negative and false positive outcomes of these back-box models, finding flaws in them is still difficult and inefficient. In this paper, the development of XAI is reviewed meticulously through careful selection and analysis of the current state-of-the-art of XAI research. It also provides a comprehensive and in-depth evaluation of the XAI frameworks and their efficacy to serve as a starting point of XAI for applied and theoretical researchers. Towards the end, it highlights emerging and critical issues pertaining to XAI research to showcase major, model-specific trends for better explanation, enhanced transparency, and improved prediction accuracy

    An intelligent rule-oriented framework for extracting key factors for grants scholarships in higher education

    Get PDF
    Education is a fundamental sector in all countries, where in some countries students com-pete to get an educational grant due to its high cost. The incorporation of artificial intelli-gence in education holds great promise for the advancement of educational systems and pro-cesses. Educational data mining involves the analysis of data generated within educational environments to extract valuable insights into student performance and other factors that enhance teaching and learning. This paper aims to analyze the factors influencing students' performance and consequently, assist granting organizations in selecting suitable students in the Arab region (Jordan as a use case). The problem was addressed using a rule-based tech-nique to facilitate the utilization and implementation of a decision support system. To this end, three classical rule induction algorithms, namely PART, JRip, and RIDOR, were em-ployed. The data utilized in this study was collected from undergraduate students at the University of Jordan from 2010 to 2020. The constructed models were evaluated based on metrics such as accuracy, recall, precision, and f1-score. The findings indicate that the JRip algorithm outperformed PART and RIDOR in most of the datasets based on f1-score metric. The interpreted decision rules of the best models reveal that both features; the average study years and high school averages play vital roles in deciding which students should receive scholarships. The paper concludes with several suggested implications to support and en-hance the decision-making process of granting agencies in the realm of higher education

    Performance analysis of various machine learning algorithms for CO2 leak prediction and characterization in geo-sequestration injection wells

    Get PDF
    The effective detection and prevention of CO2 leakage in active injection wells are paramount for safe carbon capture and storage (CCS) initiatives. This study assesses five fundamental machine learning algorithms, namely, Support Vector Regression (SVR), K-Nearest Neighbor Regression (KNNR), Decision Tree Regression (DTR), Random Forest Regression (RFR), and Artificial Neural Network (ANN), for use in developing a robust data-driven model to predict potential CO2 leakage incidents in injection wells. Leveraging wellhead and bottom-hole pressure and temperature data, the models aim to simultaneously predict the location and size of leaks. A representative dataset simulating various leak scenarios in a saline aquifer reservoir was utilized. The findings reveal crucial insights into the relationships between the variables considered and leakage characteristics. With its positive linear correlation with depth of leak, wellhead pressure could be a pivotal indicator of leak location, while the negative linear relationship with well bottom-hole pressure demonstrated the strongest association with leak size. Among the predictive models examined, the highest prediction accuracy was achieved by the KNNR model for both leak localization and sizing. This model displayed exceptional sensitivity to leak size, and was able to identify leak magnitudes representing as little as 0.0158% of the total main flow with relatively high levels of accuracy. Nonetheless, the study underscored that accurate leak sizing posed a greater challenge for the models compared to leak localization. Overall, the findings obtained can provide valuable insights into the development of efficient data-driven well-bore leak detection systems.<br/

    Synthesis of the neuro-fuzzy regulator with genetic algorithm

    Get PDF
    Real-acting objects are characterized by the presence of various types of random perturbations, which significantly reduce the quality of the control process, which determines the use of modern methods of intellectual technology to solve the problem of synthesis of control systems of structurally complex dynamic objects, allowing to compensate the influence of external factors with the properties of randomness and partial uncertainty. The article considers issues of synthesis of the automatic control system of dynamic objects by applying the theory of intelligent control. In this case, a neural network based on radial-basis functions is used at each discrete interval for neuro-fuzzy approximation of the control system, allowing real-time adjustment of the regulator parameters. The radial basis function is designed to approximate functions defined in the implicit form of pattern sets. The neuro-fuzzy regulator's parameter configuration is accomplished using a genetic algorithm, enabling more efficient computation to determine the regulator's set parameters. The regulator's parameters are represented as a vector, facilitating their application to multidimensional objects. To determine the optimal tuning parameters of the neuro-fuzzy regulator, characterized by high convergence and the possibility of determining global extrema, a genetic algorithm was used. The effectiveness of the neuro-fuzzy regulator is explained by the possibility of providing quality control of the dynamic object under random perturbations and uncertainty of input data

    Online semi-supervised learning in non-stationary environments

    Get PDF
    Existing Data Stream Mining (DSM) algorithms assume the availability of labelled and balanced data, immediately or after some delay, to extract worthwhile knowledge from the continuous and rapid data streams. However, in many real-world applications such as Robotics, Weather Monitoring, Fraud Detection Systems, Cyber Security, and Computer Network Traffic Flow, an enormous amount of high-speed data is generated by Internet of Things sensors and real-time data on the Internet. Manual labelling of these data streams is not practical due to time consumption and the need for domain expertise. Another challenge is learning under Non-Stationary Environments (NSEs), which occurs due to changes in the data distributions in a set of input variables and/or class labels. The problem of Extreme Verification Latency (EVL) under NSEs is referred to as Initially Labelled Non-Stationary Environment (ILNSE). This is a challenging task because the learning algorithms have no access to the true class labels directly when the concept evolves. Several approaches exist that deal with NSE and EVL in isolation. However, few algorithms address both issues simultaneously. This research directly responds to ILNSE’s challenge in proposing two novel algorithms “Predictor for Streaming Data with Scarce Labels” (PSDSL) and Heterogeneous Dynamic Weighted Majority (HDWM) classifier. PSDSL is an Online Semi-Supervised Learning (OSSL) method for real-time DSM and is closely related to label scarcity issues in online machine learning. The key capabilities of PSDSL include learning from a small amount of labelled data in an incremental or online manner and being available to predict at any time. To achieve this, PSDSL utilises both labelled and unlabelled data to train the prediction models, meaning it continuously learns from incoming data and updates the model as new labelled or unlabelled data becomes available over time. Furthermore, it can predict under NSE conditions under the scarcity of class labels. PSDSL is built on top of the HDWM classifier, which preserves the diversity of the classifiers. PSDSL and HDWM can intelligently switch and adapt to the conditions. The PSDSL adapts to learning states between self-learning, micro-clustering and CGC, whichever approach is beneficial, based on the characteristics of the data stream. HDWM makes use of “seed” learners of different types in an ensemble to maintain its diversity. The ensembles are simply the combination of predictive models grouped to improve the predictive performance of a single classifier. PSDSL is empirically evaluated against COMPOSE, LEVELIW, SCARGC and MClassification on benchmarks, NSE datasets as well as Massive Online Analysis (MOA) data streams and real-world datasets. The results showed that PSDSL performed significantly better than existing approaches on most real-time data streams including randomised data instances. PSDSL performed significantly better than ‘Static’ i.e. the classifier is not updated after it is trained with the first examples in the data streams. When applied to MOA-generated data streams, PSDSL ranked highest (1.5) and thus performed significantly better than SCARGC, while SCARGC performed the same as the Static. PSDSL achieved better average prediction accuracies in a short time than SCARGC. The HDWM algorithm is evaluated on artificial and real-world data streams against existing well-known approaches such as the heterogeneous WMA and the homogeneous Dynamic DWM algorithm. The results showed that HDWM performed significantly better than WMA and DWM. Also, when recurring concept drifts were present, the predictive performance of HDWM showed an improvement over DWM. In both drift and real-world streams, significance tests and post hoc comparisons found significant differences between algorithms, HDWM performed significantly better than DWM and WMA when applied to MOA data streams and 4 real-world datasets Electric, Spam, Sensor and Forest cover. The seeding mechanism and dynamic inclusion of new base learners in the HDWM algorithms benefit from the use of both forgetting and retaining the models. The algorithm also provides the independence of selecting the optimal base classifier in its ensemble depending on the problem. A new approach, Envelope-Clustering is introduced to resolve the cluster overlap conflicts during the cluster labelling process. In this process, PSDSL transforms the centroids’ information of micro-clusters into micro-instances and generates new clusters called Envelopes. The nearest envelope clusters assist the conflicted micro-clusters and successfully guide the cluster labelling process after the concept drifts in the absence of true class labels. PSDSL has been evaluated on real-world problem ‘keystroke dynamics’, and the results show that PSDSL achieved higher prediction accuracy (85.3%) and SCARGC (81.6%), while the Static (49.0%) significantly degrades the performance due to changes in the users typing pattern. Furthermore, the predictive accuracies of SCARGC are found highly fluctuated between (41.1% to 81.6%) based on different values of parameter ‘k’ (number of clusters), while PSDSL automatically determine the best values for this parameter

    Forschungsbericht / Hochschule Mittweida

    Get PDF

    Predictive Equations for Estimation of the Slump of Concrete Using GEP and MARS Methods

    Get PDF
    This paper developed two robust data-driven models, namely gene expression programming (GEP) and multivariate adaptive regression splines (MARS), for the estimation of the slump of concrete (SL). The main feature of the proposed data-driven methods is to provide explicit mathematical equations for estimating SL. The experimental data set contains five input variables, including the water-cement ratio (W/C), water (W), cement (C), river sand (Sa), and Bida Natural Gravel (BNG) used for the estimation of SL. Three common statistical indices, such as the correlation coefficient (R), root mean square error (RMSE), and mean absolute error (MAE), were used to evaluate the accuracy of the derived equations. The statistical indices revealed that the GEP formula (R=0.976, RMSE=19.143, and MAE=15.113) was more accurate than the MARS equation (R=0.962, RMSE=23.748, and MAE=16.795). However, the application of MARS, due to its simple regression equation for estimating SL, is more convenient for practical purposes than the complex formulation of GEP

    Smart Gas Sensors: Materials, Technologies, Practical ‎Applications, and Use of Machine Learning – A Review

    Get PDF
    The electronic nose, popularly known as the E-nose, that combines gas sensor arrays (GSAs) with machine learning has gained a strong foothold in gas sensing technology. The E-nose designed to mimic the human olfactory system, is used for the detection and identification of various volatile compounds. The GSAs develop a unique signal fingerprint for each volatile compound to enable pattern recognition using machine learning algorithms. The inexpensive, portable and non-invasive characteristics of the E-nose system have rendered it indispensable within the gas-sensing arena. As a result, E-noses have been widely employed in several applications in the areas of the food industry, health management, disease diagnosis, water and air quality control, and toxic gas leakage detection. This paper reviews the various sensor fabrication technologies of GSAs and highlights the main operational framework of the E-nose system. The paper details vital signal pre-processing techniques of feature extraction, feature selection, in addition to machine learning algorithms such as SVM, kNN, ANN, and Random Forests for determining the type of gas and estimating its concentration in a competitive environment. The paper further explores the potential applications of E-noses for diagnosing diseases, monitoring air quality, assessing the quality of food samples and estimating concentrations of volatile organic compounds (VOCs) in air and in food samples. The review concludes with some challenges faced by E-nose, alternative ways to tackle them and proposes some recommendations as potential future work for further development and design enhancement of E-noses
    • 

    corecore