4,323 research outputs found

    Intelligent FMI-Reduct Ensemble Frame Work for Network Intrusion Detection System (NIDS)

    Get PDF
    The era of computer networks and information systems includes finance, transport, medicine, and education contains a lot of sensitive and confidential data. With the amount of confidential and sensitive data running over network applications are growing vulnerable to a variety of cyber threats. The manual monitoring of network connections and malicious activities is extremely difficult, leading to an increasing concern for malicious attacks on network-related systems. Network intrusion is an increasing issue in the virtual realm of the internet and computer networks that could harm the network structure in various ways, such as by altering system configurations and parameters. To address this issue, the creation of an efficient Network Intrusion Detection System (NID) that identifies malicious activities within a network has become necessary. The NID must regularly monitor network activities to detect malicious connections and help secure computer networks. The utilization of ML and mining of data approaches has proven to be beneficial in these types of scenarios. In this article, mutual a data-driven Fuzzy-Rough feature selection technique has been suggested to rank important features for the NIDS model to enforce cyber security attacks. The primary goal of the research is to classify potential attacks in high dimensional scenario, handling redundant and irrelevant features using proposed dimensionality reduction technique by combining Fuzzy and Rough set Theory techniques. The classical anomaly intrusion detection approaches that use individual classifiers Such as SVM, Decision Tree, Naive Bayes, k-Nearest Neighbor, and Multi Layer Perceptron are not enough to increase the effectiveness of detecting modern attacks. Hence, the suggested anomaly-based Network Intrusion Detection System named "FMI-Reduct based Ensemble Classifier" has been tested on highly imbalanced benchmark datasets, NSL_KDD and UNSW_NB15datasets of intrusion

    Data-driven Soft Sensors in the Process Industry

    Get PDF
    In the last two decades Soft Sensors established themselves as a valuable alternative to the traditional means for the acquisition of critical process variables, process monitoring and other tasks which are related to process control. This paper discusses characteristics of the process industry data which are critical for the development of data-driven Soft Sensors. These characteristics are common to a large number of process industry fields, like the chemical industry, bioprocess industry, steel industry, etc. The focus of this work is put on the data-driven Soft Sensors because of their growing popularity, already demonstrated usefulness and huge, though yet not completely realised, potential. A comprehensive selection of case studies covering the three most important Soft Sensor application fields, a general introduction to the most popular Soft Sensor modelling techniques as well as a discussion of some open issues in the Soft Sensor development and maintenance and their possible solutions are the main contributions of this work

    A systematic review of data quality issues in knowledge discovery tasks

    Get PDF
    Hay un gran crecimiento en el volumen de datos porque las organizaciones capturan permanentemente la cantidad colectiva de datos para lograr un mejor proceso de toma de decisiones. El desafío mas fundamental es la exploración de los grandes volúmenes de datos y la extracción de conocimiento útil para futuras acciones por medio de tareas para el descubrimiento del conocimiento; sin embargo, muchos datos presentan mala calidad. Presentamos una revisión sistemática de los asuntos de calidad de datos en las áreas del descubrimiento de conocimiento y un estudio de caso aplicado a la enfermedad agrícola conocida como la roya del café.Large volume of data is growing because the organizations are continuously capturing the collective amount of data for better decision-making process. The most fundamental challenge is to explore the large volumes of data and extract useful knowledge for future actions through knowledge discovery tasks, nevertheless many data has poor quality. We presented a systematic review of the data quality issues in knowledge discovery tasks and a case study applied to agricultural disease named coffee rust

    Omnivariate rule induction using a novel pairwise statistical test

    Get PDF
    Rule learning algorithms, for example, RIPPER, induces univariate rules, that is, a propositional condition in a rule uses only one feature. In this paper, we propose an omnivariate induction of rules where under each condition, both a univariate and a multivariate condition are trained, and the best is chosen according to a novel statistical test. This paper has three main contributions: First, we propose a novel statistical test, the combined 5 x 2 cv t test, to compare two classifiers, which is a variant of the 5 x 2 cv t test and give the connections to other tests as 5 x 2 cv F test and k-fold paired t test. Second, we propose a multivariate version of RIPPER, where support vector machine with linear kernel is used to find multivariate linear conditions. Third, we propose an omnivariate version of RIPPER, where the model selection is done via the combined 5 x 2 cv t test. Our results indicate that 1) the combined 5 x 2 cv t test has higher power (lower type II error), lower type I error, and higher replicability compared to the 5 x 2 cv t test, 2) omnivariate rules are better in that they choose whichever condition is more accurate, selecting the right model automatically and separately for each condition in a rule.Publisher's VersionAuthor Post Prin

    IMPROVED EVOLUTIONARY SUPPORT VECTOR MACHINE CLASSIFIER FOR CORONARY ARTERY HEART DISEASE PREDICTION AMONG DIABETIC PATIENTS

    Get PDF
    Soft computing paves way many applications including medical informatics. Decision support system has gained a major attention that will aid medical practitioners to diagnose diseases. Diabetes mellitus is hereditary disease that might result in major heart disease. This research work aims to propose a soft computing mechanism named Improved Evolutionary Support Vector Machine classifier for CAHD risk prediction among diabetes patients. The attribute selection mechanism is attempted to build with the classifier in order to reduce the misclassification error rate of the conventional support vector machine classifier. Radial basis kernel function is employed in IESVM. IESVM classifier is evaluated through the performance metrics namely sensitivity, specificity, prediction accuracy and Matthews correlation coefficient (MCC) and also compared with existing work and our earlier proposed works

    Data mining in manufacturing: a review based on the kind of knowledge

    Get PDF
    In modern manufacturing environments, vast amounts of data are collected in database management systems and data warehouses from all involved areas, including product and process design, assembly, materials planning, quality control, scheduling, maintenance, fault detection etc. Data mining has emerged as an important tool for knowledge acquisition from the manufacturing databases. This paper reviews the literature dealing with knowledge discovery and data mining applications in the broad domain of manufacturing with a special emphasis on the type of functions to be performed on the data. The major data mining functions to be performed include characterization and description, association, classification, prediction, clustering and evolution analysis. The papers reviewed have therefore been categorized in these five categories. It has been shown that there is a rapid growth in the application of data mining in the context of manufacturing processes and enterprises in the last 3 years. This review reveals the progressive applications and existing gaps identified in the context of data mining in manufacturing. A novel text mining approach has also been used on the abstracts and keywords of 150 papers to identify the research gaps and find the linkages between knowledge area, knowledge type and the applied data mining tools and techniques
    corecore