984 research outputs found

    Efficient classification using parallel and scalable compressed model and Its application on intrusion detection

    Full text link
    In order to achieve high efficiency of classification in intrusion detection, a compressed model is proposed in this paper which combines horizontal compression with vertical compression. OneR is utilized as horizontal com-pression for attribute reduction, and affinity propagation is employed as vertical compression to select small representative exemplars from large training data. As to be able to computationally compress the larger volume of training data with scalability, MapReduce based parallelization approach is then implemented and evaluated for each step of the model compression process abovementioned, on which common but efficient classification methods can be directly used. Experimental application study on two publicly available datasets of intrusion detection, KDD99 and CMDC2012, demonstrates that the classification using the compressed model proposed can effectively speed up the detection procedure at up to 184 times, most importantly at the cost of a minimal accuracy difference with less than 1% on average

    Security in Data Mining- A Comprehensive Survey

    Get PDF
    Data mining techniques, while allowing the individuals to extract hidden knowledge on one hand, introduce a number of privacy threats on the other hand. In this paper, we study some of these issues along with a detailed discussion on the applications of various data mining techniques for providing security. An efficient classification technique when used properly, would allow an user to differentiate between a phishing website and a normal website, to classify the users as normal users and criminals based on their activities on Social networks (Crime Profiling) and to prevent users from executing malicious codes by labelling them as malicious. The most important applications of Data mining is the detection of intrusions, where different Data mining techniques can be applied to effectively detect an intrusion and report in real time so that necessary actions are taken to thwart the attempts of the intruder. Privacy Preservation, Outlier Detection, Anomaly Detection and PhishingWebsite Classification are discussed in this paper

    Challenges for Data Mining on Sensor Data of Interlinked Processes

    Get PDF
    In industries like steel production, interlinked production processes leave no time for assessing the physical quality of intermediate products. Failures during the process can lead to high internal costs when already defective products are passed through the entire value chain. However, process data like machine parameters and sensor data which are di- rectly linked to quality can be recorded. Based on a rolling mill case study, the paper discusses how decentralized data mining and intelligent machine-to-machine communication could be used to predict the physical quality of intermediate products online and in real-time for detecting quality issues as early as possible. The recording of huge data masses and the distributed but sequential nature of the problem lead to challenging research questions for the next generation of data mining

    Security in Data Mining-A Comprehensive Survey

    Get PDF
    Data mining techniques, while allowing the individuals to extract hidden knowledge on one hand, introduce a number of privacy threats on the other hand. In this paper, we study some of these issues along with a detailed discussion on the applications of various data mining techniques for providing security. An efficient classification technique when used properly, would allow an user to differentiate between a phishing website and a normal website, to classify the users as normal users and criminals based on their activities on Social networks (Crime Profiling) and to prevent users from executing malicious codes by labelling them as malicious. The most important applications of Data mining is the detection of intrusions, where different Data mining techniques can be applied to effectively detect an intrusion and report in real time so that necessary actions are taken to thwart the attempts of the intruder

    Identification and characterization of irregular consumptions of load data

    Get PDF
    The historical information of loadings on substation helps in evaluation of size of photovoltaic (PV) generation and energy storages for peak shaving and distribution system upgrade deferral. A method, based on consumption data, is proposed to separate the unusual consumption and to form the clusters of similar regular consumption. The method does optimal partition of the load pattern data into core points and border points, high and less dense regions, respectively. The local outlier factor, which does not require fixed probability distribution of data and statistical measures, ranks the unusual consumptions on only the border points, which are a few percent of the complete data. The suggested method finds the optimal or close to optimal number of clusters of similar shape of load patterns to detect regular peak and valley load demands on different days. Furthermore, identification and characterization of features pertaining to unusual consumptions in load pattern data have been done on border points only. The effectiveness of the proposed method and characterization is tested on two practical distribution systems

    Design and Analysis of a Reconfigurable Hierarchical Temporal Memory Architecture

    Get PDF
    Self-learning hardware systems, with high-degree of plasticity, are critical in performing spatio-temporal tasks in next-generation computing systems. To this end, hierarchical temporal memory (HTM) offers time-based online-learning algorithms that store and recall temporal and spatial patterns. In this work, a reconfigurable and scalable HTM architecture is designed with unique pooling realizations. Virtual synapse design is proposed to address the dynamic interconnections occurring in the learning process. The architecture is interweaved with parallel cells and columns that enable high processing speed for the cortical learning algorithm. HTM has two core operations, spatial and temporal pooling. These operations are verified for two different datasets: MNIST and European number plate font. The spatial pooling operation is independently verified for classification with and without the presence of noise. The temporal pooling is verified for simple prediction. The spatial pooler architecture is ported onto an Altera cyclone II fabric and the entire architecture is synthesized for Xilinx Virtex IV. The results show that 91% classification accuracy is achieved with MNIST database and 90% accuracy for the European number plate font numbers with the presence of Gaussian and Salt & Pepper noise. For the prediction, first and second order predictions are observed for a 5-number long sequence generated from European number plate font and ~95% accuracy is obtained. Moreover, the proposed hardware architecture offers 3902X speedup over the software realization. These results indicate that the proposed architecture can serve as a core to build the HTM in hardware and eventually as a standalone self-learning hardware system

    Applications in security and evasions in machine learning : a survey

    Get PDF
    In recent years, machine learning (ML) has become an important part to yield security and privacy in various applications. ML is used to address serious issues such as real-time attack detection, data leakage vulnerability assessments and many more. ML extensively supports the demanding requirements of the current scenario of security and privacy across a range of areas such as real-time decision-making, big data processing, reduced cycle time for learning, cost-efficiency and error-free processing. Therefore, in this paper, we review the state of the art approaches where ML is applicable more effectively to fulfill current real-world requirements in security. We examine different security applications' perspectives where ML models play an essential role and compare, with different possible dimensions, their accuracy results. By analyzing ML algorithms in security application it provides a blueprint for an interdisciplinary research area. Even with the use of current sophisticated technology and tools, attackers can evade the ML models by committing adversarial attacks. Therefore, requirements rise to assess the vulnerability in the ML models to cope up with the adversarial attacks at the time of development. Accordingly, as a supplement to this point, we also analyze the different types of adversarial attacks on the ML models. To give proper visualization of security properties, we have represented the threat model and defense strategies against adversarial attack methods. Moreover, we illustrate the adversarial attacks based on the attackers' knowledge about the model and addressed the point of the model at which possible attacks may be committed. Finally, we also investigate different types of properties of the adversarial attacks
    • …
    corecore