884 research outputs found

    Security in Data Mining- A Comprehensive Survey

    Get PDF
    Data mining techniques, while allowing the individuals to extract hidden knowledge on one hand, introduce a number of privacy threats on the other hand. In this paper, we study some of these issues along with a detailed discussion on the applications of various data mining techniques for providing security. An efficient classification technique when used properly, would allow an user to differentiate between a phishing website and a normal website, to classify the users as normal users and criminals based on their activities on Social networks (Crime Profiling) and to prevent users from executing malicious codes by labelling them as malicious. The most important applications of Data mining is the detection of intrusions, where different Data mining techniques can be applied to effectively detect an intrusion and report in real time so that necessary actions are taken to thwart the attempts of the intruder. Privacy Preservation, Outlier Detection, Anomaly Detection and PhishingWebsite Classification are discussed in this paper

    Military and Security Applications: Cybersecurity (Encyclopedia of Optimization, Third Edition)

    Get PDF
    The domain of cybersecurity is growing as part of broader military and security applications, and the capabilities and processes in this realm have qualities and characteristics that warrant using solution methods in mathematical optimization. Problems of interest may involve continuous or discrete variables, a convex or non-convex decision space, differing levels of uncertainty, and constrained or unconstrained frameworks. Cyberattacks, for example, can be modeled using hierarchical threat structures and may involve decision strategies from both an organization or individual and the adversary. Network traffic flow, intrusion detection and prevention systems, interconnected human-machine interfaces, and automated systems – these all require higher levels of complexity in mathematical optimization modeling and analysis. Attributes such as cyber resiliency, network adaptability, security capability, and information technology flexibility – these require the measurement of multiple characteristics, many of which may involve both quantitative and qualitative interpretations. And for nearly every organization that is invested in some cybersecurity practice, decisions must be made that involve the competing objectives of cost, risk, and performance. As such, mathematical optimization has been widely used and accepted to model important and complex decision problems, providing analytical evidence for helping drive decision outcomes in cybersecurity applications. In the paragraphs that follow, this chapter highlights some of the recent mathematical optimization research in the body of knowledge applied to the cybersecurity space. The subsequent literature discussed fits within a broader cybersecurity domain taxonomy considering the categories of analyze, collect and operate, investigate, operate and maintain, oversee and govern, protect and defend, and securely provision. Further, the paragraphs are structured around generalized mathematical optimization categories to provide a lens to summarize the existing literature, including uncertainty (stochastic programming, robust optimization, etc.), discrete (integer programming, multiobjective, etc.), continuous-unconstrained (nonlinear least squares, etc.), continuous-constrained (global optimization, etc.), and continuous-constrained (nonlinear programming, network optimization, linear programming, etc.). At the conclusion of this chapter, research implications and extensions are offered to the reader that desires to pursue further mathematical optimization research for cybersecurity within a broader military and security applications context

    Phishing website detection using genetic algorithm-based feature selection and parameter hypertuning

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsFalse webpages are created by cyber attackers who seek to mislead users into revealing sensitive and personal information, from credit card details to passwords. Phishing is a class of cyber attacks that mislead users into clicking on false websites, logging into related accounts, and subsequently stealing funds. This cyberattack increases annually given the exponential increase of e-commerce customers, which causes difficulty to distinguish between harmless and false websites. The conventional methods to detect phishing websites are focused on a database of blacklisted and whitelisted. Such methods are not capable to detect new phishing websites. To solve this problem, researchers are developing machine learning (ML) and deep learning-based methods. In this dissertation, a hybrid-based solution, which uses genetic algorithms and ML algorithms for phishing detection based on the URL of the website is proposed. Regarding evaluation, comparisons between conventional ML and DL models are performed using various feature sets resulting from commonly used feature selection methods, such as mutual information and recursive feature elimination. This dissertation proposes a final model with an accuracy of 95.34% on the test set

    A Competent Approach for Type of Phishing Attack Detection Using Multi-Layer Neural Network

    Full text link
    With the enlargement of contemporary technologies and the large-scale global computer networks web-attacks are escalating because of emergent curiosity of people and lawful institutions towards internet. Phishing is one of web-attack carried out by attacker using both social and technical engineering. Generally on web more attacks are launched every month with seek of crafting web addict to consider that they are contacting with a legalized entity for the intention of embezzle identity information, logon records and account details. The phishing attack detection and classification methods are utilized for the prevention and in-depth analysis of the attacks. In this paper, the proposed model has been designed with the multi-directional feature analysis along with the Back-Propagation Probabilistic neural network (BP-PNN) classification. The proposed model has performed better in the terms of the accuracy in all of the domains based upon the attack detection and classification

    Applications in security and evasions in machine learning : a survey

    Get PDF
    In recent years, machine learning (ML) has become an important part to yield security and privacy in various applications. ML is used to address serious issues such as real-time attack detection, data leakage vulnerability assessments and many more. ML extensively supports the demanding requirements of the current scenario of security and privacy across a range of areas such as real-time decision-making, big data processing, reduced cycle time for learning, cost-efficiency and error-free processing. Therefore, in this paper, we review the state of the art approaches where ML is applicable more effectively to fulfill current real-world requirements in security. We examine different security applications' perspectives where ML models play an essential role and compare, with different possible dimensions, their accuracy results. By analyzing ML algorithms in security application it provides a blueprint for an interdisciplinary research area. Even with the use of current sophisticated technology and tools, attackers can evade the ML models by committing adversarial attacks. Therefore, requirements rise to assess the vulnerability in the ML models to cope up with the adversarial attacks at the time of development. Accordingly, as a supplement to this point, we also analyze the different types of adversarial attacks on the ML models. To give proper visualization of security properties, we have represented the threat model and defense strategies against adversarial attack methods. Moreover, we illustrate the adversarial attacks based on the attackers' knowledge about the model and addressed the point of the model at which possible attacks may be committed. Finally, we also investigate different types of properties of the adversarial attacks

    A Review of Threat Vectors to DNA Sequencing Pipelines

    Get PDF
    Bioinformatics is a steadily growing field that focuses on the intersection of biology with computer science. Tools and techniques developed within this field are quickly becoming fixtures in genomics, forensics, epidemiology, and bioengineering. The development and analysis of DNA sequencing and synthesis have enabled this significant rise in demand for bioinformatic tools. Notwithstanding, these bioinformatic tools have developed in a research context free of significant cybersecurity threats. With the significant growth of the field and the commercialization of genetic information, this is no longer the case. This paper examines the bioinformatic landscape through reviewing the biological and cybersecurity threats within the bioinformatics pipeline. It is found that there are significant security deficits within existing bioinformatic databases. Additionally, it is found that there is a theoretical trojan threat posed by unverified malicious DNA sequences

    Explainable Artificial Intelligence and Causal Inference based ATM Fraud Detection

    Full text link
    Gaining the trust of customers and providing them empathy are very critical in the financial domain. Frequent occurrence of fraudulent activities affects these two factors. Hence, financial organizations and banks must take utmost care to mitigate them. Among them, ATM fraudulent transaction is a common problem faced by banks. There following are the critical challenges involved in fraud datasets: the dataset is highly imbalanced, the fraud pattern is changing, etc. Owing to the rarity of fraudulent activities, Fraud detection can be formulated as either a binary classification problem or One class classification (OCC). In this study, we handled these techniques on an ATM transactions dataset collected from India. In binary classification, we investigated the effectiveness of various over-sampling techniques, such as the Synthetic Minority Oversampling Technique (SMOTE) and its variants, Generative Adversarial Networks (GAN), to achieve oversampling. Further, we employed various machine learning techniques viz., Naive Bayes (NB), Logistic Regression (LR), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), Gradient Boosting Tree (GBT), Multi-layer perceptron (MLP). GBT outperformed the rest of the models by achieving 0.963 AUC, and DT stands second with 0.958 AUC. DT is the winner if the complexity and interpretability aspects are considered. Among all the oversampling approaches, SMOTE and its variants were observed to perform better. In OCC, IForest attained 0.959 CR, and OCSVM secured second place with 0.947 CR. Further, we incorporated explainable artificial intelligence (XAI) and causal inference (CI) in the fraud detection framework and studied it through various analyses.Comment: 34 pages; 21 Figures; 8 Table

    Machine Learning

    Get PDF
    Machine Learning can be defined in various ways related to a scientific domain concerned with the design and development of theoretical and implementation tools that allow building systems with some Human Like intelligent behavior. Machine learning addresses more specifically the ability to improve automatically through experience
    • …
    corecore