139 research outputs found

    From Intrusion Detection to Attacker Attribution: A Comprehensive Survey of Unsupervised Methods

    Get PDF
    Over the last five years there has been an increase in the frequency and diversity of network attacks. This holds true, as more and more organisations admit compromises on a daily basis. Many misuse and anomaly based Intrusion Detection Systems (IDSs) that rely on either signatures, supervised or statistical methods have been proposed in the literature, but their trustworthiness is debatable. Moreover, as this work uncovers, the current IDSs are based on obsolete attack classes that do not reflect the current attack trends. For these reasons, this paper provides a comprehensive overview of unsupervised and hybrid methods for intrusion detection, discussing their potential in the domain. We also present and highlight the importance of feature engineering techniques that have been proposed for intrusion detection. Furthermore, we discuss that current IDSs should evolve from simple detection to correlation and attribution. We descant how IDS data could be used to reconstruct and correlate attacks to identify attackers, with the use of advanced data analytics techniques. Finally, we argue how the present IDS attack classes can be extended to match the modern attacks and propose three new classes regarding the outgoing network communicatio

    Enhancing Computer Network Security through Improved Outlier Detection for Data Streams

    Get PDF
    V několika posledních letech se metody strojového učení (zvláště ty zabývající se detekcí odlehlých hodnot - OD) v oblasti kyberbezpečnosti opíraly o zjišťování anomálií síťového provozu spočívajících v nových schématech útoků. Detekce anomálií v počítačových sítích reálného světa se ale stala stále obtížnější kvůli trvalému nárůstu vysoce objemných, rychlých a dimenzionálních průběžně přicházejících dat (SD), pro která nejsou k dispozici obecně uznané a pravdivé informace o anomalitě. Účinná detekční schémata pro vestavěná síťová zařízení musejí být rychlá a paměťově nenáročná a musejí být schopna se potýkat se změnami konceptu, když se vyskytnou. Cílem této disertace je zlepšit bezpečnost počítačových sítí zesílenou detekcí odlehlých hodnot v datových proudech, obzvláště SD, a dosáhnout kyberodolnosti, která zahrnuje jak detekci a analýzu, tak reakci na bezpečnostní incidenty jako jsou např. nové zlovolné aktivity. Za tímto účelem jsou v práci navrženy čtyři hlavní příspěvky, jež byly publikovány nebo se nacházejí v recenzním řízení časopisů. Zaprvé, mezera ve volbě vlastností (FS) bez učitele pro zlepšování již hotových metod OD v datových tocích byla zaplněna navržením volby vlastností bez učitele pro detekci odlehlých průběžně přicházejících dat označované jako UFSSOD. Následně odvozujeme generický koncept, který ukazuje dva aplikační scénáře UFSSOD ve spojení s online algoritmy OD. Rozsáhlé experimenty ukázaly, že UFSSOD coby algoritmus schopný online zpracování vykazuje srovnatelné výsledky jako konkurenční metoda upravená pro OD. Zadruhé představujeme nový aplikační rámec nazvaný izolovaný les založený na počítání výkonu (PCB-iForest), jenž je obecně schopen využít jakoukoliv online OD metodu založenou na množinách dat tak, aby fungovala na SD. Do tohoto algoritmu integrujeme dvě varianty založené na klasickém izolovaném lese. Rozsáhlé experimenty provedené na 23 multidisciplinárních datových sadách týkajících se bezpečnostní problematiky reálného světa ukázaly, že PCB-iForest jasně překonává už zavedené konkurenční metody v 61 % případů a dokonce dosahuje ještě slibnějších výsledků co do vyváženosti mezi výpočetními náklady na klasifikaci a její úspěšností. Zatřetí zavádíme nový pracovní rámec nazvaný detekce odlehlých hodnot a rozpoznávání schémat útoku proudovým způsobem (SOAAPR), jenž je na rozdíl od současných metod schopen zpracovat výstup z různých online OD metod bez učitele proudovým způsobem, aby získal informace o nových schématech útoku. Ze seshlukované množiny korelovaných poplachů jsou metodou SOAAPR vypočítány tři různé soukromí zachovávající podpisy podobné otiskům prstů, které charakterizují a reprezentují potenciální scénáře útoku s ohledem na jejich komunikační vztahy, projevy ve vlastnostech dat a chování v čase. Evaluace na dvou oblíbených datových sadách odhalila, že SOAAPR může soupeřit s konkurenční offline metodou ve schopnosti korelace poplachů a významně ji překonává z hlediska výpočetního času . Navíc se všechny tři typy podpisů ve většině případů zdají spolehlivě charakterizovat scénáře útoků tím, že podobné seskupují k sobě. Začtvrté představujeme algoritmus nepárového kódu autentizace zpráv (Uncoupled MAC), který propojuje oblasti kryptografického zabezpečení a detekce vniknutí (IDS) pro síťovou bezpečnost. Zabezpečuje síťovou komunikaci (autenticitu a integritu) kryptografickým schématem s podporou druhé vrstvy kódy autentizace zpráv, ale také jako vedlejší efekt poskytuje funkcionalitu IDS tak, že vyvolává poplach na základě porušení hodnot nepárového MACu. Díky novému samoregulačnímu rozšíření algoritmus adaptuje svoje vzorkovací parametry na základě zjištění škodlivých aktivit. Evaluace ve virtuálním prostředí jasně ukazuje, že schopnost detekce se za běhu zvyšuje pro různé scénáře útoku. Ty zahrnují dokonce i situace, kdy se inteligentní útočníci snaží využít slabá místa vzorkování.ObhájenoOver the past couple of years, machine learning methods - especially the Outlier Detection (OD) ones - have become anchored to the cyber security field to detect network-based anomalies rooted in novel attack patterns. Due to the steady increase of high-volume, high-speed and high-dimensional Streaming Data (SD), for which ground truth information is not available, detecting anomalies in real-world computer networks has become a more and more challenging task. Efficient detection schemes applied to networked, embedded devices need to be fast and memory-constrained, and must be capable of dealing with concept drifts when they occur. The aim of this thesis is to enhance computer network security through improved OD for data streams, in particular SD, to achieve cyber resilience, which ranges from the detection, over the analysis of security-relevant incidents, e.g., novel malicious activity, to the reaction to them. Therefore, four major contributions are proposed, which have been published or are submitted journal articles. First, a research gap in unsupervised Feature Selection (FS) for the improvement of off-the-shell OD methods in data streams is filled by proposing Unsupervised Feature Selection for Streaming Outlier Detection, denoted as UFSSOD. A generic concept is retrieved that shows two application scenarios of UFSSOD in conjunction with online OD algorithms. Extensive experiments have shown that UFSSOD, as an online-capable algorithm, achieves comparable results with a competitor trimmed for OD. Second, a novel unsupervised online OD framework called Performance Counter-Based iForest (PCB-iForest) is being introduced, which generalized, is able to incorporate any ensemble-based online OD method to function on SD. Two variants based on classic iForest are integrated. Extensive experiments, performed on 23 different multi-disciplinary and security-related real-world data sets, revealed that PCB-iForest clearly outperformed state-of-the-art competitors in 61 % of cases and even achieved more promising results in terms of the tradeoff between classification and computational costs. Third, a framework called Streaming Outlier Analysis and Attack Pattern Recognition, denoted as SOAAPR is being introduced that, in contrast to the state-of-the-art, is able to process the output of various online unsupervised OD methods in a streaming fashion to extract information about novel attack patterns. Three different privacy-preserving, fingerprint-like signatures are computed from the clustered set of correlated alerts by SOAAPR, which characterize and represent the potential attack scenarios with respect to their communication relations, their manifestation in the data's features and their temporal behavior. The evaluation on two popular data sets shows that SOAAPR can compete with an offline competitor in terms of alert correlation and outperforms it significantly in terms of processing time. Moreover, in most cases all three types of signatures seem to reliably characterize attack scenarios to the effect that similar ones are grouped together. Fourth, an Uncoupled Message Authentication Code algorithm - Uncoupled MAC - is presented which builds a bridge between cryptographic protection and Intrusion Detection Systems (IDSs) for network security. It secures network communication (authenticity and integrity) through a cryptographic scheme with layer-2 support via uncoupled message authentication codes but, as a side effect, also provides IDS-functionality producing alarms based on the violation of Uncoupled MAC values. Through a novel self-regulation extension, the algorithm adapts its sampling parameters based on the detection of malicious actions on SD. The evaluation in a virtualized environment clearly shows that the detection rate increases over runtime for different attack scenarios. Those even cover scenarios in which intelligent attackers try to exploit the downsides of sampling

    Cyber Security and Critical Infrastructures 2nd Volume

    Get PDF
    The second volume of the book contains the manuscripts that were accepted for publication in the MDPI Special Topic "Cyber Security and Critical Infrastructure" after a rigorous peer-review process. Authors from academia, government and industry contributed their innovative solutions, consistent with the interdisciplinary nature of cybersecurity. The book contains 16 articles, including an editorial that explains the current challenges, innovative solutions and real-world experiences that include critical infrastructure and 15 original papers that present state-of-the-art innovative solutions to attacks on critical systems

    Anomaly-based Correlation of IDS Alarms

    Get PDF
    An Intrusion Detection System (IDS) is one of the major techniques for securing information systems and keeping pace with current and potential threats and vulnerabilities in computing systems. It is an indisputable fact that the art of detecting intrusions is still far from perfect, and IDSs tend to generate a large number of false IDS alarms. Hence human has to inevitably validate those alarms before any action can be taken. As IT infrastructure become larger and more complicated, the number of alarms that need to be reviewed can escalate rapidly, making this task very difficult to manage. The need for an automated correlation and reduction system is therefore very much evident. In addition, alarm correlation is valuable in providing the operators with a more condensed view of potential security issues within the network infrastructure. The thesis embraces a comprehensive evaluation of the problem of false alarms and a proposal for an automated alarm correlation system. A critical analysis of existing alarm correlation systems is presented along with a description of the need for an enhanced correlation system. The study concludes that whilst a large number of works had been carried out in improving correlation techniques, none of them were perfect. They either required an extensive level of domain knowledge from the human experts to effectively run the system or were unable to provide high level information of the false alerts for future tuning. The overall objective of the research has therefore been to establish an alarm correlation framework and system which enables the administrator to effectively group alerts from the same attack instance and subsequently reduce the volume of false alarms without the need of domain knowledge. The achievement of this aim has comprised the proposal of an attribute-based approach, which is used as a foundation to systematically develop an unsupervised-based two-stage correlation technique. From this formation, a novel SOM K-Means Alarm Reduction Tool (SMART) architecture has been modelled as the framework from which time and attribute-based aggregation technique is offered. The thesis describes the design and features of the proposed architecture, focusing upon the key components forming the underlying architecture, the alert attributes and the way they are processed and applied to correlate alerts. The architecture is strengthened by the development of a statistical tool, which offers a mean to perform results or alert analysis and comparison. The main concepts of the novel architecture are validated through the implementation of a prototype system. A series of experiments were conducted to assess the effectiveness of SMART in reducing false alarms. This aimed to prove the viability of implementing the system in a practical environment and that the study has provided appropriate contribution to knowledge in this field

    Featured Anomaly Detection Methods and Applications

    Get PDF
    Anomaly detection is a fundamental research topic that has been widely investigated. From critical industrial systems, e.g., network intrusion detection systems, to people’s daily activities, e.g., mobile fraud detection, anomaly detection has become the very first vital resort to protect and secure public and personal properties. Although anomaly detection methods have been under consistent development over the years, the explosive growth of data volume and the continued dramatic variation of data patterns pose great challenges on the anomaly detection systems and are fuelling the great demand of introducing more intelligent anomaly detection methods with distinct characteristics to cope with various needs. To this end, this thesis starts with presenting a thorough review of existing anomaly detection strategies and methods. The advantageous and disadvantageous of the strategies and methods are elaborated. Afterward, four distinctive anomaly detection methods, especially for time series, are proposed in this work aiming at resolving specific needs of anomaly detection under different scenarios, e.g., enhanced accuracy, interpretable results, and self-evolving models. Experiments are presented and analysed to offer a better understanding of the performance of the methods and their distinct features. To be more specific, the abstracts of the key contents in this thesis are listed as follows: 1) Support Vector Data Description (SVDD) is investigated as a primary method to fulfill accurate anomaly detection. The applicability of SVDD over noisy time series datasets is carefully examined and it is demonstrated that relaxing the decision boundary of SVDD always results in better accuracy in network time series anomaly detection. Theoretical analysis of the parameter utilised in the model is also presented to ensure the validity of the relaxation of the decision boundary. 2) To support a clear explanation of the detected time series anomalies, i.e., anomaly interpretation, the periodic pattern of time series data is considered as the contextual information to be integrated into SVDD for anomaly detection. The formulation of SVDD with contextual information maintains multiple discriminants which help in distinguishing the root causes of the anomalies. 3) In an attempt to further analyse a dataset for anomaly detection and interpretation, Convex Hull Data Description (CHDD) is developed for realising one-class classification together with data clustering. CHDD approximates the convex hull of a given dataset with the extreme points which constitute a dictionary of data representatives. According to the dictionary, CHDD is capable of representing and clustering all the normal data instances so that anomaly detection is realised with certain interpretation. 4) Besides better anomaly detection accuracy and interpretability, better solutions for anomaly detection over streaming data with evolving patterns are also researched. Under the framework of Reinforcement Learning (RL), a time series anomaly detector that is consistently trained to cope with the evolving patterns is designed. Due to the fact that the anomaly detector is trained with labeled time series, it avoids the cumbersome work of threshold setting and the uncertain definitions of anomalies in time series anomaly detection tasks

    Development and Validation of a Proof-of-Concept Prototype for Analytics-based Malicious Cybersecurity Insider Threat in a Real-Time Identification System

    Get PDF
    Insider threat has continued to be one of the most difficult cybersecurity threat vectors detectable by contemporary technologies. Most organizations apply standard technology-based practices to detect unusual network activity. While there have been significant advances in intrusion detection systems (IDS) as well as security incident and event management solutions (SIEM), these technologies fail to take into consideration the human aspects of personality and emotion in computer use and network activity, since insider threats are human-initiated. External influencers impact how an end-user interacts with both colleagues and organizational resources. Taking into consideration external influencers, such as personality, changes in organizational polices and structure, along with unusual technical activity analysis, would be an improvement over contemporary detection tools used for identifying at-risk employees. This would allow upper management or other organizational units to intervene before a malicious cybersecurity insider threat event occurs, or mitigate it quickly, once initiated. The main goal of this research study was to design, develop, and validate a proof-of-concept prototype for a malicious cybersecurity insider threat alerting system that will assist in the rapid detection and prediction of human-centric precursors to malicious cybersecurity insider threat activity. Disgruntled employees or end-users wishing to cause harm to the organization may do so by abusing the trust given to them in their access to available network and organizational resources. Reports on malicious insider threat actions indicated that insider threat attacks make up roughly 23% of all cybercrime incidents, resulting in $2.9 trillion in employee fraud losses globally. The damage and negative impact that insider threats cause was reported to be higher than that of outsider or other types of cybercrime incidents. Consequently, this study utilized weighted indicators to measure and correlate simulated user activity to possible precursors to malicious cybersecurity insider threat attacks. This study consisted of a mixed method approach utilizing an expert panel, developmental research, and quantitative data analysis using the developed tool on simulated data set. To assure validity and reliability of the indicators, a panel of subject matter experts (SMEs) reviewed the indicators and indicator categorizations that were collected from prior literature following the Delphi technique. The SMEs’ responses were incorporated into the development of a proof-of-concept prototype. Once the proof-of-concept prototype was completed and fully tested, an empirical simulation research study was conducted utilizing simulated user activity within a 16-month time frame. The results of the empirical simulation study were analyzed and presented. Recommendations resulting from the study also be provided

    Prescription Fraud detection via data mining : a methodology proposal

    Get PDF
    Ankara : The Department of Industrial Engineering and the Institute of Engineering and Science of Bilkent University, 2009.Thesis (Master's) -- -Bilkent University, 2009.Includes bibliographical references leaves 61-69Fraud is the illegitimate act of violating regulations in order to gain personal profit. These kinds of violations are seen in many important areas including, healthcare, computer networks, credit card transactions and communications. Every year health care fraud causes considerable amount of losses to Social Security Agencies and Insurance Companies in many countries including Turkey and USA. This kind of crime is often seem victimless by the committers, nonetheless the fraudulent chain between pharmaceutical companies, health care providers, patients and pharmacies not only damage the health care system with the financial burden but also greatly hinders the health care system to provide legitimate patients with quality health care. One of the biggest issues related with health care fraud is the prescription fraud. This thesis aims to identify a data mining methodology in order to detect fraudulent prescriptions in a large prescription database, which is a task traditionally conducted by human experts. For this purpose, we have developed a customized data-mining model for the prescription fraud detection. We employ data mining methodologies for assigning a risk score to prescriptions regarding Prescribed Medicament- Diagnosis consistency, Prescribed Medicaments’ consistency within a prescription, Prescribed Medicament- Age and Sex consistency and Diagnosis- Cost consistency. Our proposed model has been tested on real world data. The results we obtained from our experimentations reveal that the proposed model works considerably well for the prescription fraud detection problem with a 77.4% true positive rate. We conclude that incorporating such a system in Social Security Agencies would radically decrease human-expert auditing costs and efficiency.Aral, Karca DuruM.S
    corecore