98 research outputs found

    NLP Methods in Host-based Intrusion Detection Systems: A Systematic Review and Future Directions

    Full text link
    Host based Intrusion Detection System (HIDS) is an effective last line of defense for defending against cyber security attacks after perimeter defenses (e.g., Network based Intrusion Detection System and Firewall) have failed or been bypassed. HIDS is widely adopted in the industry as HIDS is ranked among the top two most used security tools by Security Operation Centers (SOC) of organizations. Although effective and efficient HIDS is highly desirable for industrial organizations, the evolution of increasingly complex attack patterns causes several challenges resulting in performance degradation of HIDS (e.g., high false alert rate creating alert fatigue for SOC staff). Since Natural Language Processing (NLP) methods are better suited for identifying complex attack patterns, an increasing number of HIDS are leveraging the advances in NLP that have shown effective and efficient performance in precisely detecting low footprint, zero day attacks and predicting the next steps of attackers. This active research trend of using NLP in HIDS demands a synthesized and comprehensive body of knowledge of NLP based HIDS. Thus, we conducted a systematic review of the literature on the end to end pipeline of the use of NLP in HIDS development. For the end to end NLP based HIDS development pipeline, we identify, taxonomically categorize and systematically compare the state of the art of NLP methods usage in HIDS, attacks detected by these NLP methods, datasets and evaluation metrics which are used to evaluate the NLP based HIDS. We highlight the relevant prevalent practices, considerations, advantages and limitations to support the HIDS developers. We also outline the future research directions for the NLP based HIDS development

    SGNET: A Worldwide Deployable Framework to Support the Analysis of Malware Threat Models

    Full text link
    The dependability community has expressed a growing interest in the recent years for the effects of malicious, ex-ternal, operational faults in computing systems, ie. intru-sions. The term intrusion tolerance has been introduced to emphasize the need to go beyond what classical fault toler-ant systems were able to offer. Unfortunately, as opposed to well understood accidental faults, the domain is still lack-ing sound data sets and models to offer rationales in the design of intrusion tolerant solutions. In this paper, we de-scribe a framework similar in its spirit to so called honey-farms but built in a way that makes its large-scale deploy-ment easily feasible. Furthermore, it offers a very rich level of interaction with the attackers without suffering from the drawbacks of expensive high interaction systems. The sys-tem is described, a prototype is presented as well as some preliminary results that highlight the feasibility as well as the usefulness of the approach.

    Algorizmi: A Configurable Virtual Testbed to Generate Datasets for Offline Evaluation of Intrusion Detection Systems

    Get PDF
    Intrusion detection systems (IDSes) are an important security measure that network administrators adopt to defend computer networks against malicious attacks and intrusions. The field of IDS research includes many challenges. However, one open problem remains orthogonal to the others: IDS evaluation. In other words, researchers have not yet succeeded to agree on a general systematic methodology and/or a set of metrics to fairly evaluate different IDS algorithms. This leads to another problem: the lack of an appropriate IDS evaluation dataset that satisfies the common research needs. One major contribution in this area is the DARPA dataset offered by the Massachusetts Institute of Technology Lincoln Lab (MIT/LL), which has been extensively used to evaluate a number of IDS algorithms proposed in the literature. Despite this, the DARPA dataset received a lot of criticism concerning the way it was designed, especially concerning its obsoleteness and inability to incorporate new sorts of network attacks. In this thesis, we survey previous research projects that attempted to provide a system for IDS offline evaluation. From the survey, we identify a set of design requirements for such a system based on the research community needs. We, then, propose Algorizmi as an open-source configurable virtual testbed for generating datasets for offline IDS evaluation. We provide an architectural overview of Algorizmi and its software and hardware components. Algorizmi provides its users with tools that allow them to create their own experimental testbed using the concepts of virtualization and cloud computing. Algorizmi users can configure the virtual machine instances running in their experiments, select what background traffic those instances will generate and what attacks will be launched against them. At any point in time, an Algorizmi user can generate a dataset (network traffic trace) for any of her experiments so that she can use this dataset afterwards to evaluate an IDS the same way the DARPA dataset is used. Our analysis shows that Algorizmi satisfies more requirements than previous research projects that target the same research problem of generating datasets for IDS offline evaluation. Finally, we prove the utility of Algorizmi by building a sample network of machines, generate both background and attack traffic within that network. We then download a snapshot of the dataset for that experiment and run it against Snort IDS. Snort successfully detected the attacks we launched against the sample network. Additionally, we evaluate the performance of Algorizmi while processing some of the common usages of a typical user based on 5 metrics: CPU time, CPU usage, memory usage, network traffic sent/received and the execution time

    Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks

    Full text link
    Malware still constitutes a major threat in the cybersecurity landscape, also due to the widespread use of infection vectors such as documents. These infection vectors hide embedded malicious code to the victim users, facilitating the use of social engineering techniques to infect their machines. Research showed that machine-learning algorithms provide effective detection mechanisms against such threats, but the existence of an arms race in adversarial settings has recently challenged such systems. In this work, we focus on malware embedded in PDF files as a representative case of such an arms race. We start by providing a comprehensive taxonomy of the different approaches used to generate PDF malware, and of the corresponding learning-based detection systems. We then categorize threats specifically targeted against learning-based PDF malware detectors, using a well-established framework in the field of adversarial machine learning. This framework allows us to categorize known vulnerabilities of learning-based PDF malware detectors and to identify novel attacks that may threaten such systems, along with the potential defense mechanisms that can mitigate the impact of such threats. We conclude the paper by discussing how such findings highlight promising research directions towards tackling the more general challenge of designing robust malware detectors in adversarial settings

    Forensic Evidence Identification and Modeling for Attacks against a Simulated Online Business Information System

    Get PDF
    Forensic readiness of business information systems can support future forensics investigation or auditing on external/internal attacks, internal sabotage and espionage, and business fraud. To establish forensics readiness, it is essential for an organization to identify which fingerprints are relevant and where they can be located, to determine whether they are logged in a forensically sound way and whether all the needed fingerprints are available to reconstruct the events successfully. Also, a fingerprint identification and locating mechanism should be provided to guide potential forensics investigation in the future. Furthermore, mechanisms should be established to automate the security incident tracking and reconstruction processes. In this research, external and internal attacks are first modeled as augmented attack trees based on the vulnerabilities of business information systems. Then, modeled attacks are conducted against a honeynet that simulates an online business information system, and a forensic investigation follows each attack. Finally, an evidence tree, which is expected to provide the necessary contextual information to automate the attack tracking and reconstruction process in the future, is built for each attack based on fingerprints identified and located within the system

    Experimental Penetration Testing Teaching and Learning for High School Students Using Cloud Computing

    Get PDF
    The need for high school students trained in ICT to developing cybersecurity skills implies the understanding of threats on security. Considering that the aim of hacking is to circumvent restrictions, the goal of this experimental course is to train students in understanding hacking to improve security. Currently, the reality of hacking has alarmingly evolved and shaped an undeniable black market of information where talented teenagers are not exempt to partake. Despite the fact that the formal teaching and learning of hacking inside high schools can be seen as miseducation, that misunderstanding is faced in this work by addressing both the defensive and offensive security from the perspective of penetration testing. By developing progressive challenges over an adaptive cloud environment, the students can be taught hacking from a constructive perspective. A cloud-based attack surface is implemented which consists of a set of systems gradually prepared by means of scripts. The theoretical and practical lessons are directed by a set of scaffolded and constructivist challenges. The discussion about ethics is confronted and remains present throughout the teaching and learning process. Finally, the results and empirical findings of the students are analyzed and measured demonstrating that high school students can acquire skills to protect information for the community, and for themselves

    From Map to Dist: the Evolution of a Large-Scale Wlan Monitoring System

    Get PDF
    The edge of the Internet is increasingly becoming wireless. Therefore, monitoring the wireless edge is important to understanding the security and performance aspects of the Internet experience. We have designed and implemented a large-scale WLAN monitoring system, the Distributed Internet Security Testbed (DIST), at Dartmouth College. It is equipped with distributed arrays of “sniffers” that cover 210 diverse campus locations and more than 5,000 users. In this paper, we describe our approach, designs and solutions for addressing the technical challenges that have resulted from efficiency, scalability, security, and management perspectives. We also present extensive evaluation results on a production network, and summarize the lessons learned
    • …
    corecore