19 research outputs found
Поведенческая идентификация программ
The algorithm of pattern mining from sequences of system calls is described. Patterns are used for process identification or establishing the fact that some sequence of system calls was produced by the process which was used in pattern extraction. Existing algorithms are computationaly more complex or reveals high false positive runs in experiments in comparision with an automaton building algorithm. Our algorithm is less complex and more precise in comparision with the classical N-gram algorithm. Performance tests reveal that our kernel monitor does not significatly slow down the processing of the operating system. After 20 minutes of learning the algorithm is able to identify any thread of any process with 85% precision. Program identification based on behavior is used for anomaly detection of malicious activities in system.В работе описан алгоритм выделения шаблонов переменной длины из последовательностей системных вызовов. Эти шаблоны используются для идентификации процессов – установления того, что некоторая последовательность вызовов была сгенерирована тем же самым процессом, из которого были выделены шаблоны. Существующие алгоритмы либо вычислительно сложны, либо имеют высокий уровень ложных срабатываний по сравнению со сложным и ненадежным автоматным подходом. Предложенный в работе алгоритм имеет низкую вычислительную сложность и большую точность, чем классический N-граммный алгоритм. Тесты производительности показали, что реализованный монитор системных вызовов несущественно замедляет работу операционной системы. Предложенный алгоритм после двадцатиминутного обучения способен идентифицировать за одну минуту потоки процессов с точностью 85%. Поведенческая идентификация потоков программ используется для аномального обнаружения вредоносных воздействий на систему
Anagram: A Content Anomaly Detector Resistant to Mimicry Attack
In this paper, we present Anagram, a content anomaly detector that models a mixture of high-order n-grams (n > 1) designed to detect anomalous and suspicious network packet payloads. By using higher- order n-grams, Anagram can detect significant anomalous byte sequences and generate robust signatures of validated malicious packet content. The Anagram content models are implemented using highly efficient Bloom filters, reducing space requirements and enabling privacy-preserving cross-site correlation. The sensor models the distinct content flow of a network or host using a semi- supervised training regimen. Previously known exploits, extracted from the signatures of an IDS, are likewise modeled in a Bloom filter and are used during training as well as detection time. We demonstrate that Anagram can identify anomalous traffic with high accuracy and low false positive rates. Anagram’s high-order n-gram analysis technique is also resilient against simple mimicry attacks that blend exploits with normal appearing byte padding, such as the blended polymorphic attack recently demonstrated in. We discuss randomized n-gram models, which further raises the bar and makes it more difficult for attackers to build precise packet structures to evade Anagram even if they know the distribution of the local site content flow. Finally, Anagram-’s speed and high detection rate makes it valuable not only as a standalone sensor, but also as a network anomaly flow classifier in an instrumented fault-tolerant host-based environment; this enables significant cost amortization and the possibility of a symbiotic feedback loop that can improve accuracy and reduce false positive rates over time
Recommended from our members
Behavior-Based Modeling and Its Application to Email Analysis
The Email Mining Toolkit (EMT) is a data mining system that computes behavior profiles or models of user email accounts. These models may be used for a multitude of tasks including forensic analyses and detection tasks of value to law enforcement and intelligence agencies, as well for as other typical tasks such as virus and spam detection. To demonstrate the power of the methods, we focus on the application of these models to detect the early onset of a viral propagation without "content-base" (or signature-based) analysis in common use in virus scanners. We present several experiments using real email from 15 users with injected simulated viral emails and describe how the combination of different behavior models improves overall detection rates. The performance results vary depending upon parameter settings, approaching 99% true positive (TP) (percentage of viral emails caught) in general cases and with 0.38% false positive (FP) (percentage of emails with attachments that are mislabeled as viral). The models used for this study are based upon volume and velocity statistics of a user's email rate and an analysis of the user's (social) cliques revealed in the person's email behavior. We show by way of simulation that virus propagations are detectable since viruses may emit emails at rates different than human behavior suggests is normal, and email is directed to groups of recipients in ways that violate the users' typical communications with their social groups
Detecção de comportamento no sistema catarinense de telemedicina
Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico.Programa de Pós-Graduação em Ciência da ComputaçãoEste trabalho apresenta uma pesquisa de métodos para detecção de comportamento, assim como, descreve o desenvolvimento de um framework para a detecção de comportamento de usuários em sistemas web e seqüência de ações suspeitas que possam indicar um uso incorreto do mesmo ou uma possível falha de segurança. Deste modo buscou-se incrementar a segurança de sistemas web e avaliar a aplicação da técnica probabilística de Mahalanobis nesse contexto. O framework proposto foi desenvolvido utilizando como métrica de similaridade, a distância de Mahalanobis, de forma a validar suplementarmente, possíveis benefícios advindos de métricas estatísticas na comparação de assinaturas de comportamento. Foi executada uma extensa validação tendo como base protótipo do sistema aqui descrito, para funcionamento em conjunto com o portal de Telemedicina do Estado de Santa Catarina, que também serviu de base de coleta de dados para validação. Como resultado obteve-se um framework modular que pode ser facilmente portado para diferentes sistemas web que possuam características similares ao Portal de Telemedicina. Obteve-se, ainda, na utilização da técnica proposta como forma de detecção de comportamento, um baixo índice de detecção de comportamento válido marcado como inválido (Falso Negativo) e também, baixos índices de detecção de comportamento inválido marcadas como válido (Falsos Positivos). This work presents a research about detection of behavior methods, as well as it describes the development of a framework for user behavior detecting in web systems and sequence of actions that may indicate a suspected misuse or a possible security breach. The proposed framework wa developed using dissimilarity metric, the Mahalanobis distance, to supplementary validate, potential benefits derived from statistical metric comparison for behavior signatures. An extensive validation was performed based on the prototype system described here, for operation in conjunction with the Telemedicine portal of the State of Santa Catarina, which also served as the basis for collection of validation data. As results, we have obtained a modular framework that can be easily adapted to different web based systems presenting a given set of features similar to the Telemedicine Portal. The proposed behavior detection technique was able to achieve a low detection rate of valid behavior marked as invalid (False Negatives) and also a low detection rate of invalid behavior marked as valid (False Positives
Ex-Ray: Detection of History-Leaking Browser Extensions
Web browsers have become the predominant means for developing and deploying applications, and thus they often handle sensitive data such as social interactions or financial credentials and information. As a consequence, defensive measures such as TLS, the Same-Origin Policy (SOP), and Content Security Policy (CSP) are critical for ensuring that sensitive data remains in trusted hands.
Browser extensions, while a useful mechanism for allowing third-party extensions to core browser functionality, pose a security risk in this regard since they have access to privileged browser APIs that are not necessarily restricted by the SOP or CSP. Because of this, they have become a major vector for introducing malicious code into the browser. Prior work has led to improved security models for isolating and sandboxing extensions, as well as techniques for identifying potentially malicious extensions. The area of privacy-violating browser extensions has so far been covered by manual analysis and systems performing search on specific text on network traffic. However, comprehensive content-agnostic systems for identifying tracking behavior at the network level are an area that has not yet received significant attention.
In this paper, we present a dynamic technique for identifying privacy-violating extensions in Web browsers that relies solely on observations of the network traffic patterns generated by browser extensions. We then present Ex-Ray, a prototype implementation of this technique for the Chrome Web browser, and use it to evaluate all extensions from the Chrome store with more than 1,000 installations (10,691 in total). Our evaluation finds new types of tracking behavior not covered by state of the art systems. Finally, we discuss potential browser improvements to prevent abuse by future user-tracking extensions
Dos and Don'ts of Machine Learning in Computer Security
With the growing processing power of computing systems and the increasing
availability of massive datasets, machine learning algorithms have led to major
breakthroughs in many different areas. This development has influenced computer
security, spawning a series of work on learning-based security systems, such as
for malware detection, vulnerability discovery, and binary code analysis.
Despite great potential, machine learning in security is prone to subtle
pitfalls that undermine its performance and render learning-based systems
potentially unsuitable for security tasks and practical deployment. In this
paper, we look at this problem with critical eyes. First, we identify common
pitfalls in the design, implementation, and evaluation of learning-based
security systems. We conduct a study of 30 papers from top-tier security
conferences within the past 10 years, confirming that these pitfalls are
widespread in the current security literature. In an empirical analysis, we
further demonstrate how individual pitfalls can lead to unrealistic performance
and interpretations, obstructing the understanding of the security problem at
hand. As a remedy, we propose actionable recommendations to support researchers
in avoiding or mitigating the pitfalls where possible. Furthermore, we identify
open problems when applying machine learning in security and provide directions
for further research.Comment: to appear at USENIX Security Symposium 202