604 research outputs found
Recommended from our members
Masquerade Detection Using a Taxonomy-Based Multinomial Modeling Approach in UNIX Systems
This paper presents one-class Hellinger distance-based and one-class SVM modeling techniques that use a set of features to reveal user intent. The specific objective is to model user command profiles and detect deviations indicating a masquerade attack. The approach aims to model user intent, rather than only modeling sequences of user issued commands. We hypothesize that each individual user will search in a targeted and limited fashion in order to find information germane to their current task. Masqueraders, on the other hand, will likely not know the file system and layout of another user's desktop, and would likely search more extensively and broadly. Hence, modeling a user search behavior to detect deviations may more accurately detect masqueraders. To that end, we extend prior research that uses UNIX command sequences issued by users as the audit source by relying upon an abstraction of commands. We devised a taxonomy of UNIX commands that is used to abstract command sequences. The experimental results show that the approach does not lose information and performs comparably to or slightly better than the modeling approach based on simple UNIX command frequencies
Masquerader Detection Using OCLEP: One-Class Classification Using Length Statistics of Emerging Patterns
We introduce a new method for masquerader detection that only uses a user’s own data for training, called Oneclass Classification using Length statistics of Emerging Patterns (OCLEP). Emerging patterns (EPs) are patterns whose support increases from one dataset/class to another with a big ratio, and have been very useful in earlier studies. OCLEP classifies a case T as self or masquerader by using the average length of EPs obtained by contrasting T against sets of samples of a user’s normal data. It is based on the observation that one needs long EPs to differentiate instances from a common class, but needs short EPs to differentiate instances from different classes.
OCLEP has two novel features: for training it uses EPs mined from just the self class; for classification it uses the length statistics instead of the EPs themselves. Experiments show that OCLEP can achieve very good accuracy while keeping the false positive rate low, it achieves slightly better area-under-ROC-curve than SVM, and it can achieve good results when other approaches can not. OCLEP requires little effort in choosing parameters; the SVM requires significant tuning and it is hard to reach the theoretical optimal result. These features imply that OCLEP is a good complementary component for a robust masquerader detection system, even though its average performance in false positive rate is not as good as SVM’s
Recommended from our members
Combining a Baiting and a User Search Profiling Techniques for Masquerade Detection
Masquerade attacks are characterized by an adversary stealing a legitimate user's credentials and using them to impersonate the victim and perform malicious activities, such as stealing information. Prior work on masquerade attack detection has focused on profiling legitimate user behavior and detecting abnormal behavior indicative of a masquerade attack. Like any anomaly-detection based techniques, detecting masquerade attacks by profiling user behavior suffers from a significant number of false positives. We extend prior work and provide a novel integrated detection approach in this paper. We combine a user behavior profiling technique with a baiting technique in order to more accurately detect masquerade activity. We show that using this integrated approach reduces the false positives by 36% when compared to user behavior profiling alone, while achieving almost perfect detection results. We also show how this combined detection approach serves as a mechanism for hardening the masquerade attack detector against mimicry attacks
Probabilistic Vs Clustering Analysis of Modified Unix Command Lines for Masquerade Detection
A computer system masquerader is an intruder who takes over a genuine user session and misuses it. These Masqueraders also called insiders are those who work within the organization and try to either attain more system privileges in the form of impersonation, or misuse their privileges which then become an abuse. Detecting and alarming such intrusions is the primary goal of masquerade detection techniques. A survey of previously undertaken research shows that a behavior analysis can be carried out to detect masqueraders. Automatic discovery of masqueraders is possible by discovering significant departures of test command sessions from the normal user profiles based on command histories. In this line of experiments Schonlau et al performed testing based on a data set comprised of truncated command lines. These experiments proved less efficient with the best detection result reported for a Bayes one-step Markov model, which achieved a hit rate of 69.3% with a corresponding false-alarm rate of 6.7%. Roy A. Maxion and Tahlia N. Townsend reported a 61.5% hit rate and a false-alarm rate of 1.3% based on a na?ve Bayes classification technique. This thesis outlines some of these techniques and their difficulties. As an extension to these techniques we propose a Bayesian network technique that uses a Hybrid classifier of Na?ve Bayes and Deferred Na?ve Bayes classifiers. This approach combines the advantages of both online (Na?ve Bayes) and offline (Deferred Na?ve Bayes) classifiers. With the Bayesian Networks Classifier we also present a clustering approach adopted from the data mining literature for masquerade detection. Finally, a comparative study of the two proposed classifiers and Na?ve Bayes Classifier was carried out with the help of ROC Curves showing the respective hit rates to false alarm rates.Computer Science Departmen
Computer Intrusion Detection Through Statistical Analysis and Prediction Modeling
Information security is very important in today’s society. Computer intrusion is one type of security infraction that poses a threat to all of us. Almost every person in modern parts of the world depend upon automated information. Information systems deliver paychecks on time, manage taxes, transfer funds, deliver important information that enables decisions, and maintain situational awareness in many different ways. Interrupting, corrupting, or destroying this information is a real threat. Computer attackers, often posing as intruders masquerading as authentic users, are the nucleus of this threat. Preventive computer security measures often do not provide enough; digital firms need methods to detect attackers who have breached firewalls or other barriers. This thesis explores techniques to detect computer intruders based upon UNIX command usage of authentic users compared against command usage of attackers. The hypothesis is that computing behavior of authentic users differs from the computing behavior of attackers. In order to explore this hypothesis, seven different variables that measure computing commands are created and utilized to perform predictive modeling to determine the presence or absence of a attacker. This is a classification problem that involves two known groups: intruders and non intruders. Techniques explored include a proven algorithm published by Matthius Schonlau in [17] and several predictive model variations utilizing the aforementioned seven variables; predictive models include linear discrimination analysis, clustering, kernel partial least squares learning machines
Generating Threat Intelligence based on OSINT and a Cyber Threat Unified Taxonomy
Tese de mestrado em Segurança Informática, Universidade de Lisboa, Faculdade de Ciências, 2020As ameaças cibernéticas atuais utilizam múltiplos meios de propagação, tais como a engenharia social, vulnerabilidades de e-mail e aplicações e, muitas vezes, operam em diferentes fases, tais como o comprometimento de um único dispositivo, o movimento lateral na rede e a exfiltração de dados. Estas ameaças são complexas e dependem de táticas bem avançadas, por forma a passarem despercebidas nas defesas de segurança tradicionais, como por exemplo firewalls. Um tipo de ameaças que tem tido um impacto significativo na ascensão do cibercrime são as ameaças persistentes avançadas (APTs), as quais têm objetivos claros, são altamente organizadas, têm acesso a recursos praticamente ilimitados e tendem a realizar ataques ocultos por longos períodos e com múltiplas tentativas. À medida que as organizações têm tido consciência que os ciberataques estão a aumentar em quantidade e complexidade, a utilização de informação sobre ciberameaças está a ganhar popularidade para combater tais ataques. Esta tendência tem acompanhado a evolução das APTs, uma vez que estas exigem um nível de resposta diferente e mais específico a cada organização. A informação sobre ciberameaças pode ser obtida de diversas fontes e em diferentes formatos, sendo a informação de fonte aberta (OSINT) uma das mais comuns. Também pode ser obtida por plataformas especificas de ameaças (TIPs) que ajudam a consumir, produzir e partilhar informações sobre ciberameaças. As TIPs têm múltiplas vantagens que permitem às organizações explorar facilmente os principais processos de recolha, enriquecimento e partilha de informações relacionadas com ameaças. No entanto, devido ao elevado volume de informação OSINT recebido por dia e às diversas taxonomias existentes para classificação de ciberameaças provenientes do OSINT, as TIPs atuais apresentam limitações de processamento desta, capaz de produzir informação inteligente (threat intelligence, TI) de qualidade que seja útil no combate de ciberataques, impedido assim a sua adoção em massa. Por sua vez, os analistas de segurança desperdiçam um tempo considerável em analisar o OSINT e a classificá-lo com diferentes taxonomias, por vezes, correspondentes a ameaças da mesma categoria. Esta dissertação propõe uma solução, denominada Automated Event Classification and Correlation Platform (AECCP), para algumas das limitações das TIPs mencionadas anteriormente e relacionadas com a gestão do conhecimento de ameaças, a triagem de ameaças, o elevado volume de informação partilhada, a qualidade dos dados, as capacidades de análise avançadas e a automatização de tarefas. Esta solução procura aumentar a qualidade da TI produzidas por TIPs, classificando-a em conformidade com um sistema de classificação comum, removendo a informação irrelevante, ou seja, com baixo valor, enriquecendo-a com dados importantes e relevantes de fontes OSINT, e agregando-a em eventos com informação semelhante. O sistema de classificação comum, denominado de Unified Taxonomy, foi definido no âmbito desta dissertação e teve como base uma análise de outras taxonomias públicas conhecidas e utilizadas na partilha de TI. O AECCP é uma plataforma composta por componentes que podem trabalhar em conjunto ou individualmente. O AECCP compreende um classificador (Classifier), um redutor de informação irrelevante (Trimmer), um enriquecedor de informação baseado em OSINT (Enricher) e um agregador de agregador de eventos sobre a mesma ameaça, ou seja, que contêm informação semelhante (Clusterer). O Classifier analisa eventos e, com base na sua informação, classifica-os na Unified Taxonomy, por forma a catalogar eventos ainda não classificados e a eliminar a duplicação de taxonomias com o mesmo significado de eventos previamente classificados. O Trimmer elimina a informação menos pertinente dos eventos baseando-se na classificação do mesmo. O Enricher enriquece os eventos com dados externos e provenientes de OSINT, os quais poderão conter informação importante e relacionada com a informação já presente no evento, mas não contida no mesmo. Por último, o Clusterer agrega eventos que partilham o mesmo contexto associado à classificação de cada um e à informação que estes contêm, produzindo aglomerados de eventos que serão combinados num único evento. Esta nova informação garantirá aos analistas de segurança o acesso e fácil visibilidade a informação relativa a eventos semelhantes aos que estes analisam. O desenho da arquitetura do AECCP, foi fundamentado numa realizada sobre três fontes públicas de informação que continham mais de 1100 eventos de ameaças de cibersegurança partilhados por 24 entidades externas e colecradas entre os anos de 2016 e 2019. A Unified Taxonomy utilizada pelo Classifier, foi produzida com base na análise detalhada das taxonomias utilizadas por estes eventos e nas taxonomias mais utilizadas na comunidade de partilha de TI sobre ciberameaças. No decorrer desta análise foram também identificados os atributos mais pertinentes e relevantes para cada categoria da Unified Taxonomy, através da agregação da informação em grupos com contexto semelhante e de uma análise minuciosa da informação contida em cada um dos mais de 1100 eventos. A dissertação, também, apresenta os algoritmos utilizados na implementação de cada um dos componentes que compõem o AECCP, bem como a avaliação destes e da plataforma. Na avaliação foram utilizadas as mesmas três fontes de OSINT utilizadas na análise inicial, no entanto, com 64 eventos criados e partilhados mais recentemente que os utilizados nessa análise. Dos resultados, foi possível verificar um aumento de 72% na classificação dos eventos, um aumento médio de 54 atributos por evento, com uma redução nos atributos com pouco valor e aumento superior de atributos com maior valor, após os eventos serem processados pelo AECCP. Foi também possível produzir 24 eventos agregados, enriquecidos e classificados pelos outros componentes do AECCP. Por último, foram processados pelo AECCP 6 eventos com grande volume de informação produzidos por uma plataforma externa, denominada de PURE, onde foi possível verificar que o AECCP é capaz de processar eventos oriundos de outras plataformas e de tamanho elevando. Em suma, a dissertação apresenta quatro contribuições, nomeadamente, um sistema de classificação comum, a Unified Taxonomy, os atributos mais pertinentes para cada uma das categorias da Unified Taxonomy, o desenho da arquitetura do AECCP composto por 4 módulos (Classifier, Trimmer, Enricher e Clusterer) que procura resolver 5 das limitações das atuais TIPs (gestão do conhecimento de ameaças, a triagem de ameaças, o elevado volume de informação partilhada, a qualidade dos dados e as capacidades de análise avançadas e a automatização de tarefas) e a sua implementação e avaliação.Today’s threats use multiple means of propagation, such as social engineering, email, and application vulnerabilities, and often operate in different phases, such as single device compromise, network lateral movement and data exfiltration. These complex threats rely on well-advanced tactics for appearing unknown to traditional security defences. One type that had a major impact in the rise of cybercrime are the advanced persistent threats (APTs), which have clear objectives, are highly organized and well-resourced and tend to perform long term stealthy campaigns with repeated attempts. As organizations realize that attacks are increasing in size and complexity, threat intelligence (TI) is growing in popularity and use amongst them. This trend followed the evolution of the APTs as they require a different level of response that is more specific to the organization. TI can be obtained via many formats, being open source intelligence (OSINT) one of the most common; and using threat intelligence platforms (TIPs) that aid organization consuming, producing and sharing TI. TIPs have multiple advantages that enable organisations to easily bootstrap the core processes of collecting, normalising, enriching, correlating, analysing, disseminating and sharing of threat related information. However, current TIPs have some limitations that prevents theirs mass adoption. This dissertation proposes a solution to some of these limitations related with threat knowledge management, limited technology enablement in threat triage, high volume of shared threat information, data quality and limited advanced analytics capabilities and tasks automation. Overall, our solution improves the quality of TI by classifying it accordingly a common taxonomy, removing the information with low value, enriching it with valuable information from OSINT sources, and aggregating it into clusters of events with similar information. This dissertation offers a complete data analysis of three OSINT feeds and the results that made us to design our solution, a detailed description of the architecture of our solution, its implementations and its validation, including the processing of events from other academic solutions
Unknown Threat Detection With Honeypot Ensemble Analsyis Using Big Datasecurity Architecture
The amount of data that is being generated continues to rapidly grow in size and complexity. Frameworks such as Apache Hadoop and Apache Spark are evolving at a rapid rate as organizations are building data driven applications to gain competitive advantages. Data analytics frameworks decomposes our problems to build applications that are more than just inference and can help make predictions as well as prescriptions to problems in real time instead of batch processes.
Information Security is becoming more important to organizations as the Internet and cloud technologies become more integrated with their internal processes. The number of attacks and attack vectors has been increasing steadily over the years. Border defense measures (e.g. Intrusion Detection Systems) are no longer enough to identify and stop attackers.
Data driven information security is not a new approach to solving information security; however there is an increased emphasis on combining heterogeneous sources to
gain a broader view of the problem instead of isolated systems. Stitching together multiple alerts into a cohesive system can increase the number of True Positives.
With the increased concern of unknown insider threats and zero-day attacks, identifying unknown attack vectors becomes more difficult. Previous research has shown that with as little as 10 commands it is possible to identify a masquerade attack against a user\u27s profile.
This thesis is going to look at a data driven information security architecture that relies on both behavioral analysis of SSH profiles and bad actor data collected from an SSH honeypot to identify bad actor attack vectors. Honeypots should collect only data from bad actors; therefore have a high True Positive rate. Using Apache Spark and Apache Hadoop we can create a real time data driven architecture that can collect and analyze new bad actor behaviors from honeypot data and monitor legitimate user accounts to create predictive and prescriptive models. Previously unidentified attack vectors can be cataloged for review
Recommended from our members
Towards Effective Masquerade Attack Detection
Data theft has been the main goal of the cybercrime community for many years, and more and more so as the cybercrime community gets more motivated by financial gain establishing a thriving underground economy. Masquerade attacks are a common security problem that is a consequence of identity theft and that is generally motivated by data theft. Such attacks are characterized by a system user illegitimately posing as another legitimate user. Prevention-focused solutions such as access control solutions and Data Loss Prevention tools have failed in preventing these attacks, making detection not a mere desideratum, but rather a necessity. Detecting masqueraders, however, is very hard. Prior work has focused on user command modeling to identify abnormal behavior indicative of impersonation. These approaches suffered from high miss and false positive rates. None of these approaches could be packaged into an easily-deployable, privacy-preserving, and effective masquerade attack detector. In this thesis, I present a machine learning-based technique using a set of novel features that aim to reveal user intent. I hypothesize that each individual user knows his or her own file system well enough to search in a limited, targeted, and unique fashion in order to find information germane to their current task. Masqueraders, on the other hand, are not likely to know the file system and layout of another user's desktop, and would likely search more extensively and broadly in a manner that is different from that of the victim user being impersonated. Based on this assumption, I model a user's search behavior and monitor deviations from it that could indicate fraudulent behavior. I identify user search events using a taxonomy of Windows applications, DLLs, and user commands. The taxonomy abstracts the user commands and actions and enriches them with contextual information. Experimental results show that modeling search behavior reliably detects all simulated masquerade activity with a very low false positive rate of 1.12%, far better than any previously published results. The limited set of features used for search behavior modeling also results in considerable performance gains over the same modeling techniques that use larger sets of features, both during sensor training and deployment. While an anomaly- or profiling-based detection approach, such as the one used in the user search profiling sensor, has the advantage of detecting unknown attacks and fraudulent masquerade behaviors, it suffers from a relatively high number of false positives and remains potentially vulnerable to mimicry attacks. To further improve the accuracy of the user search profiling approach, I supplement it with a trap-based detection approach. I monitor user actions directed at decoy documents embedded in the user's local file system. The decoy documents, which contain enticing information to the attacker, are known to the legitimate user of the system, and therefore should not be touched by him or her. Access to these decoy files, therefore, should highly suggest the presence of a masquerader. A decoy document access sensor detects any action that requires loading the decoy document into memory such as reading the document, copying it, or zipping it. I conducted human subject studies to investigate the deployment-related properties of decoy documents and to determine how decoys should be strategically deployed in a file system in order to maximize their masquerade detection ability. Our user study results show that effective deployment of decoys allows for the detection of all masquerade activity within ten minutes of its onset at most. I use the decoy access sensor as an oracle for the user search profiling sensor. If abnormal search behavior is detected, I hypothesize that suspicious activity is taking place and validate the hypothesis by checking for accesses to decoy documents. Combining the two sensors and detection techniques reduces the false positive rate to 0.77%, and hardens the sensor against mimicry attacks. The overall sensor has very limited resource requirements (40 KB) and does not introduce any noticeable delay to the user when performing its monitoring actions. Finally, I seek to expand the search behavior profiling technique to detect, not only malicious masqueraders, but any other system users. I propose a diversified and personalized user behavior profiling approach to improve the accuracy of user behavior models. The ultimate goal is to augment existing computer security features such as passwords with user behavior models, as behavior information is not readily available to be stolen and its use could substantially raise the bar for malefactors seeking to perpetrate masquerade attacks
- …