11 research outputs found

    Clustering Arabic Tweets for Sentiment Analysis

    Get PDF
    The focus of this study is to evaluate the impact of linguistic preprocessing and similarity functions for clustering Arabic Twitter tweets. The experiments apply an optimized version of the standard K-Means algorithm to assign tweets into positive and negative categories. The results show that root-based stemming has a significant advantage over light stemming in all settings. The Averaged Kullback-Leibler Divergence similarity function clearly outperforms the Cosine, Pearson Correlation, Jaccard Coefficient and Euclidean functions. The combination of the Averaged Kullback-Leibler Divergence and root-based stemming achieved the highest purity of 0.764 while the second-best purity was 0.719. These results are of importance as it is contrary to normal-sized documents where, in many information retrieval applications, light stemming performs better than root-based stemming and the Cosine function is commonly used

    Clustering Arabic Tweets for Sentiment Analysis

    Get PDF
    The focus of this study is to evaluate the impact of linguistic preprocessing and similarity functions for clustering Arabic Twitter tweets. The experiments apply an optimized version of the standard K-Means algorithm to assign tweets into positive and negative categories. The results show that root-based stemming has a significant advantage over light stemming in all settings. The Averaged Kullback-Leibler Divergence similarity function clearly outperforms the Cosine, Pearson Correlation, Jaccard Coefficient and Euclidean functions. The combination of the Averaged Kullback-Leibler Divergence and root-based stemming achieved the highest purity of 0.764 while the second-best purity was 0.719. These results are of importance as it is contrary to normal-sized documents where, in many information retrieval applications, light stemming performs better than root-based stemming and the Cosine function is commonly used

    A Relevance Model for Threat-Centric Ranking of Cybersecurity Vulnerabilities

    Get PDF
    The relentless and often haphazard process of tracking and remediating vulnerabilities is a top concern for cybersecurity professionals. The key challenge they face is trying to identify a remediation scheme specific to in-house, organizational objectives. Without a strategy, the result is a patchwork of fixes applied to a tide of vulnerabilities, any one of which could be the single point of failure in an otherwise formidable defense. This means one of the biggest challenges in vulnerability management relates to prioritization. Given that so few vulnerabilities are a focus of real-world attacks, a practical remediation strategy is to identify vulnerabilities likely to be exploited and focus efforts towards remediating those vulnerabilities first. The goal of this research is to demonstrate that aggregating and synthesizing readily accessible, public data sources to provide personalized, automated recommendations that an organization can use to prioritize its vulnerability management strategy will offer significant improvements over what is currently realized using the Common Vulnerability Scoring System (CVSS). We provide a framework for vulnerability management specifically focused on mitigating threats using adversary criteria derived from MITRE ATT&CK. We identify the data mining steps needed to acquire, standardize, and integrate publicly available cyber intelligence data sets into a robust knowledge graph from which stakeholders can infer business logic related to known threats. We tested our approach by identifying vulnerabilities in academic and common software associated with six universities and four government facilities. Ranking policy performance was measured using the Normalized Discounted Cumulative Gain (nDCG). Our results show an average 71.5% to 91.3% improvement towards the identification of vulnerabilities likely to be targeted and exploited by cyber threat actors. The ROI of patching using our policies resulted in a savings in the range of 23.3% to 25.5% in annualized unit costs. Our results demonstrate the efficiency of creating knowledge graphs to link large data sets to facilitate semantic queries and create data-driven, flexible ranking policies. Additionally, our framework uses only open standards, making implementation and improvement feasible for cyber practitioners and academia

    NLP-Based Techniques for Cyber Threat Intelligence

    Full text link
    In the digital era, threat actors employ sophisticated techniques for which, often, digital traces in the form of textual data are available. Cyber Threat Intelligence~(CTI) is related to all the solutions inherent to data collection, processing, and analysis useful to understand a threat actor's targets and attack behavior. Currently, CTI is assuming an always more crucial role in identifying and mitigating threats and enabling proactive defense strategies. In this context, NLP, an artificial intelligence branch, has emerged as a powerful tool for enhancing threat intelligence capabilities. This survey paper provides a comprehensive overview of NLP-based techniques applied in the context of threat intelligence. It begins by describing the foundational definitions and principles of CTI as a major tool for safeguarding digital assets. It then undertakes a thorough examination of NLP-based techniques for CTI data crawling from Web sources, CTI data analysis, Relation Extraction from cybersecurity data, CTI sharing and collaboration, and security threats of CTI. Finally, the challenges and limitations of NLP in threat intelligence are exhaustively examined, including data quality issues and ethical considerations. This survey draws a complete framework and serves as a valuable resource for security professionals and researchers seeking to understand the state-of-the-art NLP-based threat intelligence techniques and their potential impact on cybersecurity

    Automating Cyber Analytics

    Get PDF
    Model based security metrics are a growing area of cyber security research concerned with measuring the risk exposure of an information system. These metrics are typically studied in isolation, with the formulation of the test itself being the primary finding in publications. As a result, there is a flood of metric specifications available in the literature but a corresponding dearth of analyses verifying results for a given metric calculation under different conditions or comparing the efficacy of one measurement technique over another. The motivation of this thesis is to create a systematic methodology for model based security metric development, analysis, integration, and validation. In doing so we hope to fill a critical gap in the way we view and improve a system’s security. In order to understand the security posture of a system before it is rolled out and as it evolves, we present in this dissertation an end to end solution for the automated measurement of security metrics needed to identify risk early and accurately. To our knowledge this is a novel capability in design time security analysis which provides the foundation for ongoing research into predictive cyber security analytics. Modern development environments contain a wealth of information in infrastructure-as-code repositories, continuous build systems, and container descriptions that could inform security models, but risk evaluation based on these sources is ad-hoc at best, and often simply left until deployment. Our goal in this work is to lay the groundwork for security measurement to be a practical part of the system design, development, and integration lifecycle. In this thesis we provide a framework for the systematic validation of the existing security metrics body of knowledge. In doing so we endeavour not only to survey the current state of the art, but to create a common platform for future research in the area to be conducted. We then demonstrate the utility of our framework through the evaluation of leading security metrics against a reference set of system models we have created. We investigate how to calibrate security metrics for different use cases and establish a new methodology for security metric benchmarking. We further explore the research avenues unlocked by automation through our concept of an API driven S-MaaS (Security Metrics-as-a-Service) offering. We review our design considerations in packaging security metrics for programmatic access, and discuss how various client access-patterns are anticipated in our implementation strategy. Using existing metric processing pipelines as reference, we show how the simple, modular interfaces in S-MaaS support dynamic composition and orchestration. Next we review aspects of our framework which can benefit from optimization and further automation through machine learning. First we create a dataset of network models labeled with the corresponding security metrics. By training classifiers to predict security values based only on network inputs, we can avoid the computationally expensive attack graph generation steps. We use our findings from this simple experiment to motivate our current lines of research into supervised and unsupervised techniques such as network embeddings, interaction rule synthesis, and reinforcement learning environments. Finally, we examine the results of our case studies. We summarize our security analysis of a large scale network migration, and list the friction points along the way which are remediated by this work. We relate how our research for a large-scale performance benchmarking project has influenced our vision for the future of security metrics collection and analysis through dev-ops automation. We then describe how we applied our framework to measure the incremental security impact of running a distributed stream processing system inside a hardware trusted execution environment

    Discovery of Web Attacks by Inspecting HTTPS Network Traffic with Machine Learning and Similarity Search

    Get PDF
    Tese de mestrado, Segurança Informática, Universidade de Lisboa, Faculdade de Ciências, 2022Web applications are the building blocks of many services, from social networks to banks. Network security threats have remained a permanent concern since the advent of data communication. Not withstanding, security breaches are still a serious problem since web applications incorporate both company information and private client data. Traditional Intrusion Detection Systems (IDS) inspect the payload of the packets looking for known intrusion signatures or deviations from nor mal behavior. However, this Deep Packet Inspection (DPI) approach cannot inspect encrypted network traffic of Hypertext Transfer Protocol Secure (HTTPS), a protocol that has been widely adopted nowadays to protect data communication. We are interested in web application attacks, and to accurately detect them, we must access the payload. Network flows are able to aggregate flows of traffic with common properties, so they can be employed for inspecting large amounts of traffic. The main objective of this thesis is to develop a system to discover anomalous HTTPS traffic and confirm that the payloads included in it contains web applications attacks. We propose a new reliable method and system to identify traffic that may include web application attacks by analysing HTTPS network flows (netflows) and discovering payload content similarities. We resort to unsupervised machine learning algorithms to cluster netflows and identify anomalous traffic and to Locality Sensitive Hashing (LSH) algorithms to create a Similarity Search Engine (SSE) capable of correctly identifying the presence of known web applications attacks over this traffic. We involve the system in a continuous improvement process to keep a reliable detection as new web applications attacks are discovered. We evaluated the system, which showed that it could detect anomalous traffic, the SSE was able to confirm the presence of web attacks into that anomalous traffic, and the continuous improvement process was able to increase the accuracy of the SSE

    Security Risk Management for the Internet of Things

    Get PDF
    In recent years, the rising complexity of Internet of Things (IoT) systems has increased their potential vulnerabilities and introduced new cybersecurity challenges. In this context, state of the art methods and technologies for security risk assessment have prominent limitations when it comes to large scale, cyber-physical and interconnected IoT systems. Risk assessments for modern IoT systems must be frequent, dynamic and driven by knowledge about both cyber and physical assets. Furthermore, they should be more proactive, more automated, and able to leverage information shared across IoT value chains. This book introduces a set of novel risk assessment techniques and their role in the IoT Security risk management process. Specifically, it presents architectures and platforms for end-to-end security, including their implementation based on the edge/fog computing paradigm. It also highlights machine learning techniques that boost the automation and proactiveness of IoT security risk assessments. Furthermore, blockchain solutions for open and transparent sharing of IoT security information across the supply chain are introduced. Frameworks for privacy awareness, along with technical measures that enable privacy risk assessment and boost GDPR compliance are also presented. Likewise, the book illustrates novel solutions for security certification of IoT systems, along with techniques for IoT security interoperability. In the coming years, IoT security will be a challenging, yet very exciting journey for IoT stakeholders, including security experts, consultants, security research organizations and IoT solution providers. The book provides knowledge and insights about where we stand on this journey. It also attempts to develop a vision for the future and to help readers start their IoT Security efforts on the right foot

    Cyber-Physical Threat Intelligence for Critical Infrastructures Security

    Get PDF
    Modern critical infrastructures can be considered as large scale Cyber Physical Systems (CPS). Therefore, when designing, implementing, and operating systems for Critical Infrastructure Protection (CIP), the boundaries between physical security and cybersecurity are blurred. Emerging systems for Critical Infrastructures Security and Protection must therefore consider integrated approaches that emphasize the interplay between cybersecurity and physical security techniques. Hence, there is a need for a new type of integrated security intelligence i.e., Cyber-Physical Threat Intelligence (CPTI). This book presents novel solutions for integrated Cyber-Physical Threat Intelligence for infrastructures in various sectors, such as Industrial Sites and Plants, Air Transport, Gas, Healthcare, and Finance. The solutions rely on novel methods and technologies, such as integrated modelling for cyber-physical systems, novel reliance indicators, and data driven approaches including BigData analytics and Artificial Intelligence (AI). Some of the presented approaches are sector agnostic i.e., applicable to different sectors with a fair customization effort. Nevertheless, the book presents also peculiar challenges of specific sectors and how they can be addressed. The presented solutions consider the European policy context for Security, Cyber security, and Critical Infrastructure protection, as laid out by the European Commission (EC) to support its Member States to protect and ensure the resilience of their critical infrastructures. Most of the co-authors and contributors are from European Research and Technology Organizations, as well as from European Critical Infrastructure Operators. Hence, the presented solutions respect the European approach to CIP, as reflected in the pillars of the European policy framework. The latter includes for example the Directive on security of network and information systems (NIS Directive), the Directive on protecting European Critical Infrastructures, the General Data Protection Regulation (GDPR), and the Cybersecurity Act Regulation. The sector specific solutions that are described in the book have been developed and validated in the scope of several European Commission (EC) co-funded projects on Critical Infrastructure Protection (CIP), which focus on the listed sectors. Overall, the book illustrates a rich set of systems, technologies, and applications that critical infrastructure operators could consult to shape their future strategies. It also provides a catalogue of CPTI case studies in different sectors, which could be useful for security consultants and practitioners as well

    Cyber-Physical Threat Intelligence for Critical Infrastructures Security

    Get PDF
    Modern critical infrastructures can be considered as large scale Cyber Physical Systems (CPS). Therefore, when designing, implementing, and operating systems for Critical Infrastructure Protection (CIP), the boundaries between physical security and cybersecurity are blurred. Emerging systems for Critical Infrastructures Security and Protection must therefore consider integrated approaches that emphasize the interplay between cybersecurity and physical security techniques. Hence, there is a need for a new type of integrated security intelligence i.e., Cyber-Physical Threat Intelligence (CPTI). This book presents novel solutions for integrated Cyber-Physical Threat Intelligence for infrastructures in various sectors, such as Industrial Sites and Plants, Air Transport, Gas, Healthcare, and Finance. The solutions rely on novel methods and technologies, such as integrated modelling for cyber-physical systems, novel reliance indicators, and data driven approaches including BigData analytics and Artificial Intelligence (AI). Some of the presented approaches are sector agnostic i.e., applicable to different sectors with a fair customization effort. Nevertheless, the book presents also peculiar challenges of specific sectors and how they can be addressed. The presented solutions consider the European policy context for Security, Cyber security, and Critical Infrastructure protection, as laid out by the European Commission (EC) to support its Member States to protect and ensure the resilience of their critical infrastructures. Most of the co-authors and contributors are from European Research and Technology Organizations, as well as from European Critical Infrastructure Operators. Hence, the presented solutions respect the European approach to CIP, as reflected in the pillars of the European policy framework. The latter includes for example the Directive on security of network and information systems (NIS Directive), the Directive on protecting European Critical Infrastructures, the General Data Protection Regulation (GDPR), and the Cybersecurity Act Regulation. The sector specific solutions that are described in the book have been developed and validated in the scope of several European Commission (EC) co-funded projects on Critical Infrastructure Protection (CIP), which focus on the listed sectors. Overall, the book illustrates a rich set of systems, technologies, and applications that critical infrastructure operators could consult to shape their future strategies. It also provides a catalogue of CPTI case studies in different sectors, which could be useful for security consultants and practitioners as well

    Actas de las VI Jornadas Nacionales (JNIC2021 LIVE)

    Get PDF
    Estas jornadas se han convertido en un foro de encuentro de los actores más relevantes en el ámbito de la ciberseguridad en España. En ellas, no sólo se presentan algunos de los trabajos científicos punteros en las diversas áreas de ciberseguridad, sino que se presta especial atención a la formación e innovación educativa en materia de ciberseguridad, y también a la conexión con la industria, a través de propuestas de transferencia de tecnología. Tanto es así que, este año se presentan en el Programa de Transferencia algunas modificaciones sobre su funcionamiento y desarrollo que han sido diseñadas con la intención de mejorarlo y hacerlo más valioso para toda la comunidad investigadora en ciberseguridad
    corecore