141 research outputs found

    DDoS Attacks with Randomized Traffic Innovation: Botnet Identification Challenges and Strategies

    Full text link
    Distributed Denial-of-Service (DDoS) attacks are usually launched through the botnetbotnet, an "army" of compromised nodes hidden in the network. Inferential tools for DDoS mitigation should accordingly enable an early and reliable discrimination of the normal users from the compromised ones. Unfortunately, the recent emergence of attacks performed at the application layer has multiplied the number of possibilities that a botnet can exploit to conceal its malicious activities. New challenges arise, which cannot be addressed by simply borrowing the tools that have been successfully applied so far to earlier DDoS paradigms. In this work, we offer basically three contributions: i)i) we introduce an abstract model for the aforementioned class of attacks, where the botnet emulates normal traffic by continually learning admissible patterns from the environment; ii)ii) we devise an inference algorithm that is shown to provide a consistent (i.e., converging to the true solution as time progresses) estimate of the botnet possibly hidden in the network; and iii)iii) we verify the validity of the proposed inferential strategy over realreal network traces.Comment: Submitted for publicatio

    Search strategies in unstructured overlays

    Get PDF
    Trabalho de projecto de mestrado em Engenharia Informática, apresentado à Universidade de Lisboa, através da Faculdade de Ciências, 2008Unstructured peer-to-peer networks have a low maintenance cost, high resilience and tolerance to the continuous arrival and departure of nodes. In these networks search is usually performed by flooding, which generates a high number of duplicate messages. To improve scalability, unstructured overlays evolved to a two-tiered architecture where regular nodes rely on special nodes, called supernodes or superpeers, to locate resources, thus reducing the scope of flooding based searches. While this approach takes advantage of node heterogeneity, it makes the overlay less resilient to accidental and malicious faults, and less attractive to users concerned with the consumption of their resources and who may not desire to commit additional resources that are required by nodes selected as superpeers. Another point of concern is churn, defined as the constant entry and departure of nodes. Churn affects both structured and unstructured overlay networks and, in order to build resilient search protocols, it must be taken into account. This dissertation proposes a novel search algorithm, called FASE, which combines a replication policy and a search space division technique to achieve low hop counts using a small number of messages, on unstructured overlays with nonhierarquical topologies. The problem of churn is mitigated by a distributed monitoring algorithm designed with FASE in mind. Simulation results validate FASE efficiency when compared to other search algorithms for peer-to-peer networks. The evaluation of the distributed monitoring algorithm shows that it maintains FASE performance when subjected to churn.Os sistemas peer-to-peer, como aplicações de partilha e distribuição de conteúdos ou voz-sobre-IP, são construídos sobre redes sobrepostas. Redes sobrepostas são redes virtuais que existem sobre uma rede subjacente, em que a topologia da rede sobreposta não tem de ter uma correspondência com a topologia da rede subjacente. Ao contrário das suas congéneres estruturadas, as redes sobrepostas não-estru-turadas não restringem a localização dos seus participantes, ou seja, não limitam a escolha de vizinhos de um dado nó, o que torna a sua manutenção mais simples. O baixo custo de manutenção das redes sobrepostas não-estruturadas torna estas especialmente adequadas para a construção de sistemas peer-to-peer capazes de tolerar o comportamento dinâmico dos seus participantes, uma vez que estas redes são permanentemente afectadas pela entrada e saída de nós na rede, um fénomeno conhecido como churn. O algoritmo de pesquisa mais comum em redes sobrepostas não-estruturadas consiste em inundar a rede, o que origina uma grande quantidade de mensagens duplicadas por cada pesquisa. A escalabilidade destes algoritmos é limitada porque consomem demasiados recursos da rede em sistemas com muitos participantes. Para reduzir o número de mensagens, as redes sobrepostas não-estruturadas podem ser organizadas em topologias hierárquicas. Nestas topologias alguns nós da rede, chamados supernós, assumem um papel mais importante, responsabilizando-se pela localização de objectos. A utilização de supernós cria novos problemas, como a sua selecção e a dependência da rede de uma pequena percentagem dos nós. Esta dissertação apresenta um novo algoritmo de pesquisa, chamado FASE, criado para operar sobre redes sobrepostas não estruturadas com topologias não-hierárquicas. Este algoritmo combina uma política de replicação com uma técnica de divisão do espaço de procura para resolver pesquisas ao alcançe de um número reduzido de saltos com o menor custo possível. Adicionalmente, o algoritmo procura nivelar a contribuição dos participantes, já que todos contribuem de uma forma semelhante para o desempenho da pesquisa. A estratégia seguida pelo algo- ritmo consiste em dividir tanto os nós da rede como as chaves dos seus conteúdos por diferentes “frequências” e replicar chaves nas respectivas frequências, sem, no entanto, limitar a localização de um nó ou impor uma estrutura à rede ou mesmo aplicar uma definição rígida de chave. Com o objectivo de mitigar o problema do churn, é apresentado um algoritmo de monitorização distribuído para as réplicas originadas pelo FASE. Os algoritmos propostos são avaliados através de simulações, que validam a eficiência do FASE quando comparado com outros algoritmos de pesquisa em redes sobrepostas não-estruturadas. É também demonstrado que o FASE mantém o seu desempenho em redes sob o efeito do churn quando combinado com o algoritmo de monitorização

    Towards Defeating Mass Surveillance and SARS-CoV-2: The Pronto-C2 Fully Decentralized Automatic Contact Tracing System

    Get PDF
    Mass surveillance can be more easily achieved leveraging fear and desire of the population to feel protected while affected by devastating events. Indeed, in such scenarios, governments can adopt exceptional measures that limit civil rights, usually receiving large support from citizens. The COVID-19 pandemic is currently affecting daily life of many citizens in the world. People are forced to stay home for several weeks, unemployment rates quickly increase, uncertainty and sadness generate an impelling desire to join any government effort in order to stop as soon as possible the spread of the virus. Following recommendations of epidemiologists, governments are proposing the use of smartphone applications to allow automatic contact tracing of citizens.Such systems can be an effective way to defeat the spread of the SARS-CoV-2 virus since they allow to gain time in identifying potentially new infected persons that should therefore be in quarantine. This raises the natural question of whether this form of automatic contact tracing can be a subtle weapon for governments to violate privacy inside new and more sophisticated mass surveillance programs. In order to preserve privacy and at the same time to contribute to the containment of the pandemic, several research partnerships are proposing privacy-preserving contact tracing systems where pseudonyms are updated periodically to avoid linkability attacks. A core component of such systems is Bluetooth low energy (BLE, for short) a technology that allows two smartphones to detect that they are in close proximity. Among such systems there are some proposals like DP-3T, MIT-PACT, UW-PACT and the Apple&Google exposure notification system that through a decentralized approach claim to guarantee better privacy properties compared to other centralized approaches (e.g., PEPP-PT-NTK, PEPP-PT-ROBERT). On the other hand, advocates of centralized approaches claim that centralization gives to epidemiologists more useful data, therefore allowing to take more effective actions to defeat the virus. Motivated by Snowden\u27s revelations about previous attempts of governments to realize mass surveillance programs, in this paper we first analyze mass surveillance attacks that leverage weaknesses of automatic contact tracing systems. We focus in particular on the DP-3T system (still our analysis is significant also for MIT-PACT and Apple&Google systems). Based on recent literature and new findings, we discuss how a government can exploit the use of the DP-3T system to successfully mount privacy attacks as part of a mass surveillance program. Interestingly, we show that privacy issues in the DP-3T system are not inherent in BLE-based contact tracing systems. Indeed, we propose two systems named and Pronto-C2\textsf{Pronto-C2} that, in our view, enjoy a much better resilience with respect to mass surveillance attacks still relying on BLE. Both systems are based on a paradigm shift: instead of asking smartphones to send keys to the Big Brother (this corresponds to the approach of the DP-3T system), we construct a decentralized BLE-based ACT system where smartphones anonymously and confidentially talk to each other in the presence of the Big Brother. Unlike Pronto-B2\textsf{Pronto-B2}, Pronto-C2\textsf{Pronto-C2} relies on Diffie-Hellman key exchange providing better privacy but also requiring a bulletin board to translate a BLE beacon identifier into a group element. Both systems can optionally be implemented using Blockchain technology, offering complete transparency and resilience through full decentralization, therefore being more appealing for citizens. Only through a large participation of citizens contact tracing systems can be really useful to defeat COVID-19, and our proposal goes straight in this direction

    Formalizing evasion attacks against machine learning security detectors

    Get PDF
    Recent work has shown that adversarial examples can bypass machine learning-based threat detectors relying on static analysis by applying minimal perturbations. To preserve malicious functionality, previous attacks either apply trivial manipulations (e.g. padding), potentially limiting their effectiveness, or require running computationally-demanding validation steps to discard adversarial variants that do not correctly execute in sandbox environments. While machine learning systems for detecting SQL injections have been proposed in the literature, no attacks have been tested against the proposed solutions to assess the effectiveness and robustness of these methods. In this thesis, we overcome these limitations by developing RAMEn, a unifying framework that (i) can express attacks for different domains, (ii) generalizes previous attacks against machine learning models, and (iii) uses functions that preserve the functionality of manipulated objects. We provide new attacks for both Windows malware and SQL injection detection scenarios by exploiting the format used for representing these objects. To show the efficacy of RAMEn, we provide experimental results of our strategies in both white-box and black-box settings. The white-box attacks against Windows malware detectors show that it takes only the 2% of the input size of the target to evade detection with ease. To further speed up the black-box attacks, we overcome the issues mentioned before by presenting a novel family of black-box attacks that are both query-efficient and functionality-preserving, as they rely on the injection of benign content, which will never be executed, either at the end of the malicious file, or within some newly-created sections, encoded in an algorithm called GAMMA. We also evaluate whether GAMMA transfers to other commercial antivirus solutions, and surprisingly find that it can evade many commercial antivirus engines. For evading SQLi detectors, we create WAF-A-MoLE, a mutational fuzzer that that exploits random mutations of the input samples, keeping alive only the most promising ones. WAF-A-MoLE is capable of defeating detectors built with different architectures by using the novel practical manipulations we have proposed. To facilitate reproducibility and future work, we open-source our framework and corresponding attack implementations. We conclude by discussing the limitations of current machine learning-based malware detectors, along with potential mitigation strategies based on embedding domain knowledge coming from subject-matter experts naturally into the learning process

    Collaborative Intrusion Detection in Federated Cloud Environments using Dempster-Shafer Theory of Evidence

    Get PDF
    Moving services to the Cloud environment is a trend that has been increasing in recent years, with a constant increase in sophistication and complexity of such services. Today, even critical infrastructure operators are considering moving their services and data to the Cloud. As Cloud computing grows in popularity, new models are deployed to further the associated benefits. Federated Clouds are one such concept, which are an alternative for companies reluctant to move their data out of house to a Cloud Service Providers (CSP) due to security and confidentiality concerns. Lack of collaboration among different components within a Cloud federation, or among CSPs, for detection or prevention of attacks is an issue. For protecting these services and data, as Cloud environments and Cloud federations are large scale, it is essential that any potential solution should scale alongside the environment adapt to the underlying infrastructure without any issues or performance implications. This thesis presents a novel architecture for collaborative intrusion detection specifically for CSPs within a Cloud federation. Our approach offers a proactive model for Cloud intrusion detection based on the distribution of responsibilities, whereby the responsibility for managing the elements of the Cloud is distributed among several monitoring nodes and brokering, utilising our Service-based collaborative intrusion detection – “Security as a Service” methodology. For collaborative intrusion detection, the Dempster-Shafer (D-S) theory of evidence is applied, executing as a fusion node with the role of collecting and fusing the information provided by the monitoring entities, taking the final decision regarding a possible attack. This type of detection and prevention helps increase resilience to attacks in the Cloud. The main novel contribution of this project is that it provides the means by which DDoS attacks are detected within a Cloud federation, so as to enable an early propagated response to block the attack. This inter-domain cooperation will offer holistic security, and add to the defence in depth. However, while the utilisation of D-S seems promising, there is an issue regarding conflicting evidences which is addressed with an extended two stage D-S fusion process. The evidence from the research strongly suggests that fusion algorithms can play a key role in autonomous decision making schemes, however our experimentation highlights areas upon which improvements are needed before fully applying to federated environments

    Distributed D3: A web-based distributed data visualisation framework for Big Data

    Get PDF
    The influx of Big Data has created an ever-growing need for analytic tools targeting towards the acquisition of insights and knowledge from large datasets. Visual perception as a fundamental tool used by humans to retrieve information from the outside world around us has its unique ability to distinguish patterns pre-attentively. Visual analytics via data visualisations is therefore a very powerful tool and has become ever more important in this era. Data-Driven Documents (D3.js) is a versatile and popular web-based data visualisation library that has tended to be the standard toolkit for visualising data in recent years. However, the library is technically inherent and limited in capability by the single thread model of a single browser window in a single machine, and therefore not able to deal with large datasets. The main objective of this thesis is to overcome this limitation and address possible challenges by developing the Distributed D3 framework that employs distributed mechanism to enable the possibility of delivering web-based visualisations for large-scale data, which also allows to effectively utilise the graphical computational resources of the modern visualisation environments. As a result, the first contribution is that the integrated version of Distributed D3 framework has been developed for the Data Observatory. The work proves the concept of Distributed D3 is feasible in reality and also enables developers to collaborate on large-scale data visualisations by using it on the Data Observatory. The second contribution is that the Distributed D3 has been optimised by investigating the potential bottlenecks for large-scale data visualisation applications. The work finds the key performance bottlenecks of the framework and shows an improvement of the overall performance by 35.7% after optimisations, which improves the scalability and usability of Distributed D3 for large-scale data visualisation applications. The third contribution is that the generic version of Distributed D3 framework has been developed for the customised environments. The work improves the usability and flexibility of the framework and makes it ready to be published in the open-source community for further improvements and usages.Open Acces

    A Framework for anonymous background data delivery and feedback

    Get PDF
    The current state of the industry’s methods of collecting background data reflecting diagnostic and usage information are often opaque and require users to place a lot of trust in the entity receiving the data. For vendors, having a centralized database of potentially sensitive data is a privacy protection headache and a potential liability should a breach of that database occur. Unfortunately, high profile privacy failures are not uncommon, so many individuals and companies are understandably skeptical and choose not to contribute any information. It is a shame, since the data could be used for improving reliability, or getting stronger security, or for valuable academic research into real-world usage patterns. We propose, implement and evaluate a framework for non-realtime anonymous data collection, aggregation for analysis, and feedback. Departing from the usual “trusted core” approach, we aim to maintain reporters’ anonymity even if the centralized part of the system is compromised. We design a peer-to-peer mix network and its protocol that are tuned to the properties of background diagnostic traffic. Our system delivers data to a centralized repository while maintaining (i) source anonymity, (ii) privacy in transit, and (iii) the ability to provide analysis feedback back to the source. By removing the core’s ability to identify the source of data and to track users over time, we drastically reduce its attractiveness as a potential attack target and allow vendors to make concrete and verifiable privacy and anonymity claims

    Privacy-preserving systems around security, trust and identity

    Get PDF
    Data has proved to be the most valuable asset in a modern world of rapidly advancing technologies. Companies are trying to maximise their profits by getting valuable insights from collected data about people’s trends and behaviour which often can be considered personal and sensitive. Additionally, sophisticated adversaries often target organisations aiming to exfiltrate sensitive data to sell it to third parties or ask for ransom. Hence, the privacy assurance of the individual data producers is a matter of great importance who rely on simply trusting that the services they use took all the necessary countermeasures to protect them.Distributed ledger technology and its variants can securely store data and preserve its privacy with novel characteristics. Additionally, the concept of self-sovereign identity, which gives the control back to the data subjects, is an expected future step once these approaches mature further. Last but not least, big data analysis typically occurs through machine learning techniques. However, the security of these techniques is often questioned since adversaries aim to exploit them for their benefit.The aspect of security, privacy and trust is highlighted throughout this thesis which investigates several emerging technologies that aim to protect and analyse sensitive data compared to already existing systems, tools and approaches in terms of security guarantees and performance efficiency.The contributions of this thesis derive to i) the presentation of a novel distributed ledger infrastructure tailored to the domain name system, ii) the adaptation of this infrastructure to a critical healthcare use case, iii) the development of a novel self-sovereign identity healthcare scenario in which a data scientist analyses sensitive data stored in the premises of three hospitals, through a privacy-preserving machine learning approach, and iv) the thorough investigation of adversarial attacks that aim to exploit machine learning intrusion detection systems by “tricking” them to misclassify carefully crafted inputs such as malware identified as benign.A significant finding is that the security and privacy of data are often neglected since they do not directly impact people’s lives. It is common for the protection and confidentiality of systems, even of critical nature, to be an afterthought, which is considered merely after malicious intents occur. Further, emerging sets of technologies, tools, and approaches built with fundamental security and privacy principles, such as the distributed ledger technology, should be favoured by existing systems that can adopt them without significant changes and compromises. Additionally, it has been presented that the decentralisation of machine learning algorithms through self-sovereign identity technologies that provide novel end-to-end encrypted channels is possible without sacrificing the valuable utility of the original machine learning algorithms.However, a matter of great importance is that alongside technological advancements, adversaries are becoming more sophisticated in this area and are trying to exploit the aforementioned machine learning approaches and other similar ones for their benefit through various tools and approaches. Adversarial attacks pose a real threat to any machine learning algorithm and artificial intelligence technique, and their detection is challenging and often problematic. Hence, any security professional operating in this domain should consider the impact of these attacks and the protection countermeasures to combat or minimise them
    corecore