78 research outputs found

    Rejection-oriented learning without complete class information

    Get PDF
    Machine Learning is commonly used to support decision-making in numerous, diverse contexts. Its usefulness in this regard is unquestionable: there are complex systems built on the top of machine learning techniques whose descriptive and predictive capabilities go far beyond those of human beings. However, these systems still have limitations, whose analysis enable to estimate their applicability and confidence in various cases. This is interesting considering that abstention from the provision of a response is preferable to make a mistake in doing so. In the context of classification-like tasks, the indication of such inconclusive output is called rejection. The research which culminated in this thesis led to the conception, implementation and evaluation of rejection-oriented learning systems for two distinct tasks: open set recognition and data stream clustering. These system were derived from WiSARD artificial neural network, which had rejection modelling incorporated into its functioning. This text details and discuss such realizations. It also presents experimental results which allow assess the scientific and practical importance of the proposed state-of-the-art methodology.Aprendizado de Máquina é comumente usado para apoiar a tomada de decisão em numerosos e diversos contextos. Sua utilidade neste sentido é inquestionável: existem sistemas complexos baseados em técnicas de aprendizado de máquina cujas capacidades descritivas e preditivas vão muito além das dos seres humanos. Contudo, esses sistemas ainda possuem limitações, cuja análise permite estimar sua aplicabilidade e confiança em vários casos. Isto é interessante considerando que a abstenção da provisão de uma resposta é preferível a cometer um equívoco ao realizar tal ação. No contexto de classificação e tarefas similares, a indicação desse resultado inconclusivo é chamada de rejeição. A pesquisa que culminou nesta tese proporcionou a concepção, implementação e avaliação de sistemas de aprendizado orientados `a rejeição para duas tarefas distintas: reconhecimento em cenário abertos e agrupamento de dados em fluxo contínuo. Estes sistemas foram derivados da rede neural artificial WiSARD, que teve a modelagem de rejeição incorporada a seu funcionamento. Este texto detalha e discute tais realizações. Ele também apresenta resultados experimentais que permitem avaliar a importância científica e prática da metodologia de ponta proposta

    Denial of Service in Web-Domains: Building Defenses Against Next-Generation Attack Behavior

    Get PDF
    The existing state-of-the-art in the field of application layer Distributed Denial of Service (DDoS) protection is generally designed, and thus effective, only for static web domains. To the best of our knowledge, our work is the first that studies the problem of application layer DDoS defense in web domains of dynamic content and organization, and for next-generation bot behaviour. In the first part of this thesis, we focus on the following research tasks: 1) we identify the main weaknesses of the existing application-layer anti-DDoS solutions as proposed in research literature and in the industry, 2) we obtain a comprehensive picture of the current-day as well as the next-generation application-layer attack behaviour and 3) we propose novel techniques, based on a multidisciplinary approach that combines offline machine learning algorithms and statistical analysis, for detection of suspicious web visitors in static web domains. Then, in the second part of the thesis, we propose and evaluate a novel anti-DDoS system that detects a broad range of application-layer DDoS attacks, both in static and dynamic web domains, through the use of advanced techniques of data mining. The key advantage of our system relative to other systems that resort to the use of challenge-response tests (such as CAPTCHAs) in combating malicious bots is that our system minimizes the number of these tests that are presented to valid human visitors while succeeding in preventing most malicious attackers from accessing the web site. The results of the experimental evaluation of the proposed system demonstrate effective detection of current and future variants of application layer DDoS attacks

    Data-driven Disease Surveillance

    Get PDF
    The recent and still ongoing pandemic of SARS-CoV-2 has shown that an infectious disease outbreak can have serious consequences on public health and economy. In this situation, public health officials constantly aim to control and reduce the number of infections in order to avoid overburdening health care system. Besides minimizing personal contact through political measures, a fundamental approach to contain the spread of diseases is to isolate infected individuals. The effectiveness of the latter approach strongly depends on a timely detection of the outbreak as the tracking of individuals can quickly become infeasible when the number of cases increases. Hence, a key factor in the containment of an infectious disease is the early detection of a potential larger outbreak, commonly known as outbreak detection. For this purpose, epidemiologists rely on a variety of statistical surveillance methods in order to maintain an overview of the current situation of infections by either monitoring confirmed cases or cases with early symptoms. Mainly based on statistical hypothesis testing, these methods automatically raise an alarm if an unexpected increase in the number of infections is observed. The practical usefulness of such methods highly depends on the trade-off between the ability to detect outbreaks and the chances of raising a false alarm. However, this hypothesis-based approach to disease surveillance has several limitations. On the one hand, it is a hand-crafted approach which requires domain knowledge to set up the statistical methods, especially if early symptoms are monitored. On the other hand, outbreaks of emerging infectious diseases with different symptom patterns are likely to be missed by such a surveillance system. In this thesis, we focus on data-driven disease surveillance and address these challenges in the following ways. To support epidemiologists in the process of defining reliable disease patterns for monitoring cases with early symptoms, we present a novel approach to discover such patterns in historic data. With respect to supervised learning, we propose a fusion classifier which can combine the output of multiple statistical methods using the univariate time series of infection counts as the only source of information. In addition, we develop algorithms based on unsupervised learning which frame the task of outbreak detection as a general anomaly detection task. This even includes the surveillance of emerging infectious diseases. Therefore, we contribute a novel framework and propose a new approach based on sum-product networks to monitor multiple disease patterns simultaneously. Our results show that data-driven approaches are ideal to assist epidemiologists by processing large amounts of data that cannot fully be understood and analyzed by humans. Most significantly, the incorporation of additional information into the surveillance through machine learning techniques shows reliable and promising results

    Interactive visualization of event logs for cybersecurity

    Get PDF
    Hidden cyber threats revealed with new visualization software Eventpa
    • …
    corecore