71 research outputs found

    Advanced analytical methods for fraud detection: a systematic literature review

    Get PDF
    The developments of the digital era demand new ways of producing goods and rendering services. This fast-paced evolution in the companies implies a new approach from the auditors, who must keep up with the constant transformation. With the dynamic dimensions of data, it is important to seize the opportunity to add value to the companies. The need to apply more robust methods to detect fraud is evident. In this thesis the use of advanced analytical methods for fraud detection will be investigated, through the analysis of the existent literature on this topic. Both a systematic review of the literature and a bibliometric approach will be applied to the most appropriate database to measure the scientific production and current trends. This study intends to contribute to the academic research that have been conducted, in order to centralize the existing information on this topic

    Robust Recommender System: A Survey and Future Directions

    Full text link
    With the rapid growth of information, recommender systems have become integral for providing personalized suggestions and overcoming information overload. However, their practical deployment often encounters "dirty" data, where noise or malicious information can lead to abnormal recommendations. Research on improving recommender systems' robustness against such dirty data has thus gained significant attention. This survey provides a comprehensive review of recent work on recommender systems' robustness. We first present a taxonomy to organize current techniques for withstanding malicious attacks and natural noise. We then explore state-of-the-art methods in each category, including fraudster detection, adversarial training, certifiable robust training against malicious attacks, and regularization, purification, self-supervised learning against natural noise. Additionally, we summarize evaluation metrics and common datasets used to assess robustness. We discuss robustness across varying recommendation scenarios and its interplay with other properties like accuracy, interpretability, privacy, and fairness. Finally, we delve into open issues and future research directions in this emerging field. Our goal is to equip readers with a holistic understanding of robust recommender systems and spotlight pathways for future research and development

    Evaluating Adversarial Robustness of Detection-based Defenses against Adversarial Examples

    Get PDF
    Machine Learning algorithms provide astonishing performance in a wide range of tasks, including sensitive and critical applications. On the other hand, it has been shown that they are vulnerable to adversarial attacks, a set of techniques that violate the integrity, confidentiality, or availability of such systems. In particular, one of the most studied phenomena concerns adversarial examples, i.e., input samples that are carefully manipulated to alter the model output. In the last decade, the research community put a strong effort into this field, proposing new evasion attacks and methods to defend against them. With this thesis, we propose different approaches that can be applied to Deep Neural Networks to detect and reject adversarial examples that present an anomalous distribution with respect to training data. The first leverages the domain knowledge of the relationships among the considered classes integrated through a framework in which first-order logic knowledge is converted into constraints and injected into a semi-supervised learning problem. Within this setting, the classifier is able to reject samples that violate the domain knowledge constraints. This approach can be applied in both single and multi-label classification settings. The second one is based on a Deep Neural Rejection (DNR) mechanism to detect adversarial examples, based on the idea of rejecting samples that exhibit anomalous feature representations at different network layers. To this end, we exploit RBF SVM classifiers, which provide decreasing confidence values as samples move away from the training data distribution. Despite technical differences, this approach shares a common backbone structure with other proposed methods that we formalize in a unifying framework. As all of them require comparing input samples against an oversized number of reference prototypes, possibly at different representation layers, they suffer from the same drawback, i.e., high computational overhead and memory usage, that makes these approaches unusable in real applications. To overcome this limitation, we introduce FADER (Fast Adversarial Example Rejection), a technique for speeding up detection-based methods by employing RBF networks as detectors: by fixing the number of required prototypes, their runtime complexity can be controlled. All proposed methods are evaluated in both black-box and white-box settings, i.e., against an attacker unaware of the defense mechanism, and against an attacker who knows the defense and adapts the attack algorithm to bypass it, respectively. Our experimental evaluation shows that the proposed methods increase the robustness of the defended models and help detect adversarial examples effectively, especially when the attacker does not know the underlying detection system

    Advances in computational frameworks in the fight against TB: The way forward

    Get PDF
    Around 1.6 million people lost their life to Tuberculosis in 2021 according to WHO estimates. Although an intensive treatment plan exists against the causal agent, Mycobacterium Tuberculosis, evolution of multi-drug resistant strains of the pathogen puts a large number of global populations at risk. Vaccine which can induce long-term protection is still in the making with many candidates currently in different phases of clinical trials. The COVID-19 pandemic has further aggravated the adversities by affecting early TB diagnosis and treatment. Yet, WHO remains adamant on its “End TB” strategy and aims to substantially reduce TB incidence and deaths by the year 2035. Such an ambitious goal would require a multi-sectoral approach which would greatly benefit from the latest computational advancements. To highlight the progress of these tools against TB, through this review, we summarize recent studies which have used advanced computational tools and algorithms for—early TB diagnosis, anti-mycobacterium drug discovery and in the designing of the next-generation of TB vaccines. At the end, we give an insight on other computational tools and Machine Learning approaches which have successfully been applied in biomedical research and discuss their prospects and applications against TB

    Single-board Device Individual Authentication based on Hardware Performance and Autoencoder Transformer Models

    Full text link
    The proliferation of the Internet of Things (IoT) has led to the emergence of crowdsensing applications, where a multitude of interconnected devices collaboratively collect and analyze data. Ensuring the authenticity and integrity of the data collected by these devices is crucial for reliable decision-making and maintaining trust in the system. Traditional authentication methods are often vulnerable to attacks or can be easily duplicated, posing challenges to securing crowdsensing applications. Besides, current solutions leveraging device behavior are mostly focused on device identification, which is a simpler task than authentication. To address these issues, an individual IoT device authentication framework based on hardware behavior fingerprinting and Transformer autoencoders is proposed in this work. This solution leverages the inherent imperfections and variations in IoT device hardware to differentiate between devices with identical specifications. By monitoring and analyzing the behavior of key hardware components, such as the CPU, GPU, RAM, and Storage on devices, unique fingerprints for each device are created. The performance samples are considered as time series data and used to train outlier detection transformer models, one per device and aiming to model its normal data distribution. Then, the framework is validated within a spectrum crowdsensing system leveraging Raspberry Pi devices. After a pool of experiments, the model from each device is able to individually authenticate it between the 45 devices employed for validation. An average True Positive Rate (TPR) of 0.74+-0.13 and an average maximum False Positive Rate (FPR) of 0.06+-0.09 demonstrate the effectiveness of this approach in enhancing authentication, security, and trust in crowdsensing applications

    Jornadas Nacionales de Investigación en Ciberseguridad: actas de las VIII Jornadas Nacionales de Investigación en ciberseguridad: Vigo, 21 a 23 de junio de 2023

    Get PDF
    Jornadas Nacionales de Investigación en Ciberseguridad (8ª. 2023. Vigo)atlanTTicAMTEGA: Axencia para a modernización tecnolóxica de GaliciaINCIBE: Instituto Nacional de Cibersegurida

    GLAZE: Protecting Artists from Style Mimicry by Text-to-Image Models

    Full text link
    Recent text-to-image diffusion models such as MidJourney and Stable Diffusion threaten to displace many in the professional artist community. In particular, models can learn to mimic the artistic style of specific artists after "fine-tuning" on samples of their art. In this paper, we describe the design, implementation and evaluation of Glaze, a tool that enables artists to apply "style cloaks" to their art before sharing online. These cloaks apply barely perceptible perturbations to images, and when used as training data, mislead generative models that try to mimic a specific artist. In coordination with the professional artist community, we deploy user studies to more than 1000 artists, assessing their views of AI art, as well as the efficacy of our tool, its usability and tolerability of perturbations, and robustness across different scenarios and against adaptive countermeasures. Both surveyed artists and empirical CLIP-based scores show that even at low perturbation levels (p=0.05), Glaze is highly successful at disrupting mimicry under normal conditions (>92%) and against adaptive countermeasures (>85%)

    Detection and Mitigation of Steganographic Malware

    Get PDF
    A new attack trend concerns the use of some form of steganography and information hiding to make malware stealthier and able to elude many standard security mechanisms. Therefore, this Thesis addresses the detection and the mitigation of this class of threats. In particular, it considers malware implementing covert communications within network traffic or cloaking malicious payloads within digital images. The first research contribution of this Thesis is in the detection of network covert channels. Unfortunately, the literature on the topic lacks of real traffic traces or attack samples to perform precise tests or security assessments. Thus, a propaedeutic research activity has been devoted to develop two ad-hoc tools. The first allows to create covert channels targeting the IPv6 protocol by eavesdropping flows, whereas the second allows to embed secret data within arbitrary traffic traces that can be replayed to perform investigations in realistic conditions. This Thesis then starts with a security assessment concerning the impact of hidden network communications in production-quality scenarios. Results have been obtained by considering channels cloaking data in the most popular protocols (e.g., TLS, IPv4/v6, and ICMPv4/v6) and showcased that de-facto standard intrusion detection systems and firewalls (i.e., Snort, Suricata, and Zeek) are unable to spot this class of hazards. Since malware can conceal information (e.g., commands and configuration files) in almost every protocol, traffic feature or network element, configuring or adapting pre-existent security solutions could be not straightforward. Moreover, inspecting multiple protocols, fields or conversations at the same time could lead to performance issues. Thus, a major effort has been devoted to develop a suite based on the extended Berkeley Packet Filter (eBPF) to gain visibility over different network protocols/components and to efficiently collect various performance indicators or statistics by using a unique technology. This part of research allowed to spot the presence of network covert channels targeting the header of the IPv6 protocol or the inter-packet time of generic network conversations. In addition, the approach based on eBPF turned out to be very flexible and also allowed to reveal hidden data transfers between two processes co-located within the same host. Another important contribution of this part of the Thesis concerns the deployment of the suite in realistic scenarios and its comparison with other similar tools. Specifically, a thorough performance evaluation demonstrated that eBPF can be used to inspect traffic and reveal the presence of covert communications also when in the presence of high loads, e.g., it can sustain rates up to 3 Gbit/s with commodity hardware. To further address the problem of revealing network covert channels in realistic environments, this Thesis also investigates malware targeting traffic generated by Internet of Things devices. In this case, an incremental ensemble of autoencoders has been considered to face the ''unknown'' location of the hidden data generated by a threat covertly exchanging commands towards a remote attacker. The second research contribution of this Thesis is in the detection of malicious payloads hidden within digital images. In fact, the majority of real-world malware exploits hiding methods based on Least Significant Bit steganography and some of its variants, such as the Invoke-PSImage mechanism. Therefore, a relevant amount of research has been done to detect the presence of hidden data and classify the payload (e.g., malicious PowerShell scripts or PHP fragments). To this aim, mechanisms leveraging Deep Neural Networks (DNNs) proved to be flexible and effective since they can learn by combining raw low-level data and can be updated or retrained to consider unseen payloads or images with different features. To take into account realistic threat models, this Thesis studies malware targeting different types of images (i.e., favicons and icons) and various payloads (e.g., URLs and Ethereum addresses, as well as webshells). Obtained results showcased that DNNs can be considered a valid tool for spotting the presence of hidden contents since their detection accuracy is always above 90% also when facing ''elusion'' mechanisms such as basic obfuscation techniques or alternative encoding schemes. Lastly, when detection or classification are not possible (e.g., due to resource constraints), approaches enforcing ''sanitization'' can be applied. Thus, this Thesis also considers autoencoders able to disrupt hidden malicious contents without degrading the quality of the image

    Toward Building an Intelligent and Secure Network: An Internet Traffic Forecasting Perspective

    Get PDF
    Internet traffic forecast is a crucial component for the proactive management of self-organizing networks (SON) to ensure better Quality of Service (QoS) and Quality of Experience (QoE). Given the volatile and random nature of traffic data, this forecasting influences strategic development and investment decisions in the Internet Service Provider (ISP) industry. Modern machine learning algorithms have shown potential in dealing with complex Internet traffic prediction tasks, yet challenges persist. This thesis systematically explores these issues over five empirical studies conducted in the past three years, focusing on four key research questions: How do outlier data samples impact prediction accuracy for both short-term and long-term forecasting? How can a denoising mechanism enhance prediction accuracy? How can robust machine learning models be built with limited data? How can out-of-distribution traffic data be used to improve the generalizability of prediction models? Based on extensive experiments, we propose a novel traffic forecast/prediction framework and associated models that integrate outlier management and noise reduction strategies, outperforming traditional machine learning models. Additionally, we suggest a transfer learning-based framework combined with a data augmentation technique to provide robust solutions with smaller datasets. Lastly, we propose a hybrid model with signal decomposition techniques to enhance model generalization for out-of-distribution data samples. We also brought the issue of cyber threats as part of our forecast research, acknowledging their substantial influence on traffic unpredictability and forecasting challenges. Our thesis presents a detailed exploration of cyber-attack detection, employing methods that have been validated using multiple benchmark datasets. Initially, we incorporated ensemble feature selection with ensemble classification to improve DDoS (Distributed Denial-of-Service) attack detection accuracy with minimal false alarms. Our research further introduces a stacking ensemble framework for classifying diverse forms of cyber-attacks. Proceeding further, we proposed a weighted voting mechanism for Android malware detection to secure Mobile Cyber-Physical Systems, which integrates the mobility of various smart devices to exchange information between physical and cyber systems. Lastly, we employed Generative Adversarial Networks for generating flow-based DDoS attacks in Internet of Things environments. By considering the impact of cyber-attacks on traffic volume and their challenges to traffic prediction, our research attempts to bridge the gap between traffic forecasting and cyber security, enhancing proactive management of networks and contributing to resilient and secure internet infrastructure
    corecore