10 research outputs found

    A Modular and Adaptive System for Business Email Compromise Detection

    Full text link
    The growing sophistication of Business Email Compromise (BEC) and spear phishing attacks poses significant challenges to organizations worldwide. The techniques featured in traditional spam and phishing detection are insufficient due to the tailored nature of modern BEC attacks as they often blend in with the regular benign traffic. Recent advances in machine learning, particularly in Natural Language Understanding (NLU), offer a promising avenue for combating such attacks but in a practical system, due to limitations such as data availability, operational costs, verdict explainability requirements or a need to robustly evolve the system, it is essential to combine multiple approaches together. We present CAPE, a comprehensive and efficient system for BEC detection that has been proven in a production environment for a period of over two years. Rather than being a single model, CAPE is a system that combines independent ML models and algorithms detecting BEC-related behaviors across various email modalities such as text, images, metadata and the email's communication context. This decomposition makes CAPE's verdicts naturally explainable. In the paper, we describe the design principles and constraints behind its architecture, as well as the challenges of model design, evaluation and adapting the system continuously through a Bayesian approach that combines limited data with domain knowledge. Furthermore, we elaborate on several specific behavioral detectors, such as those based on Transformer neural architectures

    Fully Invisible Protean Signatures Schemes

    Get PDF
    Protean Signatures (PS), recently introduced by Krenn et al. (CANS \u2718), allow a semi-trusted third party, named the sanitizer, to modify a signed message in a controlled way. The sanitizer can edit signer-chosen parts to arbitrary bitstrings, while the sanitizer can also redact admissible parts, which are also chosen by the signer. Thus, PSs generalize both redactable signature (RSS) and sanitizable signature (SSS) into a single notion. However, the current definition of invisibility does not prohibit that an outsider can decide which parts of a message are redactable - only which parts can be edited are hidden. This negatively impacts on the privacy guarantees provided by the state-of-the-art definition. We extend PSs to be fully invisible. This strengthened notion guarantees that an outsider can neither decide which parts of a message can be edited nor which parts can be redacted. To achieve our goal, we introduce the new notions of Invisible RSSs and Invisible Non-Accountable SSSs (SSS\u27), along with a consolidated framework for aggregate signatures. Using those building blocks, our resulting construction is significantly more efficient than the original scheme by Krenn et al., which we demonstrate in a prototypical implementation

    FIN-DM: finantsteenuste andmekaeve protsessi mudel

    Get PDF
    Andmekaeve hõlmab reeglite kogumit, protsesse ja algoritme, mis võimaldavad ettevõtetel iga päev kogutud andmetest rakendatavaid teadmisi ammutades suurendada tulusid, vähendada kulusid, optimeerida tooteid ja kliendisuhteid ning saavutada teisi eesmärke. Andmekaeves ja -analüütikas on vaja hästi määratletud metoodikat ja protsesse. Saadaval on mitu andmekaeve ja -analüütika standardset protsessimudelit. Kõige märkimisväärsem ja laialdaselt kasutusele võetud standardmudel on CRISP-DM. Tegu on tegevusalast sõltumatu protsessimudeliga, mida kohandatakse sageli sektorite erinõuetega. CRISP-DMi tegevusalast lähtuvaid kohandusi on pakutud mitmes valdkonnas, kaasa arvatud meditsiini-, haridus-, tööstus-, tarkvaraarendus- ja logistikavaldkonnas. Seni pole aga mudelit kohandatud finantsteenuste sektoris, millel on omad valdkonnapõhised erinõuded. Doktoritöös käsitletakse seda lünka finantsteenuste sektoripõhise andmekaeveprotsessi (FIN-DM) kavandamise, arendamise ja hindamise kaudu. Samuti uuritakse, kuidas kasutatakse andmekaeve standardprotsesse eri tegevussektorites ja finantsteenustes. Uurimise käigus tuvastati mitu tavapärase raamistiku kohandamise stsenaariumit. Lisaks ilmnes, et need meetodid ei keskendu piisavalt sellele, kuidas muuta andmekaevemudelid tarkvaratoodeteks, mida saab integreerida organisatsioonide IT-arhitektuuri ja äriprotsessi. Peamised finantsteenuste valdkonnas tuvastatud kohandamisstsenaariumid olid seotud andmekaeve tehnoloogiakesksete (skaleeritavus), ärikesksete (tegutsemisvõime) ja inimkesksete (diskrimineeriva mõju leevendus) aspektidega. Seejärel korraldati tegelikus finantsteenuste organisatsioonis juhtumiuuring, mis paljastas 18 tajutavat puudujääki CRISP- DMi protsessis. Uuringu andmete ja tulemuste abil esitatakse doktoritöös finantsvaldkonnale kohandatud CRISP-DM nimega FIN-DM ehk finantssektori andmekaeve protsess (Financial Industry Process for Data Mining). FIN-DM laiendab CRISP-DMi nii, et see toetab privaatsust säilitavat andmekaevet, ohjab tehisintellekti eetilisi ohte, täidab riskijuhtimisnõudeid ja hõlmab kvaliteedi tagamist kui osa andmekaeve elutsüklisData mining is a set of rules, processes, and algorithms that allow companies to increase revenues, reduce costs, optimize products and customer relationships, and achieve other business goals, by extracting actionable insights from the data they collect on a day-to-day basis. Data mining and analytics projects require well-defined methodology and processes. Several standard process models for conducting data mining and analytics projects are available. Among them, the most notable and widely adopted standard model is CRISP-DM. It is industry-agnostic and often is adapted to meet sector-specific requirements. Industry- specific adaptations of CRISP-DM have been proposed across several domains, including healthcare, education, industrial and software engineering, logistics, etc. However, until now, there is no existing adaptation of CRISP-DM for the financial services industry, which has its own set of domain-specific requirements. This PhD Thesis addresses this gap by designing, developing, and evaluating a sector-specific data mining process for financial services (FIN-DM). The PhD thesis investigates how standard data mining processes are used across various industry sectors and in financial services. The examination identified number of adaptations scenarios of traditional frameworks. It also suggested that these approaches do not pay sufficient attention to turning data mining models into software products integrated into the organizations' IT architectures and business processes. In the financial services domain, the main discovered adaptation scenarios concerned technology-centric aspects (scalability), business-centric aspects (actionability), and human-centric aspects (mitigating discriminatory effects) of data mining. Next, an examination by means of a case study in the actual financial services organization revealed 18 perceived gaps in the CRISP-DM process. Using the data and results from these studies, the PhD thesis outlines an adaptation of CRISP-DM for the financial sector, named the Financial Industry Process for Data Mining (FIN-DM). FIN-DM extends CRISP-DM to support privacy-compliant data mining, to tackle AI ethics risks, to fulfill risk management requirements, and to embed quality assurance as part of the data mining life-cyclehttps://www.ester.ee/record=b547227

    Side-Channel Analysis and Cryptography Engineering : Getting OpenSSL Closer to Constant-Time

    Get PDF
    As side-channel attacks reached general purpose PCs and started to be more practical for attackers to exploit, OpenSSL adopted in 2005 a flagging mechanism to protect against SCA. The opt-in mechanism allows to flag secret values, such as keys, with the BN_FLG_CONSTTIME flag. Whenever a flag is checked and detected, the library changes its execution flow to SCA-secure functions that are slower but safer, protecting these secret values from being leaked. This mechanism favors performance over security, it is error-prone, and is obscure for most library developers, increasing the potential for side-channel vulnerabilities. This dissertation presents an extensive side-channel analysis of OpenSSL and criticizes its fragile flagging mechanism. This analysis reveals several flaws affecting the library resulting in multiple side-channel attacks, improved cache-timing attack techniques, and a new side channel vector. The first part of this dissertation introduces the main topic and the necessary related work, including the microarchitecture, the cache hierarchy, and attack techniques; then it presents a brief troubled history of side-channel attacks and defenses in OpenSSL, setting the stage for the related publications. This dissertation includes seven original publications contributing to the area of side-channel analysis, microarchitecture timing attacks, and applied cryptography. From an SCA perspective, the results identify several vulnerabilities and flaws enabling protocol-level attacks on RSA, DSA, and ECDSA, in addition to full SCA of the SM2 cryptosystem. With respect to microarchitecture timing attacks, the dissertation presents a new side-channel vector due to port contention in the CPU execution units. And finally, on the applied cryptography front, OpenSSL now enjoys a revamped code base securing several cryptosystems against SCA, favoring a secure-by-default protection against side-channel attacks, instead of the insecure opt-in flagging mechanism provided by the fragile BN_FLG_CONSTTIME flag

    Exploring attributes, sequences, and time in Recommender Systems: From classical to Point-of-Interest recommendation

    Full text link
    Tesis Doctoral inédita leída en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Ingenieria Informática. Fecha de lectura: 08-07-2021Since the emergence of the Internet and the spread of digital communications throughout the world, the amount of data stored on the Web has been growing exponentially. In this new digital era, a large number of companies have emerged with the purpose of ltering the information available on the web and provide users with interesting items. The algorithms and models used to recommend these items are called Recommender Systems. These systems are applied to a large number of domains, from music, books, or movies to dating or Point-of-Interest (POI), which is an increasingly popular domain where users receive recommendations of di erent places when they arrive to a city. In this thesis, we focus on exploiting the use of contextual information, especially temporal and sequential data, and apply it in novel ways in both traditional and Point-of-Interest recommendation. We believe that this type of information can be used not only for creating new recommendation models but also for developing new metrics for analyzing the quality of these recommendations. In one of our rst contributions we propose di erent metrics, some of them derived from previously existing frameworks, using this contextual information. Besides, we also propose an intuitive algorithm that is able to provide recommendations to a target user by exploiting the last common interactions with other similar users of the system. At the same time, we conduct a comprehensive review of the algorithms that have been proposed in the area of POI recommendation between 2011 and 2019, identifying the common characteristics and methodologies used. Once this classi cation of the algorithms proposed to date is completed, we design a mechanism to recommend complete routes (not only independent POIs) to users, making use of reranking techniques. In addition, due to the great di culty of making recommendations in the POI domain, we propose the use of data aggregation techniques to use information from di erent cities to generate POI recommendations in a given target city. In the experimental work we present our approaches on di erent datasets belonging to both classical and POI recommendation. The results obtained in these experiments con rm the usefulness of our recommendation proposals, in terms of ranking accuracy and other dimensions like novelty, diversity, and coverage, and the appropriateness of our metrics for analyzing temporal information and biases in the recommendations producedDesde la aparici on de Internet y la difusi on de las redes de comunicaciones en todo el mundo, la cantidad de datos almacenados en la red ha crecido exponencialmente. En esta nueva era digital, han surgido un gran n umero de empresas con el objetivo de ltrar la informaci on disponible en la red y ofrecer a los usuarios art culos interesantes. Los algoritmos y modelos utilizados para recomendar estos art culos reciben el nombre de Sistemas de Recomendaci on. Estos sistemas se aplican a un gran n umero de dominios, desde m usica, libros o pel culas hasta las citas o los Puntos de Inter es (POIs, en ingl es), un dominio cada vez m as popular en el que los usuarios reciben recomendaciones de diferentes lugares cuando llegan a una ciudad. En esta tesis, nos centramos en explotar el uso de la informaci on contextual, especialmente los datos temporales y secuenciales, y aplicarla de forma novedosa tanto en la recomendaci on cl asica como en la recomendaci on de POIs. Creemos que este tipo de informaci on puede utilizarse no s olo para crear nuevos modelos de recomendaci on, sino tambi en para desarrollar nuevas m etricas para analizar la calidad de estas recomendaciones. En una de nuestras primeras contribuciones proponemos diferentes m etricas, algunas derivadas de formulaciones previamente existentes, utilizando esta informaci on contextual. Adem as, proponemos un algoritmo intuitivo que es capaz de proporcionar recomendaciones a un usuario objetivo explotando las ultimas interacciones comunes con otros usuarios similares del sistema. Al mismo tiempo, realizamos una revisi on exhaustiva de los algoritmos que se han propuesto en el a mbito de la recomendaci o n de POIs entre 2011 y 2019, identi cando las caracter sticas comunes y las metodolog as utilizadas. Una vez realizada esta clasi caci on de los algoritmos propuestos hasta la fecha, dise~namos un mecanismo para recomendar rutas completas (no s olo POIs independientes) a los usuarios, haciendo uso de t ecnicas de reranking. Adem as, debido a la gran di cultad de realizar recomendaciones en el ambito de los POIs, proponemos el uso de t ecnicas de agregaci on de datos para utilizar la informaci on de diferentes ciudades y generar recomendaciones de POIs en una determinada ciudad objetivo. En el trabajo experimental presentamos nuestros m etodos en diferentes conjuntos de datos tanto de recomendaci on cl asica como de POIs. Los resultados obtenidos en estos experimentos con rman la utilidad de nuestras propuestas de recomendaci on en t erminos de precisi on de ranking y de otras dimensiones como la novedad, la diversidad y la cobertura, y c omo de apropiadas son nuestras m etricas para analizar la informaci on temporal y los sesgos en las recomendaciones producida

    Smart Sensor Technologies for IoT

    Get PDF
    The recent development in wireless networks and devices has led to novel services that will utilize wireless communication on a new level. Much effort and resources have been dedicated to establishing new communication networks that will support machine-to-machine communication and the Internet of Things (IoT). In these systems, various smart and sensory devices are deployed and connected, enabling large amounts of data to be streamed. Smart services represent new trends in mobile services, i.e., a completely new spectrum of context-aware, personalized, and intelligent services and applications. A variety of existing services utilize information about the position of the user or mobile device. The position of mobile devices is often achieved using the Global Navigation Satellite System (GNSS) chips that are integrated into all modern mobile devices (smartphones). However, GNSS is not always a reliable source of position estimates due to multipath propagation and signal blockage. Moreover, integrating GNSS chips into all devices might have a negative impact on the battery life of future IoT applications. Therefore, alternative solutions to position estimation should be investigated and implemented in IoT applications. This Special Issue, “Smart Sensor Technologies for IoT” aims to report on some of the recent research efforts on this increasingly important topic. The twelve accepted papers in this issue cover various aspects of Smart Sensor Technologies for IoT

    Machine Learning Based Detection and Evasion Techniques for Advanced Web Bots.

    Get PDF
    Web bots are programs that can be used to browse the web and perform different types of automated actions, both benign and malicious. Such web bots vary in sophistication based on their purpose, ranging from simple automated scripts to advanced web bots that have a browser fingerprint and exhibit a humanlike behaviour. Advanced web bots are especially appealing to malicious web bot creators, due to their browserlike fingerprint and humanlike behaviour which reduce their detectability. Several effective behaviour-based web bot detection techniques have been pro- posed in literature. However, the performance of these detection techniques when target- ing malicious web bots that try to evade detection has not been examined in depth. Such evasive web bot behaviour is achieved by different techniques, including simple heuris- tics and statistical distributions, or more advanced machine learning based techniques. Motivated by the above, in this thesis we research novel web bot detection techniques and how effective these are against evasive web bots that try to evade detection using, among others, recent advances in machine learning. To this end, we initially evaluate state-of-the-art web bot detection techniques against web bots of different sophistication levels and show that, while the existing approaches achieve very high performance in general, such approaches are not very effective when faced with only advanced web bots that try to remain undetected. Thus, we propose a novel web bot detection framework that can be used to detect effectively bots of varying levels of sophistication, including advanced web bots. This framework comprises and combines two detection modules: (i) a detection module that extracts several features from web logs and uses them as input to several well-known machine learning algo- rithms, and (ii) a detection module that uses mouse trajectories as input to Convolutional Neural Networks (CNNs). Moreover, we examine the case where advanced web bots utilise themselves the re- cent advances in machine learning to evade detection. Specifically, we propose two novel evasive advanced web bot types: (i) the web bots that use Reinforcement Learning (RL) to update their browsing behaviour based on whether they have been detected or not, and (ii) the web bots that have in their possession several data from human behaviours and use them as input to Generative Adversarial Networks (GANs) to generate images of humanlike mouse trajectories. We show that both approaches increase the evasiveness of the web bots by reducing the performance of the detection framework utilised in each case. We conclude that malicious web bots can exhibit high sophistication levels and com- bine different techniques that increase their evasiveness. Even though web bot detection frameworks can combine different methods to effectively detect such bots, web bots can update their behaviours using, among other, recent advances in machine learning to in- crease their evasiveness. Thus, the detection techniques should be continuously updated to keep up with new techniques introduced by malicious web bots to evade detection

    Social informatics

    Get PDF
    5th International Conference, SocInfo 2013, Kyoto, Japan, November 25-27, 2013, Proceedings</p
    corecore