12 research outputs found

    A Big Data Architecture for Early Identification and Categorization of Dark Web Sites

    Full text link
    The dark web has become notorious for its association with illicit activities and there is a growing need for systems to automate the monitoring of this space. This paper proposes an end-to-end scalable architecture for the early identification of new Tor sites and the daily analysis of their content. The solution is built using an Open Source Big Data stack for data serving with Kubernetes, Kafka, Kubeflow, and MinIO, continuously discovering onion addresses in different sources (threat intelligence, code repositories, web-Tor gateways, and Tor repositories), downloading the HTML from Tor and deduplicating the content using MinHash LSH, and categorizing with the BERTopic modeling (SBERT embedding, UMAP dimensionality reduction, HDBSCAN document clustering and c-TF-IDF topic keywords). In 93 days, the system identified 80,049 onion services and characterized 90% of them, addressing the challenge of Tor volatility. A disproportionate amount of repeated content is found, with only 6.1% unique sites. From the HTML files of the dark sites, 31 different low-topics are extracted, manually labeled, and grouped into 11 high-level topics. The five most popular included sexual and violent content, repositories, search engines, carding, cryptocurrencies, and marketplaces. During the experiments, we identified 14 sites with 13,946 clones that shared a suspiciously similar mirroring rate per day, suggesting an extensive common phishing network. Among the related works, this study is the most representative characterization of onion services based on topics to date

    An Approach to Guide Users Towards Less Revealing Internet Browsers

    Get PDF
    When browsing the Internet, HTTP headers enable both clients and servers send extra data in their requests or responses such as the User-Agent string. This string contains information related to the sender’s device, browser, and operating system. Previous research has shown that there are numerous privacy and security risks result from exposing sensitive information in the User-Agent string. For example, it enables device and browser fingerprinting and user tracking and identification. Our large analysis of thousands of User-Agent strings shows that browsers differ tremendously in the amount of information they include in their User-Agent strings. As such, our work aims at guiding users towards using less exposing browsers. In doing so, we propose to assign an exposure score to browsers based on the information they expose and vulnerability records. Thus, our contribution in this work is as follows: first, provide a full implementation that is ready to be deployed and used by users. Second, conduct a user study to identify the effectiveness and limitations of our proposed approach. Our implementation is based on using more than 52 thousand unique browsers. Our performance and validation analysis show that our solution is accurate and efficient. The source code and data set are publicly available and the solution has been deployed

    Cyber Security

    Get PDF
    This open access book constitutes the refereed proceedings of the 17th International Annual Conference on Cyber Security, CNCERT 2021, held in Beijing, China, in AJuly 2021. The 14 papers presented were carefully reviewed and selected from 51 submissions. The papers are organized according to the following topical sections: ​data security; privacy protection; anomaly detection; traffic analysis; social network security; vulnerability detection; text classification

    Cyber Security

    Get PDF
    This open access book constitutes the refereed proceedings of the 17th International Annual Conference on Cyber Security, CNCERT 2021, held in Beijing, China, in AJuly 2021. The 14 papers presented were carefully reviewed and selected from 51 submissions. The papers are organized according to the following topical sections: ​data security; privacy protection; anomaly detection; traffic analysis; social network security; vulnerability detection; text classification

    Let the weakest link fail, but gracefully:understanding tailored phishing and measures against it

    Get PDF

    Challenges in Cybersecurity and Privacy - the European Research Landscape

    Get PDF
    Cybersecurity and Privacy issues are becoming an important barrier for a trusted and dependable global digital society development. Cyber-criminals are continuously shifting their cyber-attacks specially against cyber-physical systems and IoT, since they present additional vulnerabilities due to their constrained capabilities, their unattended nature and the usage of potential untrustworthiness components. Likewise, identity-theft, fraud, personal data leakages, and other related cyber-crimes are continuously evolving, causing important damages and privacy problems for European citizens in both virtual and physical scenarios. In this context, new holistic approaches, methodologies, techniques and tools are needed to cope with those issues, and mitigate cyberattacks, by employing novel cyber-situational awareness frameworks, risk analysis and modeling, threat intelligent systems, cyber-threat information sharing methods, advanced big-data analysis techniques as well as exploiting the benefits from latest technologies such as SDN/NFV and Cloud systems. In addition, novel privacy-preserving techniques, and crypto-privacy mechanisms, identity and eID management systems, trust services, and recommendations are needed to protect citizens’ privacy while keeping usability levels. The European Commission is addressing the challenge through different means, including the Horizon 2020 Research and Innovation program, thereby financing innovative projects that can cope with the increasing cyberthreat landscape. This book introduces several cybersecurity and privacy research challenges and how they are being addressed in the scope of 15 European research projects. Each chapter is dedicated to a different funded European Research project, which aims to cope with digital security and privacy aspects, risks, threats and cybersecurity issues from a different perspective. Each chapter includes the project’s overviews and objectives, the particular challenges they are covering, research achievements on security and privacy, as well as the techniques, outcomes, and evaluations accomplished in the scope of the EU project. The book is the result of a collaborative effort among relative ongoing European Research projects in the field of privacy and security as well as related cybersecurity fields, and it is intended to explain how these projects meet the main cybersecurity and privacy challenges faced in Europe. Namely, the EU projects analyzed in the book are: ANASTACIA, SAINT, YAKSHA, FORTIKA, CYBECO, SISSDEN, CIPSEC, CS-AWARE. RED-Alert, Truessec.eu. ARIES, LIGHTest, CREDENTIAL, FutureTrust, LEPS. Challenges in Cybersecurity and Privacy - the European Research Landscape is ideal for personnel in computer/communication industries as well as academic staff and master/research students in computer science and communications networks interested in learning about cyber-security and privacy aspects

    Challenges in Cybersecurity and Privacy - the European Research Landscape

    Get PDF
    Cybersecurity and Privacy issues are becoming an important barrier for a trusted and dependable global digital society development. Cyber-criminals are continuously shifting their cyber-attacks specially against cyber-physical systems and IoT, since they present additional vulnerabilities due to their constrained capabilities, their unattended nature and the usage of potential untrustworthiness components. Likewise, identity-theft, fraud, personal data leakages, and other related cyber-crimes are continuously evolving, causing important damages and privacy problems for European citizens in both virtual and physical scenarios. In this context, new holistic approaches, methodologies, techniques and tools are needed to cope with those issues, and mitigate cyberattacks, by employing novel cyber-situational awareness frameworks, risk analysis and modeling, threat intelligent systems, cyber-threat information sharing methods, advanced big-data analysis techniques as well as exploiting the benefits from latest technologies such as SDN/NFV and Cloud systems. In addition, novel privacy-preserving techniques, and crypto-privacy mechanisms, identity and eID management systems, trust services, and recommendations are needed to protect citizens’ privacy while keeping usability levels. The European Commission is addressing the challenge through different means, including the Horizon 2020 Research and Innovation program, thereby financing innovative projects that can cope with the increasing cyberthreat landscape. This book introduces several cybersecurity and privacy research challenges and how they are being addressed in the scope of 15 European research projects. Each chapter is dedicated to a different funded European Research project, which aims to cope with digital security and privacy aspects, risks, threats and cybersecurity issues from a different perspective. Each chapter includes the project’s overviews and objectives, the particular challenges they are covering, research achievements on security and privacy, as well as the techniques, outcomes, and evaluations accomplished in the scope of the EU project. The book is the result of a collaborative effort among relative ongoing European Research projects in the field of privacy and security as well as related cybersecurity fields, and it is intended to explain how these projects meet the main cybersecurity and privacy challenges faced in Europe. Namely, the EU projects analyzed in the book are: ANASTACIA, SAINT, YAKSHA, FORTIKA, CYBECO, SISSDEN, CIPSEC, CS-AWARE. RED-Alert, Truessec.eu. ARIES, LIGHTest, CREDENTIAL, FutureTrust, LEPS. Challenges in Cybersecurity and Privacy - the European Research Landscape is ideal for personnel in computer/communication industries as well as academic staff and master/research students in computer science and communications networks interested in learning about cyber-security and privacy aspects

    Jornadas Nacionales de Investigación en Ciberseguridad: actas de las VIII Jornadas Nacionales de Investigación en ciberseguridad: Vigo, 21 a 23 de junio de 2023

    Get PDF
    Jornadas Nacionales de Investigación en Ciberseguridad (8ª. 2023. Vigo)atlanTTicAMTEGA: Axencia para a modernización tecnolóxica de GaliciaINCIBE: Instituto Nacional de Cibersegurida
    corecore