34 research outputs found

    RETVec: Resilient and Efficient Text Vectorizer

    Full text link
    This paper describes RETVec, an efficient, resilient, and multilingual text vectorizer designed for neural-based text processing. RETVec combines a novel character encoding with an optional small embedding model to embed words into a 256-dimensional vector space. The RETVec embedding model is pre-trained using pair-wise metric learning to be robust against typos and character-level adversarial attacks. In this paper, we evaluate and compare RETVec to state-of-the-art vectorizers and word embeddings on popular model architectures and datasets. These comparisons demonstrate that RETVec leads to competitive, multilingual models that are significantly more resilient to typos and adversarial text attacks. RETVec is available under the Apache 2 license at https://github.com/google-research/retvec.Comment: Accepted at NeurIPS 202

    RETSim: Resilient and Efficient Text Similarity

    Full text link
    This paper introduces RETSim (Resilient and Efficient Text Similarity), a lightweight, multilingual deep learning model trained to produce robust metric embeddings for near-duplicate text retrieval, clustering, and dataset deduplication tasks. We demonstrate that RETSim is significantly more robust and accurate than MinHash and neural text embeddings, achieving new state-of-the-art performance on dataset deduplication, adversarial text retrieval benchmarks, and spam clustering tasks. We also introduce the W4NT3D benchmark (Wiki-40B 4dversarial Near-T3xt Dataset) for evaluating multilingual, near-duplicate text retrieval capabilities under adversarial settings. RETSim and the W4NT3D benchmark are open-sourced under the MIT License at https://github.com/google/unisim

    Xcs: Cross channel scripting and its impact on web applications

    Get PDF
    ABSTRACT We study the security of embedded web servers used in consumer electronic devices, such as security cameras and photo frames, and for IT infrastructure, such as wireless access points and lights-out management systems. All the devices we examine turn out to be vulnerable to a variety of web attacks, including cross site scripting (XSS) and cross site request forgery (CSRF). In addition, we show that consumer electronics are particularly vulnerable to a nasty form of persistent XSS where a non-web channel such as NFS or SNMP is used to inject a malicious script. This script is later used to attack an unsuspecting user who connects to the device's web server. We refer to web attacks which are mounted through a non-web channel as cross channel scripting (XCS). We propose a client-side defense against certain XCS which we implement as a browser extension

    Generalized Power Attacks against Crypto Hardware using Long-Range Deep Learning

    Get PDF
    To make cryptographic processors more resilient against side-channel attacks, engineers have developed various countermeasures. However, the effectiveness of these countermeasures is often uncertain, as it depends on the complex interplay between software and hardware. Assessing a countermeasure’s effectiveness using profiling techniques or machine learning so far requires significant expertise and effort to be adapted to new targets which makes those assessments expensive. We argue that including cost-effective automated attacks will help chip design teams to quickly evaluate their countermeasures during the development phase, paving the way to more secure chips. In this paper, we lay the foundations toward such automated system by proposing GPAM, the first deep-learning system for power side-channel analysis that generalizes across multiple cryptographic algorithms, implementations, and side-channel countermeasures without the need for manual tuning or trace preprocessing. We demonstrate GPAM’s capability by successfully attacking four hardened hardware-accelerated elliptic-curve digital-signature implementations. We showcase GPAM’s ability to generalize across multiple algorithms by attacking a protected AES implementation and achieving comparable performance to state-of-the-art attacks, but without manual trace curation and within a limited budget. We release our data and models as an open-source contribution to allow the community to independently replicate our results and build on them

    Hybrid Post-Quantum Signatures in Hardware Security Keys

    Get PDF
    Recent advances in quantum computing are increasingly jeopardizing the security of cryptosystems currently in widespread use, such as RSA or elliptic-curve signatures. To address this threat, researchers and standardization institutes have accelerated the transition to quantum-resistant cryptosystems, collectively known as Post-Quantum Cryptography (PQC). These PQC schemes present new challenges due to their larger memory and computational footprints and their higher chance of latent vulnerabilities. In this work, we address these challenges by introducing a scheme to upgrade the digital signatures used by security keys to PQC. We introduce a hybrid digital signature scheme based on two building blocks: a classically-secure scheme, ECDSA, and a post-quantum secure one, Dilithium. Our hybrid scheme maintains the guarantees of each underlying building block even if the other one is broken, thus being resistant to classical and quantum attacks. We experimentally show that our hybrid signature scheme can successfully execute on current security keys, even though secure PQC schemes are known to require substantial resources. We publish an open-source implementation of our scheme at https://github.com/google/OpenSK/releases/tag/hybrid-pqc so that other researchers can reproduce our results on a nRF52840 development kit

    SoK: hate, harassment, and the changing landscape of online abuse

    Full text link
    We argue that existing security, privacy, and antiabuse protections fail to address the growing threat of online hate and harassment. In order for our community to understand and address this gap, we propose a taxonomy for reasoning about online hate and harassment. Our taxonomy draws on over 150 interdisciplinary research papers that cover disparate threats ranging from intimate partner violence to coordinated mobs. In the process, we identify seven classes of attacks—such as toxic content and surveillance—that each stem from different attacker capabilities and intents. We also provide longitudinal evidence from a three-year survey that hate and harassment is a pervasive, growing experience for online users, particularly for at-risk communities like young adults and people who identify as LGBTQ+. Responding to each class of hate and harassment requires a unique strategy and we highlight five such potential research directions that ultimately empower individuals, communities, and platforms to do so.Accepted manuscrip
    corecore