25 research outputs found

    How Bad is Training on Synthetic Data? A Statistical Analysis of Language Model Collapse

    Full text link
    The phenomenon of model collapse, introduced in (Shumailov et al., 2023), refers to the deterioration in performance that occurs when new models are trained on synthetic data generated from previously trained models. This recursive training loop makes the tails of the original distribution disappear, thereby making future-generation models forget about the initial (real) distribution. With the aim of rigorously understanding model collapse in language models, we consider in this paper a statistical model that allows us to characterize the impact of various recursive training scenarios. Specifically, we demonstrate that model collapse cannot be avoided when training solely on synthetic data. However, when mixing both real and synthetic data, we provide an estimate of a maximal amount of synthetic data below which model collapse can eventually be avoided. Our theoretical conclusions are further supported by empirical validations

    Etude comparative entre contexte d’urgence et contexte de chirurgie programmĂ©e de l’intervention initiale des pĂ©ritonites postopĂ©ratoires

    Get PDF
    Abstract Introduction - L’objectif de cette Ă©tude est de mettre en relief les diffĂ©rents profilsĂ©pidĂ©miologiques, clinico-pronostiques et thĂ©rapeutiques des pĂ©ritonites postopĂ©ratoiresen fonction du contexte initial, d’urgence ou de chirurgie programmĂ©e, del’intervention initiale.MatĂ©riels et mĂ©thodes - Il s’agit d’une Ă©tude rĂ©trospective et prospective qui acolligĂ© 127 patients opĂ©rĂ©s pour pĂ©ritonite postopĂ©ratoire au service des urgenceschirurgicales d’Oran. L’intervention premiĂšre en contexte d’urgence avait concernĂ©83 patients (65,4%) et 44 patients (34,6%) opĂ©rĂ©s initialement dans un cadre dechirurgie programmĂ©e.RĂ©sultats - L’ñge moyen des patients opĂ©rĂ©s du groupe contexte d’urgence [GCU]Ă©tait de 46,6 ans versus 49,9 ans pour les patients opĂ©rĂ©s en contexte de chirurgieprogrammĂ©e [GCP]. Une prĂ©dominance masculine de 59,0% Ă©tait observĂ©e dans le[GCU] versus 52,2% dans le [GCP]. Le degrĂ© de septicitĂ© des interventions, selonla classification d’Altemeier ,avait montrĂ© que les interventions initiales (classesIII/contaminĂ©e et IV/sale) reprĂ©sentaient respectivement 90,2% et 100% pour le[GCU], et 9,7% et 0% pour le [GCP]. L’état de choc Ă©tait retrouvĂ© chez 26 patientsdont 17 avaient Ă©tĂ© opĂ©rĂ©s initialement en contexte d’urgence (65%). La mortalitéétait de 31,3% dans le [GCU] contre 27,2% dans le [GCP]. P = 0,635.Conclusion - Cette Ă©tude comparative a permis de mettre en Ă©vidence les diffĂ©rentescaractĂ©ristiques Ă©pidĂ©miologiques, diagnostiques et thĂ©rapeutiques distinguantles deux groupes de patients opĂ©rĂ©s selon le caractĂšre d’urgence ou non del’intervention premiĂšre

    Revolutionizing Cyber Threat Detection with Large Language Models: A privacy-preserving BERT-based Lightweight Model for IoT/IIoT Devices

    Full text link
    The field of Natural Language Processing (NLP) is currently undergoing a revolutionary transformation driven by the power of pre-trained Large Language Models (LLMs) based on groundbreaking Transformer architectures. As the frequency and diversity of cybersecurity attacks continue to rise, the importance of incident detection has significantly increased. IoT devices are expanding rapidly, resulting in a growing need for efficient techniques to autonomously identify network-based attacks in IoT networks with both high precision and minimal computational requirements. This paper presents SecurityBERT, a novel architecture that leverages the Bidirectional Encoder Representations from Transformers (BERT) model for cyber threat detection in IoT networks. During the training of SecurityBERT, we incorporated a novel privacy-preserving encoding technique called Privacy-Preserving Fixed-Length Encoding (PPFLE). We effectively represented network traffic data in a structured format by combining PPFLE with the Byte-level Byte-Pair Encoder (BBPE) Tokenizer. Our research demonstrates that SecurityBERT outperforms traditional Machine Learning (ML) and Deep Learning (DL) methods, such as Convolutional Neural Networks (CNNs) or Recurrent Neural Networks (RNNs), in cyber threat detection. Employing the Edge-IIoTset cybersecurity dataset, our experimental analysis shows that SecurityBERT achieved an impressive 98.2% overall accuracy in identifying fourteen distinct attack types, surpassing previous records set by hybrid solutions such as GAN-Transformer-based architectures and CNN-LSTM models. With an inference time of less than 0.15 seconds on an average CPU and a compact model size of just 16.7MB, SecurityBERT is ideally suited for real-life traffic analysis and a suitable choice for deployment on resource-constrained IoT devices.Comment: This paper has been accepted for publication in IEEE Access: http://dx.doi.org/10.1109/ACCESS.2024.336346

    Adversarial Attacks and Defenses in 6G Network-Assisted IoT Systems

    Full text link
    The Internet of Things (IoT) and massive IoT systems are key to sixth-generation (6G) networks due to dense connectivity, ultra-reliability, low latency, and high throughput. Artificial intelligence, including deep learning and machine learning, offers solutions for optimizing and deploying cutting-edge technologies for future radio communications. However, these techniques are vulnerable to adversarial attacks, leading to degraded performance and erroneous predictions, outcomes unacceptable for ubiquitous networks. This survey extensively addresses adversarial attacks and defense methods in 6G network-assisted IoT systems. The theoretical background and up-to-date research on adversarial attacks and defenses are discussed. Furthermore, we provide Monte Carlo simulations to validate the effectiveness of adversarial attacks compared to jamming attacks. Additionally, we examine the vulnerability of 6G IoT systems by demonstrating attack strategies applicable to key technologies, including reconfigurable intelligent surfaces, massive multiple-input multiple-output (MIMO)/cell-free massive MIMO, satellites, the metaverse, and semantic communications. Finally, we outline the challenges and future developments associated with adversarial attacks and defenses in 6G IoT systems.Comment: 17 pages, 5 figures, and 4 tables. Submitted for publication

    Edge Learning for 6G-enabled Internet of Things: A Comprehensive Survey of Vulnerabilities, Datasets, and Defenses

    Full text link
    The ongoing deployment of the fifth generation (5G) wireless networks constantly reveals limitations concerning its original concept as a key driver of Internet of Everything (IoE) applications. These 5G challenges are behind worldwide efforts to enable future networks, such as sixth generation (6G) networks, to efficiently support sophisticated applications ranging from autonomous driving capabilities to the Metaverse. Edge learning is a new and powerful approach to training models across distributed clients while protecting the privacy of their data. This approach is expected to be embedded within future network infrastructures, including 6G, to solve challenging problems such as resource management and behavior prediction. This survey article provides a holistic review of the most recent research focused on edge learning vulnerabilities and defenses for 6G-enabled IoT. We summarize the existing surveys on machine learning for 6G IoT security and machine learning-associated threats in three different learning modes: centralized, federated, and distributed. Then, we provide an overview of enabling emerging technologies for 6G IoT intelligence. Moreover, we provide a holistic survey of existing research on attacks against machine learning and classify threat models into eight categories, including backdoor attacks, adversarial examples, combined attacks, poisoning attacks, Sybil attacks, byzantine attacks, inference attacks, and dropping attacks. In addition, we provide a comprehensive and detailed taxonomy and a side-by-side comparison of the state-of-the-art defense methods against edge learning vulnerabilities. Finally, as new attacks and defense technologies are realized, new research and future overall prospects for 6G-enabled IoT are discussed

    Individual motile CD4+ T cells can participate in efficient multikilling through conjugation to multiple tumor cells

    Get PDF
    T cells genetically modified to express a CD19-specific chimeric antigen receptor (CAR) for the investigational treatment of B-cell malignancies comprise a heterogeneous population, and their ability to persist and participate in serial killing of tumor cells is a predictor of therapeutic success. We implemented Timelapse Imaging Microscopy in Nanowell Grids (TIMING) to provide direct evidence that CD4+CAR+ T cells (CAR4 cells) can engage in multikilling via simultaneous conjugation to multiple tumor cells. Comparisons of the CAR4 cells and CD8+CAR+ T cells (CAR8 cells) demonstrate that, although CAR4 cells can participate in killing and multikilling, they do so at slower rates, likely due to the lower granzyme B content. Significantly, in both sets of T cells, a minor subpopulation of individual T cells identified by their high motility demonstrated efficient killing of single tumor cells. A comparison of the multikiller and single-killer CAR+ T cells revealed that the propensity and kinetics of T-cell apoptosis were modulated by the number of functional conjugations. T cells underwent rapid apoptosis, and at higher frequencies, when conjugated to single tumor cells in isolation, and this effect was more pronounced on CAR8 cells. Our results suggest that the ability of CAR+ T cells to participate in multikilling should be evaluated in the context of their ability to resist activation-induced cell death. We anticipate that TIMING may be used to rapidly determine the potency of T-cell populations and may facilitate the design and manufacture of next-generation CAR+ T cells with improved efficacy. Cancer Immunol Res; 3(5); 473–82. ©2015 AACR

    Dynamic Intelligence Assessment:Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence

    Get PDF
    As machine intelligence evolves, the need to test and compare the problem-solving abilities of different AI models grows. However, current benchmarks are often simplistic, allowing models to perform uniformly well and making it difficult to distinguish their capabilities. Additionally, benchmarks typically rely on static question-answer pairs that the models might memorize or guess. To address these limitations, we introduce Dynamic Intelligence Assessment (DIA), a novel methodology for testing AI models using dynamic question templates and improved metrics across multiple disciplines such as mathematics, cryptography, cybersecurity, and computer science. The accompanying dataset, DIA-Bench, contains a diverse collection of challenge templates with mutable parameters presented in various formats, including text, PDFs, compiled binaries, visual puzzles, and CTF-style cybersecurity challenges. Our framework introduces four new metrics to assess a model's reliability and confidence across multiple attempts. These metrics revealed that even simple questions are frequently answered incorrectly when posed in varying forms, highlighting significant gaps in models' reliability. Notably, API models like GPT-4o often overestimated their mathematical capabilities, while ChatGPT-4o demonstrated better performance due to effective tool usage. In self-assessment, OpenAI's o1-mini proved to have the best judgement on what tasks it should attempt to solve. We evaluated 25 state-of-the-art LLMs using DIA-Bench, showing that current models struggle with complex tasks and often display unexpectedly low confidence, even with simpler questions. The DIA framework sets a new standard for assessing not only problem-solving but also a model's adaptive intelligence and ability to assess its limitations. The dataset is publicly available on the project's page: https://github.com/DIA-Bench

    Design of Low‐profile ultrawideband Antennas for Body‐Centric Communications using the FDTD method

    No full text
    corecore