25 research outputs found
How Bad is Training on Synthetic Data? A Statistical Analysis of Language Model Collapse
The phenomenon of model collapse, introduced in (Shumailov et al., 2023),
refers to the deterioration in performance that occurs when new models are
trained on synthetic data generated from previously trained models. This
recursive training loop makes the tails of the original distribution disappear,
thereby making future-generation models forget about the initial (real)
distribution. With the aim of rigorously understanding model collapse in
language models, we consider in this paper a statistical model that allows us
to characterize the impact of various recursive training scenarios.
Specifically, we demonstrate that model collapse cannot be avoided when
training solely on synthetic data. However, when mixing both real and synthetic
data, we provide an estimate of a maximal amount of synthetic data below which
model collapse can eventually be avoided. Our theoretical conclusions are
further supported by empirical validations
Etude comparative entre contexte dâurgence et contexte de chirurgie programmĂ©e de lâintervention initiale des pĂ©ritonites postopĂ©ratoires
Abstract
Introduction - Lâobjectif de cette Ă©tude est de mettre en relief les diffĂ©rents profilsĂ©pidĂ©miologiques, clinico-pronostiques et thĂ©rapeutiques des pĂ©ritonites postopĂ©ratoiresen fonction du contexte initial, dâurgence ou de chirurgie programmĂ©e, delâintervention initiale.MatĂ©riels et mĂ©thodes - Il sâagit dâune Ă©tude rĂ©trospective et prospective qui acolligĂ© 127 patients opĂ©rĂ©s pour pĂ©ritonite postopĂ©ratoire au service des urgenceschirurgicales dâOran. Lâintervention premiĂšre en contexte dâurgence avait concernĂ©83 patients (65,4%) et 44 patients (34,6%) opĂ©rĂ©s initialement dans un cadre dechirurgie programmĂ©e.RĂ©sultats - LâĂąge moyen des patients opĂ©rĂ©s du groupe contexte dâurgence [GCU]Ă©tait de 46,6 ans versus 49,9 ans pour les patients opĂ©rĂ©s en contexte de chirurgieprogrammĂ©e [GCP]. Une prĂ©dominance masculine de 59,0% Ă©tait observĂ©e dans le[GCU] versus 52,2% dans le [GCP]. Le degrĂ© de septicitĂ© des interventions, selonla classification dâAltemeier ,avait montrĂ© que les interventions initiales (classesIII/contaminĂ©e et IV/sale) reprĂ©sentaient respectivement 90,2% et 100% pour le[GCU], et 9,7% et 0% pour le [GCP]. LâĂ©tat de choc Ă©tait retrouvĂ© chez 26 patientsdont 17 avaient Ă©tĂ© opĂ©rĂ©s initialement en contexte dâurgence (65%). La mortalitéétait de 31,3% dans le [GCU] contre 27,2% dans le [GCP]. P = 0,635.Conclusion - Cette Ă©tude comparative a permis de mettre en Ă©vidence les diffĂ©rentescaractĂ©ristiques Ă©pidĂ©miologiques, diagnostiques et thĂ©rapeutiques distinguantles deux groupes de patients opĂ©rĂ©s selon le caractĂšre dâurgence ou non delâintervention premiĂšre
Revolutionizing Cyber Threat Detection with Large Language Models: A privacy-preserving BERT-based Lightweight Model for IoT/IIoT Devices
The field of Natural Language Processing (NLP) is currently undergoing a
revolutionary transformation driven by the power of pre-trained Large Language
Models (LLMs) based on groundbreaking Transformer architectures. As the
frequency and diversity of cybersecurity attacks continue to rise, the
importance of incident detection has significantly increased. IoT devices are
expanding rapidly, resulting in a growing need for efficient techniques to
autonomously identify network-based attacks in IoT networks with both high
precision and minimal computational requirements. This paper presents
SecurityBERT, a novel architecture that leverages the Bidirectional Encoder
Representations from Transformers (BERT) model for cyber threat detection in
IoT networks. During the training of SecurityBERT, we incorporated a novel
privacy-preserving encoding technique called Privacy-Preserving Fixed-Length
Encoding (PPFLE). We effectively represented network traffic data in a
structured format by combining PPFLE with the Byte-level Byte-Pair Encoder
(BBPE) Tokenizer. Our research demonstrates that SecurityBERT outperforms
traditional Machine Learning (ML) and Deep Learning (DL) methods, such as
Convolutional Neural Networks (CNNs) or Recurrent Neural Networks (RNNs), in
cyber threat detection. Employing the Edge-IIoTset cybersecurity dataset, our
experimental analysis shows that SecurityBERT achieved an impressive 98.2%
overall accuracy in identifying fourteen distinct attack types, surpassing
previous records set by hybrid solutions such as GAN-Transformer-based
architectures and CNN-LSTM models. With an inference time of less than 0.15
seconds on an average CPU and a compact model size of just 16.7MB, SecurityBERT
is ideally suited for real-life traffic analysis and a suitable choice for
deployment on resource-constrained IoT devices.Comment: This paper has been accepted for publication in IEEE Access:
http://dx.doi.org/10.1109/ACCESS.2024.336346
Adversarial Attacks and Defenses in 6G Network-Assisted IoT Systems
The Internet of Things (IoT) and massive IoT systems are key to
sixth-generation (6G) networks due to dense connectivity, ultra-reliability,
low latency, and high throughput. Artificial intelligence, including deep
learning and machine learning, offers solutions for optimizing and deploying
cutting-edge technologies for future radio communications. However, these
techniques are vulnerable to adversarial attacks, leading to degraded
performance and erroneous predictions, outcomes unacceptable for ubiquitous
networks. This survey extensively addresses adversarial attacks and defense
methods in 6G network-assisted IoT systems. The theoretical background and
up-to-date research on adversarial attacks and defenses are discussed.
Furthermore, we provide Monte Carlo simulations to validate the effectiveness
of adversarial attacks compared to jamming attacks. Additionally, we examine
the vulnerability of 6G IoT systems by demonstrating attack strategies
applicable to key technologies, including reconfigurable intelligent surfaces,
massive multiple-input multiple-output (MIMO)/cell-free massive MIMO,
satellites, the metaverse, and semantic communications. Finally, we outline the
challenges and future developments associated with adversarial attacks and
defenses in 6G IoT systems.Comment: 17 pages, 5 figures, and 4 tables. Submitted for publication
Edge Learning for 6G-enabled Internet of Things: A Comprehensive Survey of Vulnerabilities, Datasets, and Defenses
The ongoing deployment of the fifth generation (5G) wireless networks
constantly reveals limitations concerning its original concept as a key driver
of Internet of Everything (IoE) applications. These 5G challenges are behind
worldwide efforts to enable future networks, such as sixth generation (6G)
networks, to efficiently support sophisticated applications ranging from
autonomous driving capabilities to the Metaverse. Edge learning is a new and
powerful approach to training models across distributed clients while
protecting the privacy of their data. This approach is expected to be embedded
within future network infrastructures, including 6G, to solve challenging
problems such as resource management and behavior prediction. This survey
article provides a holistic review of the most recent research focused on edge
learning vulnerabilities and defenses for 6G-enabled IoT. We summarize the
existing surveys on machine learning for 6G IoT security and machine
learning-associated threats in three different learning modes: centralized,
federated, and distributed. Then, we provide an overview of enabling emerging
technologies for 6G IoT intelligence. Moreover, we provide a holistic survey of
existing research on attacks against machine learning and classify threat
models into eight categories, including backdoor attacks, adversarial examples,
combined attacks, poisoning attacks, Sybil attacks, byzantine attacks,
inference attacks, and dropping attacks. In addition, we provide a
comprehensive and detailed taxonomy and a side-by-side comparison of the
state-of-the-art defense methods against edge learning vulnerabilities.
Finally, as new attacks and defense technologies are realized, new research and
future overall prospects for 6G-enabled IoT are discussed
Edge Learning for 6G-enabled Internet of Things:A Comprehensive Survey of Vulnerabilities, Datasets, and Defenses
Individual motile CD4+ T cells can participate in efficient multikilling through conjugation to multiple tumor cells
T cells genetically modified to express a CD19-specific chimeric antigen receptor (CAR) for the investigational treatment of B-cell malignancies comprise a heterogeneous population, and their ability to persist and participate in serial killing of tumor cells is a predictor of therapeutic success. We implemented Timelapse Imaging Microscopy in Nanowell Grids (TIMING) to provide direct evidence that CD4+CAR+ T cells (CAR4 cells) can engage in multikilling via simultaneous conjugation to multiple tumor cells. Comparisons of the CAR4 cells and CD8+CAR+ T cells (CAR8 cells) demonstrate that, although CAR4 cells can participate in killing and multikilling, they do so at slower rates, likely due to the lower granzyme B content. Significantly, in both sets of T cells, a minor subpopulation of individual T cells identified by their high motility demonstrated efficient killing of single tumor cells. A comparison of the multikiller and single-killer CAR+ T cells revealed that the propensity and kinetics of T-cell apoptosis were modulated by the number of functional conjugations. T cells underwent rapid apoptosis, and at higher frequencies, when conjugated to single tumor cells in isolation, and this effect was more pronounced on CAR8 cells. Our results suggest that the ability of CAR+ T cells to participate in multikilling should be evaluated in the context of their ability to resist activation-induced cell death. We anticipate that TIMING may be used to rapidly determine the potency of T-cell populations and may facilitate the design and manufacture of next-generation CAR+ T cells with improved efficacy. Cancer Immunol Res; 3(5); 473â82. ©2015 AACR
Dynamic Intelligence Assessment:Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence
As machine intelligence evolves, the need to test and compare the problem-solving abilities of different AI models grows. However, current benchmarks are often simplistic, allowing models to perform uniformly well and making it difficult to distinguish their capabilities. Additionally, benchmarks typically rely on static question-answer pairs that the models might memorize or guess. To address these limitations, we introduce Dynamic Intelligence Assessment (DIA), a novel methodology for testing AI models using dynamic question templates and improved metrics across multiple disciplines such as mathematics, cryptography, cybersecurity, and computer science. The accompanying dataset, DIA-Bench, contains a diverse collection of challenge templates with mutable parameters presented in various formats, including text, PDFs, compiled binaries, visual puzzles, and CTF-style cybersecurity challenges. Our framework introduces four new metrics to assess a model's reliability and confidence across multiple attempts. These metrics revealed that even simple questions are frequently answered incorrectly when posed in varying forms, highlighting significant gaps in models' reliability. Notably, API models like GPT-4o often overestimated their mathematical capabilities, while ChatGPT-4o demonstrated better performance due to effective tool usage. In self-assessment, OpenAI's o1-mini proved to have the best judgement on what tasks it should attempt to solve. We evaluated 25 state-of-the-art LLMs using DIA-Bench, showing that current models struggle with complex tasks and often display unexpectedly low confidence, even with simpler questions. The DIA framework sets a new standard for assessing not only problem-solving but also a model's adaptive intelligence and ability to assess its limitations. The dataset is publicly available on the project's page: https://github.com/DIA-Bench