3,742 research outputs found
Federated learning enables big data for rare cancer boundary detection
Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train accurate and generalizable ML models, by only sharing numerical model updates. Here we present findings from the largest FL study to-date, involving data from 71 healthcare institutions across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, utilizing the largest dataset of such patients ever used in the literature (25,256 MRI scans from 6,314 patients). We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent. We anticipate our study to: 1) enable more studies in healthcare informed by large and diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further quantitative analyses for glioblastoma via performance optimization of our consensus model for eventual public release, and 3) demonstrate the effectiveness of FL at such scale and task complexity as a paradigm shift for multi-site collaborations, alleviating the need for data sharing
A Survey on Secure and Private Federated Learning Using Blockchain: Theory and Application in Resource-constrained Computing
Federated Learning (FL) has gained widespread popularity in recent years due
to the fast booming of advanced machine learning and artificial intelligence
along with emerging security and privacy threats. FL enables efficient model
generation from local data storage of the edge devices without revealing the
sensitive data to any entities. While this paradigm partly mitigates the
privacy issues of users' sensitive data, the performance of the FL process can
be threatened and reached a bottleneck due to the growing cyber threats and
privacy violation techniques. To expedite the proliferation of FL process, the
integration of blockchain for FL environments has drawn prolific attention from
the people of academia and industry. Blockchain has the potential to prevent
security and privacy threats with its decentralization, immutability,
consensus, and transparency characteristic. However, if the blockchain
mechanism requires costly computational resources, then the
resource-constrained FL clients cannot be involved in the training. Considering
that, this survey focuses on reviewing the challenges, solutions, and future
directions for the successful deployment of blockchain in resource-constrained
FL environments. We comprehensively review variant blockchain mechanisms that
are suitable for FL process and discuss their trade-offs for a limited resource
budget. Further, we extensively analyze the cyber threats that could be
observed in a resource-constrained FL environment, and how blockchain can play
a key role to block those cyber attacks. To this end, we highlight some
potential solutions towards the coupling of blockchain and federated learning
that can offer high levels of reliability, data privacy, and distributed
computing performance
Towards understanding privacy-aware artificial intelligence
Εθνικό Μετσόβιο Πολυτεχνείο--Μεταπτυχιακή Εργασία. Διεπιστημονικό-Διατμηματικό Πρόγραμμα Μεταπτυχιακών Σπουδών (Δ.Π.Μ.Σ.) "Επιστήμη Δεδομένων και Μηχανική Μάθηση
Quality of Information in Mobile Crowdsensing: Survey and Research Challenges
Smartphones have become the most pervasive devices in people's lives, and are
clearly transforming the way we live and perceive technology. Today's
smartphones benefit from almost ubiquitous Internet connectivity and come
equipped with a plethora of inexpensive yet powerful embedded sensors, such as
accelerometer, gyroscope, microphone, and camera. This unique combination has
enabled revolutionary applications based on the mobile crowdsensing paradigm,
such as real-time road traffic monitoring, air and noise pollution, crime
control, and wildlife monitoring, just to name a few. Differently from prior
sensing paradigms, humans are now the primary actors of the sensing process,
since they become fundamental in retrieving reliable and up-to-date information
about the event being monitored. As humans may behave unreliably or
maliciously, assessing and guaranteeing Quality of Information (QoI) becomes
more important than ever. In this paper, we provide a new framework for
defining and enforcing the QoI in mobile crowdsensing, and analyze in depth the
current state-of-the-art on the topic. We also outline novel research
challenges, along with possible directions of future work.Comment: To appear in ACM Transactions on Sensor Networks (TOSN
BC4LLM: Trusted Artificial Intelligence When Blockchain Meets Large Language Models
In recent years, artificial intelligence (AI) and machine learning (ML) are
reshaping society's production methods and productivity, and also changing the
paradigm of scientific research. Among them, the AI language model represented
by ChatGPT has made great progress. Such large language models (LLMs) serve
people in the form of AI-generated content (AIGC) and are widely used in
consulting, healthcare, and education. However, it is difficult to guarantee
the authenticity and reliability of AIGC learning data. In addition, there are
also hidden dangers of privacy disclosure in distributed AI training. Moreover,
the content generated by LLMs is difficult to identify and trace, and it is
difficult to cross-platform mutual recognition. The above information security
issues in the coming era of AI powered by LLMs will be infinitely amplified and
affect everyone's life. Therefore, we consider empowering LLMs using blockchain
technology with superior security features to propose a vision for trusted AI.
This paper mainly introduces the motivation and technical route of blockchain
for LLM (BC4LLM), including reliable learning corpus, secure training process,
and identifiable generated content. Meanwhile, this paper also reviews the
potential applications and future challenges, especially in the frontier
communication networks field, including network resource allocation, dynamic
spectrum sharing, and semantic communication. Based on the above work combined
and the prospect of blockchain and LLMs, it is expected to help the early
realization of trusted AI and provide guidance for the academic community
Private Semi-supervised Knowledge Transfer for Deep Learning from Noisy Labels
Deep learning models trained on large-scale data have achieved encouraging
performance in many real-world tasks. Meanwhile, publishing those models
trained on sensitive datasets, such as medical records, could pose serious
privacy concerns. To counter these issues, one of the current state-of-the-art
approaches is the Private Aggregation of Teacher Ensembles, or PATE, which
achieved promising results in preserving the utility of the model while
providing a strong privacy guarantee. PATE combines an ensemble of "teacher
models" trained on sensitive data and transfers the knowledge to a "student"
model through the noisy aggregation of teachers' votes for labeling unlabeled
public data which the student model will be trained on. However, the knowledge
or voted labels learned by the student are noisy due to private aggregation.
Learning directly from noisy labels can significantly impact the accuracy of
the student model.
In this paper, we propose the PATE++ mechanism, which combines the current
advanced noisy label training mechanisms with the original PATE framework to
enhance its accuracy. A novel structure of Generative Adversarial Nets (GANs)
is developed in order to integrate them effectively. In addition, we develop a
novel noisy label detection mechanism for semi-supervised model training to
further improve student model performance when training with noisy labels. We
evaluate our method on Fashion-MNIST and SVHN to show the improvements on the
original PATE on all measures
- …