11,813 research outputs found
Towards Sybil Resilience in Decentralized Learning
Federated learning is a privacy-enforcing machine learning technology but
suffers from limited scalability. This limitation mostly originates from the
internet connection and memory capacity of the central parameter server, and
the complexity of the model aggregation function. Decentralized learning has
recently been emerging as a promising alternative to federated learning. This
novel technology eliminates the need for a central parameter server by
decentralizing the model aggregation across all participating nodes. Numerous
studies have been conducted on improving the resilience of federated learning
against poisoning and Sybil attacks, whereas the resilience of decentralized
learning remains largely unstudied. This research gap serves as the main
motivator for this study, in which our objective is to improve the Sybil
poisoning resilience of decentralized learning.
We present SybilWall, an innovative algorithm focused on increasing the
resilience of decentralized learning against targeted Sybil poisoning attacks.
By combining a Sybil-resistant aggregation function based on similarity between
Sybils with a novel probabilistic gossiping mechanism, we establish a new
benchmark for scalable, Sybil-resilient decentralized learning.
A comprehensive empirical evaluation demonstrated that SybilWall outperforms
existing state-of-the-art solutions designed for federated learning scenarios
and is the only algorithm to obtain consistent accuracy over a range of
adversarial attack scenarios. We also found SybilWall to diminish the utility
of creating many Sybils, as our evaluations demonstrate a higher success rate
among adversaries employing fewer Sybils. Finally, we suggest a number of
possible improvements to SybilWall and highlight promising future research
directions
Evaluation Methodologies in Software Protection Research
Man-at-the-end (MATE) attackers have full control over the system on which
the attacked software runs, and try to break the confidentiality or integrity
of assets embedded in the software. Both companies and malware authors want to
prevent such attacks. This has driven an arms race between attackers and
defenders, resulting in a plethora of different protection and analysis
methods. However, it remains difficult to measure the strength of protections
because MATE attackers can reach their goals in many different ways and a
universally accepted evaluation methodology does not exist. This survey
systematically reviews the evaluation methodologies of papers on obfuscation, a
major class of protections against MATE attacks. For 572 papers, we collected
113 aspects of their evaluation methodologies, ranging from sample set types
and sizes, over sample treatment, to performed measurements. We provide
detailed insights into how the academic state of the art evaluates both the
protections and analyses thereon. In summary, there is a clear need for better
evaluation methodologies. We identify nine challenges for software protection
evaluations, which represent threats to the validity, reproducibility, and
interpretation of research results in the context of MATE attacks
Approximate Computing Survey, Part I: Terminology and Software & Hardware Approximation Techniques
The rapid growth of demanding applications in domains applying multimedia
processing and machine learning has marked a new era for edge and cloud
computing. These applications involve massive data and compute-intensive tasks,
and thus, typical computing paradigms in embedded systems and data centers are
stressed to meet the worldwide demand for high performance. Concurrently, the
landscape of the semiconductor field in the last 15 years has constituted power
as a first-class design concern. As a result, the community of computing
systems is forced to find alternative design approaches to facilitate
high-performance and/or power-efficient computing. Among the examined
solutions, Approximate Computing has attracted an ever-increasing interest,
with research works applying approximations across the entire traditional
computing stack, i.e., at software, hardware, and architectural levels. Over
the last decade, there is a plethora of approximation techniques in software
(programs, frameworks, compilers, runtimes, languages), hardware (circuits,
accelerators), and architectures (processors, memories). The current article is
Part I of our comprehensive survey on Approximate Computing, and it reviews its
motivation, terminology and principles, as well it classifies and presents the
technical details of the state-of-the-art software and hardware approximation
techniques.Comment: Under Review at ACM Computing Survey
Evaluation of different segmentation-based approaches for skin disorders from dermoscopic images
Treballs Finals de Grau d'Enginyeria Biomèdica. Facultat de Medicina i Ciències de la Salut. Universitat de Barcelona. Curs: 2022-2023. Tutor/Director: Sala Llonch, Roser, Mata Miquel, Christian, Munuera, JosepSkin disorders are the most common type of cancer in the world and the incident has been lately increasing over the past decades. Even with the most complex and advanced technologies, current image acquisition systems do not permit a reliable identification of the skin lesion by visual examination due to the challenging structure of the malignancy. This promotes the need for the implementation of automatic skin lesion segmentation methods in order to assist in physicians’ diagnostic when determining the lesion's region and to serve as a preliminary step for the classification of the skin lesion. Accurate and precise segmentation is crucial for a rigorous screening and monitoring of the disease's progression.
For the purpose of the commented concern, the present project aims to accomplish a state-of-the-art review about the most predominant conventional segmentation models for skin lesion segmentation, alongside with a market analysis examination. With the rise of automatic segmentation tools, a wide number of algorithms are currently being used, but many are the drawbacks when employing them for dermatological disorders due to the high-level presence of artefacts in the image acquired.
In light of the above, three segmentation techniques have been selected for the completion of the work: level set method, an algorithm combining GrabCut and k-means methods and an intensity automatic algorithm developed by Hospital Sant Joan de Déu de Barcelona research group. In addition, a validation of their performance is conducted for a further implementation of them in clinical training. The proposals, together with the got outcomes, have been accomplished by means of a publicly available skin lesion image database
A conceptual framework for developing dashboards for big mobility data
Dashboards are an increasingly popular form of data visualization. Large, complex, and dynamic mobility data present a number of challenges in dashboard design. The overall aim for dashboard design is to improve information communication and decision making, though big mobility data in particular require considering privacy alongside size and complexity. Taking these issues into account, a gap remains between wrangling mobility data and developing meaningful dashboard output. Therefore, there is a need for a framework that bridges this gap to support the mobility dashboard development and design process. In this paper we outline a conceptual framework for mobility data dashboards that provides guidance for the development process while considering mobility data structure, volume, complexity, varied application contexts, and privacy constraints. We illustrate the proposed framework’s components and process using example mobility dashboards with varied inputs, end-users and objectives. Overall, the framework offers a basis for developers to understand how informational displays of big mobility data are determined by end-user needs as well as the types of data selection, transformation, and display available to particular mobility datasets
Deep learning for unsupervised domain adaptation in medical imaging: Recent advancements and future perspectives
Deep learning has demonstrated remarkable performance across various tasks in
medical imaging. However, these approaches primarily focus on supervised
learning, assuming that the training and testing data are drawn from the same
distribution. Unfortunately, this assumption may not always hold true in
practice. To address these issues, unsupervised domain adaptation (UDA)
techniques have been developed to transfer knowledge from a labeled domain to a
related but unlabeled domain. In recent years, significant advancements have
been made in UDA, resulting in a wide range of methodologies, including feature
alignment, image translation, self-supervision, and disentangled representation
methods, among others. In this paper, we provide a comprehensive literature
review of recent deep UDA approaches in medical imaging from a technical
perspective. Specifically, we categorize current UDA research in medical
imaging into six groups and further divide them into finer subcategories based
on the different tasks they perform. We also discuss the respective datasets
used in the studies to assess the divergence between the different domains.
Finally, we discuss emerging areas and provide insights and discussions on
future research directions to conclude this survey.Comment: Under Revie
Privacy-preserving patient clustering for personalized federated learning
Federated Learning (FL) is a machine learning framework that enables multiple
organizations to train a model without sharing their data with a central
server. However, it experiences significant performance degradation if the data
is non-identically independently distributed (non-IID). This is a problem in
medical settings, where variations in the patient population contribute
significantly to distribution differences across hospitals. Personalized FL
addresses this issue by accounting for site-specific distribution differences.
Clustered FL, a Personalized FL variant, was used to address this problem by
clustering patients into groups across hospitals and training separate models
on each group. However, privacy concerns remained as a challenge as the
clustering process requires exchange of patient-level information. This was
previously solved by forming clusters using aggregated data, which led to
inaccurate groups and performance degradation. In this study, we propose
Privacy-preserving Community-Based Federated machine Learning (PCBFL), a novel
Clustered FL framework that can cluster patients using patient-level data while
protecting privacy. PCBFL uses Secure Multiparty Computation, a cryptographic
technique, to securely calculate patient-level similarity scores across
hospitals. We then evaluate PCBFL by training a federated mortality prediction
model using 20 sites from the eICU dataset. We compare the performance gain
from PCBFL against traditional and existing Clustered FL frameworks. Our
results show that PCBFL successfully forms clinically meaningful cohorts of
low, medium, and high-risk patients. PCBFL outperforms traditional and existing
Clustered FL frameworks with an average AUC improvement of 4.3% and AUPRC
improvement of 7.8%
Recommended from our members
The Economics of Information and Communication Technologies in our Society
Information and Communication Technologies (ICTs) play a fundamental role in today\u27s society. As ICTs they become more mature and widely adopted, societies become more dependent on their use to operationalize daily activities. However, there are multiple societal impacts of ICTs that are not yet well understood. In this dissertation, I explore three different aspects of ICTs that have been widely discussed by media and industry during recent years. I analyze these topics from an economic perspective, contributing to the debate with rigorous modeling and the ensuing discussion of its implications. First, I study the impact that the COVID-19 pandemic had on remote meeting technologies\u27 usage. Second, I empirically tackle the long debated question of whether internet users perceive internet providers\u27 Network Neutrality practices. Finally, I analyze the most recent and ambitious public policy in the U.S. to improve households\u27 broadband internet connectivity - the so-called policy of bridging of the digital divide
Knowledge Distillation and Continual Learning for Optimized Deep Neural Networks
Over the past few years, deep learning (DL) has been achieving state-of-theart performance on various human tasks such as speech generation, language translation, image segmentation, and object detection. While traditional machine learning models require hand-crafted features, deep learning algorithms can automatically extract discriminative features and learn complex knowledge from large datasets. This powerful learning ability makes deep learning models attractive to both academia and big corporations.
Despite their popularity, deep learning methods still have two main limitations: large memory consumption and catastrophic knowledge forgetting. First, DL algorithms use very deep neural networks (DNNs) with many billion parameters, which have a big model size and a slow inference speed. This restricts the application of DNNs in resource-constraint devices such as mobile phones and autonomous vehicles. Second, DNNs are known to suffer from catastrophic forgetting. When incrementally learning new tasks, the model performance on old tasks significantly drops. The ability to accommodate new knowledge while retaining previously learned knowledge is called continual learning. Since the realworld environments in which the model operates are always evolving, a robust neural network needs to have this continual learning ability for adapting to new changes
- …