10,069 research outputs found
Towards Sybil Resilience in Decentralized Learning
Federated learning is a privacy-enforcing machine learning technology but
suffers from limited scalability. This limitation mostly originates from the
internet connection and memory capacity of the central parameter server, and
the complexity of the model aggregation function. Decentralized learning has
recently been emerging as a promising alternative to federated learning. This
novel technology eliminates the need for a central parameter server by
decentralizing the model aggregation across all participating nodes. Numerous
studies have been conducted on improving the resilience of federated learning
against poisoning and Sybil attacks, whereas the resilience of decentralized
learning remains largely unstudied. This research gap serves as the main
motivator for this study, in which our objective is to improve the Sybil
poisoning resilience of decentralized learning.
We present SybilWall, an innovative algorithm focused on increasing the
resilience of decentralized learning against targeted Sybil poisoning attacks.
By combining a Sybil-resistant aggregation function based on similarity between
Sybils with a novel probabilistic gossiping mechanism, we establish a new
benchmark for scalable, Sybil-resilient decentralized learning.
A comprehensive empirical evaluation demonstrated that SybilWall outperforms
existing state-of-the-art solutions designed for federated learning scenarios
and is the only algorithm to obtain consistent accuracy over a range of
adversarial attack scenarios. We also found SybilWall to diminish the utility
of creating many Sybils, as our evaluations demonstrate a higher success rate
among adversaries employing fewer Sybils. Finally, we suggest a number of
possible improvements to SybilWall and highlight promising future research
directions
Modular lifelong machine learning
Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge.
Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand.
This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems.
First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures.
Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations.
Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods.
Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer
Less is More: Focus Attention for Efficient DETR
DETR-like models have significantly boosted the performance of detectors and
even outperformed classical convolutional models. However, all tokens are
treated equally without discrimination brings a redundant computational burden
in the traditional encoder structure. The recent sparsification strategies
exploit a subset of informative tokens to reduce attention complexity
maintaining performance through the sparse encoder. But these methods tend to
rely on unreliable model statistics. Moreover, simply reducing the token
population hinders the detection performance to a large extent, limiting the
application of these sparse models. We propose Focus-DETR, which focuses
attention on more informative tokens for a better trade-off between computation
efficiency and model accuracy. Specifically, we reconstruct the encoder with
dual attention, which includes a token scoring mechanism that considers both
localization and category semantic information of the objects from multi-scale
feature maps. We efficiently abandon the background queries and enhance the
semantic interaction of the fine-grained object queries based on the scores.
Compared with the state-of-the-art sparse DETR-like detectors under the same
setting, our Focus-DETR gets comparable complexity while achieving 50.4AP
(+2.2) on COCO. The code is available at
https://github.com/huawei-noah/noah-research/tree/master/Focus-DETR and
https://gitee.com/mindspore/models/tree/master/research/cv/Focus-DETR.Comment: 8 pages, 6 figures, accepted to ICCV202
Identifying Appropriate Intellectual Property Protection Mechanisms for Machine Learning Models: A Systematization of Watermarking, Fingerprinting, Model Access, and Attacks
The commercial use of Machine Learning (ML) is spreading; at the same time,
ML models are becoming more complex and more expensive to train, which makes
Intellectual Property Protection (IPP) of trained models a pressing issue.
Unlike other domains that can build on a solid understanding of the threats,
attacks and defenses available to protect their IP, the ML-related research in
this regard is still very fragmented. This is also due to a missing unified
view as well as a common taxonomy of these aspects.
In this paper, we systematize our findings on IPP in ML, while focusing on
threats and attacks identified and defenses proposed at the time of writing. We
develop a comprehensive threat model for IP in ML, categorizing attacks and
defenses within a unified and consolidated taxonomy, thus bridging research
from both the ML and security communities
Graph Neural Networks for Link Prediction with Subgraph Sketching
Many Graph Neural Networks (GNNs) perform poorly compared to simple
heuristics on Link Prediction (LP) tasks. This is due to limitations in
expressive power such as the inability to count triangles (the backbone of most
LP heuristics) and because they can not distinguish automorphic nodes (those
having identical structural roles). Both expressiveness issues can be
alleviated by learning link (rather than node) representations and
incorporating structural features such as triangle counts. Since explicit link
representations are often prohibitively expensive, recent works resorted to
subgraph-based methods, which have achieved state-of-the-art performance for
LP, but suffer from poor efficiency due to high levels of redundancy between
subgraphs. We analyze the components of subgraph GNN (SGNN) methods for link
prediction. Based on our analysis, we propose a novel full-graph GNN called
ELPH (Efficient Link Prediction with Hashing) that passes subgraph sketches as
messages to approximate the key components of SGNNs without explicit subgraph
construction. ELPH is provably more expressive than Message Passing GNNs
(MPNNs). It outperforms existing SGNN models on many standard LP benchmarks
while being orders of magnitude faster. However, it shares the common GNN
limitation that it is only efficient when the dataset fits in GPU memory.
Accordingly, we develop a highly scalable model, called BUDDY, which uses
feature precomputation to circumvent this limitation without sacrificing
predictive performance. Our experiments show that BUDDY also outperforms SGNNs
on standard LP benchmarks while being highly scalable and faster than ELPH.Comment: 29 pages, 19 figures, 6 appendice
Multimodal spatio-temporal deep learning framework for 3D object detection in instrumented vehicles
This thesis presents the utilization of multiple modalities, such as image and lidar, to incorporate spatio-temporal information from sequence data into deep learning architectures for 3Dobject detection in instrumented vehicles. The race to autonomy in instrumented vehicles or self-driving cars has stimulated significant research in developing autonomous driver assistance systems (ADAS) technologies related explicitly to perception systems. Object detection plays a crucial role in perception systems by providing spatial information to its subsequent modules; hence, accurate detection is a significant task supporting autonomous driving. The advent of deep learning in computer vision applications and the availability of multiple sensing modalities such as 360° imaging, lidar, and radar have led to state-of-the-art 2D and 3Dobject detection architectures. Most current state-of-the-art 3D object detection frameworks consider single-frame reference. However, these methods do not utilize temporal information associated with the objects or scenes from the sequence data. Thus, the present research hypothesizes that multimodal temporal information can contribute to bridging the gap between 2D and 3D metric space by improving the accuracy of deep learning frameworks for 3D object estimations. The thesis presents understanding multimodal data representations and selecting hyper-parameters using public datasets such as KITTI and nuScenes with Frustum-ConvNet as a baseline architecture. Secondly, an attention mechanism was employed along with convolutional-LSTM to extract spatial-temporal information from sequence data to improve 3D estimations and to aid the architecture in focusing on salient lidar point cloud features. Finally, various fusion strategies are applied to fuse the modalities and temporal information into the architecture to assess its efficacy on performance and computational complexity. Overall, this thesis has established the importance and utility of multimodal systems for refined 3D object detection and proposed a complex pipeline incorporating spatial, temporal and attention mechanisms to improve specific, and general class accuracy demonstrated on key autonomous driving data sets
Endogenous measures for contextualising large-scale social phenomena: a corpus-based method for mediated public discourse
This work presents an interdisciplinary methodology for developing endogenous measures of group membership through analysis of pervasive linguistic patterns in public discourse. Focusing on political discourse, this work critiques the conventional approach to the study of political participation, which is premised on decontextualised, exogenous measures to characterise groups. Considering the theoretical and empirical weaknesses of decontextualised approaches to large-scale social phenomena, this work suggests that contextualisation using endogenous measures might provide a complementary perspective to mitigate such weaknesses.
This work develops a sociomaterial perspective on political participation in mediated discourse as affiliatory action performed through language. While the affiliatory function of language is often performed consciously (such as statements of identity), this work is concerned with unconscious features (such as patterns in lexis and grammar). This work argues that pervasive patterns in such features that emerge through socialisation are resistant to change and manipulation, and thus might serve as endogenous measures of sociopolitical contexts, and thus of groups.
In terms of method, the work takes a corpus-based approach to the analysis of data from the Twitter messaging service whereby patterns in users’ speech are examined statistically in order to trace potential community membership. The method is applied in the US state of Michigan during the second half of 2018—6 November having been the date of midterm (i.e. non-Presidential) elections in the United States. The corpus is assembled from the original posts of 5,889 users, who are nominally geolocalised to 417 municipalities. These users are clustered according to pervasive language features. Comparing the linguistic clusters according to the municipalities they represent finds that there are regular sociodemographic differentials across clusters. This is understood as an indication of social structure, suggesting that endogenous measures derived from pervasive patterns in language may indeed offer a complementary, contextualised perspective on large-scale social phenomena
The Metaverse: Survey, Trends, Novel Pipeline Ecosystem & Future Directions
The Metaverse offers a second world beyond reality, where boundaries are
non-existent, and possibilities are endless through engagement and immersive
experiences using the virtual reality (VR) technology. Many disciplines can
benefit from the advancement of the Metaverse when accurately developed,
including the fields of technology, gaming, education, art, and culture.
Nevertheless, developing the Metaverse environment to its full potential is an
ambiguous task that needs proper guidance and directions. Existing surveys on
the Metaverse focus only on a specific aspect and discipline of the Metaverse
and lack a holistic view of the entire process. To this end, a more holistic,
multi-disciplinary, in-depth, and academic and industry-oriented review is
required to provide a thorough study of the Metaverse development pipeline. To
address these issues, we present in this survey a novel multi-layered pipeline
ecosystem composed of (1) the Metaverse computing, networking, communications
and hardware infrastructure, (2) environment digitization, and (3) user
interactions. For every layer, we discuss the components that detail the steps
of its development. Also, for each of these components, we examine the impact
of a set of enabling technologies and empowering domains (e.g., Artificial
Intelligence, Security & Privacy, Blockchain, Business, Ethics, and Social) on
its advancement. In addition, we explain the importance of these technologies
to support decentralization, interoperability, user experiences, interactions,
and monetization. Our presented study highlights the existing challenges for
each component, followed by research directions and potential solutions. To the
best of our knowledge, this survey is the most comprehensive and allows users,
scholars, and entrepreneurs to get an in-depth understanding of the Metaverse
ecosystem to find their opportunities and potentials for contribution
Self-Supervised Learning to Prove Equivalence Between Straight-Line Programs via Rewrite Rules
We target the problem of automatically synthesizing proofs of semantic
equivalence between two programs made of sequences of statements. We represent
programs using abstract syntax trees (AST), where a given set of
semantics-preserving rewrite rules can be applied on a specific AST pattern to
generate a transformed and semantically equivalent program. In our system, two
programs are equivalent if there exists a sequence of application of these
rewrite rules that leads to rewriting one program into the other. We propose a
neural network architecture based on a transformer model to generate proofs of
equivalence between program pairs. The system outputs a sequence of rewrites,
and the validity of the sequence is simply checked by verifying it can be
applied. If no valid sequence is produced by the neural network, the system
reports the programs as non-equivalent, ensuring by design no programs may be
incorrectly reported as equivalent. Our system is fully implemented for a given
grammar which can represent straight-line programs with function calls and
multiple types. To efficiently train the system to generate such sequences, we
develop an original incremental training technique, named self-supervised
sample selection. We extensively study the effectiveness of this novel training
approach on proofs of increasing complexity and length. Our system, S4Eq,
achieves 97% proof success on a curated dataset of 10,000 pairs of equivalent
programsComment: 30 pages including appendi
- …