18,897 research outputs found
Manipulating Federated Recommender Systems: Poisoning with Synthetic Users and Its Countermeasures
Federated Recommender Systems (FedRecs) are considered privacy-preserving
techniques to collaboratively learn a recommendation model without sharing user
data. Since all participants can directly influence the systems by uploading
gradients, FedRecs are vulnerable to poisoning attacks of malicious clients.
However, most existing poisoning attacks on FedRecs are either based on some
prior knowledge or with less effectiveness. To reveal the real vulnerability of
FedRecs, in this paper, we present a new poisoning attack method to manipulate
target items' ranks and exposure rates effectively in the top-
recommendation without relying on any prior knowledge. Specifically, our attack
manipulates target items' exposure rate by a group of synthetic malicious users
who upload poisoned gradients considering target items' alternative products.
We conduct extensive experiments with two widely used FedRecs (Fed-NCF and
Fed-LightGCN) on two real-world recommendation datasets. The experimental
results show that our attack can significantly improve the exposure rate of
unpopular target items with extremely fewer malicious users and fewer global
epochs than state-of-the-art attacks. In addition to disclosing the security
hole, we design a novel countermeasure for poisoning attacks on FedRecs.
Specifically, we propose a hierarchical gradient clipping with sparsified
updating to defend against existing poisoning attacks. The empirical results
demonstrate that the proposed defending mechanism improves the robustness of
FedRecs.Comment: This paper has been accepted by SIGIR202
UniverSeg: Universal Medical Image Segmentation
While deep learning models have become the predominant method for medical
image segmentation, they are typically not capable of generalizing to unseen
segmentation tasks involving new anatomies, image modalities, or labels. Given
a new segmentation task, researchers generally have to train or fine-tune
models, which is time-consuming and poses a substantial barrier for clinical
researchers, who often lack the resources and expertise to train neural
networks. We present UniverSeg, a method for solving unseen medical
segmentation tasks without additional training. Given a query image and example
set of image-label pairs that define a new segmentation task, UniverSeg employs
a new Cross-Block mechanism to produce accurate segmentation maps without the
need for additional training. To achieve generalization to new tasks, we have
gathered and standardized a collection of 53 open-access medical segmentation
datasets with over 22,000 scans, which we refer to as MegaMedical. We used this
collection to train UniverSeg on a diverse set of anatomies and imaging
modalities. We demonstrate that UniverSeg substantially outperforms several
related methods on unseen tasks, and thoroughly analyze and draw insights about
important aspects of the proposed system. The UniverSeg source code and model
weights are freely available at https://universeg.csail.mit.eduComment: Victor and Jose Javier contributed equally to this work. Project
Website: https://universeg.csail.mit.ed
Self-Supervised Learning to Prove Equivalence Between Straight-Line Programs via Rewrite Rules
We target the problem of automatically synthesizing proofs of semantic
equivalence between two programs made of sequences of statements. We represent
programs using abstract syntax trees (AST), where a given set of
semantics-preserving rewrite rules can be applied on a specific AST pattern to
generate a transformed and semantically equivalent program. In our system, two
programs are equivalent if there exists a sequence of application of these
rewrite rules that leads to rewriting one program into the other. We propose a
neural network architecture based on a transformer model to generate proofs of
equivalence between program pairs. The system outputs a sequence of rewrites,
and the validity of the sequence is simply checked by verifying it can be
applied. If no valid sequence is produced by the neural network, the system
reports the programs as non-equivalent, ensuring by design no programs may be
incorrectly reported as equivalent. Our system is fully implemented for a given
grammar which can represent straight-line programs with function calls and
multiple types. To efficiently train the system to generate such sequences, we
develop an original incremental training technique, named self-supervised
sample selection. We extensively study the effectiveness of this novel training
approach on proofs of increasing complexity and length. Our system, S4Eq,
achieves 97% proof success on a curated dataset of 10,000 pairs of equivalent
programsComment: 30 pages including appendi
Advancing Model Pruning via Bi-level Optimization
The deployment constraints in practical applications necessitate the pruning
of large-scale deep learning models, i.e., promoting their weight sparsity. As
illustrated by the Lottery Ticket Hypothesis (LTH), pruning also has the
potential of improving their generalization ability. At the core of LTH,
iterative magnitude pruning (IMP) is the predominant pruning method to
successfully find 'winning tickets'. Yet, the computation cost of IMP grows
prohibitively as the targeted pruning ratio increases. To reduce the
computation overhead, various efficient 'one-shot' pruning methods have been
developed, but these schemes are usually unable to find winning tickets as good
as IMP. This raises the question of how to close the gap between pruning
accuracy and pruning efficiency? To tackle it, we pursue the algorithmic
advancement of model pruning. Specifically, we formulate the pruning problem
from a fresh and novel viewpoint, bi-level optimization (BLO). We show that the
BLO interpretation provides a technically-grounded optimization base for an
efficient implementation of the pruning-retraining learning paradigm used in
IMP. We also show that the proposed bi-level optimization-oriented pruning
method (termed BiP) is a special class of BLO problems with a bi-linear problem
structure. By leveraging such bi-linearity, we theoretically show that BiP can
be solved as easily as first-order optimization, thus inheriting the
computation efficiency. Through extensive experiments on both structured and
unstructured pruning with 5 model architectures and 4 data sets, we demonstrate
that BiP can find better winning tickets than IMP in most cases, and is
computationally as efficient as the one-shot pruning schemes, demonstrating 2-7
times speedup over IMP for the same level of model accuracy and sparsity.Comment: Thirty-sixth Conference on Neural Information Processing Systems
(NeurIPS 2022
Neural Architecture Search: Insights from 1000 Papers
In the past decade, advances in deep learning have resulted in breakthroughs
in a variety of areas, including computer vision, natural language
understanding, speech recognition, and reinforcement learning. Specialized,
high-performing neural architectures are crucial to the success of deep
learning in these areas. Neural architecture search (NAS), the process of
automating the design of neural architectures for a given task, is an
inevitable next step in automating machine learning and has already outpaced
the best human-designed architectures on many tasks. In the past few years,
research in NAS has been progressing rapidly, with over 1000 papers released
since 2020 (Deng and Lindauer, 2021). In this survey, we provide an organized
and comprehensive guide to neural architecture search. We give a taxonomy of
search spaces, algorithms, and speedup techniques, and we discuss resources
such as benchmarks, best practices, other surveys, and open-source libraries
Qluster: An easy-to-implement generic workflow for robust clustering of health data
The exploration of heath data by clustering algorithms allows to better describe the populations of interest by seeking the sub-profiles that compose it. This therefore reinforces medical knowledge, whether it is about a disease or a targeted population in real life. Nevertheless, contrary to the so-called conventional biostatistical methods where numerous guidelines exist, the standardization of data science approaches in clinical research remains a little discussed subject. This results in a significant variability in the execution of data science projects, whether in terms of algorithms used, reliability and credibility of the designed approach. Taking the path of parsimonious and judicious choice of both algorithms and implementations at each stage, this article proposes Qluster, a practical workflow for performing clustering tasks. Indeed, this workflow makes a compromise between (1) genericity of applications (e.g. usable on small or big data, on continuous, categorical or mixed variables, on database of high-dimensionality or not), (2) ease of implementation (need for few packages, few algorithms, few parameters, ...), and (3) robustness (e.g. use of proven algorithms and robust packages, evaluation of the stability of clusters, management of noise and multicollinearity). This workflow can be easily automated and/or routinely applied on a wide range of clustering projects. It can be useful both for data scientists with little experience in the field to make data clustering easier and more robust, and for more experienced data scientists who are looking for a straightforward and reliable solution to routinely perform preliminary data mining. A synthesis of the literature on data clustering as well as the scientific rationale supporting the proposed workflow is also provided. Finally, a detailed application of the workflow on a concrete use case is provided, along with a practical discussion for data scientists. An implementation on the Dataiku platform is available upon request to the authors
Generalized Relation Modeling for Transformer Tracking
Compared with previous two-stream trackers, the recent one-stream tracking
pipeline, which allows earlier interaction between the template and search
region, has achieved a remarkable performance gain. However, existing
one-stream trackers always let the template interact with all parts inside the
search region throughout all the encoder layers. This could potentially lead to
target-background confusion when the extracted feature representations are not
sufficiently discriminative. To alleviate this issue, we propose a generalized
relation modeling method based on adaptive token division. The proposed method
is a generalized formulation of attention-based relation modeling for
Transformer tracking, which inherits the merits of both previous two-stream and
one-stream pipelines whilst enabling more flexible relation modeling by
selecting appropriate search tokens to interact with template tokens. An
attention masking strategy and the Gumbel-Softmax technique are introduced to
facilitate the parallel computation and end-to-end learning of the token
division module. Extensive experiments show that our method is superior to the
two-stream and one-stream pipelines and achieves state-of-the-art performance
on six challenging benchmarks with a real-time running speed.Comment: Accepted by CVPR 2023. Code and models are publicly available at
https://github.com/Little-Podi/GR
Bayesian Reconstruction of Magnetic Resonance Images using Gaussian Processes
A central goal of modern magnetic resonance imaging (MRI) is to reduce the
time required to produce high-quality images. Efforts have included hardware
and software innovations such as parallel imaging, compressed sensing, and deep
learning-based reconstruction. Here, we propose and demonstrate a Bayesian
method to build statistical libraries of magnetic resonance (MR) images in
k-space and use these libraries to identify optimal subsampling paths and
reconstruction processes. Specifically, we compute a multivariate normal
distribution based upon Gaussian processes using a publicly available library
of T1-weighted images of healthy brains. We combine this library with
physics-informed envelope functions to only retain meaningful correlations in
k-space. This covariance function is then used to select a series of
ring-shaped subsampling paths using Bayesian optimization such that they
optimally explore space while remaining practically realizable in commercial
MRI systems. Combining optimized subsampling paths found for a range of images,
we compute a generalized sampling path that, when used for novel images,
produces superlative structural similarity and error in comparison to
previously reported reconstruction processes (i.e. 96.3% structural similarity
and <0.003 normalized mean squared error from sampling only 12.5% of the
k-space data). Finally, we use this reconstruction process on pathological data
without retraining to show that reconstructed images are clinically useful for
stroke identification
Learning Spiking Neural Systems with the Event-Driven Forward-Forward Process
We develop a novel credit assignment algorithm for information processing
with spiking neurons without requiring feedback synapses. Specifically, we
propose an event-driven generalization of the forward-forward and the
predictive forward-forward learning processes for a spiking neural system that
iteratively processes sensory input over a stimulus window. As a result, the
recurrent circuit computes the membrane potential of each neuron in each layer
as a function of local bottom-up, top-down, and lateral signals, facilitating a
dynamic, layer-wise parallel form of neural computation. Unlike spiking neural
coding, which relies on feedback synapses to adjust neural electrical activity,
our model operates purely online and forward in time, offering a promising way
to learn distributed representations of sensory data patterns with temporal
spike signals. Notably, our experimental results on several pattern datasets
demonstrate that the even-driven forward-forward (ED-FF) framework works well
for training a dynamic recurrent spiking system capable of both classification
and reconstruction
- …