181,303 research outputs found
Leave-one-out Singular Subspace Perturbation Analysis for Spectral Clustering
The singular subspaces perturbation theory is of fundamental importance in
probability and statistics. It has various applications across different
fields. We consider two arbitrary matrices where one is a leave-one-column-out
submatrix of the other one and establish a novel perturbation upper bound for
the distance between the two corresponding singular subspaces. It is
well-suited for mixture models and results in a sharper and finer statistical
analysis than classical perturbation bounds such as Wedin's Theorem. Empowered
by this leave-one-out perturbation theory, we provide a deterministic entrywise
analysis for the performance of spectral clustering under mixture models. Our
analysis leads to an explicit exponential error rate for spectral clustering of
sub-Gaussian mixture models. For the mixture of isotropic Gaussians, the rate
is optimal under a weaker signal-to-noise condition than that of L{\"o}ffler et
al. (2021)
Nonasymptotic analysis of adaptive and annealed Feynman-Kac particle models
Sequential and quantum Monte Carlo methods, as well as genetic type search
algorithms can be interpreted as a mean field and interacting particle
approximations of Feynman-Kac models in distribution spaces. The performance of
these population Monte Carlo algorithms is strongly related to the stability
properties of nonlinear Feynman-Kac semigroups. In this paper, we analyze these
models in terms of Dobrushin ergodic coefficients of the reference Markov
transitions and the oscillations of the potential functions. Sufficient
conditions for uniform concentration inequalities w.r.t. time are expressed
explicitly in terms of these two quantities. We provide an original
perturbation analysis that applies to annealed and adaptive Feynman-Kac models,
yielding what seems to be the first results of this kind for these types of
models. Special attention is devoted to the particular case of Boltzmann-Gibbs
measures' sampling. In this context, we design an explicit way of tuning the
number of Markov chain Monte Carlo iterations with temperature schedule. We
also design an alternative interacting particle method based on an adaptive
strategy to define the temperature increments. The theoretical analysis of the
performance of this adaptive model is much more involved as both the potential
functions and the reference Markov transitions now depend on the random
evolution on the particle model. The nonasymptotic analysis of these complex
adaptive models is an open research problem. We initiate this study with the
concentration analysis of a simplified adaptive models based on reference
Markov transitions that coincide with the limiting quantities, as the number of
particles tends to infinity.Comment: Published at http://dx.doi.org/10.3150/14-BEJ680 in the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
DRUG-NEM: Optimizing drug combinations using single-cell perturbation response to account for intratumoral heterogeneity.
An individual malignant tumor is composed of a heterogeneous collection of single cells with distinct molecular and phenotypic features, a phenomenon termed intratumoral heterogeneity. Intratumoral heterogeneity poses challenges for cancer treatment, motivating the need for combination therapies. Single-cell technologies are now available to guide effective drug combinations by accounting for intratumoral heterogeneity through the analysis of the signaling perturbations of an individual tumor sample screened by a drug panel. In particular, Mass Cytometry Time-of-Flight (CyTOF) is a high-throughput single-cell technology that enables the simultaneous measurements of multiple ([Formula: see text]40) intracellular and surface markers at the level of single cells for hundreds of thousands of cells in a sample. We developed a computational framework, entitled Drug Nested Effects Models (DRUG-NEM), to analyze CyTOF single-drug perturbation data for the purpose of individualizing drug combinations. DRUG-NEM optimizes drug combinations by choosing the minimum number of drugs that produce the maximal desired intracellular effects based on nested effects modeling. We demonstrate the performance of DRUG-NEM using single-cell drug perturbation data from tumor cell lines and primary leukemia samples
On optimality of kernels for approximate Bayesian computation using sequential Monte Carlo
Approximate Bayesian computation (ABC) has gained popularity over the past few years for the analysis of complex models arising in population genetics, epidemiology and system biology. Sequential Monte Carlo (SMC) approaches have become work-horses in ABC. Here we discuss how to construct the perturbation kernels that are required in ABC SMC approaches, in order to construct a sequence of distributions that start out from a suitably defined prior and converge towards the unknown posterior. We derive optimality criteria for different kernels, which are based on the Kullback-Leibler divergence between a distribution and the distribution of the perturbed particles. We will show that for many complicated posterior distributions, locally adapted kernels tend to show the best performance. We find that the added moderate cost of adapting kernel functions is easily regained in terms of the higher acceptance rate. We demonstrate the computational efficiency gains in a range of toy examples which illustrate some of the challenges faced in real-world applications of ABC, before turning to two demanding parameter inference problems in molecular biology, which highlight the huge increases in efficiency that can be gained from choice of optimal kernels. We conclude with a general discussion of the rational choice of perturbation kernels in ABC SMC settings
Towards the Flatter Landscape and Better Generalization in Federated Learning under Client-level Differential Privacy
To defend the inference attacks and mitigate the sensitive information
leakages in Federated Learning (FL), client-level Differentially Private FL
(DPFL) is the de-facto standard for privacy protection by clipping local
updates and adding random noise. However, existing DPFL methods tend to make a
sharp loss landscape and have poor weight perturbation robustness, resulting in
severe performance degradation. To alleviate these issues, we propose a novel
DPFL algorithm named DP-FedSAM, which leverages gradient perturbation to
mitigate the negative impact of DP. Specifically, DP-FedSAM integrates
Sharpness Aware Minimization (SAM) optimizer to generate local flatness models
with improved stability and weight perturbation robustness, which results in
the small norm of local updates and robustness to DP noise, thereby improving
the performance. To further reduce the magnitude of random noise while
achieving better performance, we propose DP-FedSAM- by adopting the
local update sparsification technique. From the theoretical perspective, we
present the convergence analysis to investigate how our algorithms mitigate the
performance degradation induced by DP. Meanwhile, we give rigorous privacy
guarantees with R\'enyi DP, the sensitivity analysis of local updates, and
generalization analysis. At last, we empirically confirm that our algorithms
achieve state-of-the-art (SOTA) performance compared with existing SOTA
baselines in DPFL.Comment: 20 pages. arXiv admin note: substantial text overlap with
arXiv:2303.1124
- …