254 research outputs found
A Consensus Approach to Distributed Convex Optimization in Multi-Agent Systems
In this thesis we address the problem of distributed unconstrained convex optimization under separability assumptions, i.e., the framework where a network of agents, each endowed with local private convex cost and subject to communication constraints, wants to collaborate to compute the minimizer of the sum of the local costs. We propose a design methodology that combines average consensus algorithms and separation of time-scales ideas. This strategy is proven, under suitable hypotheses, to be globally convergent to the true minimizer. Intuitively, the procedure lets the agents distributedly compute and sequentially update an approximated Newton-Raphson direction by means of suitable average consensus ratios. We consider both a scalar and a multidimensional scenario of the Synchronous Newton-Raphson Consensus, proposing some alternative strategies which trade-off communication and computational requirements with convergence speed. We provide analytical proofs of convergence and we show with numerical simulations that the speed of convergence of this strategy is comparable with alternative optimization strategies such as the Alternating Direction Method of Multipliers, the Distributed Subgradient Method and Distributed Control Method.
Moreover, we consider the convergence rates of the Synchronous Newton-Raphson Consensus and the Gradient Descent Consensus under the simplificative assumption of quadratic local cost functions. We derive sufficient conditions which guarantee the convergence of the algorithms. From these conditions we then obtain closed form expressions that can be used to tune the parameters for maximizing the rate of convergence. Despite these formulas have been derived under quadratic local cost functions assumptions, they can be used as rules-of-thumb for tuning the parameters of the algorithms.
Finally, we propose an asynchronous version of the Newton-Raphson Consensus. Beside having low computational complexity, low communication requirements and being interpretable as a distributed Newton-Raphson algorithm, the technique has also the beneficial properties of requiring very little coordination and naturally supporting time-varying topologies. Again, we analytically prove that under some assumptions it shows either local or global convergence properties. Through numerical simulations we corroborate these results and we compare the performance of the Asynchronous Newton-Raphson Consensus with other distributed optimization methods
Newton-Raphson Consensus for Distributed Convex Optimization
We address the problem of distributed uncon- strained convex optimization
under separability assumptions, i.e., the framework where each agent of a
network is endowed with a local private multidimensional convex cost, is
subject to communication constraints, and wants to collaborate to compute the
minimizer of the sum of the local costs. We propose a design methodology that
combines average consensus algorithms and separation of time-scales ideas. This
strategy is proved, under suitable hypotheses, to be globally convergent to the
true minimizer. Intuitively, the procedure lets the agents distributedly
compute and sequentially update an approximated Newton- Raphson direction by
means of suitable average consensus ratios. We show with numerical simulations
that the speed of convergence of this strategy is comparable with alternative
optimization strategies such as the Alternating Direction Method of
Multipliers. Finally, we propose some alternative strategies which trade-off
communication and computational requirements with convergence speed.Comment: 18 pages, preprint with proof
Multi-Agent Distributed Optimization and Estimation over Lossy Networks
Nowadays, optimization is a pervasive tool, employed in a lot different fields. Due to its flexibility, it can be used to solve many diverse problems, some of which do not seem to require an optimization framework. As so, the research on this topic is always active and copious. Another very interesting and current investigation field involves multi-agent
systems, that is, systems composed by a lot of (possibly different) agents. The research on cyber-physical systems, believed to be one of the challenges of the 21st century, is very extensive, and comprises very complex systems like smart cities and smart power-grids, but also much more simple ones, like wireless sensor networks or camera networks. In a
multi-agent context, the optimization framework is extensively used. As a consequence, optimization in multi-agent systems is an attractive topic to investigate.
The contents of this thesis focus on distributed optimization within a multi-agent scenario, i.e., optimization performed by a set of peers, among which there is no leader. Accordingly, when these agents have to perform a task, formulated as an optimization problem, they have to collaborate to solve it, all using the same kind of update rule.
Collaboration clearly implies the need of messages exchange among the agents, and the focus of the thesis is on the criticalities related to the communication step. In particular, no reliability of this step is assumed, meaning that the packets exchanged between two agents can sometime be lost. Also, the sought-for solution does not have to employ an acknowledge protocol, that is, when an agent has to send a packet, it just sends it and goes on with its computation, without waiting for a confirmation that the receiver has actually received the packet. Almost all works in the existing literature deal with packet losses employing an acknowledge (ACK) system; the effort in this thesis is to avoid the use of an ACK system, since it can slow down the communication step. However, this choice of averting the use of ACKs makes the development of optimization algorithms, and especially their convergence proof, more involved. Apart from robustness to packet losses, the algorithms developed in this dissertation are also asynchronous, that is, the agents do not need to be synchronized to perform the update and communication steps.
Three types of optimization problems are analyzed in the thesis. The first one is the patrolling problem for camera networks. The algorithm developed to solve this problem has a restricted applicability, since it is very task-dependent. The other two problems are more general, because both concern the minimization of the sum of cost functions, one for each agent in the system. In the first case, the form of the local cost functions is particular: these, in fact, are locally coupled, in the sense that the cost function of an agent depends on the variables of the agent itself and on those of its direct neighbors. The sought-for algorithm has to satisfy two properties (apart from asynchronicity and robustness to packet losses): the requirement of asking a single communication exchange per iteration (which also reduces the need of synchronicity) and the requirement that the communication among agents is only between direct neighbors. In the second case, the local functions depend all on the same variables. The analysis first focuses on the special case of local quadratic cost functions and their strong relationship with the consensus problem. Besides the development of a robust and asynchronous algorithm for the average consensus problem, a comparison among algorithms to solve the minimization of the sum of quadratic cost functions is carried out. Finally, the distributed minimization of the sum of more general local cost functions is tackled, leading to the development of a robust version of the Newton-Raphson consensus.
The theoretical tools employed in the thesis to prove convergence of the algorithms mainly rely on Lyapunov theory and the separation of scales theory
Adaptive Robust Distributed Learning in Diffusion Sensor Networks
In this paper, the problem of adaptive distributed learning in diffusion networks is considered. The algorithms are developed within the convex set theoretic framework. More specifically, they are based on computationally simple geometric projections onto closed convex sets. The paper suggests a novel combine-project-adapt protocol for cooperation among the nodes of the network; such a protocol fits naturally with the philosophy that underlies the projection-based rationale. Moreover, the possibility that some of the nodes may fail is also considered and it is addressed by employing robust statistics loss functions. Such loss functions can easily be accommodated in the adopted algorithmic framework; all that is required from a loss function is convexity. Under some mild assumptions, the proposed algorithms enjoy monotonicity, asymptotic optimality, asymptotic consensus, strong convergence and linear complexity with respect to the number of unknown parameters. Finally, experiments in the context of the system-identification task verify the validity of the proposed algorithmic schemes, which are compared to other recent algorithms that have been developed for adaptive distributed learning
Active Contours and Image Segmentation: The Current State Of the Art
Image segmentation is a fundamental task in image analysis responsible for partitioning an image into multiple sub-regions based on a desired feature. Active contours have been widely used as attractive image segmentation methods because they always produce sub-regions with continuous boundaries, while the kernel-based edge detection methods, e.g. Sobel edge detectors, often produce discontinuous boundaries. The use of level set theory has provided more flexibility and convenience in the implementation of active contours. However, traditional edge-based active contour models have been applicable to only relatively simple images whose sub-regions are uniform without internal edges. Here in this paper we attempt to brief the taxonomy and current state of the art in Image segmentation and usage of Active Contours
Group-Lasso on Splines for Spectrum Cartography
The unceasing demand for continuous situational awareness calls for
innovative and large-scale signal processing algorithms, complemented by
collaborative and adaptive sensing platforms to accomplish the objectives of
layered sensing and control. Towards this goal, the present paper develops a
spline-based approach to field estimation, which relies on a basis expansion
model of the field of interest. The model entails known bases, weighted by
generic functions estimated from the field's noisy samples. A novel field
estimator is developed based on a regularized variational least-squares (LS)
criterion that yields finitely-parameterized (function) estimates spanned by
thin-plate splines. Robustness considerations motivate well the adoption of an
overcomplete set of (possibly overlapping) basis functions, while a sparsifying
regularizer augmenting the LS cost endows the estimator with the ability to
select a few of these bases that ``better'' explain the data. This parsimonious
field representation becomes possible, because the sparsity-aware spline-based
method of this paper induces a group-Lasso estimator for the coefficients of
the thin-plate spline expansions per basis. A distributed algorithm is also
developed to obtain the group-Lasso estimator using a network of wireless
sensors, or, using multiple processors to balance the load of a single
computational unit. The novel spline-based approach is motivated by a spectrum
cartography application, in which a set of sensing cognitive radios collaborate
to estimate the distribution of RF power in space and frequency. Simulated
tests corroborate that the estimated power spectrum density atlas yields the
desired RF state awareness, since the maps reveal spatial locations where idle
frequency bands can be reused for transmission, even when fading and shadowing
effects are pronounced.Comment: Submitted to IEEE Transactions on Signal Processin
Improving statistical inference for gene expression profiling data by borrowing information
Gene expression profiling experiments, in particular, microarray experiments, are popular in genomics research. However, in addition to the great opportunities provided by such experiments, statistical challenges also arise in the analysis of expression profiling data. The current thesis discusses statistical issues associated with gene expression profiling experiments and develops new statistical methods to tackle some of these problems.
In Chapter 2, we consider the insufficient sample size problem in detecting differential gene expression. We address the problem by developing and evaluating methods for variance model selection. The idea is that information about error variances might be learned from related datasets to improve the estimation of error variances. We develop a modified multiresponse permutation procedure (MRPP), modified cross-validation procedures, and the right AICc (corrected Akaike’s information criterion) for choosing a variance model. Through realistic simulations based on three real microarray studies, we evaluate the proposed methods and suggest practical recommendations for data analysis.
In Chapter 3, we address the multiple testing problem by improving the estimation of the distribution of noncentrality parameters given a large number of two-sample t-tests. We provide parametric, nonparametric and semiparametric estimators for the distribution of noncentrality parameters, as well as false discovery rates (FDR) and local FDR. Simulations show that our density estimates are closer to the underlying truth and that our estimates of FDR are also improved relative to competing methods under a variety of situations.
In Chapter 4, we develop a novel combination of two statistical techniques with the aim to by-pass the curse of dimensionality problem in detecting differential expression of genes. We accept the fact that, in “small N, large p” situations, the data are not sufficient to provide enough information about dependency across genes. Hence, we suggest using a priori biological knowledge to assist statistical inference. We first use multidimensional scaling (MDS) methods to summarize prior knowledge about inter-gene relationships into a set of pseudo-covariates. Then, we develop a hierarchical additive logistic regression model conditional upon the generated pseudo-covariates. Simulations and analysis of real microarray data suggest that our strategy is more powerful than methods that do not use \a priori information.
Future research directions are discussed at the end of the thesis
Using Regularization to Evaluate Differential Item Functioning Among Multiple Covariates: A Penalized Expectation-Maximization Algorithm via Coordinate Descent and Soft-Thresholding
Testing for differential item functioning (DIF) has undergone rapid statistical developments in recent years. Namely, the moderated nonlinear factor analysis (MNLFA) model allows for simultaneous testing of DIF in multiple categorical and continuous covariates (e.g., age, gender, ethnicity, etc.). Recent work has also implemented a LASSO regularization approach to identify DIF and select anchor items for model identification. Although regularized MNLFA provides greater flexibility to evaluate DIF, less development has been made in efficiently estimating model parameters. Most previous implementations of MNLFA have directly maximized the observed marginal likelihood function, which limits the method to only a few items and covariates. Additionally, penalization in the MNLFA model has only been performed outside of the optimization routine, which results in a non-standard method for setting estimates to zero. To overcome these difficulties, I introduce a penalized expectation-maximization (EM) algorithm that efficiently estimates many more item parameters than previous implementations and performs regularization during optimization. I extend the regularized MNLFA model to include not just soft-thresholding for LASSO penalization, but also firm-thresholding for the MCP approach. A Monte Carlo simulation study and an empirical data analysis evaluates this new algorithm, comparing the LASSO and MCP approaches against previous work. Finally, a discussion of future research directions concludes the dissertation.Doctor of Philosoph
Recommended from our members
High dimensional information processing
Part I: Consider the n-dimensional vector y = Xβ + ǫ where β ∈ Rp has only k nonzero entries and ǫ ∈ Rn is a Gaussian noise. This can be viewed as a linear system with sparsity constraints corrupted by noise, where the objective is to estimate the sparsity pattern of β given the observation vector y and the measurement matrix X. First, we derive a non-asymptotic upper bound on the probability that a specific wrong sparsity pattern is identified by the maximum-likelihood estimator. We find that this probability depends (inversely) exponentially on the difference of kXβk2 and the ℓ2-norm of Xβ projected onto the range of columns of X indexed by the wrong sparsity pattern. Second, when X is randomly drawn from a Gaussian ensemble, we calculate a non-asymptotic upper bound on the probability of the maximum-likelihood decoder not declaring (partially) the true sparsity pattern. Consequently, we obtain sufficient conditions on the sample size n that guarantee almost surely the recovery of the true sparsity pattern. We find that the required growth rate of sample size n matches the growth rate of previously established necessary conditions. Part II: Estimating two-dimensional firing rate maps is a common problem, arising in a number of contexts: the estimation of place fields in hippocampus, the analysis of temporally nonstationary tuning curves in sensory and motor areas, the estimation of firing rates following spike-triggered covariance analyses, etc. Here we introduce methods based on Gaussian process nonparametric Bayesian techniques for estimating these two-dimensional rate maps. These techniques offer a number of advantages: the estimates may be computed efficiently, come equipped with natural errorbars, adapt their smoothness automatically to the local density and informativeness of the observed data, and permit direct fitting of the model hyperparameters (e.g., the prior smoothness of the rate map) via maximum marginal likelihood. We illustrate the flexibility and performance of the new techniques on a variety of simulated and real data. Part III: Many fundamental questions in theoretical neuroscience involve optimal decoding and the computation of Shannon information rates in populations of spiking neurons. In this paper, we apply methods from the asymptotic theory of statistical inference to obtain a clearer analytical understanding of these quantities. We find that for large neural populations carrying a finite total amount of information, the full spiking population response is asymptotically as informative as a single observation from a Gaussian process whose mean and covariance can be characterized explicitly in terms of network and single neuron properties. The Gaussian form of this asymptotic sufficient statistic allows us in certain cases to perform optimal Bayesian decoding by simple linear transformations, and to obtain closed-form expressions of the Shannon information carried by the network. One technical advantage of the theory is that it may be applied easily even to non-Poisson point process network models; for example, we find that under some conditions, neural populations with strong history-dependent (non-Poisson) effects carry exactly the same information as do simpler equivalent populations of non-interacting Poisson neurons with matched firing rates. We argue that our findings help to clarify some results from the recent literature on neural decoding and neuroprosthetic design. Part IV: A model of distributed parameter estimation in networks is introduced, where agents have access to partially informative measurements over time. Each agent faces a local identification problem, in the sense that it cannot consistently estimate the parameter in isolation. We prove that, despite local identification problems, if agents update their estimates recursively as a function of their neighbors’ beliefs, they can consistently estimate the true parameter provided that the communication network is strongly connected; that is, there exists an information path between any two agents in the network. We also show that the estimates of all agents are asymptotically normally distributed. Finally, we compute the asymptotic variance of the agents’ estimates in terms of their observation models and the network topology, and provide conditions under which the distributed estimators are as efficient as any centralized estimator
Recommended from our members
Self-controlled methods for postmarketing drug safety surveillance in large-scale longitudinal data
A primary objective in postmarketing drug safety surveillance is to ascertain the relationship between time-varying drug exposures and adverse events (AEs) related to health outcomes. Surveillance can be based on longitudinal observational databases (LODs), which contain time-stamped patient-level medical information including periods of drug exposure and dates of diagnoses. Due to its desirable properties, we focus on the self-controlled case series (SCCS) method for analysis in this context. SCCS implicitly controls for fixed multiplicative baseline covariates since each individual acts as their own control. In addition, only exposed cases are required for the analysis, which is computationally advantageous. In the first part of this work we present how the simple SCCS model can be applied to the surveillance problem, and compare the results of simple SCCS to those of existing methods. Many current surveillance methods are based on marginal associations between drug exposures and AEs. Such analyses ignore confounding drugs and interactions and have the potential to give misleading results. In order to avoid these difficulties, it is desirable for an analysis strategy to incorporate large numbers of time-varying potential confounders such as other drugs. In the second part of this work we propose the Bayesian multiple SCCS approach, which deals with high dimensionality and can provide a sparse solution via a Laplacian prior. We present details of the model and optimization procedure, as well as results of empirical investigations. SCCS is based on a conditional Poisson regression model, which assumes that events at different time points are conditionally independent given the covariate process. This requirement is problematic when the occurrence of an event can alter the future event risk. In a clinical setting, for example, patients who have a first myocardial infarction (MI) may be at higher subsequent risk for a second. In the third part of this work we propose the positive dependence self-controlled case series (PD-SCCS) method: a generalization of SCCS that allows the occurrence of an event to increase the future event risk, yet maintains the advantages of the original by controlling for fixed baseline covariates and relying solely on data from cases. We develop the model and compare the results of PD-SCCS and SCCS on example drug-AE pairs
- …