20 research outputs found
Team-optimal distributed MMSE estimation in general and tree networks
We construct team-optimal estimation algorithms over distributed networks for state estimation in the finite-horizon mean-square error (MSE) sense. Here, we have a distributed collection of agents with processing and cooperation capabilities. These agents observe noisy samples of a desired state through a linear model and seek to learn this state by interacting with each other. Although this problem has attracted significant attention and been studied extensively in fields including machine learning and signal processing, all the well-known strategies do not achieve team-optimal learning performance in the finite-horizon MSE sense. To this end, we formulate the finite-horizon distributed minimum MSE (MMSE) when there is no restriction on the size of the disclosed information, i.e., oracle performance, over an arbitrary network topology. Subsequently, we show that exchange of local estimates is sufficient to achieve the oracle performance only over certain network topologies. By inspecting these network structures, we propose recursive algorithms achieving the oracle performance through the disclosure of local estimates. For practical implementations we also provide approaches to reduce the complexity of the algorithms through the time-windowing of the observations. Finally, in the numerical examples, we demonstrate the superior performance of the introduced algorithms in the finite-horizon MSE sense due to optimal estimation. © 2017 Elsevier Inc
Logarithmic regret bound over diffusion based distributed estimation
We provide a logarithmic upper-bound on the regret function of the diffusion implementation for the distributed estimation. For certain learning rates, the bound shows guaranteed performance convergence of the distributed least mean square (DLMS) algorithms to the performance of the best estimation generated with hindsight of spatial and temporal data. We use a new cost definition for distributed estimation based on the widely-used statistical performance measures and the corresponding global regret function. Then, for certain learning rates, we provide an upper-bound on the global regret function without any statistical assumptions. © 2014 IEEE
Improved convergence performance of adaptive algorithms through logarithmic cost
We present a novel family of adaptive filtering algorithms based on a relative logarithmic cost. The new family intrinsically combines the higher and lower order measures of the error into a single continuous update based on the error amount. We introduce the least mean logarithmic square (LMLS) algorithm that achieves comparable convergence performance with the least mean fourth (LMF) algorithm and overcomes the stability issues of the LMF algorithm. In addition, we introduce the least logarithmic absolute difference (LLAD) algorithm. The LLAD and least mean square (LMS) algorithms demonstrate similar convergence performance in impulse-free noise environments while the LLAD algorithm is robust against impulsive interference and outperforms the sign algorithm (SA). © 2014 IEEE
Stochastic subgradient algorithms for strongly convex optimization over distributed networks
We study diffusion and consensus based optimization of a sum of unknown convex objective functions over distributed networks. The only access to these functions is through stochastic gradient oracles, each of which is only available at a different node; and a limited number of gradient oracle calls is allowed at each node. In this framework, we introduce a convex optimization algorithm based on stochastic subgradient descent (SSD) updates. We use a carefully designed time-dependent weighted averaging of the SSD iterates, which yields a convergence rate of O N ffiffiffi N p (1s)T after T gradient updates for each node on a network of N nodes, where 0 ≤ σ < 1 denotes the second largest singular value of the communication matrix. This rate of convergence matches the performance lower bound up to constant terms. Similar to the SSD algorithm, the computational complexity of the proposed algorithm also scales linearly with the dimensionality of the data. Furthermore, the communication load of the proposed method is the same as the communication load of the SSD algorithm. Thus, the proposed algorithm is highly efficient in terms of complexity and communication load. We illustrate the merits of the algorithm with respect to the state-of-art methods over benchmark real life data sets. © 2017 IEEE
Twice-universal piecewise linear regression via infinite depth context trees
We investigate the problem of sequential piecewise linear regression from a competitive framework. For an arbitrary and unknown data length n, we first introduce a method to partition the regressor space. Particularly, we present a recursive method that divides the regressor space into O(n) disjoint regions that can result in approximately 1.5n different piecewise linear models on the regressor space. For each region, we introduce a universal linear regressor whose performance is nearly as well as the best linear regressor whose parameters are set non-causally. We then use an infinite depth context tree to represent all piecewise linear models and introduce a universal algorithm to achieve the performance of the best piecewise linear model that can be selected in hindsight. In this sense, the introduced algorithm is twice-universal such that it sequentially achieves the performance of the best model that uses the optimal regression parameters. Our algorithm achieves this performance only with a computational complexity upper bounded by O(n) in the worst-case and O(log(n)) under certain regularity conditions. We provide the explicit description of the algorithm as well as the upper bounds on the regret with respect to the best nonlinear and piecewise linear models, and demonstrate the performance of the algorithm through simulations. © 2015 IEEE
Computationally highly efficient mixture of adaptive filters
We introduce a new combination approach for the mixture of adaptive filters based on the set-membership filtering (SMF) framework. We perform SMF to combine the outputs of several parallel running adaptive algorithms and propose unconstrained, affinely constrained and convexly constrained combination weight configurations. Here, we achieve better trade-off in terms of the transient and steady-state convergence performance while providing significant computational reduction. Hence, through the introduced approaches, we can greatly enhance the convergence performance of the constituent filters with a slight increase in the computational load. In this sense, our approaches are suitable for big data applications where the data should be processed in streams with highly efficient algorithms. In the numerical examples, we demonstrate the superior performance of the proposed approaches over the state of the art using the well-known datasets in the machine learning literature. © 2016, Springer-Verlag London
Piecewise nonlinear regression via decision adaptive trees
We investigate the problem of adaptive nonlinear regression and introduce tree based piecewise linear regression algorithms that are highly efficient and provide significantly improved performance with guaranteed upper bounds in an individual sequence manner. We partition the regressor space using hyperplanes in a nested structure according to the notion of a tree. In this manner, we introduce an adaptive nonlinear regression algorithm that not only adapts the regressor of each partition but also learns the complete tree structure with a computational complexity only polynomial in the number of nodes of the tree. Our algorithm is constructed to directly minimize the final regression error without introducing any ad-hoc parameters. Moreover, our method can be readily incorporated with any tree construction method as demonstrated in the paper. © 2014 EURASIP
Sequential Nonlinear Learning for Distributed Multiagent Systems via Extreme Learning Machines
We study online nonlinear learning over distributed multiagent systems, where each agent employs a single hidden layer feedforward neural network (SLFN) structure to sequentially minimize arbitrary loss functions. In particular, each agent trains its own SLFN using only the data that is revealed to itself. On the other hand, the aim of the multiagent system is to train the SLFN at each agent as well as the optimal centralized batch SLFN that has access to all the data, by exchanging information between neighboring agents. We address this problem by introducing a distributed subgradient-based extreme learning machine algorithm. The proposed algorithm provides guaranteed upper bounds on the performance of the SLFN at each agent and shows that each of these individual SLFNs asymptotically achieves the performance of the optimal centralized batch SLFN. Our performance guarantees explicitly distinguish the effects of data-and network-dependent parameters on the convergence rate of the proposed algorithm. The experimental results illustrate that the proposed algorithm achieves the oracle performance significantly faster than the state-of-the-art methods in the machine learning and signal processing literature. Hence, the proposed method is highly appealing for the applications involving big data. © 2016 IEEE
Communication efficient channel estimation over distributed networks
We study diffusion based channel estimation in distributed architectures suitable for various communication applications such as cognitive radios. Although the demand for distributed processing is steadily growing, these architectures require a substantial amount of communication among their nodes (or processing elements) causing significant energy consumption and increase in carbon footprint. Due to growing awareness of telecommunication industry's impact on the environment, the need to mitigate this problem is indisputable. To this end, we introduce algorithms significantly reducing the communication load between distributed nodes, which is the main cause in energy consumption, while providing outstanding performance. In this framework, after each node produces its local estimate of the communication channel, a single bit or a couple of bits of information is generated using certain random projections. This newly generated data is diffused and then used in neighboring nodes to recover the original full information, i.e., the channel estimate of the desired communication channel. We provide the complete state-space description of these algorithms and demonstrate the substantial gains through our experiments. © 2014 IEEE
Comprehensive lower bounds on sequential prediction
We study the problem of sequential prediction of real-valued sequences under the squared error loss function. While refraining from any statistical and structural assumptions on the underlying sequence, we introduce a competitive approach to this problem and compare the performance of a sequential algorithm with respect to the large and continuous class of parametric predictors. We define the performance difference between a sequential algorithm and the best parametric predictor as regret, and introduce a guaranteed worst-case lower bounds to this relative performance measure. In particular, we prove that for any sequential algorithm, there always exists a sequence for which this regret is lower bounded by zero. We then extend this result by showing that the prediction problem can be transformed into a parameter estimation problem if the class of parametric predictors satisfy a certain property, and provide a comprehensive lower bound to this case. © 2014 EURASIP
