Search CORE

65,901 research outputs found

An Efficient Interior-Point Decomposition Algorithm for Parallel Solution of Large-Scale Nonlinear Problems with Significant Variable Coupling

Author: Kang Jia
Publication venue
Publication date: 06/04/2016
Field of study

In this dissertation we develop multiple algorithms for efficient parallel solution of structured nonlinear programming problems by decomposition of the linear augmented system solved at each iteration of a nonlinear interior-point approach. In particular, we address large-scale, block-structured problems with a significant number of complicating, or coupling variables. This structure arises in many important problem classes including multi-scenario optimization, parameter estimation, two-stage stochastic programming, optimal control and power network problems. The structure of these problems induces a block-angular structure in the augmented system, and parallel solution is possible using a Schur-complement decomposition. Three major variants are implemented: a serial, full-space interior-point method, serial and parallel versions of an explicit Schur-complement decomposition, and serial and parallel versions of an implicit PCG-based Schur-complement decomposition. All of these algorithms have been implemented in C++ in an extensible software framework for nonlinear optimization. The explicit Schur-complement decomposition is typically effective for problems with a few hundred coupling variables. We demonstrate the performance of our implementation on an important problem in optimal power grid operation, the contingency-constrained AC optimal power ow problem. In this dissertation, we present a rectangular IV formulation for the contingency-constrained ACOPF problem and demonstrate that the explicit Schur-complement decomposition can dramatically reduce solution times for a problem with a large number of contingency scenarios. Moreover, a comparison of the explicit Schur-complement decomposition implementation and the Progressive Hedging approach provided by Pyomo is provided, showing that the internal decomposition approach is computationally favorable to the external approach. However, the explicit Schur-complement decomposition approach is not appropriate for problems with a large number of coupling variables because of the high computational cost associated with forming and solving the dense Schur-complement. We show that this bottleneck can be overcome by solving the Schur-complement equations implicitly using a quasi-Newton preconditioned conjugate gradient method. This new algorithm avoids explicit formation and factorization of the Schur-complement. The computational efficiency of the serial and parallel versions of this algorithm are compared with the serial full-space approach, and the serial and parallel explicit Schur-complement approach on a set of quadratic parameter estimation problems and nonlinear optimization problems. These results show that the PCG implicit Schur-complement approach dramatically reduces the computational expense for problems with many coupling variables

Texas A&M Repository

Graphics Processing Unit–Enhanced Genetic Algorithms for Solving the Temporal Dynamics of Gene Regulatory Networks

Author: Córdoba Zurita Antonio
Díaz del Río Fernando
García Calvo Agustín
Guisado Lízar José Luís
Jiménez-Morales Francisco de Paula
Publication venue: 'SAGE Publications'
Publication date: 01/01/2018
Field of study

Understanding the regulation of gene expression is one of the key problems in current biology. A promising method for that purpose is the determination of the temporal dynamics between known initial and ending network states, by using simple acting rules. The huge amount of rule combinations and the nonlinear inherent nature of the problem make genetic algorithms an excellent candidate for finding optimal solutions. As this is a computationally intensive problem that needs long runtimes in conventional architectures for realistic network sizes, it is fundamental to accelerate this task. In this article, we study how to develop efficient parallel implementations of this method for the fine-grained parallel architecture of graphics processing units (GPUs) using the compute unified device architecture (CUDA) platform. An exhaustive and methodical study of various parallel genetic algorithm schemes—master-slave, island, cellular, and hybrid models, and various individual selection methods (roulette, elitist)—is carried out for this problem. Several procedures that optimize the use of the GPU’s resources are presented. We conclude that the implementation that produces better results (both from the performance and the genetic algorithm fitness perspectives) is simulating a few thousands of individuals grouped in a few islands using elitist selection. This model comprises 2 mighty factors for discovering the best solutions: finding good individuals in a short number of generations, and introducing genetic diversity via a relatively frequent and numerous migration. As a result, we have even found the optimal solution for the analyzed gene regulatory network (GRN). In addition, a comparative study of the performance obtained by the different parallel implementations on GPU versus a sequential application on CPU is carried out. In our tests, a multifold speedup was obtained for our optimized parallel implementation of the method on medium class GPU over an equivalent sequential single-core implementation running on a recent Intel i7 CPU. This work can provide useful guidance to researchers in biology, medicine, or bioinformatics in how to take advantage of the parallelization on massively parallel devices and GPUs to apply novel metaheuristic algorithms powered by nature for real-world applications (like the method to solve the temporal dynamics of GRNs)

idUS. Depósito de Investigación Universidad de Sevilla

Adaptation and learning over networks for nonlinear system modeling

Author: Argyriou
Balakrishnan
Bouboulis
Bouboulis
Cattivelli
Cevher
Chen
Chen
Chen
Chen
Chen
Chen
Chen
Chen
Chouvardas
Di Lorenzo
Di Lorenzo
Evgeniou
Forero
Gao
Honeine
Honeine
Huang
Igelnik
Jin
Lazarevic
Li
Lopes
Mateos
Matta
Nassif
Nassif
Navia-Vazquez
Parreira
Predd
Rahimi
Richard
Rusu
Sandryhaila
Sayed
Sayed
Scardapane
Scardapane
Scardapane
Scardapane
Scarpiniti
Scarpiniti
Shin
Singh
Tsitsiklis
Yuan
Zhao
Zhao
Publication venue
Publication date: 28/04/2017
Field of study

In this chapter, we analyze nonlinear filtering problems in distributed environments, e.g., sensor networks or peer-to-peer protocols. In these scenarios, the agents in the environment receive measurements in a streaming fashion, and they are required to estimate a common (nonlinear) model by alternating local computations and communications with their neighbors. We focus on the important distinction between single-task problems, where the underlying model is common to all agents, and multitask problems, where each agent might converge to a different model due to, e.g., spatial dependencies or other factors. Currently, most of the literature on distributed learning in the nonlinear case has focused on the single-task case, which may be a strong limitation in real-world scenarios. After introducing the problem and reviewing the existing approaches, we describe a simple kernel-based algorithm tailored for the multitask case. We evaluate the proposal on a simulated benchmark task, and we conclude by detailing currently open problems and lines of research.Comment: To be published as a chapter in `Adaptive Learning Methods for Nonlinear System Modeling', Elsevier Publishing, Eds. D. Comminiello and J.C. Principe (2018

arXiv.org e-Print Archive

Crossref

HAL-INSU