3,403 research outputs found

    The H2 Control Problem for Quadratically Invariant Systems with Delays

    Full text link
    This paper gives a new solution to the output feedback H2 problem for quadratically invariant communication delay patterns. A characterization of all stabilizing controllers satisfying the delay constraints is given and the decentralized H2 problem is cast as a convex model matching problem. The main result shows that the model matching problem can be reduced to a finite-dimensional quadratic program. A recursive state-space method for computing the optimal controller based on vectorization is given.Comment: Draft submitted to IEEE Transactions on Automatic Contro

    Memory-constrained Vectorization and Scheduling of Dataflow Graphs for Hybrid CPU-GPU Platforms

    Full text link
    The increasing use of heterogeneous embedded systems with multi-core CPUs and Graphics Processing Units (GPUs) presents important challenges in effectively exploiting pipeline, task and data-level parallelism to meet throughput requirements of digital signal processing (DSP) applications. Moreover, in the presence of system-level memory constraints, hand optimization of code to satisfy these requirements is inefficient and error-prone, and can therefore, greatly slow down development time or result in highly underutilized processing resources. In this paper, we present vectorization and scheduling methods to effectively exploit multiple forms of parallelism for throughput optimization on hybrid CPU-GPU platforms, while conforming to system-level memory constraints. The methods operate on synchronous dataflow representations, which are widely used in the design of embedded systems for signal and information processing. We show that our novel methods can significantly improve system throughput compared to previous vectorization and scheduling approaches under the same memory constraints. In addition, we present a practical case-study of applying our methods to significantly improve the throughput of an orthogonal frequency division multiplexing (OFDM) receiver system for wireless communications.Comment: 25 page

    Convex Global 3D Registration with Lagrangian Duality

    Get PDF
    The registration of 3D models by a Euclidean transformation is a fundamental task at the core of many application in computer vision. This problem is non-convex due to the presence of rotational constraints, making traditional local optimization methods prone to getting stuck in local minima. This paper addresses finding the globally optimal transformation in various 3D registration problems by a unified formulation that integrates common geometric registration modalities (namely point-to-point, point-to-line and point-to-plane). This formulation renders the optimization problem independent of both the number and nature of the correspondences. The main novelty of our proposal is the introduction of a strengthened Lagrangian dual relaxation for this problem, which surpasses previous similar approaches [32] in effectiveness. In fact, even though with no theoretical guarantees, exhaustive empirical evaluation in both synthetic and real experiments always resulted on a tight relaxation that allowed to recover a guaranteed globally optimal solution by exploiting duality theory. Thus, our approach allows for effectively solving the 3D registration with global optimality guarantees while running at a fraction of the time for the state-of-the-art alternative [34], based on a more computationally intensive Branch and Bound method.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

    Higher-Order Low-Rank Regression

    Full text link
    This paper proposes an efficient algorithm (HOLRR) to handle regression tasks where the outputs have a tensor structure. We formulate the regression problem as the minimization of a least square criterion under a multilinear rank constraint, a difficult non convex problem. HOLRR computes efficiently an approximate solution of this problem, with solid theoretical guarantees. A kernel extension is also presented. Experiments on synthetic and real data show that HOLRR outperforms multivariate and multilinear regression methods and is considerably faster than existing tensor methods.Comment: submitted to ICML 201

    Polly's Polyhedral Scheduling in the Presence of Reductions

    Full text link
    The polyhedral model provides a powerful mathematical abstraction to enable effective optimization of loop nests with respect to a given optimization goal, e.g., exploiting parallelism. Unexploited reduction properties are a frequent reason for polyhedral optimizers to assume parallelism prohibiting dependences. To our knowledge, no polyhedral loop optimizer available in any production compiler provides support for reductions. In this paper, we show that leveraging the parallelism of reductions can lead to a significant performance increase. We give a precise, dependence based, definition of reductions and discuss ways to extend polyhedral optimization to exploit the associativity and commutativity of reduction computations. We have implemented a reduction-enabled scheduling approach in the Polly polyhedral optimizer and evaluate it on the standard Polybench 3.2 benchmark suite. We were able to detect and model all 52 arithmetic reductions and achieve speedups up to 2.21×\times on a quad core machine by exploiting the multidimensional reduction in the BiCG benchmark.Comment: Presented at the IMPACT15 worksho

    Particle-in-Cell Laser-Plasma Simulation on Xeon Phi Coprocessors

    Full text link
    This paper concerns development of a high-performance implementation of the Particle-in-Cell method for plasma simulation on Intel Xeon Phi coprocessors. We discuss suitability of the method for Xeon Phi architecture and present our experience of porting and optimization of the existing parallel Particle-in-Cell code PICADOR. Direct porting with no code modification gives performance on Xeon Phi close to 8-core CPU on a benchmark problem with 50 particles per cell. We demonstrate step-by-step application of optimization techniques such as improving data locality, enhancing parallelization efficiency and vectorization that leads to 3.75 x speedup on CPU and 7.5 x on Xeon Phi. The optimized version achieves 18.8 ns per particle update on Intel Xeon E5-2660 CPU and 9.3 ns per particle update on Intel Xeon Phi 5110P. On a real problem of laser ion acceleration in targets with surface grating that requires a large number of macroparticles per cell the speedup of Xeon Phi compared to CPU is 1.6 x.Comment: 16 pages, 3 figure

    An algorithm for the optimization of finite element integration loops

    Full text link
    We present an algorithm for the optimization of a class of finite element integration loop nests. This algorithm, which exploits fundamental mathematical properties of finite element operators, is proven to achieve a locally optimal operation count. In specified circumstances the optimum achieved is global. Extensive numerical experiments demonstrate significant performance improvements over the state of the art in finite element code generation in almost all cases. This validates the effectiveness of the algorithm presented here, and illustrates its limitations

    Vectorization of quantum operations and its use

    Full text link
    We give a detailed exposition of the "vectorized" notation for dealing with quantum operations. This notation is used to highlight the relationships between representations of completely-positive dynamics. Vectorization considerably simplifies the analysis of different methods of quantum process tomography, and enables us to derive compact representation of the investigated quantum operations in terms of the resulting data.Comment: 12 pages, long-overdue modifications + corrections following the feedbac

    What's In A Patch, I: Tensors, Differential Geometry and Statistical Shading Analysis

    Full text link
    We develop a linear algebraic framework for the shape-from-shading problem, because tensors arise when scalar (e.g. image) and vector (e.g. surface normal) fields are differentiated multiple times. The work is in two parts. In this first part we investigate when image derivatives exhibit invariance to changing illumination by calculating the statistics of image derivatives under general distributions on the light source. We computationally validate the hypothesis that image orientations (derivatives) provide increased invariance to illumination by showing (for a Lambertian model) that a shape-from-shading algorithm matching gradients instead of intensities provides more accurate reconstructions when illumination is incorrectly estimated under a flatness prior

    Continuous-Time Inverse Quadratic Optimal Control Problem

    Full text link
    In this paper, the problem of finite horizon inverse optimal control (IOC) is investigated, where the quadratic cost function of a dynamic process is required to be recovered based on the observation of optimal control sequences. We propose the first complete result of the necessary and sufficient condition for the existence of corresponding LQ cost functions. Under feasible cases, the analytic expression of the whole solution space is derived and the equivalence of weighting matrices in LQ problems is discussed. For infeasible problems, an infinite dimensional convex problem is formulated to obtain a best-fit approximate solution with minimal control residual. And the optimality condition is solved under a static quadratic programming framework to facilitate the computation. Finally, numerical simulations are used to demonstrate the effectiveness and feasibility of the proposed methods.Comment: 16 pages, 2 figure
    • …
    corecore