286,533 research outputs found

    Analysis and Design of Tuned Turbo Codes

    Get PDF
    It has been widely observed that there exists a fundamental trade-off between the minimum (Hamming) distance properties and the iterative decoding convergence behavior of turbo-like codes. While capacity achieving code ensembles typically are asymptotically bad in the sense that their minimum distance does not grow linearly with block length, and they therefore exhibit an error floor at moderate-to-high signal to noise ratios, asymptotically good codes usually converge further away from channel capacity. In this paper, we introduce the concept of tuned turbo codes, a family of asymptotically good hybrid concatenated code ensembles, where asymptotic minimum distance growth rates, convergence thresholds, and code rates can be traded-off using two tuning parameters, {\lambda} and {\mu}. By decreasing {\lambda}, the asymptotic minimum distance growth rate is reduced in exchange for improved iterative decoding convergence behavior, while increasing {\lambda} raises the asymptotic minimum distance growth rate at the expense of worse convergence behavior, and thus the code performance can be tuned to fit the desired application. By decreasing {\mu}, a similar tuning behavior can be achieved for higher rate code ensembles.Comment: Accepted for publication in IEEE Transactions on Information Theor

    Parameter-Efficient Finetuning of Transformers for Source Code

    Full text link
    Pretrained Transformers achieve state-of-the-art performance in various code-processing tasks but may be too large to be deployed. As software development tools often incorporate modules for various purposes which may potentially use a single instance of the pretrained model, it appears relevant to utilize parameter-efficient fine-tuning for the pretrained models of code. In this work, we test two widely used approaches, adapters and LoRA, which were initially tested on NLP tasks, on four code-processing tasks. We find that though the efficient fine-tuning approaches may achieve comparable or higher performance than the standard, full, fine-tuning in code understanding tasks, they underperform full fine-tuning in code-generative tasks. These results underline the importance of testing efficient fine-tuning approaches on other domains than NLP and motivate future research in efficient fine-tuning for source code

    Non regression testing for the JOREK code

    Get PDF
    Non Regression Testing (NRT) aims to check if software modifications result in undesired behaviour. Suppose the behaviour of the application previously known, this kind of test makes it possible to identify an eventual regression, a bug. Improving and tuning a parallel code can be a time-consuming and difficult task, especially whenever people from different scientific fields interact closely. The JOREK code aims at investing Magnetohydrodynamic (MHD) instabilities in a Tokamak plasma. This paper describes the NRT procedure that has been tuned for this simulation code. Automation of the NRT is one keypoint to keeping the code healthy in a source code repository.Comment: No. RR-8134 (2012

    Tuning and optimization for a variety of many-core architectures without changing a single line of implementation code using the Alpaka library

    Full text link
    We present an analysis on optimizing performance of a single C++11 source code using the Alpaka hardware abstraction library. For this we use the general matrix multiplication (GEMM) algorithm in order to show that compilers can optimize Alpaka code effectively when tuning key parameters of the algorithm. We do not intend to rival existing, highly optimized DGEMM versions, but merely choose this example to prove that Alpaka allows for platform-specific tuning with a single source code. In addition we analyze the optimization potential available with vendor-specific compilers when confronted with the heavily templated abstractions of Alpaka. We specifically test the code for bleeding edge architectures such as Nvidia's Tesla P100, Intel's Knights Landing (KNL) and Haswell architecture as well as IBM's Power8 system. On some of these we are able to reach almost 50\% of the peak floating point operation performance using the aforementioned means. When adding compiler-specific #pragmas we are able to reach 5 TFLOPS/s on a P100 and over 1 TFLOPS/s on a KNL system.Comment: Accepted paper for the P\^{}3MA workshop at the ISC 2017 in Frankfur

    Multimode synthesis procedure for microwave filters based on thick inductive windows

    Get PDF
    For several types of microwave filters for space application it is important to manufacture hardware without tuning elements. For this to be possible, one needs a systematic procedure to codvert ideal elements, such as resonators and impedance inverters, into actual waveguide lengths and discontinuities. The situation is further complicated by the fact that waveguide discontinuities excite higher order modes that interacting with each other can have very strong effects. In this paper we first outline the theory behind a very efficient computer code for the simulation of microwave filters based on thick inductive windows. Then we describe in detail a step-by-step procedure that, based on the code developed, allows for the rapid design of this class of microwave filters without any tuning elements. Two actual examples of design are also discussed and comparisons presented between measurements and simulations

    Practical Implementation of Lattice QCD Simulation on Intel Xeon Phi Knights Landing

    Full text link
    We investigate implementation of lattice Quantum Chromodynamics (QCD) code on the Intel Xeon Phi Knights Landing (KNL). The most time consuming part of the numerical simulations of lattice QCD is a solver of linear equation for a large sparse matrix that represents the strong interaction among quarks. To establish widely applicable prescriptions, we examine rather general methods for the SIMD architecture of KNL, such as using intrinsics and manual prefetching, to the matrix multiplication and iterative solver algorithms. Based on the performance measured on the Oakforest-PACS system, we discuss the performance tuning on KNL as well as the code design for facilitating such tuning on SIMD architecture and massively parallel machines.Comment: 8 pages, 12 figures. Talk given at LHAM'17 "5th International Workshop on Legacy HPC Application Migration" in CANDAR'17 "The Fifth International Symposium on Computing and Networking" and to appear in the proceeding

    Rapidly reconfigurable optical phase encoder-decoders based on fiber Bragg gratings

    No full text
    We demonstrate the capacity for fast dynamic reconfiguration of optical code-division multiple access (OCDMA) phase en/decoders based on fiber Bragg gratings and a thermal phase-tuning technique. The tuning time between two different phase codes is measured to be less than 2 s. An OCDMA system using tunable-phase decoders is compared with a system using fixed-phase decoders and, although the system using fixed-phase decoders exhibits a shorter output autocorrelation pulsewidth and lower sidelobes, the system using tunable-phase decoders has advantages of flexibility and a more relaxed requirement on the input pulsewidth

    Paraiso : An Automated Tuning Framework for Explicit Solvers of Partial Differential Equations

    Full text link
    We propose Paraiso, a domain specific language embedded in functional programming language Haskell, for automated tuning of explicit solvers of partial differential equations (PDEs) on GPUs as well as multicore CPUs. In Paraiso, one can describe PDE solving algorithms succinctly using tensor equations notation. Hydrodynamic properties, interpolation methods and other building blocks are described in abstract, modular, re-usable and combinable forms, which lets us generate versatile solvers from little set of Paraiso source codes. We demonstrate Paraiso by implementing a compressive hydrodynamics solver. A single source code less than 500 lines can be used to generate solvers of arbitrary dimensions, for both multicore CPUs and GPUs. We demonstrate both manual annotation based tuning and evolutionary computing based automated tuning of the program.Comment: 52 pages, 14 figures, accepted for publications in Computational Science and Discover

    Benchmarking and tuning the MILC code on clusters and supercomputers

    Get PDF
    Recently, we have benchmarked and tuned the MILC code on a number of architectures including Intel Itanium and Pentium IV (PIV), dual-CPU Athlon, and the latest Compaq Alpha nodes. Results will be presented for many of these, and we shall discuss some simple code changes that can result in a very dramatic speedup of the KS conjugate gradient on processors with more advanced memory systems such as PIV, IBM SP and Alpha.Comment: Lattice2001(algorithms) 4 pages, includes hep-lat references not in published versio
    corecore