31,800 research outputs found

    Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks

    Full text link
    Predicting the number of clock cycles a processor takes to execute a block of assembly instructions in steady state (the throughput) is important for both compiler designers and performance engineers. Building an analytical model to do so is especially complicated in modern x86-64 Complex Instruction Set Computer (CISC) machines with sophisticated processor microarchitectures in that it is tedious, error prone, and must be performed from scratch for each processor generation. In this paper we present Ithemal, the first tool which learns to predict the throughput of a set of instructions. Ithemal uses a hierarchical LSTM--based approach to predict throughput based on the opcodes and operands of instructions in a basic block. We show that Ithemal is more accurate than state-of-the-art hand-written tools currently used in compiler backends and static machine code analyzers. In particular, our model has less than half the error of state-of-the-art analytical models (LLVM's llvm-mca and Intel's IACA). Ithemal is also able to predict these throughput values just as fast as the aforementioned tools, and is easily ported across a variety of processor microarchitectures with minimal developer effort.Comment: Published at 36th International Conference on Machine Learning (ICML) 201

    User-Friendly Parallel Computations with Econometric Examples

    Get PDF
    This paper shows how a high level matrix programming language may be used to perform Monte Carlo simulation, bootstrapping, estimation by maximum likelihood and GMM, and kernel regression in parallel on symmetric multiprocessor computers or clusters of workstations. The implementation of parallelization is done in a way such that an investigator may use the programs without any knowledge of parallel programming. A bootable CD that allows rapid creation of a cluster for parallel computing is introduced. Examples show that parallelization can lead to important reductions in computational time. Detailed discussion of how the Monte Carlo problem was parallelized is included as an example for learning to write parallel programs for Octave.parallel computing, Monte Carlo, bootstrapping,maximum likelihood, GMM, kernel regression

    API design for machine learning software: experiences from the scikit-learn project

    Get PDF
    Scikit-learn is an increasingly popular machine learning li- brary. Written in Python, it is designed to be simple and efficient, accessible to non-experts, and reusable in various contexts. In this paper, we present and discuss our design choices for the application programming interface (API) of the project. In particular, we describe the simple and elegant interface shared by all learning and processing units in the library and then discuss its advantages in terms of composition and reusability. The paper also comments on implementation details specific to the Python ecosystem and analyzes obstacles faced by users and developers of the library

    Likelihood Inference In Parallel Systems Regression Models With Censored Data

    Get PDF
    The work in this thesis is concerned with the investigation of the finite sample performance of asymptotic inference procedures based on the likelihood function when applied to the regression model based on parallel systems with censored data. The study includes investigating the adequacy of these inferential procedures as well as investigating the relative performances of asymptotically equivalent likelihood-based statistics in small samples. The maximum likelihood estimator of the parameters of this model is not available in closed form. Thus, its actual sampling distribution is intractable. A simulation study is conducted to investigate the bias, the finite sample variance, the asymptotic variance obtained from the inverse of the observed Fisher information matrix, the adequacy of this approximate asymptotic variance, and the mean square

    Efficient transfer entropy analysis of non-stationary neural time series

    Full text link
    Information theory allows us to investigate information processing in neural systems in terms of information transfer, storage and modification. Especially the measure of information transfer, transfer entropy, has seen a dramatic surge of interest in neuroscience. Estimating transfer entropy from two processes requires the observation of multiple realizations of these processes to estimate associated probability density functions. To obtain these observations, available estimators assume stationarity of processes to allow pooling of observations over time. This assumption however, is a major obstacle to the application of these estimators in neuroscience as observed processes are often non-stationary. As a solution, Gomez-Herrero and colleagues theoretically showed that the stationarity assumption may be avoided by estimating transfer entropy from an ensemble of realizations. Such an ensemble is often readily available in neuroscience experiments in the form of experimental trials. Thus, in this work we combine the ensemble method with a recently proposed transfer entropy estimator to make transfer entropy estimation applicable to non-stationary time series. We present an efficient implementation of the approach that deals with the increased computational demand of the ensemble method's practical application. In particular, we use a massively parallel implementation for a graphics processing unit to handle the computationally most heavy aspects of the ensemble method. We test the performance and robustness of our implementation on data from simulated stochastic processes and demonstrate the method's applicability to magnetoencephalographic data. While we mainly evaluate the proposed method for neuroscientific data, we expect it to be applicable in a variety of fields that are concerned with the analysis of information transfer in complex biological, social, and artificial systems.Comment: 27 pages, 7 figures, submitted to PLOS ON

    Transputer control of a flexible robot link

    Get PDF
    The applicability of transputers in control systems is investigated. This is done by implementing a controller for a flexible robot arm with one degree of freedom on a system consisting of an IBM-AT and four transputers. It is found that a control system with transputers offers a great improvement compared with conventional digital control systems. Transputers can solve the common problem in control practice, i.e. having very sophisticted controllers but not being able to implement them because they need too much computing time. However, transputers are not an optimal solution for more sophisticated control systems because of shortcomings in the scheduling mechanism

    Feedback Controlled Software Systems

    Get PDF
    Software systems generally suffer from a certain fragility in the face of disturbances such as bugs, unforeseen user input, unmodeled interactions with other software components, and so on. A single such disturbance can make the machine on which the software is executing hang or crash. We postulate that what is required to address this fragility is a general means of using feedback to stabilize these systems. In this paper we develop a preliminary dynamical systems model of an arbitrary iterative software process along with the conceptual framework for stabilizing it in the presence of disturbances. To keep the computational requirements of the controllers low, randomization and approximation are used. We describe our initial attempts to apply the model to a faulty list sorter, using feedback to improve its performance. Methods by which software robustness can be enhanced by distributing a task between nodes each of which are capable of selecting the best input to process are also examined, and the particular case of a sorting system consisting of a network of partial sorters, some of which may be buggy or even malicious, is examined
    corecore