Search CORE

36,222 research outputs found

Fuzzy memoization for floating-point multimedia applications

Author: Corbal San Adrián Jesús
Valero Cortés Mateo
Álvarez Martínez Carlos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

Instruction memoization is a promising technique to reduce the power consumption and increase the performance of future low-end/mobile multimedia systems. Power and performance efficiency can be improved by reusing instances of an already executed operation. Unfortunately, this technique may not always be worth the effort due to the power consumption and area impact of the tables required to leverage an adequate level of reuse. In this paper, we introduce and evaluate a novel way of understanding multimedia floating-point operations based on the fuzzy computation paradigm: performance and power consumption can be improved at the cost of small precision losses in computation. By exploiting this implicit characteristic of multimedia applications, we propose a new technique called tolerant memoization. This technique expands the capabilities of classic memoization by associating entries with similar inputs to the same output. We evaluate this new technique by measuring the effect of tolerant memoization for floating-point operations in a low-power multimedia processor and discuss the trade-offs between performance and quality of the media outputs. We report energy improvements of 12 percent for a set of key multimedia applications with small LUT of 6 Kbytes, compared to 3 percent obtained using previously proposed techniques.Peer ReviewedPostprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

GeNN: a code generation framework for accelerated brain simulations

Author: AJ Cope
C Rossant
DF Goodman
DF Goodman
E Ros
EM Izhikevich
EM Izhikevich
HÜ Dinkelbach
I Raikov
J Baladron
JM Nageswaran
MA Swertz
ML Hines
NF Rulkov
P Gleeson
R Brette
SC Eisenstat
T Nowotny
T Nowotny
VK Pallipuram
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2015
Field of study

Large-scale numerical simulations of detailed brain circuit models are important for identifying hypotheses on brain functions and testing their consistency and plausibility. An ongoing challenge for simulating realistic models is, however, computational speed. In this paper, we present the GeNN (GPU-enhanced Neuronal Networks) framework, which aims to facilitate the use of graphics accelerators for computational models of large-scale neuronal networks to address this challenge. GeNN is an open source library that generates code to accelerate the execution of network simulations on NVIDIA GPUs, through a flexible and extensible interface, which does not require in-depth technical knowledge from the users. We present performance benchmarks showing that 200-fold speedup compared to a single core of a CPU can be achieved for a network of one million conductance based Hodgkin-Huxley neurons but that for other models the speedup can differ. GeNN is available for Linux, Mac OS X and Windows platforms. The source code, user manual, tutorials, Wiki, in-depth example projects and all other related information can be found on the project website http://genn-team.github.io/genn/

Crossref

PubMed Central

Sussex Research Online

Stochastic rounding and reduced-precision fixed-point arithmetic for solving neural ordinary differential equations

Author: Furber Steve
Hopkins Michael
Lester Dave R.
Mikaitis Mantas
Publication venue: 'The Royal Society'
Publication date: 01/01/2020
Field of study

Although double-precision floating-point arithmetic currently dominates high-performance computing, there is increasing interest in smaller and simpler arithmetic types. The main reasons are potential improvements in energy efficiency and memory footprint and bandwidth. However, simply switching to lower-precision types typically results in increased numerical errors. We investigate approaches to improving the accuracy of reduced-precision fixed-point arithmetic types, using examples in an important domain for numerical computation in neuroscience: the solution of Ordinary Differential Equations (ODEs). The Izhikevich neuron model is used to demonstrate that rounding has an important role in producing accurate spike timings from explicit ODE solution algorithms. In particular, fixed-point arithmetic with stochastic rounding consistently results in smaller errors compared to single precision floating-point and fixed-point arithmetic with round-to-nearest across a range of neuron behaviours and ODE solvers. A computationally much cheaper alternative is also investigated, inspired by the concept of dither that is a widely understood mechanism for providing resolution below the least significant bit (LSB) in digital signal processing. These results will have implications for the solution of ODEs in other subject areas, and should also be directly relevant to the huge range of practical problems that are represented by Partial Differential Equations (PDEs).Comment: Submitted to Philosophical Transactions of the Royal Society

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository

Parallel Implementation of the PHOENIX Generalized Stellar Atmosphere Program

Author: Baron E.
E. Baron
France Allard
Peter H. Hauschildt
Rybicki G. B.
Schweitzer A.
Publication venue: 'University of Chicago Press'
Publication date: 17/07/1996
Field of study

We describe the parallel implementation of our generalized stellar atmosphere and NLTE radiative transfer computer program PHOENIX. We discuss the parallel algorithms we have developed for radiative transfer, spectral line opacity, and NLTE opacity and rate calculations. Our implementation uses a MIMD design based on a relatively small number of MPI library calls. We report the results of test calculations on a number of different parallel computers and discuss the results of scalability tests.Comment: To appear in ApJ, 1997, vol 483. LaTeX, 34 pages, 3 Figures, uses AASTeX macros and styles natbib.sty, and psfig.st

arXiv.org e-Print Archive

CiteSeerX

Crossref

CERN Document Server

Classical and all-floating FETI methods for the simulation of arterial tissues

Author: Augustin Christoph M.
Holzapfel Gerhard A.
Steinbach Olaf
Publication venue: 'Wiley'
Publication date: 01/01/2014
Field of study

High-resolution and anatomically realistic computer models of biological soft tissues play a significant role in the understanding of the function of cardiovascular components in health and disease. However, the computational effort to handle fine grids to resolve the geometries as well as sophisticated tissue models is very challenging. One possibility to derive a strongly scalable parallel solution algorithm is to consider finite element tearing and interconnecting (FETI) methods. In this study we propose and investigate the application of FETI methods to simulate the elastic behavior of biological soft tissues. As one particular example we choose the artery which is - as most other biological tissues - characterized by anisotropic and nonlinear material properties. We compare two specific approaches of FETI methods, classical and all-floating, and investigate the numerical behavior of different preconditioning techniques. In comparison to classical FETI, the all-floating approach has not only advantages concerning the implementation but in many cases also concerning the convergence of the global iterative solution method. This behavior is illustrated with numerical examples. We present results of linear elastic simulations to show convergence rates, as expected from the theory, and results from the more sophisticated nonlinear case where we apply a well-known anisotropic model to the realistic geometry of an artery. Although the FETI methods have a great applicability on artery simulations we will also discuss some limitations concerning the dependence on material parameters.Comment: 29 page

arXiv.org e-Print Archive

uni≡pub

Analog VLSI-Based Modeling of the Primate Oculomotor System

Author: Horiuchi Timothy K.
Koch Christof
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/1998
Field of study

One way to understand a neurobiological system is by building a simulacrum that replicates its behavior in real time using similar constraints. Analog very large-scale integrated (VLSI) electronic circuit technology provides such an enabling technology. We here describe a neuromorphic system that is part of a long-term effort to understand the primate oculomotor system. It requires both fast sensory processing and fast motor control to interact with the world. A one-dimensional hardware model of the primate eye has been built that simulates the physical dynamics of the biological system. It is driven by two different analog VLSI chips, one mimicking cortical visual processing for target selection and tracking and another modeling brain stem circuits that drive the eye muscles. Our oculomotor plant demonstrates both smooth pursuit movements, driven by a retinal velocity error signal, and saccadic eye movements, controlled by retinal position error, and can reproduce several behavioral, stimulation, lesion, and adaptation experiments performed on primates

CiteSeerX

Caltech Authors

Correlated Resource Models of Internet End Hosts

Author: David Anderson
Heien Eric M.
Kondo Derrick
Publication venue
Publication date: 23/11/2010
Field of study

Understanding and modelling resources of Internet end hosts is essential for the design of desktop software and Internet-distributed applications. In this paper we develop a correlated resource model of Internet end hosts based on real trace data taken from the SETI@home project. This data covers a 5-year period with statistics for 2.7 million hosts. The resource model is based on statistical analysis of host computational power, memory, and storage as well as how these resources change over time and the correlations between them. We find that resources with few discrete values (core count, memory) are well modeled by exponential laws governing the change of relative resource quantities over time. Resources with a continuous range of values are well modeled with either correlated normal distributions (processor speed for integer operations and floating point operations) or log-normal distributions (available disk space). We validate and show the utility of the models by applying them to a resource allocation problem for Internet-distributed applications, and demonstrate their value over other models. We also make our trace data and tool for automatically generating realistic Internet end hosts publicly available

arXiv.org e-Print Archive

CiteSeerX

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server