7 research outputs found

    Parallel Computing in Economics - An Overview of the Software Frameworks

    Get PDF
    This paper discusses problems related to parallel computing applied in economics. It introduces the paradigms of parallel computing and emphasizes the new trends in this field - a combination between GPU computing, multicore computing and distributed computing. It also gives examples of problems arising from economics where these parallel methods can be used to speed up the computation

    Bit Based Approximation for Approx-NoC: A Data Approximation Framework for Network-On-Chip Architectures

    Get PDF
    The dawn of the big data era has led to the inception of new and creative compute paradigms that utilize heterogeneity, specialization, processor-in-memory and approximation due to the high demand for memory bandwidth and power. Relaxing the constraints of applications has led to approximate computing being put forth as a feasible solution for high performance computation. The latest fad such as machine learning, video/image processing, data analytics, neural networks and other data intensive applications have heightened the possibility of using approximate computing as a feasible solution as these applications allow imprecise output within a specific error range. This work presents Bit Based Approx-NoC, a hardware data approximation framework with a lightweight bit-based approximation technique for high performance NoCs. Bit-Based Approx-NoC facilitates approximate matching of data patterns, within a controllable error range, to compress them thereby reducing the data movement across the chip. The proposed work exploits the entropy between data words in order to increase their inherent compressibility. Evaluations in this work show on average 5% latency reduction and 14% throughput improvement compared to the state of the art NoC compression mechanisms

    ABC of SV : limited information likelihood inference in stochastic volatility jump-diffusion models

    Get PDF
    We develop novel methods for estimation and filtering of continuous-time models with stochastic volatility and jumps using so-called Approximate Bayesian Computation which build likelihoods based on limited information. The proposed estimators and filters are computationally attractive relative to standard likelihood-based versions since they rely on low-dimensional auxiliary statistics and so avoid computation of high-dimensional integrals. Despite their computational simplicity, we find that estimators and filters perform well in practice and lead to precise estimates of model parameters and latent variables. We show how the methods can incorporate intra-daily information to improve on the estimation and filtering. In particular, the availability of realized volatility measures help us in learning about parameters and latent states. The method is employed in the estimation of a flexible stochastic volatility model for the dynamics of the S&P 500 equity index. We find evidence of the presence of a dynamic jump rate and in favor of a structural break in parameters at the time of the recent financial crisis. We find evidence that possible measurement error in log price is small and has little effect on parameter estimates. Smoothing shows that, recently, volatility and the jump rate have returned to the low levels of 2004-2006

    Application Centric Networks-On-Chip Design Solutions for Future Multicore Systems

    Get PDF
    With advances in technology, future multicore systems scaled to 100s and 1000s of cores/accelerators are being touted as an effective solution for extracting huge performance gains using parallel programming paradigms. However with the failure of Dennard Scaling all the components on the chip cannot be run simultaneously without breaking the power and thermal constraints leading to strict chip power envelops. The scaling up of the number of on chip components has also brought upon Networks-On-Chip (NoC) based interconnect designs like 2D mesh. The contribution of NoC to the total on chip power and overall performance has been increasing steadily and hence high performance power-efficient NoC designs are becoming crucial. Future multicore paradigms can be broadly classified, based on the applications they are tailored to, into traditional Chip Multi processor(CMP) based application based systems, characterized by low core and NoC utilization, and emerging big data application based systems, characterized by large amounts of data movement necessitating high throughput requirements. To this order, we propose NoC design solutions for power-savings in future CMPs tailored to traditional applications and higher effective throughput gains in multicore systems tailored to bandwidth intensive applications. First, we propose Fly-over, a light-weight distributed mechanism for power-gating routers attached to switched off cores to reduce NoC power consumption in low load CMP environment. Secondly, we plan on utilizing a promising next generation memory technology, Spin-Transfer Torque Magnetic RAM(STT-MRAM), to achieve enhanced NoC performance to satisfy the high throughput demands in emerging bandwidth intensive applications, while reducing the power consumption simultaneously. Thirdly, we present a hardware data approximation framework for NoCs, APPROX-NoC, with an online data error control mechanism, which can leverage the approximate computing paradigm in the emerging data intensive big data applications to attain higher performance per watt

    High Performance Implementation of an Econometrics and Financial Application on GPUs

    No full text
    Abstract—In this paper, we describe a GPU based implementation for an estimator based on an indirect likelihood inference method. This method relies on simulations from a model and on nonparametric density or regression function computations. The estimation application arises in various domains such as econometrics and finance, when the model is fully specified, but too complex for estimation by maximum likelihood. We implemented the estimator on a machine with two 2.67GHz Intel Xeon X5650 processors and four NVIDIA M2090 GPU devices. We optimized the GPU code by efficient use of shared memory and registers available on the GPU devices. We compared the optimized GPU code performance with a C based sequential version of the code that was executed on the host machine. We observed a speed up factor of up to 242 with four GPU devices. I
    corecore