Search CORE

43,754 research outputs found

Exploiting partial reconfiguration through PCIe for a microphone array network emulator

Author: Braeken An
da Silva Gomes Bruno
Domínguez Federico
Touhafi Abdellah
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2018
Field of study

The current Microelectromechanical Systems (MEMS) technology enables the deployment of relatively low-cost wireless sensor networks composed of MEMS microphone arrays for accurate sound source localization. However, the evaluation and the selection of the most accurate and power-efficient network’s topology are not trivial when considering dynamic MEMS microphone arrays. Although software simulators are usually considered, they consist of high-computational intensive tasks, which require hours to days to be completed. In this paper, we present an FPGA-based platform to emulate a network of microphone arrays. Our platform provides a controlled simulated acoustic environment, able to evaluate the impact of different network configurations such as the number of microphones per array, the network’s topology, or the used detection method. Data fusion techniques, combining the data collected by each node, are used in this platform. The platform is designed to exploit the FPGA’s partial reconfiguration feature to increase the flexibility of the network emulator as well as to increase performance thanks to the use of the PCI-express high-bandwidth interface. On the one hand, the network emulator presents a higher flexibility by partially reconfiguring the nodes’ architecture in runtime. On the other hand, a set of strategies and heuristics to properly use partial reconfiguration allows the acceleration of the emulation by exploiting the execution parallelism. Several experiments are presented to demonstrate some of the capabilities of our platform and the benefits of using partial reconfiguration

Crossref

Ghent University Academic Bibliography

Directory of Open Access Journals

TTC: A Tensor Transposition Compiler for Multiple Architectures

Author: Abadi M.
Knijnenburg P. M.
Knijnenburg P. M.
Knijnenburg P. M.
Knijnenburg P. M.
Knijnenburg P. M.
Knijnenburg P. M.
Springer P.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

We consider the problem of transposing tensors of arbitrary dimension and describe TTC, an open source domain-specific parallel compiler. TTC generates optimized parallel C++/CUDA C code that achieves a significant fraction of the system's peak memory bandwidth. TTC exhibits high performance across multiple architectures, including modern AVX-based systems (e.g.,~Intel Haswell, AMD Steamroller), Intel's Knights Corner as well as different CUDA-based GPUs such as NVIDIA's Kepler and Maxwell architectures. We report speedups of TTC over a meaningful baseline implementation generated by external C++ compilers; the results suggest that a domain-specific compiler can outperform its general purpose counterpart significantly: For instance, comparing with Intel's latest C++ compiler on the Haswell and Knights Corner architecture, TTC yields speedups of up to

8\times

and

32\times

, respectively. We also showcase TTC's support for multiple leading dimensions, making it a suitable candidate for the generation of performance-critical packing functions that are at the core of the ubiquitous BLAS 3 routines

arXiv.org e-Print Archive

Crossref

Publikationsserver der RWTH Aachen University

Ravel-XL: a hardware accelerator for assigned-delay compiled-code logic gate simulation

Author: Brown R. B.
Marques-Silva J. P.
Riepe M. A.
Sakallah K. A.
Publication venue
Publication date: 01/03/1996
Field of study

Southampton (e-Prints Soton)

Integration of dynamic, aerodynamic, and structural optimization of helicopter rotor blades

Author: Peters David A.
Publication venue
Publication date
Field of study

Summarized here is the first six years of research into the integration of structural, dynamic, and aerodynamic considerations in the design-optimization process for rotor blades. Specifically discussed here is the application of design optimization techniques for helicopter rotor blades. The reduction of vibratory shears and moments at the blade root, aeroelastic stability of the rotor, optimum airframe design, and an efficient procedure for calculating system sensitivities with respect to the design variables used are discussed

NASA Technical Reports Server

Power quality and electromagnetic compatibility: special report, session 2

Author: Desmet Jan
Heimbach Britta
Renner Herwig
Publication venue: CIRED
Publication date: 01/01/2015
Field of study

The scope of Session 2 (S2) has been defined as follows by the Session Advisory Group and the Technical Committee: Power Quality (PQ), with the more general concept of electromagnetic compatibility (EMC) and with some related safety problems in electricity distribution systems. Special focus is put on voltage continuity (supply reliability, problem of outages) and voltage quality (voltage level, flicker, unbalance, harmonics). This session will also look at electromagnetic compatibility (mains frequency to 150 kHz), electromagnetic interferences and electric and magnetic fields issues. Also addressed in this session are electrical safety and immunity concerns (lightning issues, step, touch and transferred voltages). The aim of this special report is to present a synthesis of the present concerns in PQ&EMC, based on all selected papers of session 2 and related papers from other sessions, (152 papers in total). The report is divided in the following 4 blocks: Block 1: Electric and Magnetic Fields, EMC, Earthing systems Block 2: Harmonics Block 3: Voltage Variation Block 4: Power Quality Monitoring Two Round Tables will be organised: - Power quality and EMC in the Future Grid (CIGRE/CIRED WG C4.24, RT 13) - Reliability Benchmarking - why we should do it? What should be done in future? (RT 15

Ghent University Academic Bibliography

A review of interventions to support young workers : findings of the youth employment inventory

Author: Betcherman Gordon
Godfrey Martin
Puerto Susana
Rother Friederike
Stavreska Antoneta
Publication venue
Publication date
Field of study

This Youth Employment Inventory (YEI) is based on available documentation of current and past programs and includes evidence from 289 studies of interventions from 84 countries in all regions of the world. The interventions included in the YEI have been analyzed in order to (i) document the types of programs that have been implemented to support young workers to find work; and (ii) identify what appears to work in terms of improving employment outcomes for youth. This report synthesizes the information from this inventory and a set of background reports to document the global experience with youth employment programs. As background, Section B provides a brief summary of the situation of young people in labor markets world-wide, and also reviews the existing literature on policies to address youth employment problems. Following this, we turn to the underlying framework and methodology used to assemble the youth employment inventory in Section C. In Section D, we consider the coverage of the YEI, which represents the sample of youth programs identified in our global search of the available documentation. In addressing the question of"what works", it is critical to pay close attention to the quality of the evaluation evidence. This is discussed in Section E. The study then turns to the analysis of the effectiveness of the interventions included in the inventory. The descriptive evidence is presented in Section F. In addition, the study undertakes an econometric meta-analysis to more systematically identify the determinants of program success and the results of this analysis are presented in Section G. Finally, conclusions and implications are drawn in Section H.Labor Markets,Labor Policies,Youth and Governance,,Adolescent Health

Research Papers in Economics

Improving the Performance and Energy Efficiency of GPGPU Computing through Adaptive Cache and Memory Management Techniques

Author: Kim Kyu Yeun
Publication venue: Graduate School of UNIST
Publication date: 01/02/2020
Field of study

Department of Computer Science and EngineeringAs the performance and energy efficiency requirement of GPGPUs have risen, memory management techniques of GPGPUs have improved to meet the requirements by employing hardware caches and utilizing heterogeneous memory. These techniques can improve GPGPUs by providing lower latency and higher bandwidth of the memory. However, these methods do not always guarantee improved performance and energy efficiency due to the small cache size and heterogeneity of the memory nodes. While prior works have proposed various techniques to address this issue, relatively little work has been done to investigate holistic support for memory management techniques. In this dissertation, we analyze performance pathologies and propose various techniques to improve memory management techniques. First, we investigate the effectiveness of advanced cache indexing (ACI) for high-performance and energy-efficient GPGPU computing. Specifically, we discuss the designs of various static and adaptive cache indexing schemes and present implementation for GPGPUs. We then quantify and analyze the effectiveness of the ACI schemes based on a cycle-accurate GPGPU simulator. Our quantitative evaluation shows that ACI schemes achieve significant performance and energy-efficiency gains over baseline conventional indexing scheme. We also analyze the performance sensitivity of ACI to key architectural parameters (i.e., capacity, associativity, and ICN bandwidth) and the cache indexing latency. We also demonstrate that ACI continues to achieve high performance in various settings. Second, we propose IACM, integrated adaptive cache management for high-performance and energy-efficient GPGPU computing. Based on the performance pathology analysis of GPGPUs, we integrate state-of-the-art adaptive cache management techniques (i.e., cache indexing, bypassing, and warp limiting) in a unified architectural framework to eliminate performance pathologies. Our quantitative evaluation demonstrates that IACM significantly improves the performance and energy efficiency of various GPGPU workloads over the baseline architecture (i.e., 98.1% and 61.9% on average, respectively) and achieves considerably higher performance than the state-of-the-art technique (i.e., 361.4% at maximum and 7.7% on average). Furthermore, IACM delivers significant performance and energy efficiency gains over the baseline GPGPU architecture even when enhanced with advanced architectural technologies (e.g., higher capacity, associativity). Third, we propose bandwidth- and latency-aware page placement (BLPP) for GPGPUs with heterogeneous memory. BLPP analyzes the characteristics of a application and determines the optimal page allocation ratio between the GPU and CPU memory. Based on the optimal page allocation ratio, BLPP dynamically allocate pages across the heterogeneous memory nodes. Our experimental results show that BLPP considerably outperforms the baseline and state-of-the-art technique (i.e., 13.4% and 16.7%) and performs similar to the static-best version (i.e., 1.2% difference), which requires extensive offline profiling.clos

ScholarWorks@UNIST

Earth orbital teleoperator system man-machine interface evaluation

Author: Brye R. G.
Kirkpatrick M.
Malone T. B.
Shields N. L.
Publication venue
Publication date
Field of study

The teleoperator system man-machine interface evaluation develops and implements a program to determine human performance requirements in teleoperator systems

NASA Technical Reports Server