Search CORE

54,960 research outputs found

When should I use network emulation ?

Author: Dairaine Laurent
Lochin Emmanuel
Pérennou Tanguy
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/06/2011
Field of study

The design and development of a complex system requires an adequate methodology and efficient instrumental support in order to early detect and correct anomalies in the functional and non-functional properties of the tested protocols. Among the various tools used to provide experimental support for such developments, network emulation relies on real-time production of impairments on real traffic according to a communication model, either realistically or not. This paper aims at simply presenting to newcomers in network emulation (students, engineers, ...) basic principles and practices illustrated with a few commonly used tools. The motivation behind is to fill a gap in terms of introductory and pragmatic papers in this domain. The study particularly considers centralized approaches, allowing cheap and easy implementation in the context of research labs or industrial developments. In addition, an architectural model for emulation systems is proposed, defining three complementary levels, namely hardware, impairment and model levels. With the help of this architectural framework, various existing tools are situated and described. Various approaches for modeling the emulation actions are studied, such as impairment-based scenarios and virtual architectures, real-time discrete simulation and trace-based systems. Those modeling approaches are described and compared in terms of services and we study their ability to respond to various designer needs to assess when emulation is needed

arXiv.org e-Print Archive

Crossref

Open Archive Toulouse Archive Ouverte

When Should I Use Network Emulation?

Author: Lochin Emmanuel
Perennou Tanguy
Dairaine Laurent
Publication venue
Publication date: 11/07/2001
Field of study

arXiv.org e-Print Archive

Open Archive Toulouse Archive Ouverte

Joint Institute for Nuclear Research (JINR)

CERN Document Server

Improving the Performance and Energy Efficiency of GPGPU Computing through Adaptive Cache and Memory Management Techniques

Author: Kim Kyu Yeun
Publication venue: Graduate School of UNIST
Publication date: 01/02/2020
Field of study

Department of Computer Science and EngineeringAs the performance and energy efficiency requirement of GPGPUs have risen, memory management techniques of GPGPUs have improved to meet the requirements by employing hardware caches and utilizing heterogeneous memory. These techniques can improve GPGPUs by providing lower latency and higher bandwidth of the memory. However, these methods do not always guarantee improved performance and energy efficiency due to the small cache size and heterogeneity of the memory nodes. While prior works have proposed various techniques to address this issue, relatively little work has been done to investigate holistic support for memory management techniques. In this dissertation, we analyze performance pathologies and propose various techniques to improve memory management techniques. First, we investigate the effectiveness of advanced cache indexing (ACI) for high-performance and energy-efficient GPGPU computing. Specifically, we discuss the designs of various static and adaptive cache indexing schemes and present implementation for GPGPUs. We then quantify and analyze the effectiveness of the ACI schemes based on a cycle-accurate GPGPU simulator. Our quantitative evaluation shows that ACI schemes achieve significant performance and energy-efficiency gains over baseline conventional indexing scheme. We also analyze the performance sensitivity of ACI to key architectural parameters (i.e., capacity, associativity, and ICN bandwidth) and the cache indexing latency. We also demonstrate that ACI continues to achieve high performance in various settings. Second, we propose IACM, integrated adaptive cache management for high-performance and energy-efficient GPGPU computing. Based on the performance pathology analysis of GPGPUs, we integrate state-of-the-art adaptive cache management techniques (i.e., cache indexing, bypassing, and warp limiting) in a unified architectural framework to eliminate performance pathologies. Our quantitative evaluation demonstrates that IACM significantly improves the performance and energy efficiency of various GPGPU workloads over the baseline architecture (i.e., 98.1% and 61.9% on average, respectively) and achieves considerably higher performance than the state-of-the-art technique (i.e., 361.4% at maximum and 7.7% on average). Furthermore, IACM delivers significant performance and energy efficiency gains over the baseline GPGPU architecture even when enhanced with advanced architectural technologies (e.g., higher capacity, associativity). Third, we propose bandwidth- and latency-aware page placement (BLPP) for GPGPUs with heterogeneous memory. BLPP analyzes the characteristics of a application and determines the optimal page allocation ratio between the GPU and CPU memory. Based on the optimal page allocation ratio, BLPP dynamically allocate pages across the heterogeneous memory nodes. Our experimental results show that BLPP considerably outperforms the baseline and state-of-the-art technique (i.e., 13.4% and 16.7%) and performs similar to the static-best version (i.e., 1.2% difference), which requires extensive offline profiling.clos

ScholarWorks@UNIST

Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions

Author: Bouganis Christos-Savvas
Kouris Alexandros
Venieris Stylianos I.
Publication venue
Publication date: 19/02/2018
Field of study

In the past decade, Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance in various Artificial Intelligence tasks. To accelerate the experimentation and development of CNNs, several software frameworks have been released, primarily targeting power-hungry CPUs and GPUs. In this context, reconfigurable hardware in the form of FPGAs constitutes a potential alternative platform that can be integrated in the existing deep learning ecosystem to provide a tunable balance between performance, power consumption and programmability. In this paper, a survey of the existing CNN-to-FPGA toolflows is presented, comprising a comparative study of their key characteristics which include the supported applications, architectural choices, design space exploration methods and achieved performance. Moreover, major challenges and objectives introduced by the latest trends in CNN algorithmic research are identified and presented. Finally, a uniform evaluation methodology is proposed, aiming at the comprehensive, complete and in-depth evaluation of CNN-to-FPGA toolflows.Comment: Accepted for publication at the ACM Computing Surveys (CSUR) journal, 201

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

A system engineering perspective to knowledge transfer: A case study approach of BIM adoption

Author: Arayici Y
Coates SP
Publication venue: 'IntechOpen'
Publication date: 05/09/2012
Field of study

IntechOpen

University of Salford Institutional Repository

Management and Service-aware Networking Architectures (MANA) for Future Internet Position Paper: System Functions, Capabilities and Requirements

Author: Abramowicz H.
Brunner M.
Chemouil P.
Galis A.
Pras Aiko
Raz D.
Publication venue: IEEE Communications Society
Publication date: 01/01/2009
Field of study

Future Internet (FI) research and development threads have recently been gaining momentum all over the world and as such the international race to create a new generation Internet is in full swing: GENI, Asia Future Internet, Future Internet Forum Korea, European Union Future Internet Assembly (FIA). This is a position paper identifying the research orientation with a time horizon of 10 years, together with the key challenges for the capabilities in the Management and Service-aware Networking Architectures (MANA) part of the Future Internet (FI) allowing for parallel and federated Internet(s)

UCL Discovery

University of Twente Research Information

Recommended from our members

VIPER : a 25-MHz, 100-MIPS peak VLIW micro-processor

Author: Abnous A.
Bagherzadeh N.
Gray J.
Naylor A.
Publication venue: eScholarship, University of California
Publication date: 01/01/1992
Field of study

This paper describes the design and implementation of a very long instruction word (VLIW) microprocessor. The VIPER (VLIW integer processor) contains four pipelined functional units, and can achieve 100 MIPS peak performance at 25 MHz. The procesor is capable of performing multiway branch operations, two load/store operations and up to four ALU operations in each clock cycle, with full register file access to each functional unit. VIPER is the first VLIW microprocessor known that can achieve this level of performance. Designed in twelve months, the processor is integrated with an instruction cache controller and a data cache, requiring 450,000 transistors and a die size of 12.9 by 9.1 mm in a 1.2 µm technology

eScholarship - University of California