152 research outputs found

    The Tag Filter Architecture: An energy-efficient cache and directory design

    Full text link
    [EN] Power consumption in current high-performance chip multiprocessors (CMPs) has become a major design concern that aggravates with the current trend of increasing the core count. A significant fraction of the total power budget is consumed by on-chip caches which are usually deployed with a high associativity degree (even L1 caches are being implemented with eight ways) to enhance the system performance. On a cache access, each way in the corresponding set is accessed in parallel, which is costly in terms of energy. On the other hand, coherence protocols also must implement efficient directory caches that scale in terms of power consumption. Most of the state-of-the-art techniques that reduce the energy consumption of directories are at the cost of performance, which may become unacceptable for high-performance CMPs. In this paper, we propose an energy-efficient architectural design that can be effectively applied to any kind of cache memory. The proposed approach, called the Tag Filter (TF) Architecture, filters the ways accessed in the target cache set, and just a few ways are searched in the tag and data arrays. This allows the approach to reduce the dynamic energy consumption of caches without hurting their access time. For this purpose, the proposed architecture holds the XX least significant bits of each tag in a small auxiliary X-bit-wide array. These bits are used to filter the ways where the least significant bits of the tag do not match with the bits in the X-bit array. Experimental results show that, on average, the TF Architecture reduces the dynamic power consumption across the studied applications up to 74.9%74.9%, 85.9%85.9%, and 84.5%84.5% when applied to L1 caches, L2 caches, and directory caches, respectively.This work has been jointly supported by MINECO and European Commission (FEDER funds) under the project TIN2015-66972-C5-1-R/3-R and by Fundación Séneca, Agencia de Ciencia y Tecnología de la Región de Murcia under the project Jóvenes Líderes en Investigación 18956/JLI/13.Valls, J.; Ros Bardisa, A.; Gómez Requena, ME.; Sahuquillo Borrás, J. (2017). The Tag Filter Architecture: An energy-efficient cache and directory design. Journal of Parallel and Distributed Computing. 100:193-202. https://doi.org/10.1016/j.jpdc.2016.04.016S19320210

    Analysis of opportunities for cache coherence in heterogeneous embedded systems

    Full text link
    [ES] En el contexto de los sistemas empotrados heterogéneos surgen nuevas necesidades y retos. Este trabajo se va a centrar en la coherencia de éstos sistemas para analizar la posibilidad de aplicar técnicas que se ajusten mejor a dichas necesidades. Previo al análisis se presentará en qué consiste y qué soluciones se proponen actualmente para el problema de la coherencia.[EN] New challenges arise in the context of embedded heterogeneous systems. This work is focused on the coherence of those systems in order to analyze the posibility of applying techniques that best cope with such challenges. Prior to that, we will offer an explanation of what the coherency problem is and what the currently proposed solutions to that problem are.Esteve García, A. (2012). Analysis of opportunities for cache coherence in heterogeneous embedded systems. http://hdl.handle.net/10251/29846Archivo delegad

    高効率なメモリ順序違反検出機構に関する研究

    Get PDF
    学位の種別: 課程博士審査委員会委員 : (主査)東京大学教授 浅見 徹, 東京大学教授 坂井 修一, 東京大学准教授 田浦 健次朗, 東京大学准教授 豊田 正史, 国立情報学研究所教授 五島 正裕University of Tokyo(東京大学

    高効率なメモリ順序違反検出機構に関する研究

    Get PDF
    学位の種別: 課程博士審査委員会委員 : (主査)東京大学教授 浅見 徹, 東京大学教授 坂井 修一, 東京大学准教授 田浦 健次朗, 東京大学准教授 豊田 正史, 国立情報学研究所教授 五島 正裕University of Tokyo(東京大学

    Processing of an iceberg query on distributed and centralized databases

    Get PDF
    Master'sMASTER OF SCIENC

    Discrete Mathematics and Symmetry

    Get PDF
    Some of the most beautiful studies in Mathematics are related to Symmetry and Geometry. For this reason, we select here some contributions about such aspects and Discrete Geometry. As we know, Symmetry in a system means invariance of its elements under conditions of transformations. When we consider network structures, symmetry means invariance of adjacency of nodes under the permutations of node set. The graph isomorphism is an equivalence relation on the set of graphs. Therefore, it partitions the class of all graphs into equivalence classes. The underlying idea of isomorphism is that some objects have the same structure if we omit the individual character of their components. A set of graphs isomorphic to each other is denominated as an isomorphism class of graphs. The automorphism of a graph will be an isomorphism from G onto itself. The family of all automorphisms of a graph G is a permutation group

    Hardware Architecture for Semantic Comparison

    Get PDF
    Semantic Routed Networks provide a superior infrastructure for complex search engines. In a Semantic Routed Network (SRN), the routers are the critical component and they perform semantic comparison as their key computation. As the amount of information available on the Internet grows, the speed and efficiency with which information can be retrieved to the user becomes important. Most current search engines scale to meet the growing demand by deploying large data centers with general purpose computers that consume many megawatts of power. Reducing the power consumption of these data centers while providing better performance, will help reduce the costs of operation significantly. Performing operations in parallel is a key optimization step for better performance on general purpose CPUs. Current techniques for parallelization include architectures that are multi-core and have multiple thread handling capabilities. These coarse grained approaches have considerable resource management overhead and provide only sub-linear speedup. This dissertation proposes techniques towards a highly parallel, power efficient architecture that performs semantic comparisons as its core activity. Hardware-centric parallel algorithms have been developed to populate the required data structures followed by computation of semantic similarity. The performance of the proposed design is further enhanced using a pipelined architecture. The proposed algorithms were also implemented on two contemporary platforms such as the Nvidia CUDA and an FPGA for performance comparison. In order to validate the designs, a semantic benchmark was also been created. It has been shown that a dedicated semantic comparator delivers significantly better performance compared to other platforms. Results show that the proposed hardware semantic comparison architecture delivers a speedup performance of up to 10^5 while reducing power consumption by 80% compared to traditional computing platforms. Future research directions including better power optimization, architecting the complete semantic router and using the semantic benchmark for SRN research are also discussed

    Computational pan-genomics: status, promises and challenges

    Get PDF
    International audienceMany disciplines, from human genetics and oncology to plant breeding, microbiology and virology, commonly face the challenge of analyzing rapidly increasing numbers of genomes. In case of Homo sapiens, the number of sequenced genomes will approach hundreds of thousands in the next few years. Simply scaling up established bioinformatics pipelines will not be sufficient for leveraging the full potential of such rich genomic data sets. Instead, novel, qualitatively different computational methods and paradigms are needed. We will witness the rapid extension of computational pan-genomics, a new sub-area of research in computational biology. In this article, we generalize existing definitions and understand a pan-genome as any collection of genomic sequences to be analyzed jointly or to be used as a reference. We examine already available approaches to construct and use pan-genomes, discuss the potential benefits of future technologies and methodologies and review open challenges from the vantage point of the above-mentioned biological disciplines. As a prominent example for a computational paradigm shift, we particularly highlight the transition from the representation of reference genomes as strings to representations as graphs. We outline how this and other challenges from different application domains translate into common computational problems, point out relevant bioinformatics techniques and identify open problems in computer science. With this review, we aim to increase awareness that a joint approach to computational pan-genomics can help address many of the problems currently faced in various domains
    corecore