6,797 research outputs found

    Software prefetching for software pipelined loops

    Get PDF
    The paper investigates the interaction between software pipelining and different software prefetching techniques for VLIW machines. It is shown that processor stalls due to memory dependencies have a great impact into execution time. A novel heuristic is proposed and it is show to outperform previous proposals.Peer ReviewedPostprint (published version

    Innovaciones Docentes en Fisioterapia Manipulativa en el Grado de Fisioterapia

    Get PDF
    El objeto de esta comunicación es presentar el ciclo de mejora docente (CMD) realizado en la asignatura de Fisioterapia Manipulativa del Grado en Fisioterapia. Después de analizar la estructura de los contenidos de esta asignatura hemos intentado integrar los temas teórico- prácticos incluyendo algunas mejoras de innovación docente para conseguir la construcción del conocimiento conforme a las nuevas metodologías docentes. Hemos intentado conseguir que los estudiantes adquieran las competencias definidas en el programa de la asignatura, empleando un modelo didáctico del aprendizaje basado en problemas, introduciendo sistemas de gamificación que nos permitiese implementar el ciclo de mejora para alcanzar una calidad docente adecuada

    The effectiveness of loop unrolling for modulo scheduling in clustered VLIW architectures

    Get PDF
    Clustered organizations are becoming a common trend in the design of VLIW architectures. In this work we propose a novel modulo scheduling approach for such architectures. The proposed technique performs the cluster assignment and the instruction scheduling in a single pass, which is shown to be more effective than doing first the assignment and later the scheduling. We also show that loop unrolling significantly enhances the performance of the proposed scheduler especially when the communication channel among clusters is the main performance bottleneck. By selectively unrolling some loops, we can obtain the best performance with the minimum increase in code size. Performance evaluation for the SPECfp95 shows that the clustered architecture achieves about the same IPC (Instructions Per Cycle) as a unified architecture with the same resources. Moreover when the cycle time is taken into account, a 4-cluster configurations is 3.6 times faster than the unified architecture.Peer ReviewedPostprint (published version

    Modulo scheduling for a fully-distributed clustered VLIW architecture

    Get PDF
    Clustering is an approach that many microprocessors are adopting in recent times in order to mitigate the increasing penalties of wire delays. We propose a novel clustered VLIW architecture which has all its resources partitioned among clusters, including the cache memory. A modulo scheduling scheme for this architecture is also proposed. This algorithm takes into account both register and memory inter-cluster communications so that the final schedule results in a cluster assignment that favors cluster locality in cache references and register accesses. It has been evaluated for both 2- and 4-cluster configurations and for differing numbers and latencies of inter-cluster buses. The proposed algorithm produces schedules with very low communication requirements and outperforms previous cluster-oriented schedulers.Peer ReviewedPostprint (published version

    Fast, accurate and flexible data locality analysis

    Get PDF
    This paper presents a tool based on a new approach for analyzing the locality exhibited by data memory references. The tool is very fast because it is based on a static locality analysis enhanced with very simple profiling information, which results in a negligible slowdown. This feature allows the tool to be used for highly time-consuming applications and to include it as a step in a typical iterative analysis-optimization process. The tool can provide a detailed evaluation of the reuse exhibited by a program, quantifying and qualifying the different types of misses either globally or detailed by program sections, data structures, memory instructions, etc. The accuracy of the tool is validated by comparing its results with those provided by a simulator.Peer ReviewedPostprint (published version

    Distributed data cache designs for clustered VLIW processors

    Get PDF
    Wire delays are a major concern for current and forthcoming processors. One approach to deal with this problem is to divide the processor into semi-independent units referred to as clusters. A cluster usually consists of a local register file and a subset of the functional units, while the L1 data cache typically remains centralized in What we call partially distributed architectures. However, as technology evolves, the relative latency of such a centralized cache will increase, leading to an important impact on performance. In this paper, we propose partitioning the L1 data cache among clusters for clustered VLIW processors. We refer to this kind of design as fully distributed processors. In particular; we propose and evaluate three different configurations: a snoop-based cache coherence scheme, a word-interleaved cache, and flexible LO-buffers managed by the compiler. For each alternative, instruction scheduling techniques targeted to cyclic code are developed. Results for the Mediabench suite'show that the performance of such fully distributed architectures is always better than the performance of a partially distributed one with the same amount of resources. In addition, the key aspects of each fully distributed configuration are explored.Peer ReviewedPostprint (published version

    The Originals `Privilegios Rodados´ From The Ducal Archive of Medinaceli: I. Alfonso VIII, King of Castile (1158-1214)

    Get PDF
    Este es el primer estudio parcial de una serie que estamos elaborando sobre la magnífica colección documental de Privilegios Rodados que custodia el Archivo de la Casa Ducal de Medinaceli, un buen  número de ellos inéditos. De casi el centenar de pergaminos originales que constituye la colección, los seis primeros privilegios fueron emitidos por la cancillería del rey de Castilla Alfonso VIII entre los años 1175 y 1212. Este monarca fue el primero que intituló en Castilla privilegios con signo rodado y sello de plomo pendiente, sin duda el tipo documental más admirado desde la Edad Media por su solemnidad, belleza, elegancia, prestancia, jerarquía, vistosidad, fiabilidad y perdurabilidad. En estos ejemplares ya se vislumbran esas características y las formalidades de las que se rodearon estos documentos medievales para garantizar su autenticidad. El estudio va precedido de una introducción dedicada a los orígenes del privilegio rodado en las cancillerías hispanas.This is the first partial study of a serie which we are elaborating about the magnificient documental collection of “Privilegios Rodados” which is watched over in the Ducal Archive of Medinaceli, a good number of them unpublished. From nearly a hundred of original rolls that constitute the collection, the first six privilegios were emited by the Royal Chancellery of the king of Castile, Alfonso VIII, between 1175 and 1212. This monarch was the first that entitled in Castile privilegios with signo rodado and hanging lead seal, undoubtedly the documental type more admired from the Middle Age for its solemnity, beauty, elegance, performance, hierarchy, brightness, reliability and durability. In these copies, it can be discerned that characteristics and the formalities of the ones that these medieval documents were surrounded in order to guarantee its authenticity. The study is preceded by an introduction dedicated to the origins of the privilegio rodado in the hispanic chancelleries
    • …
    corecore