1,561 research outputs found
The effectiveness of loop unrolling for modulo scheduling in clustered VLIW architectures
Clustered organizations are becoming a common trend in the design of VLIW architectures. In this work we propose a novel modulo scheduling approach for such architectures. The proposed technique performs the cluster assignment and the instruction scheduling in a single pass, which is shown to be more effective than doing first the assignment and later the scheduling. We also show that loop unrolling significantly enhances the performance of the proposed scheduler especially when the communication channel among clusters is the main performance bottleneck. By selectively unrolling some loops, we can obtain the best performance with the minimum increase in code size. Performance evaluation for the SPECfp95 shows that the clustered architecture achieves about the same IPC (Instructions Per Cycle) as a unified architecture with the same resources. Moreover when the cycle time is taken into account, a 4-cluster configurations is 3.6 times faster than the unified architecture.Peer ReviewedPostprint (published version
Modulo scheduling for a fully-distributed clustered VLIW architecture
Clustering is an approach that many microprocessors are adopting in recent times in order to mitigate the increasing penalties of wire delays. We propose a novel clustered VLIW architecture which has all its resources partitioned among clusters, including the cache memory. A modulo scheduling scheme for this architecture is also proposed. This algorithm takes into account both register and memory inter-cluster communications so that the final schedule results in a cluster assignment that favors cluster locality in cache references and register accesses. It has been evaluated for both 2- and 4-cluster configurations and for differing numbers and latencies of inter-cluster buses. The proposed algorithm produces schedules with very low communication requirements and outperforms previous cluster-oriented schedulers.Peer ReviewedPostprint (published version
Fast, accurate and flexible data locality analysis
This paper presents a tool based on a new approach for analyzing the locality exhibited by data memory references. The tool is very fast because it is based on a static locality analysis enhanced with very simple profiling information, which results in a negligible slowdown. This feature allows the tool to be used for highly time-consuming applications and to include it as a step in a typical iterative analysis-optimization process. The tool can provide a detailed evaluation of the reuse exhibited by a program, quantifying and qualifying the different types of misses either globally or detailed by program sections, data structures, memory instructions, etc. The accuracy of the tool is validated by comparing its results with those provided by a simulator.Peer ReviewedPostprint (published version
Acciones para la eliminación del condicionante de género en la elección de carreras de ingeniería : desarrollo de la campaña “Ingeniera por qué no” en la Universidad Miguel Hernández (UMH)
Durante el año 2012, desde la Universidad Miguel Hernández (UMH), un equipo de docentes viene desarrollando un proyecto que comprende un conjunto de iniciativas encaminadas a conseguir romper con el prejuicio social que considera que las profesiones y estudios tienen género. Este hecho provoca el problema de que exista un significativo sesgo de género en algunas profesiones y titulaciones, como es el caso de la masculinización que tradicionalmente y culturalmente han sufrido y siguen sufriendo las ingenierías. En la UMH existen dos escuelas técnicas, la Escuela Politécnica Superior de Orihuela (EPSO), de larga tradición, ubicada en el Campus de Orihuela (Alicante) y la Escuela Politécnica Superior de Elche (EPSE). Por todo ello y conocedores de la campaña de sensibilización al respecto que la Fundación Isonomía ha venido desarrollando recientemente con éxito bajo el eslogan "Ingeniera por qué no", hemos considerado de interés promover y extender dicha campaña en nuestro campus y así aunar esfuerzos que rompan con la idea de que las carreras poseen género mediante acciones tales como: talleres de formación en másteres de profesorado de secundaria y bachillerato que imparte la UMH, talleres de formación a estudiantes de bachillerato en los institutos, elaboración de informes que visibilicen esta realidad dentro de nuestra universidad, así como la elaboración de material formativo que apoye las acciones para el cambio en todas las acciones planteadas
Triazole-Directed Pd-Catalyzed C(sp2)–H Oxygenation of Arenes and Alkenes
Selective Pd-catalyzed C(sp2)–H oxygenation of 4-substituted 1,2,3-triazoles is described. Unlike previous metal-catalyzed C–H functionalization events, which preferentially occur at the activated heterocyclic C–H bond, the regioselective oxygenation of the arene/alkene moiety is now achieved featuring the unconventional role of a simple triazole scaffold as a modular and selective directing group.MINECO for a Ramon y Cajal research contract (RYC-2012-09873)
Flexible compiler-managed L0 buffers for clustered VLIW processors
Wire delays are a major concern for current and forthcoming processors. One approach to attack this problem is to divide the processor into semi-independent units referred to as clusters. A cluster usually consists of a local register file and a subset of the functional units, while the data cache remains centralized. However, as technology evolves, the latency of such a centralized cache increase leading to an important performance impact. In this paper, we propose to include flexible low-latency buffers in each cluster in order to reduce the performance impact of higher cache latencies. The reduced number of entries in each buffer permits the design of flexible ways to map data from L1 to these buffers. The proposed L0 buffers are managed by the compiler, which is responsible to decide which memory instructions make us of them. Effective instruction scheduling techniques are proposed to generate code that exploits these buffers. Results for the Mediabench benchmark suite show that the performance of a clustered VLIW processor with a unified L1 data cache is improved by 16% when such buffers are used. In addition, the proposed architecture also shows significant advantages over both MultiVLIW processors and clustered processors with a word-interleaved cache, two state-of-the-art designs with a distributed L1 data cache.Peer ReviewedPostprint (published version
Virtual cluster scheduling through the scheduling graph
This paper presents an instruction scheduling and cluster assignment approach for clustered processors. The proposed technique makes use of a novel representation named the scheduling graph which describes all possible schedules. A powerful deduction process is applied to this graph, reducing at each step the set of possible schedules. In contrast to traditional list scheduling techniques, the proposed scheme tries to establish relations among instructions rather than assigning each instruction to a particular cycle. The main advantage is that wrong or poor schedules can be anticipated and discarded earlier. In addition, cluster assignment of instructions is performed using another novel concept called virtual clusters, which define sets of instructions that must execute in the same cluster. These clusters are managed during the deduction process to identify incompatibilities among instructions. The mapping of virtual to physical clusters is postponed until the scheduling of the instructions has finalized. The advantages this novel approach features include: (1) accurate scheduling information when assigning, and, (2) accurate information of the cluster assignment constraints imposed by scheduling decisions. We have implemented and evaluated the proposed scheme with superblocks extracted from Speclnt95 and MediaBench. The results show that this approach produces better schedules than the previous state-of-the-art. Speed-ups are up to 15%, with average speed-ups ranging from 2.5% (2-Clusters) to 9.5% (4-Clusters).Peer ReviewedPostprint (published version
A unified modulo scheduling and register allocation technique for clustered processors
This work presents a modulo scheduling framework for clustered ILP processors that integrates the cluster assignment, instruction scheduling and register allocation steps in a single phase. This unified approach is more effective than traditional approaches based on sequentially performing some (or all) of the three steps, since it allows optimizing the global code generation problem instead of searching for optimal solutions to each individual step. Besides, it avoids the iterative nature of traditional approaches, which require repeated applications of the three steps until a valid solution is found. The proposed framework includes a mechanism to insert spill code on-the-fly and heuristics to evaluate the quality of partial schedules considering simultaneously inter-cluster communications, memory pressure and register pressure. Transformations that allow trading pressure on a type of resource for another resource are also included. We show that the proposed technique outperforms previously proposed techniques. For instance, the average speed-up for the SPECfp95 is 36% for a 4-cluster configuration.Peer ReviewedPostprint (published version
Three low environmental terraced houses: a case of good practices in a central region of Spain
Topic III: Best practices and failures in realized SB projects in Mediterranean section.
The purpose of this paper is to present a POSTER PRESENTATION about a good practice, which will be presented with plans, photographs and text in A0 poster.
The good practice is composed by three terraced houses of low environmental impact built in Valladolid, a medium size Spanish city. The research about the reduction of CO2
emissions on this project has been awarded by the Obra Social of Caja España de Inversiones, Caja de
Ahorros y Monte de Piedad, in their Research Prize Convocation for renewable energies in 2003. The three houses has been finalized in 2004. Every houses have a floor area of 190 m2 each one, with conventional family program, from a private promotion in a residentia
l area. The technical construction level is an average between light or high techno
Explanatory factors of university student participation in flamenco
The present work offers a study exploring
University of Seville students’ cultural participation and
how often they attend live flamenco shows. Based on the
statistical yearbook of this university, a sample of 452
students from different fields was selected and, by
applying a questionnaire, a binomial logit model and an
ordered finance model were constructed. Our empirical
findings offer descriptive, explanatory and predictive
statistical results regarding participation and frequency.
For example, the results evidence that 43% of the
University of Seville students have never attended a live
flamenco show and that one of the main issues influencing
attendance is human and cultural capital
- …