26,523 research outputs found
Efficient resources assignment schemes for clustered multithreaded processors
New feature sizes provide larger number of transistors per chip that architects could use in order to further exploit instruction level parallelism. However, these technologies bring also new challenges that complicate conventional monolithic processor designs. On the one hand, exploiting instruction level parallelism is leading us to diminishing returns and therefore exploiting other sources of parallelism like thread level parallelism is needed in order to keep raising performance with a reasonable hardware complexity. On the other hand, clustering architectures have been widely studied in order to reduce the inherent complexity of current monolithic processors. This paper studies the synergies and trade-offs between two concepts, clustering and simultaneous multithreading (SMT), in order to understand the reasons why conventional SMT resource assignment schemes are not so effective in clustered processors. These trade-offs are used to propose a novel resource assignment scheme that gets and average speed up of 17.6% versus Icount improving fairness in 24%.Peer ReviewedPostprint (published version
Frontend frequency-voltage adaptation for optimal energy-delay/sup 2/
In this paper, we present a clustered, multiple-clock domain (CMCD) microarchitecture that combines the benefits of both clustering and globally asynchronous locally synchronous (GALS) designs. We also present a mechanism for dynamically adapting the frequency and voltage of the frontend of the CMCD with the goal to optimize the energy-delay/sup 2/ product (ED2P). Our mechanism has minimal hardware cost, is entirely self-adjustable, does not depend on any thresholds, and achieves results close to optimal. We evaluate it on 16 SPEC 2000 applications and report 17.5% ED2P reduction on average (80% of the upper bound).Peer ReviewedPostprint (published version
The change towards a teaching methodology based on competences: a case study in a Spanish university
The European Higher Education Area (EHEA) has promoted the implementation of a teaching methodology based on competences. Drawing on New Institutional Sociology, the present work aims to identify and improve knowledge concerning the factors which are hindering that change in the Spanish university system. This is investigated using a case study based on a Spanish university which is a pioneer in the implementation of a competence-based curriculum. The results show that factors identified and analysed are conditioning the change and causing a ceremonial adoption of the competence-based system by the teaching staff. The results of the study may be of use as a reference and orientation for educators and administrators as well as for the regulators of those countries integrated within the EHEA. The study may prove useful in the analysis of inertia factors present when designing and implementing the policies and measures necessary to achieve a comprehensive teaching methodology; one which incorporates learning and assessment of competences in a real manner
Control speculation for energy-efficient next-generation superscalar processors
Conventional front-end designs attempt to maximize the number of "in-flight" instructions in the pipeline. However, branch mispredictions cause the processor to fetch useless instructions that are eventually squashed, increasing front-end energy and issue queue utilization and, thus, wasting around 30 percent of the power dissipated by a processor. Furthermore, processor design trends lead to increasing clock frequencies by lengthening the pipeline, which puts more pressure on the branch prediction engine since branches take longer to be resolved. As next-generation high-performance processors become deeply pipelined, the amount of wasted energy due to misspeculated instructions will go up. The aim of this work is to reduce the energy consumption of misspeculated instructions. We propose selective throttling, which triggers different power-aware techniques (fetch throttling, decode throttling, or disabling the selection logic) depending on the branch prediction confidence level. Results show that combining fetch-bandwidth reduction along with select-logic disabling provides the best performance in terms of overall energy reduction and energy-delay product improvement (14 percent and 10 percent, respectively, for a processor with a 22-stage pipeline and 16 percent and 13 percent, respectively, for a processor with a 42-stage pipeline).Peer ReviewedPostprint (published version
Negotiation of meaning in outside of the classroom group assignments: accounting for the how to understand the what of future mathematics teachers' learning
In this paper we illustrate how Wenger’s theory of social learning can be used to account for phenomena of future teachers change in settings that are not usually studied, namely group work that future teachers do as they work on class assignments outside of class. We describe how we adapted Wenger’s theory to the exploration of future mathematics teachers’ learning and illustrate how the analysis of the audio taped interaction of a group of future teachers working out-side the classroom generated conjectures that help to explain their didactic knowledge development
Virtual-physical registers
A novel dynamic register renaming approach is proposed in this work. The key idea of the novel scheme is to delay the allocation of physical registers until a late stage in the pipeline, instead of doing it in the decode stage as conventional schemes do. In this way, the register pressure is reduced and the processor can exploit more instruction-level parallelism. Delaying the allocation of physical registers require some additional artifact to keep track of dependences. This is achieved by introducing the concept of virtual-physical registers, which do not require any storage location and are used to identify dependences among instructions that have not yet allocated a register to its destination operand. Two alternative allocation strategies have been investigated that differ in the stage where physical registers are allocated: issue or write-back. The experimental evaluation has confirmed the higher performance of the latter alternative. We have performed all evaluation of the novel scheme through a detailed simulation of a dynamically scheduled processor. The results show a significant improvement (e.g., 19% increase in IPC for a machine with 64 physical registers in each file) when compared with the traditional register renaming approach.Peer ReviewedPostprint (published version
Using MCD-DVS for dynamic thermal management performance improvement
With chip temperature being a major hurdle in microprocessor design, techniques to recover the performance loss due to thermal emergency mechanisms are crucial in order to sustain performance growth. Many techniques for power reduction in the past and some on thermal management more recently have contributed to alleviate this problem. Probably the most important thermal control technique is dynamic voltage and frequency scaling (DVS) which allows for almost cubic reduction in power with worst-case performance penalty only linear. So far, DVS techniques for temperature control have been studied at the chip level. Finer grain DVS is feasible if a globally-asynchronous locally-synchronous (GALS) design style is employed. GALS, also known as multiple-clock domain (MCD), allows for an independent voltage and frequency control for each one of the clock domains that are part of the chip. There are several studies on DVS for GALS that aim to improve energy and power efficiency but not temperature. This paper proposes and analyses the usage of DVS at the domain level to control temperature in a clustered MCD microarchitecture with the goal of improving the performance of applications that do not meet the thermal constraints imposed by the designers.Peer ReviewedPostprint (published version
Anomalies in the cognitive-executive functions in patients with chiari malformation type I
Resumen tomado de la publicaciónAnomalías en las funciones cognitivo-ejecutivas en pacientes con la Malformación de Chiari Tipo I. Antecedentes: en la última década, existen evidencias crecientes de que déficits neuropsicológicos, esencialmente en funciones ejecutivas, pueden estar involucrados en la patogenia de la enfermedad de Chiari Tipo I. El objetivo del estudio es evaluar la influencia de anormalidades estructurales sobre las funciones neuropsicológicas, fundamentalmente ejecutivas, en pacientes con Chiari Tipo I. Método: para ello se comparó el perfil neuropsicológico de estos pacientes con controles sanos. Tanto a los pacientes Chiari Tipo I como a los controles sanos se les aplicó pruebas neuropsicológicas que valoraron funciones ejecutivas frontales de vigilancia o atención sostenida, flexibilidad mental, y planificación y formación de conceptos (Stroop, CPT, WCST). Resultados: los resultados obtenidos sugieren una afectación de los pacientes Chiari Tipo I en los procesos de inhibición y autocontrol (Stroop) y en la capacidad atencional y en el mantenimiento del curso del pensamiento y la acción (WCST). Conclusiones: estos resultados proporcionan evidencias de posibles déficits o anomalías en las funciones ejecutivas cognitivas, que permitirían diferenciar los pacientes con Chiari Tipo I.Universidad de Oviedo. Biblioteca de Psicología; Plaza Feijoo, s/n.; 33003 Oviedo; Tel. +34985104146; Fax +34985104126; [email protected]
- …