10,606 research outputs found

    A software-hardware hybrid steering mechanism for clustered microarchitectures

    Get PDF
    Clustered microarchitectures provide a promising paradigm to solve or alleviate the problems of increasing microprocessor complexity and wire delays. High- performance out-of-order processors rely on hardware-only steering mechanisms to achieve balanced workload distribution among clusters. However, the additional steering logic results in a significant increase on complexity, which actually decreases the benefits of the clustered design. In this paper, we address this complexity issue and present a novel software-hardware hybrid steering mechanism for out-of-order processors. The proposed software- hardware cooperative scheme makes use of the concept of virtual clusters. Instructions are distributed to virtual clusters at compile time using static properties of the program such as data dependences. Then, at runtime, virtual clusters are mapped into physical clusters by considering workload information. Experiments using SPEC CPU2000 benchmarks show that our hybrid approach can achieve almost the same performance as a state-of-the-art hardware-only steering scheme, while requiring low hardware complexity. In addition, the proposed mechanism outperforms state-of-the-art software-only steering mechanisms by 5% and 10% on average for 2-cluster and 4-cluster machines, respectively.Peer ReviewedPostprint (published version

    Reducing branch delay to zero in pipelined processors

    Get PDF
    A mechanism to reduce the cost of branches in pipelined processors is described and evaluated. It is based on the use of multiple prefetch, early computation of the target address, delayed branch, and parallel execution of branches. The implementation of this mechanism using a branch target instruction memory is described. An analytical model of the performance of this implementation makes it possible to measure the efficiency of the mechanism with a very low computational cost. The model is used to determine the size of cache lines that maximizes the processor performance, to compare the performance of the mechanism with that of other schemes, and to analyze the performance of the mechanism with two alternative cache organizations.Peer ReviewedPostprint (published version

    Virtual cluster scheduling through the scheduling graph

    Get PDF
    This paper presents an instruction scheduling and cluster assignment approach for clustered processors. The proposed technique makes use of a novel representation named the scheduling graph which describes all possible schedules. A powerful deduction process is applied to this graph, reducing at each step the set of possible schedules. In contrast to traditional list scheduling techniques, the proposed scheme tries to establish relations among instructions rather than assigning each instruction to a particular cycle. The main advantage is that wrong or poor schedules can be anticipated and discarded earlier. In addition, cluster assignment of instructions is performed using another novel concept called virtual clusters, which define sets of instructions that must execute in the same cluster. These clusters are managed during the deduction process to identify incompatibilities among instructions. The mapping of virtual to physical clusters is postponed until the scheduling of the instructions has finalized. The advantages this novel approach features include: (1) accurate scheduling information when assigning, and, (2) accurate information of the cluster assignment constraints imposed by scheduling decisions. We have implemented and evaluated the proposed scheme with superblocks extracted from Speclnt95 and MediaBench. The results show that this approach produces better schedules than the previous state-of-the-art. Speed-ups are up to 15%, with average speed-ups ranging from 2.5% (2-Clusters) to 9.5% (4-Clusters).Peer ReviewedPostprint (published version

    A unified modulo scheduling and register allocation technique for clustered processors

    Get PDF
    This work presents a modulo scheduling framework for clustered ILP processors that integrates the cluster assignment, instruction scheduling and register allocation steps in a single phase. This unified approach is more effective than traditional approaches based on sequentially performing some (or all) of the three steps, since it allows optimizing the global code generation problem instead of searching for optimal solutions to each individual step. Besides, it avoids the iterative nature of traditional approaches, which require repeated applications of the three steps until a valid solution is found. The proposed framework includes a mechanism to insert spill code on-the-fly and heuristics to evaluate the quality of partial schedules considering simultaneously inter-cluster communications, memory pressure and register pressure. Transformations that allow trading pressure on a type of resource for another resource are also included. We show that the proposed technique outperforms previously proposed techniques. For instance, the average speed-up for the SPECfp95 is 36% for a 4-cluster configuration.Peer ReviewedPostprint (published version

    First order transitions by conduction calorimetry: Application to deuterated potassium dihydrogen phosphate ferroelastic crystal under uniaxial pressure

    Get PDF
    The specific heat c and the heat power W exchanged by a Deuterated Potassium Dihydrogen Phosphate ferroelectric-ferroelastic crystal have been measured simultaneously for both decreasing and increasing temperature at a low constant rate (0.06 K/h) between 175 and 240 K. The measurements were carried out under controlled uniaxial stresses of 0.3 and 4.5±0.1 bar applied to face (110). At Tt=207.9 K, a first order transition is produced with anomalous specific heat behavior in the interval where the transition heat appears. This anomalous behavior is explained in terms of the temperature variation of the heat power during the transition. During cooling, the transition occurs with coexistence of phases, while during heating it seems that metastable states are reached. Excluding data affected by the transition heat, the specific heat behavior agrees with the predictions of a 2-4-6 Landau potential in the range of 4–15 K below Tt while logarithmic behavior is obtained in the range from Tt to 1 K below Tt. Data obtained under 0.3 and 4.5 bar uniaxial stresses exhibit the same behavior.Dirección General de Investigación Científica y Técnica. Gobierno de España-PB91-60

    Length and Age of First Maturation of Flemish Cap Cod in 1993 with an Histologic Study

    Get PDF
    6 páginas, 2 figuras, 4 tablas.-- Scientific Council Meeting221 ovaries of cod caught on Flemish Cap in July 1993 were analyzed hiStologically . Sampled cod range from 31 to 88 cm in lengths. and the age ranged from 2 to 8 years old. To study the percentage of spawning females by size and age classes, the percentage of females ripening for the first time was also studied. The maduration ogive using cortical alveoli (like an indicative of the next spawners) was calculated as 50% maduration length and age at 50 cm and 4 years old and the maduration ogive using postovulatorie follicles (like indicative of those females which spawned at least one time) was calculated as 50% next maduration lenght and age at 64 cm and 5 years.Peer reviewe

    Instruction replication for clustered microarchitectures

    Get PDF
    This work presents a new compilation technique that uses instruction replication in order to reduce the number of communications executed on a clustered microarchitecture. For such architectures, the need to communicate values between clusters can result in a significant performance loss. Inter-cluster communications can be reduced by selectively replicating an appropriate set of instructions. However, instruction replication must be done carefully since it may also degrade performance due to the increased contention it can place on processor resources. The proposed scheme is built on top of a previously proposed state-of-the-art modulo scheduling algorithm that effectively reduces communications. Results show that the number of communications can decrease using replication, which results in significant speed-ups. IPC is increased by 25% on average for a 4-cluster microarchitecture and by as mush as 70% for selected programs.Peer ReviewedPostprint (published version

    Juan Ramón Jiménez y su entorno social y cultural: de la correspondencia con León Sánchez Cuesta (1927-1956)

    Get PDF
    Este artículo profundiza en el entorno social y cultural de Juan Ramón Jiménez a través de su relación con León Sánchez Cuesta, conocido como el “librero de la generación del 27”. Para ello revisamos diversas fuentes bibliográfi cas, especialmente la correspondencia inédita que el librero dirigió al poeta y de la cual se reproducen en el artículo dos cartas. Sánchez Cuesta fue amigo personal, depositario exclusivo de la obra de Juan Ramón Jiménez entre 1925 y 1933, aproximadamente, difusor y distribuidor de su obra y uno de sus principales suministradores de libros. El librero también jugó un papel importante como mediador en las disputas entre Juan Ramón Jiménez y algunos poetas de la “generación del 27”. Sánchez Cuesta es considerado una figura clave en el entorno de Juan Ramón Jiménez y su correspondencia nos permite indagar en el entorno social y cultural y los hábitos de lectura del poeta de Moguer.This article analyzes the social and cultural context of Juan Ramón Jiménez through his relationship with León Sánchez Cuesta, also known as “the librarian of the generation of 27”. For this purpose we review different bibliographic sources, including the unpublished correspondence between Sánchez Cuesta and Juan Ramón Jiménez. Two of these letters are included in this article. Sánchez Cuesta was a friend and sole distributor of the works of Juan Ramón Jiménez during (approximately) the years 1925-1933. He played an important role spreading the work of Juan Ramón and providing him with reading materials. Sánchez Cuesta also played an important mediator role in some disputes between Juan Ramón and some poets of the “generation of ‘27”. Sánchez Cuesta is a key figure in the life of Juan Ramón Jiménez and his correspondence increases our understanding of both the social and cultural context and the reading habits of Juan Ramón Jiménez

    Overexpression of mitochondrial if1 prevents metastatic disease of colorectal cancer by enhancing anoikis and tumor infiltration of NK cells

    Full text link
    Increasing evidences show that the ATPase Inhibitory Factor 1 (IF1), the physiological inhibitor of the ATP synthase, is overexpressed in a large number of carcinomas contributing to metabolic reprogramming and cancer progression. Herein, we show that in contrast to the findings in other carcinomas, the overexpression of IF1 in a cohort of colorectal carcinomas (CRC) predicts less chances of disease recurrence, IF1 being an independent predictor of survival. Bioinformatic and gene expression analyses of the transcriptome of colon cancer cells with differential expression of IF1 indicate that cells overexpressing IF1 display a less aggressive behavior than IF1 silenced (shIF1) cells. Proteomic and functional in vitro migration and invasion assays confirmed the higher tumorigenic potential of shIF1 cells. Moreover, shIF1 cells have increased in vivo metastatic potential. The higher metastatic potential of shIF1 cells relies on increased cFLIP-mediated resistance to undergo anoikis after cell detachment. Furthermore, tumor spheroids of shIF1 cells have an increased ability to escape from immune surveillance by NK cells. Altogether, the results reveal that the overexpression of IF1 acts as a tumor suppressor in CRC with an important anti-metastatic role, thus supporting IF1 as a potential therapeutic target in CRCThis research was funded by grants from Ministerio de Ciencia, Innovación y Universidades (SAF2013-41945-R, SAF2016-75916-R and SAF2016-75452-R), CIBERER-ISCIII (CB06/07/0017) and Fundación Ramón Areces, Spai
    corecore