Search CORE

25 research outputs found

Concert recording 2013-03-28

Author: Badour Alex
Baker Andrew
Bobo Richard
Brani Will
Brilhart John
Burner Michael
Busch Sarah
Campbell Whitney
Cunningham Nathan
Divine Corey
Fox Jennifer
Green Elliot
Hall Anna
Hamby Peter
Hanna Michael
Head Joel
Jay Austin
Jones Andrew
Kamilos Matthew
Lomolino Amber
Loungsangroong Manchusa
Mayes Amy
Mezines Kristine
Olefsky Michael
Oliverio Patrick
Payne Jeff
Roe Mike
Rowan Andy
Rulli Joseph
Simmons Curtis
Skahan Crystal
Thompson Emily
Tucker Dallas
Vecchio Nick
Willis Emily
Publication venue: ScholarWorks@UARK
Publication date: 28/03/2013
Field of study

[Track 01]. Fanfares liturgiques. Procession du Vendredi-Saint / Henri Tomasi -- [Track 02]. The good soldier Schweik suite. Overture / Robert Kurka -- [Track 03]. The good soldier Schweik suite. Lament / Robert Kurka -- [Track 04]. The good soldier Schweik suite. March / Robert Kurka -- [Track 05]. The good soldier Schweik suite. War Dance / Robert Kurka -- [Track 06]. The good soldier Schweik suite. Pastoral / Robert Kurka -- [Track 07]. The good soldier Schweik suite. Finale / Robert Kurka -- [Track 08]. Serenade no. 11 in E flat major, KV 375. Allegro maestoso / Wolfgang Amadeus Mozart -- [Track 09]. Prelude, fugue and riffs / Leonard Bernstein

ScholarWorks@UARK

UARK (University of Arkansas )

Latency Tolerant Branch Predictors

Author: Alex Ramirez
Mateo Valero
Oliverio J. Santana
Publication venue
Publication date: 01/01/2003
Field of study

The access latency of branch predictors is a well known problem of fetch engine design. Prediction overriding techniques are commonly accepted to overcome this problem. However, prediction overriding requires a complex recovery mechanism to discard the wrong speculative work based on overridden predictions. In this paper, we show that stream and trace predictors, which use long basic prediction units, can tolerate access latency without needing overriding, thus reducing fetch engine complexity. We show that both the stream fetch engine and the trace cache architecture not using overriding outperform other efficient fetch engines, such as an EV8-like fetch architecture or the FTB fetch engine, even when they do use overriding. 1

CiteSeerX

UPCommons. Portal del coneixement obert de la UPC

Carleton University's Institutional Repository

Enlarging Instruction Streams

Author: Alex Ramirez
Mateo Valero
Oliverio J. Santana
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Reducing Fetch Architecture Complexity Using

Author: Alex Ramirez
Mateo Valero
Oliverio J. Santana
Publication venue
Publication date
Field of study

Fetch engine performance is seriously limited by the branch prediction table access latency. This fact has lead to the development of hardware mechanisms, like prediction overriding, aimed to tolerate this latency. However, prediction overriding requires additional support and recovery mechanisms, which increases the fetch architecture complexity. In this paper, we show that this increase in complexity can be avoided if the interaction between the fetch architecture and software code optimizations is taken into account. We use aggressive procedure inlining to generate long streams of instructions that are used by the fetch engine as the basic prediction unit. We call instruction stream to a sequence of instructions from the target of a taken branch to the next taken branch. These instruction streams are long enough to feed the execution engine with instructions during multiple cycles, while a new stream prediction is being generated, and thus hiding the prediction table access latency. Our results show that the length of instruction streams compensates the increase in the instruction cache miss rate caused by inlining. We show that, using procedure inlining, the need for a prediction overriding mechanism is avoided, reducing the fetch engine complexity.

CiteSeerX

Tolerating branch predictor latency on SMT

Author: Alex Ramírez
Ayose Falcón
Mateo Valero
Oliverio J. Santana
Publication venue
Publication date: 01/01/2003
Field of study

Abstract. Simultaneous Multithreading (SMT) tolerates latency by executing instructions from multiple threads. If a thread is stalled, resources can be used by other threads. However, fetch stall conditions caused by multi-cycle branch predictors prevent SMT to achieve all its potential performance, since the flow of fetched instructions is halted. This paper proposes and evaluates solutions to deal with the branch predictor delay on SMT. Our contribution is two-fold: we describe a decoupled implementation of the SMT fetch unit, and we propose an interthread pipelined branch predictor implementation. These techniques prove to be effective for tolerating the branch predictor access latency

CiteSeerX

Crossref

INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING AND NETWORKING (IJHPCN) 1 A Latency-Conscious SMT Branch Prediction Architecture

Author: Alex Ramirez
Ayose Falcón
Mateo Valero
Oliverio J. Santana
Publication venue
Publication date
Field of study

Abstract — Executing multiple threads has proved to be an effective solution to partially hide latencies that appear in a processor. When a thread is stalled because a long-latency operation is being processed, like a memory access or a floatingpoint calculation, the processor can switch to another context so that another thread can take advantage of the idle resources. However, fetch stall conditions caused by a branch predictor delay are not hidden by current SMT fetch designs, causing a performance drop due to the absence of instructions to execute. In this paper, we propose several solutions to reduce the effect of branch predictor delay in the performance of Simultaneous Multithreading (SMT) processors. First, we analyze the impact of varying the number of access ports. Then, we describe a decoupled implementation of an SMT fetch unit that helps to tolerate the predictor delay. Finally, we present an inter-thread pipelined branch predictor, based on creating a pipelined of interleaved predictions from different threads. Our results show that, combining all the proposed techniques, the performance obtained is similar to that obtained using an ideal, 1-cycle access branch predictor. Index Terms — SMT, fetch engine, branch predictor delay, decoupled predictor, predictor pipelining. I

CiteSeerX

A Complexity-Effective Decoding Architecture Based on Instruction Streams

Author: Alex Ramirez
Ayose Falcón
Mateo Valero
Oliverio J. Santana
Publication venue
Publication date
Field of study

A complex decoding logic is a performance bottleneck for those high-frequency microprocessors that implement variable length instruction set architectures. The need of removing this complexity from the critical execution path has lead to the design of alternative techniques, like the trace cache fetch architecture. The trace cache stores decoded instructions, and thus the instructions fetched from it do not require to be decoded again. However, this is achieved at the cost of increasing the complexity of the fetch engine. This paper presents a first glance at a complexityeffective decoding architecture. Our proposal does not use a special-purpose storage like the trace cache. Instead, our architecture stores frequently executed instructions in the memory hierarchy. The already decoded instructions can be fetched from memory, removing the complex instruction decoder from the critical path. Our final objective is to provide all the benefits of fetching already decoded instructions, but without increasing the implementation cost and complexity of the fetch architecture.

CiteSeerX

Fetching Instruction Streams

Author: Alex Ramirez
Josep L. Larriba-Pey
Mateo Valero
Oliverio J. Santana
Publication venue: Society Press
Publication date: 01/01/2002
Field of study

Fetch performance is a very important factor because it effectively limits the overall processor performance. However, there is little performance advantage in increasing front-end performance beyond what the back-end can consume. For each processor design, the target is to build the best possible fetch engine for the required performance level. A fetch engine will be better if it provides better performance, but also if it takes fewer resources, requires less chip area, or consumes less power. In this pape

CiteSeerX

UPCommons. Portal del coneixement obert de la UPC

Manual de procedimientos para el diseño y mecanización por medio de software del cálculo de materiales y mano de obra de proyectos telefonicos

Author: Abarca Valle Oliverio Antonio
Brito Hurtado Douglas Edgardo
Estrada Rivera José Alex
Publication venue
Publication date: 01/09/1995
Field of study

El trabajo que a continuación se desarrolla les permitira conocer preliminarmente las generalidades, acerca del área de planta externa, su definición y la descripción de los elementos que la constituyen, el por que de suimportancia para la telefonía y para cualquier empresa o compañía dedicada al basto mundo de la telefonía . También se consideranlas obras que permiten establecer de forma física la interconección entre los abonados de una central telefonica, las cuales permiten que un abonado, de la misma zona; tambien permite la comunicación entre las centrales y estas a su ves entre entre sus abonados. El conocimiento de planta externa implica necesariamente que se deben conocer los materiales con los cuales se construyen las redes de la misma. Estos materiales se describeny a la vez se ilustran en un glosario de accesorios que se consideran como los más utilizados en proyectos telefonicos en nuestro país. La utilización de estos materiales depende en gran medida de la aceptación por parte de ANTEL, atendiendo a que dichos materiales estan sujetos a las normas internacionales ; ASTM, las cuales rigen las contrucciones de los mismos mediante pruebas que necesariamente se les practicán, para mantener un control de calidad

Repositorio Institucional de la Universidad de El Salvador