13 research outputs found

    Hybrid Designs for Caches and Cores.

    Full text link
    Processor power constraints have come to the forefront over the last decade, heralded by the stagnation of clock frequency scaling. High-performance core and cache designs often utilize power-hungry techniques to increase parallelism. Conversely, the most energy-efficient designs opt for a serial execution to avoid unnecessary overheads. While both of these extremes constitute one-size-fits-all approaches, a judicious mix of parallel and serial execution has the potential to achieve the best of both high-performing and energy-efficient designs. This dissertation examines such hybrid designs for cores and caches. Firstly, we introduce a novel, hybrid out-of-order/in-order core microarchitecture. Instructions that are steered towards in-order execution skip register allocation, reordering and dynamic scheduling. At the same time, these instructions can interleave on an instruction-by-instruction basis with instructions that continue to benefit from these conventional out-of-order mechanisms. Secondly, this dissertation revisits a hybrid technique introduced for L1 caches, way-prediction, in the context of last-level caches that are larger, have higher associativity, and experience less locality.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113484/1/sleimanf_1.pd

    Intel Galileo and Intel Galileo Gen 2

    Get PDF
    Computer scienc

    Driving the Network-on-Chip Revolution to Remove the Interconnect Bottleneck in Nanoscale Multi-Processor Systems-on-Chip

    Get PDF
    The sustained demand for faster, more powerful chips has been met by the availability of chip manufacturing processes allowing for the integration of increasing numbers of computation units onto a single die. The resulting outcome, especially in the embedded domain, has often been called SYSTEM-ON-CHIP (SoC) or MULTI-PROCESSOR SYSTEM-ON-CHIP (MP-SoC). MPSoC design brings to the foreground a large number of challenges, one of the most prominent of which is the design of the chip interconnection. With a number of on-chip blocks presently ranging in the tens, and quickly approaching the hundreds, the novel issue of how to best provide on-chip communication resources is clearly felt. NETWORKS-ON-CHIPS (NoCs) are the most comprehensive and scalable answer to this design concern. By bringing large-scale networking concepts to the on-chip domain, they guarantee a structured answer to present and future communication requirements. The point-to-point connection and packet switching paradigms they involve are also of great help in minimizing wiring overhead and physical routing issues. However, as with any technology of recent inception, NoC design is still an evolving discipline. Several main areas of interest require deep investigation for NoCs to become viable solutions: • The design of the NoC architecture needs to strike the best tradeoff among performance, features and the tight area and power constraints of the onchip domain. • Simulation and verification infrastructure must be put in place to explore, validate and optimize the NoC performance. • NoCs offer a huge design space, thanks to their extreme customizability in terms of topology and architectural parameters. Design tools are needed to prune this space and pick the best solutions. • Even more so given their global, distributed nature, it is essential to evaluate the physical implementation of NoCs to evaluate their suitability for next-generation designs and their area and power costs. This dissertation performs a design space exploration of network-on-chip architectures, in order to point-out the trade-offs associated with the design of each individual network building blocks and with the design of network topology overall. The design space exploration is preceded by a comparative analysis of state-of-the-art interconnect fabrics with themselves and with early networkon- chip prototypes. The ultimate objective is to point out the key advantages that NoC realizations provide with respect to state-of-the-art communication infrastructures and to point out the challenges that lie ahead in order to make this new interconnect technology come true. Among these latter, technologyrelated challenges are emerging that call for dedicated design techniques at all levels of the design hierarchy. In particular, leakage power dissipation, containment of process variations and of their effects. The achievement of the above objectives was enabled by means of a NoC simulation environment for cycleaccurate modelling and simulation and by means of a back-end facility for the study of NoC physical implementation effects. Overall, all the results provided by this work have been validated on actual silicon layout

    Engineering Photon Sources for Practical Quantum Information Processing:If you liked it then you should have put a ring on it

    Get PDF
    Integrated quantum photonics offers a promising route to the realisation of universal fault-tolerant quantum computers. Much progress has been made on the theoretical aspects of a future quantum information processor, reducing both error thresholds and circuit complexity. Currently, engineering efforts are focused on integrating the most valuable technologies for a photonic quantum computer; pure single-photon sources, low-loss phase shifters and passivecircuit components, as well as efficient single-photon detectors and corresponding electronics.Here, we present efforts to target the former under the constraints imposed by the latter. We engineer the spectral correlations of photons produced by a heralded single-photon source, such that they produce photons in pure quantum states (99.1±0.1 % purity), and enable additional optimisation using temporal shaping of the pump field. Our source also has a high intrinsicheralding efficiency (94.0 ± 2.9 %) and produces photon pairs at a rate (4.4 ± 0.1 MHz mW−2) which is an order of magnitude better than previously predicted by the literature for a resonant source of this purity. Additionally, we present tomographic methodologies that fully describe the photonic quantum states that we produce, without the use of analytical models, and as a means of verifying the quantum states we create, entitled – "Quantum-referenced SpontaneousEmission Tomography" (Q-SpET). We also design reconfigurable photonic circuits that can be operated at cryogenic temperatures, with zero static power consumption, entitled – "Cladding Layer Manipulation" (CLM). These devices function as on-chip phase shifters, enabling the local reconfiguration of circuit elements using established technologies but removing the need for active power consumption to maintain the reconfigured circuit. These devices are capable ofan Lπ = 12.3 ± 0.3 µm, a ∼7x reduction in length when compared to the thermo-optic phaseshifters used throughout this thesis. Finally, we investigate how pure photon sources operate as part of larger circuits within the typical design rules of photonic quantum circuits. Using this information to accurately model all of the spurious contributions to the final photonic quantumstate, which we call a form of nonlinear noise. This noise can decrease source purity to below 40 %, significantly affecting the fidelity of Hong-Ou-Mandel interference, and subsequently, our ability to reliably create fundamental resources for photonic quantum computers. All of this contributes to our design of a fundamental building block for integrated quantum photonic processors, the functionality of which can be predicted at scale, under the conditions imposed by the rest of the processor

    Synthesis of multi-cycle circuits from guarded atomic actions

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 143-147).One solution to the timing closure problem is to perform infrequent operations in more than one clock cycle. Despite the apparent simplicity of the solution statement, it is not easily considered because it requires changes in RTL, which in turn exacerbates the verification problem. Another approach to the problem is to avoid it altogether, by using a high-level design methodology and allow the synthesis tool to generate the design that matches design requirements. This approach hinges on the ability of the tool to be able to generate satisfactory RTL from the high-level description, an ability which often cannot be tested until late in the project. Failure to meet the requirements can result in costly delays as an alternative way of expressing the design intent is sought and experimented with. We offer a timing closure solution that does not suffer from these problems. We have selected atomic actions as the high-level design methodology. We exploit the fact that semantics of atomic actions are untimed, that is, the time to execute an action does not change its outcome. The current hardware synthesis technique from atomic actions assumes that each action takes one clock cycle to complete its computation. Consequently, the action with the longest combinational path determines the clock cycle of the entire design, often leading to needlessly slow circuits. By augmenting the description of the actions with desired timing information, we allow the designer to split long paths over multiple clock cycles without giving up the semantics of atomicity. We also introduce loops with dynamic bounds into the atomic action description. These loops are not unrolled for synthesis, but the guards are evaluated for each iteration. Our synthesis results show that the clock speed and performance of circuits can be improved substantially with our technique, without having to substantially change the design.by Michal Karczmarek.Ph.D

    Intel Galileo and Intel Galileo Gen 2

    Get PDF
    Computer scienc

    Fast one- and two-pick fixed-priority selection and muxing circuits

    No full text
    Due to copyright restrictions, the access to the full text of this article is only available via subscription.Priority encoders and arbiters usually drive multiplexers (muxes). Latency optimization of priority encoders and multiplexer trees has usually been handled separately in the literature. However, in some applications with circular data dependencies, the combined latency of the arbiter and muxing needs to be optimized. Moreover, there is an ever growing need for throughput. This requires switches that pick and multiplex more than one request per cycle. In this paper, we propose a family of circuit topologies where priority encoding picks one or two requests and takes place in parallel with muxing. We first present a scalable logic circuit for the 1-pick fixed-priority muxing problem and then extend it to the 2-pick problem. We compare the proposed architecture to its counterpart that does only priority encoding using Synopsis Design Compiler with ARM-Artisan TSMC 180 nm worst-case standard library. The results show that most of the priority encoding latency is hidden in the proposed circuit topology

    Satellite Networks: Architectures, Applications, and Technologies

    Get PDF
    Since global satellite networks are moving to the forefront in enhancing the national and global information infrastructures due to communication satellites' unique networking characteristics, a workshop was organized to assess the progress made to date and chart the future. This workshop provided the forum to assess the current state-of-the-art, identify key issues, and highlight the emerging trends in the next-generation architectures, data protocol development, communication interoperability, and applications. Presentations on overview, state-of-the-art in research, development, deployment and applications and future trends on satellite networks are assembled

    The 1992 4th NASA SERC Symposium on VLSI Design

    Get PDF
    Papers from the fourth annual NASA Symposium on VLSI Design, co-sponsored by the IEEE, are presented. Each year this symposium is organized by the NASA Space Engineering Research Center (SERC) at the University of Idaho and is held in conjunction with a quarterly meeting of the NASA Data System Technology Working Group (DSTWG). One task of the DSTWG is to develop new electronic technologies that will meet next generation electronic data system needs. The symposium provides insights into developments in VLSI and digital systems which can be used to increase data systems performance. The NASA SERC is proud to offer, at its fourth symposium on VLSI design, presentations by an outstanding set of individuals from national laboratories, the electronics industry, and universities. These speakers share insights into next generation advances that will serve as a basis for future VLSI design

    Images on the Move: Materiality - Networks - Formats

    Get PDF
    In contemporary society, digital images have become increasingly mobile. They are networked, shared on social media, and circulated across small and portable screens. Accordingly, the discourses of spreadability and circulation have come to supersede the focus on production, indexicality, and manipulability, which had dominated early conceptions of digital photography and film. However, the mobility of images is neither technologically nor conceptually limited to the realm of the digital. The edited volume re-examines the historical, aesthetical, and theoretical relevance of image mobility. The contributors provide a materialist account of images on the move - ranging from wired photography to postcards to streaming media
    corecore