1,234 research outputs found

    Lattice QCD Production on Commodity Clusters at Fermilab

    Full text link
    We describe the construction and results to date of Fermilab's three Myrinet-networked lattice QCD production clusters (an 80-node dual Pentium III cluster, a 48-node dual Xeon cluster, and a 128-node dual Xeon cluster). We examine a number of aspects of performance of the MILC lattice QCD code running on these clusters.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, 6 pages, LaTeX, 8 eps figures. PSN TUIT00

    Fast Recompilation of Object Oriented Modules

    Full text link
    Once a program file is modified, the recompilation time should be minimized, without sacrificing execution speed or high level object oriented features. The recompilation time is often a problem for the large graphical interactive distributed applications tackled by modern OO languages. A compilation server and fast code generator were developed and integrated with the SRC Modula-3 compiler and Linux ELF dynamic linker. The resulting compilation and recompilation speedups are impressive. The impact of different language features, processor speed, and application size are discussed

    CA-BIST for asynchronous circuits: a case study on the RAPPID asynchronous instruction length decoder

    Get PDF
    Journal ArticleThis paper presents a case study in low-cost noninvasive Built-In Self Test (BIST) for RAPPID, a largescale 120,000-transistor asynchronous version of the Pentium® Pro Instruction Length Decoder, which runs at 3.6 GHz. RAPPID uses a synchronous 0.25 micron CMOS library for static and domino logic, and has no Design-for-Test hooks other than some debug features. We explore the use of Cellular Automata (CA) for on-chip test pattern generation and response evaluation. More specifically, we look for fast ways to tune the CA-BIST to the RAPPID design, rather than using pseudo-random testing. The metric for tuning the CA-BIST pattern generation is based on an abstract hardware description model of the instruction length decoder, which is independent of implementation details, and hence also independent of the asynchronous circuit style. Our CA-BI ST solution uses a novel bootstrap procedure for generating the test patterns, which give complete coverage for this metric, and cover 94% of the testable stuck-at faults for the actual design at switch level. Analysis of the undetected and untestable faults shows that the same fault effects can be expected for a similar clocked circuit. This is encouraging evidence that testability is no excuse to avoid asynchronous design techniques in addition to high-performance synchronous solutions

    CA-BIST for asynchronous circuits: a case study on the RAPPID asynchronous instruction length decoder

    Get PDF
    Journal ArticleThis paper presents a case study in low-cost noninvasive Built-In Self Test (BIST) for RAPPID, a largescale 120,000-transistor asynchronous version of the Pentium® Pro Instruction Length Decoder, which runs at 3.6 GHz. RAPPID uses a synchronous 0.25 micron CMOS library for static and domino logic, and has no Design-for-Test hooks other than some debug features. We explore the use of Cellular Automata (CA) for on-chip test pattern generation and response evaluation. More specifically, we look for fast ways to tune the CA-BIST to the RAPPID design, rather than using pseudo-random testing. The metric for tuning the CA-BIST pattern generation is based on an abstract hardware description model of the instruction length decoder, which is independent of implementation details, and hence also independent of the asynchronous circuit style. Our CA-BI ST solution uses a novel bootstrap procedure for generating the test patterns, which give complete coverage for this metric, and cover 94% of the testable stuck-at faults for the actual design at switch level. Analysis of the undetected and untestable faults shows that the same fault effects can be expected for a similar clocked circuit. This is encouraging evidence that testability is no excuse to avoid asynchronous design techniques in addition to high-performance synchronous solutions

    Keeping Secrets in Hardware: the Microsoft Xbox(TM) Case Study

    Get PDF
    This paper discusses the hardware foundations of the cryptosystem employed by the Xbox(TM) video game console from Microsoft. A secret boot block overlay is buried within a system ASIC. This secret boot block decrypts and verifies portions of an external FLASH-type ROM. The presence of the secret boot block is camouflaged by a decoy boot block in the external ROM. The code contained within the secret boot block is transferred to the CPU in the clear over a set of high-speed busses where it can be extracted using simple custom hardware. The paper concludes with recommendations for improving the Xbox security system. One lesson of this study is that the use of a high-performance bus alone is not a sufficient security measure, given the advent of inexpensive, fast rapid prototyping services and high-performance FPGAs

    Issues in using commodity operating systems for time-dependent tasks: experiences from a study of windows NT

    Get PDF
    ManuscriptThis paper presents a snapshot of early results from a study of Windows NT aimed at understanding and improving its limitations when used for time-dependent tasks, such as those that arise for audio and video processing. Clearly there are time scales for which it can achieve effectively perfect reliability, such as the onesecond deadlines present in the Tiger Video Filesystem. Other time scales, such as reliable sub-millisecond scheduling of periodic tasks in user space, are clearly out of reach. Yet, there is an interesting middle ground between these time scales in which deadlines may be met, but will not always be. This study focuses on system and application behaviors in this region with the short-term goals of understanding and improving the real-time responsiveness of applications using Windows NT 5.0 and a longer-term goal of prototyping and recommending possible scheduling and resource management enhancements to future Microsoft systems products. Finally, while this paper primarily contains examples and results from Windows NT, we believe that the kinds of limitations and artifacts identified may also apply to other commodity systems such as the many UNIX variants. Indeed, this paper is primarily intended to provide a starting point for fruitful discussions along these lines at the workshop and not as a record of completed work
    • …
    corecore