18 research outputs found

    Fine Grain Concurrent Computations

    Get PDF
    This thesis develops a computational model, a programming notation, and a set of programming principles to further and to demonstrate the practicality of programming fine grain concurrent computers. Programs are expressed in the computational model as a collection of definitions of autonomous computing agents called objects. In the execution of a program, the objects communicate data and synchronize their actions exclusively by message-passing. An object executes its definition only in response to receiving a message, and its actions may include sending messages, creating new objects, and modifying its own internal state. The number of actions that occur in response to a message is finite; Turing computability is achieved not within a single object, but through the interaction of objects. A new concurrent programming notation Cantor is used to demonstrate the cognitive process of writing programs using the object model. Programs for numerical sieves, sorting, the eight queens problem, and Gaussian elimination are fully described. Each of these programs involve up to thousands of objects in their execution. The general programming strategy is to first partition objects by their overall behavior and then to program the behaviors to be self-organizing. The semantics of Cantor are made precise through the definition of a formal semantics for Cantor and the object model. Objects are modelled as finite automata. The formal semantics is useful for proving program properties and for building frameworks to capture specific properties of object programs. The mathematical frameworks are constructed for building object graphs independently of program execution and for systematically removing objects that are irrelevant to program execution (garbage collection). The formal semantics are complemented by experiments that allow one to study the dynamics of the execution of Cantor programs on fine grain concurrent computers. The clean semantics of Cantor suggests simple metrics for evaluating the execution of concurrent programs for an ideal, abstract implementation. Program performance is also evaluated for environments where computing resources are limited. Prom the results of these experiments, hardware and software architectures for organizing fine grain message-passing computations is proposed, including support for fault tolerance and for garbage collection

    A VLSI Combinator Reduction Engine

    Get PDF
    No Abstract

    Multicomputers

    Get PDF
    This report outlines the history, current status, current developments, and plans for the message-passing concurrent computers, or multicomputers, developed in the Submicron Systems Architecture Project at Caltech. These systems include the Cosmic Cube and its commercial descendants, two second-generation cosmic cubes currently in development, and the Mosaic C, a fine grain multicomputer whose nodes are single VLSI chips. Section 1 introduces the physical architectures, with particular attention to the characteristics of the message-passing networks. Section 2 describes the programming environments for the first and second generation medium grain size "cubes." Section 3 describes a fine grain concurrent object-oriented programming notation called Cantor, which currently runs on cubes and sequential systems, and which will be used for application programming of the Mosaic C

    Cantor User Report: Version 2.0

    Get PDF
    No abstract available

    Surgical perspectives from a prospective, nonrandomized, multicenter study of breast conserving surgery and adjuvant electronic brachytherapy for the treatment of breast cancer

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Accelerated partial breast irradiation (APBI) may be used to deliver radiation to the tumor bed post-lumpectomy in eligible patients with breast cancer. Patient and tumor characteristics as well as the lumpectomy technique can influence patient eligibility for APBI. This report describes a lumpectomy procedure and examines patient, tumor, and surgical characteristics from a prospective, multicenter study of electronic brachytherapy.</p> <p>Methods</p> <p>The study enrolled 65 patients of age 45-84 years with ductal carcinoma or ductal carcinoma in situ, and 44 patients, who met the inclusion and exclusion criteria, were treated with APBI using the Axxent<sup>® </sup>electronic brachytherapy system following lumpectomy. The prescription dose was 34 Gy in 10 fractions over 5 days.</p> <p>Results</p> <p>The lumpectomy technique as described herein varied by site and patient characteristics. The balloon applicator was implanted by the surgeon (91%) or a radiation oncologist (9%) during or up to 61 days post-lumpectomy (mean 22 days). A lateral approach was most commonly used (59%) for insertion of the applicator followed by an incision site approach in 27% of cases, a medial approach in 5%, and an inferior approach in 7%. A trocar was used during applicator insertion in 27% of cases. Local anesthetic, sedation, both or neither were administered in 45%, 2%, 41% and 11% of cases, respectively, during applicator placement. The prescription dose was delivered in 42 of 44 treated patients.</p> <p>Conclusions</p> <p>Early stage breast cancer can be treated with breast conserving surgery and APBI using electronic brachytherapy. Treatment was well tolerated, and these early outcomes were similar to the early outcomes with iridium-based balloon brachytherapy.</p

    The M-Cache: A Message-Retrieving Mechanism for Multicomputer Systems

    No full text
    This paper presents the design and evaluation of the M-cache, a small, fast and intelligent memory for handling messages at the processing nodes of multicomputer systems. The M-cache provides hardware support for the message search operation often performed in message-directed programming. It also provides a mechanism for bandwidth matching between the interconnection network and local memory of a node. Through simulation experiments, we have studied the execution of concurrent algorithms on systems with and without M-caches to obtain relative speedup measures. The results show that a modest investment in silicon is sufficient to effect over an order of magnitude reduction in message-retrieval time. Such hardware support is needed to make the cost-effective implementation of fine-grain concurrent programs a reality. 1 Introduction This paper describes the M-cache, a hardware mechanism for efficiently supporting message-based programming abstractions in multicomputers. At present ther..

    High-performance clock-powered logic

    No full text
    High performance clock-powered logic runs at below supply levels and reduces the need for faster digital logic circuitry. In a preferred embodiment, a clocked buffer (101) is used to drive the signal line. The receiving end of the line is connected to a jam latch (123), preferably followed by an n-latch (125), followed by the digital logic (109), and followed by a second n-latch (127). The first n-latch is eliminated in an alternative embodiment, preferably one that uses complementary data signals

    High-performance clock-powered logic

    No full text
    High performance clock-powered logic runs at below supply levels and reduces the need for faster digital logic circuitry. In a preferred embodiment, a clocked buffer (101) is used to drive the signal line. The receiving end of the line is connected to a jam latch (123), preferably followed by an n-latch (125), followed by the digital logic (109), and followed by a second n-latch (127). The first n-latch is eliminated in an alternative embodiment, preferably one that uses complementary data signals

    Low-Power Sequential Access Memory Design

    No full text
    This paper presents the design and evaluation of a sequential access memory (SAM) that provides low power and high performance by replacing address decoders with special locally-communicating sequencers. A test chip containing one 16x16-b SAM and one 64x16-b SAM (consisting of four 16x16-b banks) has been designed, fabricated, and evaluated using a 0.25-m CMOS process. With a clock frequency of 40MHz at 1.2V, the measured worst-case read power dissipations for the 16x16-b SAM and the 64x16-b SAM are 344W and 358W respectively, demonstrating power dissipation that is largely independent of SAM size
    corecore