48 research outputs found

    Parallelizing Navier-Stokes Computations on a Variety of Architectural Platforms

    Get PDF
    We study the computational, communication, and scalability characteristics of a Computational Fluid Dynamics application, which solves the time accurate flow field of a jet using the compressible Navier-Stokes equations, on a variety of parallel architectural platforms. The platforms chosen for this study are a cluster of workstations (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), distributed memory multiprocessors with different topologies-the IBM SP and the Cray T3D. We investigate the impact of various networks, connecting the cluster of workstations, on the performance of the application and the overheads induced by popular message passing libraries used for parallelization. The work also highlights the importance of matching the memory bandwidth to the processor speed for good single processor performance. By studying the performance of an application on a variety of architectures, we are able to point out the strengths and weaknesses of each of the example computing platforms

    Parallel Navier-Stokes computations on shared and distributed memory architectures

    Get PDF
    We study a high order finite difference scheme to solve the time accurate flow field of a jet using the compressible Navier-Stokes equations. As part of our ongoing efforts, we have implemented our numerical model on three parallel computing platforms to study the computational, communication, and scalability characteristics. The platforms chosen for this study are a cluster of workstations connected through fast networks (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), and a distributed memory multiprocessor (the IBM SPI). Our focus in this study is on the LACE testbed. We present some results for the Cray YMP and the IBM SP1 mainly for comparison purposes. On the LACE testbed, we study: (1) the communication characteristics of Ethernet, FDDI, and the ALLNODE networks and (2) the overheads induced by the PVM message passing library used for parallelizing the application. We demonstrate that clustering of workstations is effective and has the potential to be computationally competitive with supercomputers at a fraction of the cost

    The six minute walk test accurately estimates mean peak oxygen uptake

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Both Peak Oxygen Uptake (peak VO2), from cardiopulmonary exercise testing (CPET) and the distance walked during a Six-Minute Walk Test (6 MWD) are used for following the natural history of various diseases, timing of procedures such as transplantation and for assessing the response to therapeutic interventions. However, their relationship has not been clearly defined.</p> <p>Methods</p> <p>We determined the ability of 6 MWD to predict peak VO2 using data points from 1,083 patients with diverse cardiopulmonary disorders. The patient data came from a study we performed and 10 separate studies where we were able to electronically convert published scattergrams to bivariate points. Using Linear Mixed Model analysis (LMM), we determined what effect factors such as disease entity and different inter-site testing protocols contributed to the magnitude of the standard error of estimate (SEE).</p> <p>Results</p> <p>The LMM analysis found that only 0.16 ml/kg/min or about 4% of the SEE was due to all of the inter-site testing differences. The major source of error is the inherent variability related to the two tests. Therefore, we were able to create a generalized equation that can be used to predict peak VO2 among patients with different diseases, who have undergone various exercise protocols, with minimal loss of accuracy. Although 6 MWD and peak VO2 are significantly correlated, the SEE is unacceptably large for clinical usefulness in an individual patient. For the data as a whole it is 3.82 ml/kg/min or 26.7% of mean peak VO2. Conversely, the SEE for predicting the mean peak VO2 from mean 6 MWD for the 11 study groups is only 1.1 ml/kg/min.</p> <p>Conclusions</p> <p>A generalized equation can be used to predict peak VO2 from 6 MWD. Unfortunately, like other prediction equations, it is of limited usefulness for individual patients. However, the generalized equation can be used to accurately estimate mean peak VO2 from mean 6 MWD, among groups of patients with diverse diseases without the need for cardiopulmonary exercise testing. The equation is:</p> <p><display-formula><graphic file="1471-2466-10-31-i1.gif"/></display-formula></p

    Magnetization Dynamics, Bennett Clocking and Associated Energy Dissipation in Multiferroic Logic

    Full text link
    It has been recently shown that multiferroic logic - where logic bits are encoded in the magnetization orientation of a nanoscale magnetostrictive layer elastically coupled to a piezoelectric layer - can be Bennett clocked with small electrostatic potentials of few tens of mV applied to the piezoelectric layer. The potential generates stress in the magnetostrictive layer and rotates its magnetization by a large angle to carry out Bennett clocking. This method of clocking is far more energy-efficient than using spin transfer torque. In order to assess if such a clocking scheme can be also reasonably fast, we have studied the magnetization dynamics of a multiferroic logic array with nearest neighbor dipole coupling using the Landau-Lifshitz-Gilbert (LLG) equation. We find that switching delays of ~ 3 ns (clock rates of 0.33 GHz) can be achieved with proper design provided we clock non-adiabatically and dissipate ~48,000 kT (at room temperature) of energy per clock cycle per bit flip in the clocking circuit. This dissipation far exceeds the energy barrier separating the two logic states, which we assumed to be 32 kT to yield a bit error probability of . Had we used spin transfer torque to switch with the same ~ 3 ns delay, the energy dissipation would have been much larger (~ 6Γ—1066 \times 106 kT). This shows that spin transfer torque, widely used in magnetic random access memory, is an inefficient way to switch a magnet, and multiferroic logic clocked with voltage-induced stress is a superior nanomagnetic logic scheme

    Alterations in Adenosine Metabolism and Signaling in Patients with Chronic Obstructive Pulmonary Disease and Idiopathic Pulmonary Fibrosis

    Get PDF
    Background: Adenosine is generated in response to cellular stress and damage and is elevated in the lungs of patients with chronic lung disease. Adenosine signaling through its cell surface receptors serves as an amplifier of chronic lung disorders, suggesting adenosine-based therapeutics may be beneficial in the treatment of lung diseases such as chronic obstructive pulmonary disease (COPD) and idiopathic pulmonary fibrosis (IPF). Previous studies in mouse models of chronic lung disease demonstrate that the key components of adenosine metabolism and signaling are altered. Changes include an upregulation of CD73, the major enzyme of adenosine production and down-regulation of adenosine deaminase (ADA), the major enzyme for adenosine metabolism. In addition, adenosine receptors are elevated. Methodology/Principal Findings: The focus of this study was to utilize tissues from patients with COPD or IPF to examine whether changes in purinergic metabolism and signaling occur in human disease. Results demonstrate that the levels of CD73 and A2BR are elevated in surgical lung biopsies from severe COPD and IPF patients. Immunolocalization assays revealed abundant expression of CD73 and the A2BR in alternatively activated macrophages in both COPD and IPF samples. In addition, mediators that are regulated by the A 2BR, such as IL-6, IL-8 and osteopontin were elevated in these samples and activation of the A 2BR on cells isolated from the airways of COPD and IPF patients was shown to directly induce the production of these mediators. Conclusions/Significance: These findings suggest that components of adenosine metabolism and signaling are altered in

    Optimal Fully Adaptive Wormhole Routing for Meshes

    No full text
    A deadlock-free fully adaptive routing algorithm for 2D meshes which is optimal in the number of virtual channels required and in the number of restrictions placed on the use of these virtual channels is presented. The routing algorithm imposes less than half as many routing restrictions as any previous fully adaptive routing algorithm. It is also proved that, ignoring symmetry, this routing algorithm is the only fully adaptive routing algorithm that achieves both of these goals. The algorithm exploits the fact that for some adaptive routing algorithms, deadlock freedom is possible even when cycles are present in the channel dependency graph. The implementation of the routing algorithm requires relatively simple router control logic. The routing algorithm requires only the minimum number of virtual channels even when extended to arbitrary dimension meshes, yielding a dramatic reduction in the number of virtual channels needed to support fully adaptive routing. Compared to all previous algorithms which required an exponential number of virtual channels with the dimension of the mesh, the new algorithm requires only 4n - 2 virtual channels for an n-dimensional mesh

    A Necessary and Sufficient Condition for Deadlock-Free Wormhole Routing

    No full text
    An important open problem in wormhole routing has been to find a necessary and sufficient condition for deadlock-free adaptive routing. Recently, Duato has solved this problem for a restricted class of adaptive routing algorithms. In this paper, a necessary and sufficient condition is proposed that can be used for any adaptive or nonadaptive routing algorithm for wormhole routing, as long as only local information is required for routing. The underlying proof technique introduces a new type of dependency graph, the channel waiting graph, which omits most channel dependencies that cannot be used to create a deadlock configuration. The necessary and sufficient condition can be applied in a straightforward manner to most routing algorithms. This is illustrated by proving deadlock freedom for a partially adaptive nonminimal mesh routing algorithm that does not require virtual channels and a fully adaptive minimal hypercube routing algorithm with two virtual channels per physical channel. B..

    A Universal Proof Technique for Deadlock-Free Routing in Interconnection Networks

    No full text
    An important open problem in interconnection network routing has been to characterize the conditions under which routing algorithms are deadlock-free. Although this problem has been resolved for restricted classes of routing algorithms, no general solution has been found. In this paper, we solve this problem by proving a necessary and sufficient condition that can be used for any interconnection network routing algorithm, as long as only local information is required for routing. Our proof technique is universal: it can be used with any switching technique that is not inherently deadlock-free. This includes switching techniques such as wormhole routing, store-and-forward routing, and virtual cut-through. The proof technique for the necessary and sufficient condition introduces a new type of dependency graph, the buffer waiting graph, which omits most dependencies that cannot be used to create a deadlock configuration. Our methodology is illustrated by proving deadlock freedom for a sto..
    corecore