51 research outputs found

    A cost-effective heuristic to schedule local and remote memory in cluster computers

    Full text link
    Cluster computers represent a cost-effective alternative solution to supercomputers. In these systems, it is common to constrain the memory address space of a given processor to the local motherboard. Constraining the system in this way is much cheaper than using a full-fledged shared memory implementation among motherboards. However, memory usage among motherboards can be unfairly balanced. On the other hand, remote memory access (RMA) hardware provides fast interconnects among the motherboards of a cluster. RMA devices can be used to access remote RAM memory from a local motherboard. This work focuses on this capability in order to achieve a better global use of the total RAM memory in the system. More precisely, the address space of local applications is extended to remote motherboards and is used to access remote RAM memory. This paper presents an ideal memory scheduling algorithm and proposes a cost-effective heuristic to allocate local and remote memory among local applications. Compared to the devised ideal algorithm, the heuristic obtains the same (or closely resembling) results while largely reducing the computational cost. In addition, we analyze the impact on the performance of stand alone applications varying the memory distribution among regions (local, local to board, and remote). Then, this study is extended to any number of concurrent applications. Experimental results show that a QoS parameter is needed in order to avoid unacceptable performance degradation. © 2011 Springer Science+Business Media, LLC.This work was supported by Spanish CICYT under Grant TIN2009-14475-C04-01 and by Consolider-Ingenio under Grant CSD2006-00046.Serrano Gómez, M.; Sahuquillo Borrás, J.; Petit Martí, SV.; Hassan Mohamed, H.; Duato Marín, JF. (2012). A cost-effective heuristic to schedule local and remote memory in cluster computers. Journal of Supercomputing. 59(3):1533-1551. https://doi.org/10.1007/s11227-011-0566-8S15331551593IBM journal of Research and Development staff (2008) Overview of the IBM blue gene/P project. IBM J Res Dev 52(1/2):199–220Blocksome M, Archer C, Inglett T, McCarthy P, Mundy M, Ratterman J, Sidelnik A, Smith B, Almási G, Castaños J, Lieber D, Moreira J, Krishnamoorthy S, Tipparaju V, Nieplocha J (2006) Design and implementation of a one-sided communication interface for the IBM eServer Blue Gene® supercomputer. In: Proceedings of the 2006 ACM/IEEE conference on supercomputing, SC ’06, Tampa, FL, USA, November 2006, pp 54–54Kumar S, Dózsa G, Almasi G, Heidelberger P, Chen D, Giampapa M, Blocksome M, Faraj A, Parker J, Ratterman J, Smith BE, Archer C (2008) The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer. In: Proceedings of the 22nd annual international conference on supercomputing, Island of Kos, Greece, June 2008, pp 94–103Tipparaju V, Kot A, Nieplocha J, Bruggencate MT, Chrisochoides N (2007) Evaluation of remote memory access communication on the cray XT3. In: Proceedings of the 21th international parallel and distributed processing symposium, Long Beach, California, USA, March 2007, pp 1–7Nussle M, Scherer M, Bruning U (2009) A resource optimized remote-memory-access architecture for low-latency communication. In: International conference on parallel processing, Sept 2009, pp 220–227http://www.hypertransport.org/Serrano M, Sahuquillo J, Hassan H, Petit S, Duato J (2010) A scheduling heuristic to handle local and remote memory in cluster computers. In: Proceedings of the 12th IEEE international conference on high performance computing, Melbourne, Australia, Sept 2010, pp 35–42Keltcher CN, McGrath KJ, Ahmed A, Conway P (2003) The AMD opteron processor for multiprocessor servers. IEEE MICRO 23(2):66–76Duato J, Silla F, Yalamanchili S (2009) Extending hypertransport protocol for improved scalability. In: First international workshop on hypertransport research and applications.Litz H, Fröening H, Nuessle M, Brüening U (2007) A hypertransport network interface controller for ultra-low latency message transfers. HyperTransport Consortium White Paperhttps://www.simics.net/http://www.cs.wisc.edu/gems/http://www.cs.virginia.edu/stream/Woo SC, Ohara M, Torrie E, Singh JP, Gupta A (1995) The SPLASH-2 programs: Characterization and methodological considerations. In: Proceedings of the 22nd annual international symposium on computer architecture, New York, NY, USA, 1995, pp 24–36Levitin A (2003) Introduction to the design and analysis of algorithms. Addison Wesley, ReadingOleszkiewicz J, Xiao L, Liu Y (2004) Parallel network RAM: Effectively utilizing global cluster memory for large data-intensive parallel programs. In: Proceedings of 33rd international conference on parallel processing, Montreal, Quebec, Canada, pp 353–360Liang S, Noronha R, Panda DK (2005) Swapping to remote memory over infiniband: An approach using a high performance network block device. In: Proceedings of the 2005 IEEE international conference on cluster computing, Boston, Massachusetts, USA, pp 1–10Oguchi M, Kitsuregawa M (2000) Using available remote memory dynamically for parallel data mining application on ATM-connected PC cluster. In: Proceedings of the 14th international parallel & distributed processing symposium, Cancun, Mexico, pp 411–420Werstein P, Jia X, Huang Z (2007) A remote memory swapping system for cluster computers. In: Proceedings of the eighth international conference on parallel and distributed computing, applications and technologies, Adelaide, Australia, pp 75–81Midorikawa H, Kurokawa M, Himeno R, Sato M (2008) DLM: A distributed large memory system using remote memory swapping over cluster nodes. In: Proceedings of the 2008 IEEE international conference on cluster computing, Tsukuba, Japan, October 2008, pp 268–27

    Active memory controller

    Full text link
    Inability to hide main memory latency has been increasingly limiting the performance of modern processors. The problem is worse in large-scale shared memory systems, where remote memory latencies are hundreds, and soon thousands, of processor cycles. To mitigate this problem, we propose an intelligent memory and cache coherence controller (AMC) that can execute Active Memory Operations (AMOs). AMOs are select operations sent to and executed on the home memory controller of data. AMOs can eliminate a significant number of coherence messages, minimize intranode and internode memory traffic, and create opportunities for parallelism. Our implementation of AMOs is cache-coherent and requires no changes to the processor core or DRAM chips. In this paper, we present the microarchitecture design of AMC, and the programming model of AMOs. We compare AMOs\u27 performance to that of several other memory architectures on a variety of scientific and commercial benchmarks. Through simulation, we show that AMOs offer dramatic performance improvements for an important set of data-intensive operations, e.g., up to 50x faster barriers, 12x faster spinlocks, 8.5x-15x faster stream/array operations, and 3x faster database queries. We also present an analytical model that can predict the performance benefits of using AMOs with decent accuracy. The silicon cost required to support AMOs is less than 1% of the die area of a typical high performance processor, based on a standard cell implementation

    Structural analogs of huperzine A improve survival in guinea pigs exposed to soman

    No full text
    Chemical warfare nerve agents such as soman exert their toxic effects through an irreversible inhibition of acetylcholinesterase (AChE) and subsequently glutamatergic function, leading to uncontrolled seizures. The natural alkaloid (-)-huperzine A is a potent inhibitor of AChE and has been demonstrated to exert neuroprotection at an appropriate dose. It is hypothesized that analogs of both (+)- and (-)-huperzine A with an improved ability to interact with NMDA receptors together with reduced AChE inhibition will exhibit more effective neuroprotection against nerve agents. In this manuscript, the tested huperzine A analogs 2 and 3 were demonstrated to improve survival of guinea pigs exposed to soman at either 1.2 or 2 × LD50

    Liquid Water: Obtaining the Right Answer for the Right Reasons

    No full text
    Water is ubiquitous on our planet and plays an essential role in several key chemical and biological processes. Accurate models for water are crucial in understanding, controlling and predicting the physical and chemical properties of complex aqueous systems. Over the last few years we have been developing a molecular-level based approach for a macroscopic model for water that is based on the explicit description of the underlying intermolecular interactions between molecules in water clusters. In the absence of detailed experimental data for small water clusters, highly-accurate theoretical results are required to validate and parameterize model potentials. As an example of the benchmarks needed for the development of accurate models for the interaction between water molecules, for the most stable structure of (H2O)20 we ran a coupled-cluster calculation on the ORNL's Jaguar petaflop computer that used over 100 TB of memory for a sustained performance of 487 TFLOP/s (double precision) on 96,000 processors, lasting for 2 hours. By this summer we will have studied multiple structures of both (H2O)20 and (H2O)24 and completed basis set and other convergence studies and anticipate the sustained performance rising close to 1 PFLOP/s

    Preliminary Structure-Activity Relationships and Biological Evaluation of Novel Antitubercular Indolecarboxamide Derivatives Against Drug-Susceptible and Drug-Resistant Mycobacterium tuberculosis Strains

    No full text
    Tuberculosis (TB) remains one of the leading causes of mortality and morbidity worldwide, with approximately one-third of the world’s population infected with latent TB. This is further aggravated by HIV coinfection and the emergence of multidrug- and extensively drug-resistant (MDR and XDR, respectively) TB; hence the quest for highly effective antitubercular drugs with novel modes of action is imperative. We report herein the discovery of an indole-2-carboxamide analogue, 3, as a highly potent antitubercular agent, and the subsequent chemical modifications aimed at establishing a preliminary body of structure–activity relationships (SARs). These efforts led to the identification of three molecules (12–14) possessing an exceptional activity in the low nanomolar range against actively replicating Mycobacterium tuberculosis, with minimum inhibitory concentration (MIC) values lower than those of the most prominent antitubercular agents currently in use. These compounds were also devoid of apparent toxicity to Vero cells. Importantly, compound 12 was found to be active against the tested XDR-TB strains and orally active in the serum inhibition titration assay
    • …
    corecore