9,187 research outputs found

    Grid-enabled SIMAP utility: Motivation, integration technology and performance results

    Get PDF
    A biological system comprises large numbers of functionally diverse and frequently multifunctional sets of elements that interact selectively and nonlinearly to produce coherent behaviours. Such a system can be anything from an intracellular biological process (such as a biochemical reaction cycle, gene regulatory network or signal transduction pathway) to a cell, tissue, entire organism, or even an ecological web. Biochemical systems are responsible for processing environmental signals, inducing the appropriate cellular responses and sequence of internal events. However, such systems are not fully or even poorly understood. Systems biology is a scientific field that is concerned with the systematic study of biological and biochemical systems in terms of complex interactions rather than their individual molecular components. At the core of systems biology is computational modelling (also called mathematical modelling), which is the process of constructing and simulating an abstract model of a biological system for subsequent analysis. This methodology can be used to test hypotheses via insilico experiments, providing predictions that can be tested by in-vitro and in-vivo studies. For example, the ERbB1-4 receptor tyrosine kinases (RTKs) and the signalling pathways they activate, govern most core cellular processes such as cell division, motility and survival (Citri and Yarden, 2006) and are strongly linked to cancer when they malfunction due to mutations etc. An ODE (ordinary differential equation)-based mass action ErbB model has been constructed and analysed by Chen et al. (2009) in order to depict what roles of each protein plays and ascertain to how sets of proteins coordinate with each other to perform distinct physiological functions. The model comprises 499 species (molecules), 201 parameters and 828 reactions. These in silico experiments can often be computationally very expensive, e.g. when multiple biochemical factors are being considered or a variety of complex networks are being simulated simultaneously. Due to the size and complexity of the models and the requirement to perform comprehensive experiments it is often necessary to use high-performance computing (HPC) to keep the experimental time within tractable bounds. Based on this as part of an EC funded cancer research project, we have developed the SIMAP Utility that allows the SImulation modeling of the MAP kinase pathway (http://www.simap-project.org). In this paper we present experiences with Grid-enabling SIMAP using Condor

    The Family of MapReduce and Large Scale Data Processing Systems

    Full text link
    In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large scale data processing mechanisms. MapReduce is a simple and powerful programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. It isolates the application from the details of running a distributed program such as issues on data distribution, scheduling and fault tolerance. However, the original implementation of the MapReduce framework had some limitations that have been tackled by many research efforts in several followup works after its introduction. This article provides a comprehensive survey for a family of approaches and mechanisms of large scale data processing mechanisms that have been implemented based on the original idea of the MapReduce framework and are currently gaining a lot of momentum in both research and industrial communities. We also cover a set of introduced systems that have been implemented to provide declarative programming interfaces on top of the MapReduce framework. In addition, we review several large scale data processing systems that resemble some of the ideas of the MapReduce framework for different purposes and application scenarios. Finally, we discuss some of the future research directions for implementing the next generation of MapReduce-like solutions.Comment: arXiv admin note: text overlap with arXiv:1105.4252 by other author

    Real Time in Plan 9

    Get PDF
    We describe our experience with the implementation and use of a hard-real-time scheduler for use in Plan 9 as an embedded operating system

    Pregelix: Big(ger) Graph Analytics on A Dataflow Engine

    Full text link
    There is a growing need for distributed graph processing systems that are capable of gracefully scaling to very large graph datasets. Unfortunately, this challenge has not been easily met due to the intense memory pressure imposed by process-centric, message passing designs that many graph processing systems follow. Pregelix is a new open source distributed graph processing system that is based on an iterative dataflow design that is better tuned to handle both in-memory and out-of-core workloads. As such, Pregelix offers improved performance characteristics and scaling properties over current open source systems (e.g., we have seen up to 15x speedup compared to Apache Giraph and up to 35x speedup compared to distributed GraphLab), and makes more effective use of available machine resources to support Big(ger) Graph Analytics
    • …
    corecore