44 research outputs found

    A highly efficient multi-core algorithm for clustering extremely large datasets

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In recent years, the demand for computational power in computational biology has increased due to rapidly growing data sets from microarray and other high-throughput technologies. This demand is likely to increase. Standard algorithms for analyzing data, such as cluster algorithms, need to be parallelized for fast processing. Unfortunately, most approaches for parallelizing algorithms largely rely on network communication protocols connecting and requiring multiple computers. One answer to this problem is to utilize the intrinsic capabilities in current multi-core hardware to distribute the tasks among the different cores of one computer.</p> <p>Results</p> <p>We introduce a multi-core parallelization of the k-means and k-modes cluster algorithms based on the design principles of transactional memory for clustering gene expression microarray type data and categorial SNP data. Our new shared memory parallel algorithms show to be highly efficient. We demonstrate their computational power and show their utility in cluster stability and sensitivity analysis employing repeated runs with slightly changed parameters. Computation speed of our Java based algorithm was increased by a factor of 10 for large data sets while preserving computational accuracy compared to single-core implementations and a recently published network based parallelization.</p> <p>Conclusions</p> <p>Most desktop computers and even notebooks provide at least dual-core processors. Our multi-core algorithms show that using modern algorithmic concepts, parallelization makes it possible to perform even such laborious tasks as cluster sensitivity and cluster number estimation on the laboratory computer.</p

    Carbon Monoxide Poisoning and Improved Method of its Spot Detection

    No full text
    The paper reviews some investigations on carbon monoxide poisoning and describes a detailed method for detection of carbon monoxide. A comparative study indicating the scope, limitation and range of the various other methods of spot detection has also been given

    Using Speculative Push for Unnecessary Checkpoint Creation Avoidance

    No full text
    Abstract. This paper discusses a way of incorporating speculation techniques into Distributed Shared Memory (DSM) systems with checkpointing mechanism without creating unnecessary checkpoints. Speculation is a general technique involving prediction of the future of a computation, namely accesses to shared objects unavailable on the accessing node (read faults). Thanks to such predictions objects can be pushed to requesting nodes before the actual access operation is performed, resulting, at least potentially, in a considerable performance improvement. This mechanism is a foundation for the proposed SpecCkpt protocol based on independent checkpointing integrated with a coherence protocol for a given consistency model introducing little overhead. It ensures the consistency of checkpoints, at the same time allowing a fast recovery from failures.

    Atomic Transactional Execution in Hardware: A New High-Performance Abstraction for Databases?

    No full text
    This paper discusses one such proposal. It is based on a hardware mechanism called Transactional Lock Removal [2] (TLR), which was originally designed to support the atomic execution of critical sections by a lock-based multithreaded program in a lock-free manner. In this paper, we explain the mechanism and suggest how it could be used to control the atomic execution of transactions in a database system. The TLR hardware identifies, at runtime, lock-protected critical sections in the program and executes these sections without acquiring the lock. TLR maintains correct semantics of the program in the absence of locks by executing and committing all operations in the now lock-free critical section &quot;atomically&quot;. Any updates performed during the critical section execution are locally buffered in processor caches. They are made visible to other threads instantaneously at the end of the critical section. By not acquiring locks, the hardware can extract inherent parallelism in the program independent of locking granularit

    Modification and testing of a gaussian dispersion model for particulate matter in the respirable size range

    No full text
    Modified expressions for vertical and horizontal dispersion coefficients σy and σz are proposed. The determined values of σy and σz, for a specified time span, are used in a Gaussian profile to predict pollution load. This model, being easy to use, serves as a convenient method to predict dust concentration (in the respirable size range)
    corecore