52 research outputs found

    Agile parallel bioinformatics workflow management using Pwrake

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error.</p> <p>Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows.</p> <p>Findings</p> <p>We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows.</p> <p>Conclusions</p> <p>Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability and maintainability of rakefiles may facilitate sharing workflows among the scientific community. Workflows for GATK and Dindel are available at <url>http://github.com/misshie/Workflows</url>.</p

    Skew-Aware Collective Communication for MapReduce Shuffling

    Get PDF
    This paper proposes and examines the three in-memory shuffling methods designed to address problems in MapReduce shuffling caused by skewed data. Coupled Shuffle Architecture (CSA) employs a single pairwise all-to-all exchange to shuffle both blocks, units of shuffle transfer, and meta-blocks, which contain the metadata of corresponding blocks. Decoupled Shuffle Architecture (DSA) separates the shuffling of meta-blocks and blocks, and applies different all-to-all exchange algorithms to each shuffling process, attempting to mitigate the impact of stragglers in strongly skewed distributions. Decoupled Shuffle Architecture with Skew-Aware Meta-Shuffle (DSA w/ SMS) autonomously determines the proper placement of blocks based on the memory consumption of each worker process. This approach targets extremely skewed situations where some worker processes could exceed their node memory limitation. This study evaluates implementations of the three shuffling methods in our prototype in-memory MapReduce engine, which employs high performance interconnects such as InfiniBand and Intel Omni-Path. Our results suggest that DSA w/ SMS is the only viable solution for extremely skewed data distributions. We also present a detailed investigation of the performance of CSA and DSA in various skew situations

    The PI3K-Akt Pathway in SN-38-Induced Apoptosis in Human Gastric Cancer Cell Lines

    Get PDF
    SN-38, an active metabolite of a topoisomerase I inhibitor, CPT-11, exhibits a cytotoxic effect by inducing apoptosis in cancer cells. Phosphatidylinositol-3-OH kinase (PI3K)-Akt signaling is known to protect a variety of cells from apoptosis. The relationship between resistance to SN-38-induced apoptosis and the PI3K-Akt pathway in human gastric cancer cells is unknown. Here, we did an investigation using two gastric cancer cell lines, MKN1 and MKN45. Cell viability was determined by sodium 3'-[1-(phenylaminocarbonyl)-3,4-tetrazolium]-bis(4-methoxy-6-nitro) benzene sulfonic acid hydrate (XTT) assay. Apoptosis was confirmed by fluorescence microscopy using Hoechst 33342 staining. Expression levels of phospho-Akt (pAkt) were determined by Western blotting. After being treated with SN-38, the populations of sub-G1 cells were induced by flow cytometry in 36.8% of MKN45 cells more frequently than in 13.5% of MKN1 cells. SN-38 inhibited the expression of pAkt dose-dependently in MKN45 cells, but not in MKN1 cells. In MKN1 cells, an additional pretreatment with the PI3K inhibitor, LY294002, led to the inhibition of pAkt expression and induced apoptosis. The results suggested that SN-38 induces apoptosis by decreasing PI3K-Akt survival signaling, the anti-apoptotic signals, in human gastric cancer cells. Akt inhibitor might be a useful anti-tumor agent in combination with CPT-11

    Multifaceted Assessment of Chronic Gastritis: A Study of Correlations between Serological, Endoscopic, and Histological Diagnostics

    Get PDF
    Aim. Chronic gastritis was assessed serologically, endoscopically and histologically to identify correlations between these methods. Methods. Subjects comprised 319 patients who had provided informed consent. Serological assessment of chronic gastritis was based on the pepsinogen test method. Endoscopic gastritis and histological gastritis were assessed and scored according to the Kimura-Takemoto classification system and the updated Sydney classification system respectively, and correlations between these three methods were studied. Results. Pepsinogen I/II ratio showed a significant correlation to the extent of mononuclear cell infiltration of the gastric corpus. When histological gastritis was divided, on the basis of the distribution of mononuclear cell infiltration, into gastritis limited to the antrum and corpus gastritis, these types were distinguished with high accuracy using a pepsinogen I/II ratio of 3 as the cutoff. A good correlation was also seen between pepsinogen I/II ratio and development of atrophy in endoscopic gastritis, where groups with and without advanced atrophy were also distinguished with high accuracy using a cutoff value of 3. Conclusion. Significant correlations exist between serum pepsinogen levels, endoscopic gastritis, and histological gastritis. Pepsinogen I/II ratio allows prediction of the existence of endoscopic gastritis and histological gastritis, or the extent of their development, with high accuracy

    The Multigrid Preconditioned Conjugate Gradient Method

    Get PDF
    This paper considers an efficient preconditioner and proposes a multigrid preconditioned conjugate gradient method (MGCG method) which is the conjugate gradient method with the multigrid method as a preconditioner. The combination of the multigrid method and the conjugate gradient method was already considered. Kettler and Meijerink [7] and Kettler [8] treated the multigrid method as a preconditioner of the conjugate gradient method. However this paper formulates MGCG method more generally than their ones and requirements of the multigrid preconditioner are studied. On the other hands, Bank and Douglas [2] treated the conjugate gradient method as a relaxation method of the multigrid method. Braess [3] considered these two combinations and reported the conjugate gradient method with a multigrid preconditioning is effective for elasticity problems. We study requirements of the valid multigrid preconditioner and evaluates this preconditioner by some numerical experiments and eigenvalue analysis. Especially, eigenvalue analysis is more direct and more reasonable criterion than convergence rate, since the number of iterations of the conjugate gradient method until convergence depends on the eigenvalues&apos; distribution of the preconditioned matrix. In Sections 2 and 3, the preconditioned conjugate gradient method and the multigrid method which are the basis of this paper are briefly explained. Section 4 discusses the requirements of the valid two-grid preconditioner for the conjugate gradient method. Then in Section 5, it is extended to the requirements of the multigrid preconditioner. In Section 7, numerical experiments show that MGCG method converges with very few iterations even for ill-conditioned problems. In Section 8, eigenvalue analysis is performed, and it is realize..

    Efficient Implementation of the Multigrid Preconditioned Conjugate Gradient Method on Distributed Memory Machines

    No full text
    A multigrid preconditioned conjugate gradient (MGCG) method[15], which uses the multigrid method as a preconditioner for the CG method, has a good convergence rate even for the problems on which the standard multigrid method does not converge efficiently. This paper considers a parallelization of the MGCG method and proposes an efficient parallel MGCG method on distributed memory machines. For the good convergence rate of the MGCG method, several difficulties in parallelizing the multigrid method are successfully settled. Besides, the parallel MGCG method on Fujitsu multicomputer AP1000[8] has high performance and it is more than 10 times faster than the Scaled CG (SCG) method[6]. 1 Introduction Parallelization of the multigrid method has been studied and several parallel multigrid methods have been implemented. One natural parallelization approach is governed by the grid partitioning principles. Sbosny[13] analyzed and implemented parallel multigrid method using the domain decomposit..
    corecore