610 research outputs found

    PerfBlower: Quickly Detecting Memory-Related Performance Problems via Amplification

    Get PDF
    Performance problems in managed languages are extremely difficult to find. Despite many efforts to find those problems, most existing work focuses on how to debug a user-provided test execution in which performance problems already manifest. It remains largely unknown how to effectively find performance bugs before software release. As a result, performance bugs often escape to production runs, hurting software reliability and user experience. This paper describes PerfBlower, a general performance testing framework that allows developers to quickly test Java programs to find memory-related performance problems. PerfBlower provides (1) a novel specification language ISL to describe a general class of performance problems that have observable symptoms; (2) an automated test oracle via emph{virtual amplification}; and (3) precise reference-path-based diagnostic information via object mirroring. Using this framework, we have amplified three different types of problems. Our experimental results demonstrate that (1) ISL is expressive enough to describe various memory-related performance problems; (2) PerfBlower successfully distinguishes executions with and without problems; 8 unknown problems are quickly discovered under small workloads; and (3) PerfBlower outperforms existing detectors and does not miss any bugs studied before in the literature

    Optimal Data Partitioning and a Test Case for Ray-Finned Fishes (Actinopterygii) Based on Ten Nuclear Loci

    Get PDF
    Data partitioning, the combined phylogenetic analysis of homogeneous blocks of data, is a common strategy used to accommodate heterogeneities in complex multilocus data sets. Variation in evolutionary rates and substitution patterns among sites are typically addressed by partitioning data by gene, codon position, or both. Excessive partitioning of the data, however, could lead to overparameterization; therefore, it seems critical to define the minimum numbers of partitions necessary to improve the overall fit of the model. We propose a new method, based on cluster analysis, to find an optimal partitioning strategy for multilocus protein-coding data sets. A heuristic exploration of alternative partitioning schemes, based on Bayesian and maximum likelihood (ML) criteria, is shown here to produce an optimal number of partitions. We tested this method using sequence data of 10 nuclear genes collected from 52 ray-finned fish (Actinopterygii) and four tetrapods. The concatenated sequences included 7995 nucleotide sites maximally split into 30 partitions defined a priori based on gene and codon position. Our results show that a model based on only 10 partitions defined by cluster analysis performed better than partitioning by both gene and codon position. Alternative data partitioning schemes also are shown to affect the topologies resulting from phylogenetic analysis, especially when Bayesian methods are used, suggesting that overpartitioning may be of major concern. The phylogenetic relationships among the major clades of ray-finned fish were assessed using the best data-partitioning schemes under ML and Bayesian methods. Some significant results include the monophyly of “Holostei” (Amia and Lepisosteus), the sister-group relationships between (1) esociforms and salmoniforms and (2) osmeriforms and stomiiforms, the polyphyly of Perciformes, and a close relationship of cichlids and atherinomorphs

    Making sense of microarray data : development of an integrated bioinformatics tool

    Get PDF
    Microarray technology promises to monitor interactions among tens of thousands of genes simultaneously. Two types of microarrays, Oligonucleotide (oligo) and cDNA arrays, are in common use. Oligo arrays have the advantage of providing a platform that can be more readily compared between laboratories. With rapid evolution of hardware and lab protocols, the challenge becomes the analysis of a vast amount of data rather than the manufacture or the use of microarrays. Most software applications were developed dealing with cDNA arrays. There remains a lack of tools that can be used for oligo array analysis. The goal of this research project is to develop a bioinformatics tool dedicated to analyzing oligo array data. Our tool, AffyMiner, consists of three functional components: GeneFinding---finding significant genes in the experiment, GOTree---constructing a Gene Ontology (GO) tree, and interfaces---linking to third-party applications. AffyMiner effectively deals with multiple replicates in the experiment, provides users flexibility of choosing different data metrics for finding significant genes, and is capable of incorporating various gene annotations. In addition, AffyMiner maps genes of interest onto the GO spaces, providing assistance in the interpretation of findings in the context of biology. Furthermore, AffyMiner provides a portal to use Cluster and GenMAPP, two popular programs for microarray analysis. AffyMiner has been used by multiple users and was found to be an effective tool that has reduced plenty of time and efforts needed for data analysi

    A Practical Approach to Phylogenomics: The Phylogeny of Ray-Finned Fish (Actinopterygii) as a Case Study

    Get PDF
    Background: Molecular systematics occupies one of the central stages in biology in the genomic era, ushered in by unprecedented progress in DNA technology. The inference of organismal phylogeny is now based on many independent genetic loci, a widely accepted approach to assemble the tree of life. Surprisingly, this approach is hindered by lack of appropriate nuclear gene markers for many taxonomic groups especially at high taxonomic level, partially due to the lack of tools for efficiently developing new phylogenetic makers. We report here a genome-comparison strategy to identifying nuclear gene markers for phylogenetic inference and apply it to the ray-finned fishes – the largest vertebrate clade in need of phylogenetic resolution. Results: A total of 154 candidate molecular markers – relatively well conserved, putatively single-copy gene fragments with long, uninterrupted exons – were obtained by comparing whole genome sequences of two model organisms, Danio rerio and Takifugu rubripes. Experimental tests of 15 of these (randomly picked) markers on 36 taxa (representing two-thirds of the ray-finned fish orders) demonstrate the feasibility of amplifying by PCR and directly sequencing most of these candidates from whole genomic DNA in a vast diversity of fish species. Preliminary phylogenetic analyses of sequence data obtained for 14 taxa and 10 markers (total of 7,872 bp for each species) are encouraging, suggesting that the markers obtained will make significant contributions to future fish phylogenetic studies. Conclusion: We present a practical approach that systematically compares whole genome sequences to identify single-copy nuclear gene markers for inferring phylogeny. Our method is an improvement over traditional approaches (e.g., manually picking genes for testing) because it uses genomic information and automates the process to identify large numbers of candidate makers. This approach is shown here to be successful for fishes, but also could be applied to other groups of organisms for which two or more complete genome sequences exist, which has important implications for assembling the tree of life

    7TMRmine: a Web server for hierarchical mining of 7TMR proteins

    Get PDF
    Background: Seven-transmembrane region-containing receptors (7TMRs) play central roles in eukaryotic signal transduction. Due to their biomedical importance, thorough mining of 7TMRs from diverse genomes has been an active target of bioinformatics and pharmacogenomics research. The need for new and accurate 7TMR/GPCR prediction tools is paramount with the accelerated rate of acquisition of diverse sequence information. Currently available and often used protein classification methods (e.g., profile hidden Markov Models) are highly accurate for identifying their membership information among already known 7TMR subfamilies. However, these alignment-based methods are less effective for identifying remote similarities, e.g., identifying proteins from highly divergent or possibly new 7TMR families. In this regard, more sensitive (e.g., alignment-free) methods are needed to complement the existing protein classification methods. A better strategy would be to combine different classifiers, from more specific to more sensitive methods, to identify a broader spectrum of 7TMR protein candidates. Description: We developed a Web server, 7TMRmine, by integrating alignment-free and alignment-based classifiers specifically trained to identify candidate 7TMR proteins as well as transmembrane (TM) prediction methods. This new tool enables researchers to easily assess the distribution of GPCR functionality in diverse genomes or individual newly-discovered proteins. 7TMRmine is easily customized and facilitates exploratory analysis of diverse genomes. Users can integrate various alignment-based, alignment-free, and TM-prediction methods in any combination and in any hierarchical order. Sixteen classifiers (including two TM-prediction methods) are available on the 7TMRmine Web server. Not only can the 7TMRmine tool be used for 7TMR mining, but also for general TM-protein analysis. Users can submit protein sequences for analysis, or explore pre-analyzed results for multiple genomes. The server currently includes prediction results and the summary statistics for 68 genomes. Conclusion: 7TMRmine facilitates the discovery of 7TMR proteins. By combining prediction results from different classifiers in a multi-level filtering process, prioritized sets of 7TMR candidates can be obtained for further investigation. 7TMRmine can be also used as a general TM-protein classifier. Comparisons of TM and 7TMR protein distributions among 68 genomes revealed interesting differences in evolution of these protein families among major eukaryotic phyla

    A practical approach to phylogenomics: the phylogeny of ray-finned fish (Actinopterygii) as a case study

    Get PDF
    BACKGROUND: Molecular systematics occupies one of the central stages in biology in the genomic era, ushered in by unprecedented progress in DNA technology. The inference of organismal phylogeny is now based on many independent genetic loci, a widely accepted approach to assemble the tree of life. Surprisingly, this approach is hindered by lack of appropriate nuclear gene markers for many taxonomic groups especially at high taxonomic level, partially due to the lack of tools for efficiently developing new phylogenetic makers. We report here a genome-comparison strategy to identifying nuclear gene markers for phylogenetic inference and apply it to the ray-finned fishes – the largest vertebrate clade in need of phylogenetic resolution. RESULTS: A total of 154 candidate molecular markers – relatively well conserved, putatively single-copy gene fragments with long, uninterrupted exons – were obtained by comparing whole genome sequences of two model organisms, Danio rerio and Takifugu rubripes. Experimental tests of 15 of these (randomly picked) markers on 36 taxa (representing two-thirds of the ray-finned fish orders) demonstrate the feasibility of amplifying by PCR and directly sequencing most of these candidates from whole genomic DNA in a vast diversity of fish species. Preliminary phylogenetic analyses of sequence data obtained for 14 taxa and 10 markers (total of 7,872 bp for each species) are encouraging, suggesting that the markers obtained will make significant contributions to future fish phylogenetic studies. CONCLUSION: We present a practical approach that systematically compares whole genome sequences to identify single-copy nuclear gene markers for inferring phylogeny. Our method is an improvement over traditional approaches (e.g., manually picking genes for testing) because it uses genomic information and automates the process to identify large numbers of candidate makers. This approach is shown here to be successful for fishes, but also could be applied to other groups of organisms for which two or more complete genome sequences exist, which has important implications for assembling the tree of life

    Complete Genome Sequences of Pseudomonas fluorescens Bacteriophages Isolated from Freshwater Samples in Omaha, Nebraska

    Get PDF
    The complete genome sequences of four Pseudomonas fluorescens bacteriophages, UNO-SLW1 to UNO-SLW4, isolated from freshwater samples, are 39,092 to 39,215 bp long. The genomes are highly similar (identity, \u3e0.995) but dissimilar from that of Pseudomonas phage Pf-10 (the closest relative, 0.685 to 0.686 identity), with 48 to 49 protein-coding genes and 66 regulatory sites predicted

    Transcriptomic variation of hepatopancreas reveals the energy metabolism and biological processes associated with molting in Chinese mitten crab, Eriocheir sinensis

    Get PDF
    Molting is a critical developmental process for crustaceans, yet the underlying molecular mechanism is unknown. In this study, we used RNA-Seq to investigate transcriptomic profiles of the hepatopancreas and identified differentially expressed genes at four molting stages of Chinese mitten crab (Eriocheir sinensis). A total of 97,398 transcripts were assembled, with 31,900 transcripts annotated. Transcriptomic comparison revealed 1,189 genes differentially expressed amongst different molting stages. We observed a pattern associated with energy metabolism and physiological responses during a molting cycle. In specific, differentially expressed genes enriched in postmolt were linked to energy consumption whereas genes enriched in intermolt were related to carbohydrates, lipids metabolic and biosynthetic processes. In premolt, a preparation stage for upcoming molting and energy consumption, highly expressed genes were enriched in response to steroid hormone stimulus and immune system development. The expression profiles of twelve functional genes detected via RNA-Seq were corroborated through real-time RT-PCR assay. Together, our results, including assembled transcriptomes, annotated functional elements and enriched differentially expressed genes amongst different molting stages, provide novel insights into the functions of the hepatopancreas in energy metabolism and biological processes pertaining to molting in crustaceans
    • …
    corecore