248 research outputs found

    Increasing the Reliability of Adaptive Quadrature Using Explicit Interpolants

    Full text link
    We present two new adaptive quadrature routines. Both routines differ from previously published algorithms in many aspects, most significantly in how they represent the integrand, how they treat non-numerical values of the integrand, how they deal with improper divergent integrals and how they estimate the integration error. The main focus of these improvements is to increase the reliability of the algorithms without significantly impacting their efficiency. Both algorithms are implemented in Matlab and tested using both the "families" suggested by Lyness and Kaganove and the battery test used by Gander and Gautschi and Kahaner. They are shown to be more reliable, albeit in some cases less efficient, than other commonly-used adaptive integrators.Comment: 32 pages, submitted to ACM Transactions on Mathematical Softwar

    Shaping Biological Knowledge: Applications in Proteomics

    Get PDF
    The central dogma of molecular biology has provided a meaningful principle for data integration in the field of genomics. In this context, integration reflects the known transitions from a chromosome to a protein sequence: transcription, intron splicing, exon assembly and translation. There is no such clear principle for integrating proteomics data, since the laws governing protein folding and interactivity are not quite understood. In our effort to bring together independent pieces of information relative to proteins in a biologically meaningful way, we assess the bias of bioinformatics resources and consequent approximations in the framework of small-scale studies. We analyse proteomics data while following both a data-driven (focus on proteins smaller than 10 kDa) and a hypothesis-driven (focus on whole bacterial proteomes) approach. These applications are potentially the source of specialized complements to classical biological ontologies

    GenoQuery: a new querying module for functional annotation in a genomic warehouse

    Get PDF
    Motivation: We have to cope with both a deluge of new genome sequences and a huge amount of data produced by high-throughput approaches used to exploit these genomic features. Crossing and comparing such heterogeneous and disparate data will help improving functional annotation of genomes. This requires designing elaborate integration systems such as warehouses for storing and querying these data

    Relationship between the Composition of Flavonoids and Flower Colors Variation in Tropical Water Lily (Nymphaea) Cultivars

    Get PDF
    Water lily, the member of the Nymphaeaceae family, is the symbol of Buddhism and Brahmanism in India. Despite its limited researches on flower color variations and formation mechanism, water lily has background of blue flowers and displays an exceptionally wide diversity of flower colors from purple, red, blue to yellow, in nature. In this study, 34 flavonoids were identified among 35 tropical cultivars by high-performance liquid chromatography (HPLC) with photodiode array detection (DAD) and electrospray ionization mass spectrometry (ESI-MS). Among them, four anthocyanins: delphinidin 3-O-rhamnosyl-5-O-galactoside (Dp3Rh5Ga), delphinidin 3-O-(2″-O-galloyl-6″-O-oxalyl-rhamnoside) (Dp3galloyl-oxalylRh), delphinidin 3-O-(6″-O-acetyl-β-glucopyranoside) (Dp3acetylG) and cyanidin 3- O-(2″-O-galloyl-galactopyranoside)-5-O-rhamnoside (Cy3galloylGa5Rh), one chalcone: chalcononaringenin 2′-O-galactoside (Chal2′Ga) and twelve flavonols: myricetin 7-O-rhamnosyl-(1→2)-rhamnoside (My7RhRh), quercetin 7-O-galactosyl-(1→2)-rhamnoside (Qu7GaRh), quercetin 7-O-galactoside (Qu7Ga), kaempferol 7-O-galactosyl-(1→2)-rhamnoside (Km7GaRh), myricetin 3-O-galactoside (My3Ga), kaempferol 7-O-galloylgalactosyl-(1→2)-rhamnoside (Km7galloylGaRh), myricetin 3-O-galloylrhamnoside (My3galloylRh), kaempferol 3-O-galactoside (Km3Ga), isorhamnetin 7-O-galactoside (Is7Ga), isorhamnetin 7-O-xyloside (Is7Xy), kaempferol 3-O-(3″-acetylrhamnoside) (Km3-3″acetylRh) and quercetin 3-O-acetylgalactoside (Qu3acetylGa) were identified in the petals of tropic water lily for the first time. Meanwhile a multivariate analysis was used to explore the relationship between pigments and flower color. By comparing, the cultivars which were detected delphinidin 3-galactoside (Dp3Ga) presented amaranth, and detected delphinidin 3′-galactoside (Dp3′Ga) presented blue. However, the derivatives of delphinidin and cyanidin were more complicated in red group. No anthocyanins were detected within white and yellow group. At the same time a possible flavonoid biosynthesis pathway of tropical water lily was presumed putatively. These studies will help to elucidate the evolution mechanism on the formation of flower colors and provide theoretical basis for outcross breeding and developing health care products from this plant

    Gene fusions and gene duplications: relevance to genomic annotation and functional analysis

    Get PDF
    BACKGROUND: Escherichia coli a model organism provides information for annotation of other genomes. Our analysis of its genome has shown that proteins encoded by fused genes need special attention. Such composite (multimodular) proteins consist of two or more components (modules) encoding distinct functions. Multimodular proteins have been found to complicate both annotation and generation of sequence similar groups. Previous work overstated the number of multimodular proteins in E. coli. This work corrects the identification of modules by including sequence information from proteins in 50 sequenced microbial genomes. RESULTS: Multimodular E. coli K-12 proteins were identified from sequence similarities between their component modules and non-fused proteins in 50 genomes and from the literature. We found 109 multimodular proteins in E. coli containing either two or three modules. Most modules had standalone sequence relatives in other genomes. The separated modules together with all the single (un-fused) proteins constitute the sum of all unimodular proteins of E. coli. Pairwise sequence relationships among all E. coli unimodular proteins generated 490 sequence similar, paralogous groups. Groups ranged in size from 92 to 2 members and had varying degrees of relatedness among their members. Some E. coli enzyme groups were compared to homologs in other bacterial genomes. CONCLUSION: The deleterious effects of multimodular proteins on annotation and on the formation of groups of paralogs are emphasized. To improve annotation results, all multimodular proteins in an organism should be detected and when known each function should be connected with its location in the sequence of the protein. When transferring functions by sequence similarity, alignment locations must be noted, particularly when alignments cover only part of the sequences, in order to enable transfer of the correct function. Separating multimodular proteins into module units makes it possible to generate protein groups related by both sequence and function, avoiding mixing of unrelated sequences. Organisms differ in sizes of groups of sequence-related proteins. A sample comparison of orthologs to selected E. coli paralogous groups correlates with known physiological and taxonomic relationships between the organisms

    MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Multiple Sequence Alignment (MSA) is a basic tool for bioinformatics research and analysis. It has been used essentially in almost all bioinformatics tasks such as protein structure modeling, gene and protein function prediction, DNA motif recognition, and phylogenetic analysis. Therefore, improving the accuracy of multiple sequence alignment is important for advancing many bioinformatics fields.</p> <p>Results</p> <p>We designed and developed a new method, MSACompro, to synergistically incorporate predicted secondary structure, relative solvent accessibility, and residue-residue contact information into the currently most accurate posterior probability-based MSA methods to improve the accuracy of multiple sequence alignments. The method is different from the multiple sequence alignment methods (e.g. 3D-Coffee) that use the tertiary structure information of some sequences since the structural information of our method is fully predicted from sequences. To the best of our knowledge, applying predicted relative solvent accessibility and contact map to multiple sequence alignment is novel. The rigorous benchmarking of our method to the standard benchmarks (i.e. BAliBASE, SABmark and OXBENCH) clearly demonstrated that incorporating predicted protein structural information improves the multiple sequence alignment accuracy over the leading multiple protein sequence alignment tools without using this information, such as MSAProbs, ProbCons, Probalign, T-coffee, MAFFT and MUSCLE. And the performance of the method is comparable to the state-of-the-art method PROMALS of using structural features and additional homologous sequences by slightly lower scores.</p> <p>Conclusion</p> <p>MSACompro is an efficient and reliable multiple protein sequence alignment tool that can effectively incorporate predicted protein structural information into multiple sequence alignment. The software is available at <url>http://sysbio.rnet.missouri.edu/multicom_toolbox/</url>.</p
    corecore