242 research outputs found

    Testing statistical significance scores of sequence comparison methods with structure similarity

    Get PDF
    BACKGROUND: In the past years the Smith-Waterman sequence comparison algorithm has gained popularity due to improved implementations and rapidly increasing computing power. However, the quality and sensitivity of a database search is not only determined by the algorithm but also by the statistical significance testing for an alignment. The e-value is the most commonly used statistical validation method for sequence database searching. The CluSTr database and the Protein World database have been created using an alternative statistical significance test: a Z-score based on Monte-Carlo statistics. Several papers have described the superiority of the Z-score as compared to the e-value, using simulated data. We were interested if this could be validated when applied to existing, evolutionary related protein sequences. RESULTS: All experiments are performed on the ASTRAL SCOP database. The Smith-Waterman sequence comparison algorithm with both e-value and Z-score statistics is evaluated, using ROC, CVE and AP measures. The BLAST and FASTA algorithms are used as reference. We find that two out of three Smith-Waterman implementations with e-value are better at predicting structural similarities between proteins than the Smith-Waterman implementation with Z-score. SSEARCH especially has very high scores. CONCLUSION: The compute intensive Z-score does not have a clear advantage over the e-value. The Smith-Waterman implementations give generally better results than their heuristic counterparts. We recommend using the SSEARCH algorithm combined with e-values for pairwise sequence comparisons

    A fresh look at the evolution and diversification of photochemical reaction centers

    Get PDF
    In this review, I reexamine the origin and diversification of photochemical reaction centers based on the known phylogenetic relations of the core subunits, and with the aid of sequence and structural alignments. I show, for example, that the protein folds at the C-terminus of the D1 and D2 subunits of Photosystem II, which are essential for the coordination of the water-oxidizing complex, were already in place in the most ancestral Type II reaction center subunit. I then evaluate the evolution of reaction centers in the context of the rise and expansion of the different groups of bacteria based on recent large-scale phylogenetic analyses. I find that the Heliobacteriaceae family of Firmicutes appears to be the earliest branching of the known groups of phototrophic bacteria; however, the origin of photochemical reaction centers and chlorophyll synthesis cannot be placed in this group. Moreover, it becomes evident that the Acidobacteria and the Proteobacteria shared a more recent common phototrophic ancestor, and this is also likely for the Chloroflexi and the Cyanobacteria. Finally, I argue that the discrepancies among the phylogenies of the reaction center proteins, chlorophyll synthesis enzymes, and the species tree of bacteria are best explained if both types of photochemical reaction centers evolved before the diversification of the known phyla of phototrophic bacteria. The primordial phototrophic ancestor must have had both Type I and Type II reaction centers

    The allometry of the smallest: superlinear scaling of microbial metabolic rates in the Atlantic Ocean

    Get PDF
    Prokaryotic planktonic organisms are small in size but largely relevant in marine biogeochemical cycles. Due to their reduced size range (0.2 to 1 mu m in diameter), the effects of cell size on their metabolism have been hardly considered and are usually not examined in field studies. Here, we show the results of size-fractionated experiments of marine microbial respiration rate along a latitudinal transect in the Atlantic Ocean. The scaling exponents obtained from the power relationship between respiration rate and size were significantly higher than one. This superlinearity was ubiquitous across the latitudinal transect but its value was not universal revealing a strong albeit heterogeneous effect of cell size on microbial metabolism. Our results suggest that the latitudinal differences observed are the combined result of changes in cell size and composition between functional groups within prokaryotes. Communities where the largest size fraction was dominated by prokaryotic cyanobacteria, especially Prochlorococcus, have lower allometric exponents. We hypothesize that these larger, more complex prokaryotes fall close to the evolutionary transition between prokaryotes and protists, in a range where surface area starts to constrain metabolism and, hence, are expected to follow a scaling closer to linearity.Versión del editor8,951

    Discovery: an interactive resource for the rational selection and comparison of putative drug target proteins in malaria

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Up to half a billion human clinical cases of malaria are reported each year, resulting in about 2.7 million deaths, most of which occur in sub-Saharan Africa. Due to the over-and misuse of anti-malarials, widespread resistance to all the known drugs is increasing at an alarming rate. Rational methods to select new drug target proteins and lead compounds are urgently needed. The Discovery system provides data mining functionality on extensive annotations of five malaria species together with the human and mosquito hosts, enabling the selection of new targets based on multiple protein and ligand properties.</p> <p>Methods</p> <p>A web-based system was developed where researchers are able to mine information on malaria proteins and predicted ligands, as well as perform comparisons to the human and mosquito host characteristics. Protein features used include: domains, motifs, EC numbers, GO terms, orthologs, protein-protein interactions, protein-ligand interactions and host-pathogen interactions among others. Searching by chemical structure is also available.</p> <p>Results</p> <p>An <it>in silico</it> system for the selection of putative drug targets and lead compounds is presented, together with an example study on the bifunctional DHFR-TS from <it>Plasmodium falciparum</it>.</p> <p>Conclusion</p> <p>The Discovery system allows for the identification of putative drug targets and lead compounds in Plasmodium species based on the filtering of protein and chemical properties.</p

    Learning to live together: mutualism between self-splicing introns and their hosts

    Get PDF
    Group I and II introns can be considered as molecular parasites that interrupt protein-coding and structural RNA genes in all domains of life. They function as self-splicing ribozymes and thereby limit the phenotypic costs associated with disruption of a host gene while they act as mobile DNA elements to promote their spread within and between genomes. Once considered purely selfish DNA elements, they now seem, in the light of recent work on the molecular mechanisms regulating bacterial and phage group I and II intron dynamics, to show evidence of co-evolution with their hosts. These previously underappreciated relationships serve the co-evolving entities particularly well in times of environmental stress

    RISCI - Repeat Induced Sequence Changes Identifier: a comprehensive, comparative genomics-based, in silico subtractive hybridization pipeline to identify repeat induced sequence changes in closely related genomes

    Get PDF
    <p>Abstract</p> <p>Background -</p> <p>The availability of multiple whole genome sequences has facilitated <it>in silico </it>identification of fixed and polymorphic transposable elements (TE). Whereas polymorphic loci serve as makers for phylogenetic and forensic analysis, fixed species-specific transposon insertions, when compared to orthologous loci in other closely related species, may give insights into their evolutionary significance. Besides, TE insertions are not isolated events and are frequently associated with subtle sequence changes concurrent with insertion or post insertion. These include duplication of target site, 3' and 5' flank transduction, deletion of the target locus, 5' truncation or partial deletion and inversion of the transposon, and post insertion changes like inter or intra element recombination, disruption etc. Although such changes have been studied independently, no automated platform to identify differential transposon insertions and the associated array of sequence changes in genomes of the same or closely related species is available till date. To this end, we have designed RISCI - 'Repeat Induced Sequence Changes Identifier' - a comprehensive, comparative genomics-based, <it>in silico </it>subtractive hybridization pipeline to identify differential transposon insertions and associated sequence changes using specific alignment signatures, which may then be examined for their downstream effects.</p> <p>Results -</p> <p>We showcase the utility of RISCI by comparing full length and truncated L1HS and AluYa5 retrotransposons in the reference human genome with the chimpanzee genome and the alternate human assemblies (Celera and HuRef). Comparison of the reference human genome with alternate human assemblies using RISCI predicts 14 novel polymorphisms in full length L1HS, 24 in truncated L1HS and 140 novel polymorphisms in AluYa5 insertions, besides several insertion and post insertion changes. We present comparison with two previous studies to show that RISCI predictions are broadly in agreement with earlier reports. We also demonstrate its versatility by comparing various strains of <it>Mycobacterium tuberculosis </it>for IS 6100 insertion polymorphism.</p> <p>Conclusions -</p> <p>RISCI combines comparative genomics with subtractive hybridization, inferring changes only when exclusive to one of the two genomes being compared. The pipeline is generic and may be applied to most transposons and to any two or more genomes sharing high sequence similarity. Such comparisons, when performed on a larger scale, may pull out a few critical events, which may have seeded the divergence between the two species under comparison.</p

    A systematic review of economic analyses of telehealth services using real time video communication

    Get PDF
    Background: Telehealth is the delivery of health care at a distance, using information and communication technology. The major rationales for its introduction have been to decrease costs, improve efficiency and increase access in health care delivery. This systematic review assesses the economic value of one type of telehealth delivery - synchronous or real time video communication - rather than examining a heterogeneous range of delivery modes as has been the case with previous reviews in this area. Methods A systematic search was undertaken for economic analyses of the clinical use of telehealth, ending in June 2009. Studies with patient outcome data and a non-telehealth comparator were included. Cost analyses, non-comparative studies and those where patient satisfaction was the only health outcome were excluded. Results 36 articles met the inclusion criteria. 22(61%) of the studies found telehealth to be less costly than the non-telehealth alternative, 11(31%) found greater costs and 3 (9%) gave the same or mixed results. 23 of the studies took the perspective of the health services, 12 were societal, and one was from the patient perspective. In three studies of telehealth to rural areas, the health services paid more for telehealth, but due to savings in patient travel, the societal perspective demonstrated cost savings. In regard to health outcomes, 12 (33%) of studies found improved health outcomes, 21 (58%) found outcomes were not significantly different, 2(6%) found that telehealth was less effective, and 1 (3%) found outcomes differed according to patient group. The organisational model of care was more important in determining the value of the service than the clinical discipline, the type of technology, or the date of the study. Conclusion Delivery of health services by real time video communication was cost-effective for home care and access to on-call hospital specialists, showed mixed results for rural service delivery, and was not cost-effective for local delivery of services between hospitals and primary care

    More Than 1,001 Problems with Protein Domain Databases: Transmembrane Regions, Signal Peptides and the Issue of Sequence Homology

    Get PDF
    Large-scale genome sequencing gained general importance for life science because functional annotation of otherwise experimentally uncharacterized sequences is made possible by the theory of biomolecular sequence homology. Historically, the paradigm of similarity of protein sequences implying common structure, function and ancestry was generalized based on studies of globular domains. Having the same fold imposes strict conditions over the packing in the hydrophobic core requiring similarity of hydrophobic patterns. The implications of sequence similarity among non-globular protein segments have not been studied to the same extent; nevertheless, homology considerations are silently extended for them. This appears especially detrimental in the case of transmembrane helices (TMs) and signal peptides (SPs) where sequence similarity is necessarily a consequence of physical requirements rather than common ancestry. Thus, matching of SPs/TMs creates the illusion of matching hydrophobic cores. Therefore, inclusion of SPs/TMs into domain models can give rise to wrong annotations. More than 1001 domains among the 10,340 models of Pfam release 23 and 18 domains of SMART version 6 (out of 809) contain SP/TM regions. As expected, fragment-mode HMM searches generate promiscuous hits limited to solely the SP/TM part among clearly unrelated proteins. More worryingly, we show explicit examples that the scores of clearly false-positive hits, even in global-mode searches, can be elevated into the significance range just by matching the hydrophobic runs. In the PIR iProClass database v3.74 using conservative criteria, we find that at least between 2.1% and 13.6% of its annotated Pfam hits appear unjustified for a set of validated domain models. Thus, false-positive domain hits enforced by SP/TM regions can lead to dramatic annotation errors where the hit has nothing in common with the problematic domain model except the SP/TM region itself. We suggest a workflow of flagging problematic hits arising from SP/TM-containing models for critical reconsideration by annotation users
    corecore