96 research outputs found

    Subfamily specific conservation profiles for proteins based on n-gram patterns

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A new algorithm has been developed for generating conservation profiles that reflect the evolutionary history of the subfamily associated with a query sequence. It is based on n-gram patterns (NP{<it>n,m</it>}) which are sets of <it>n </it>residues and <it>m </it>wildcards in windows of size <it>n+m</it>. The generation of conservation profiles is treated as a signal-to-noise problem where the signal is the count of n-gram patterns in target sequences that are similar to the query sequence and the noise is the count over all target sequences. The signal is differentiated from the noise by applying singular value decomposition to sets of target sequences rank ordered by similarity with respect to the query.</p> <p>Results</p> <p>The new algorithm was used to construct 4,248 profiles from 120 randomly selected Pfam-A families. These were compared to profiles generated from multiple alignments using the consensus approach. The two profiles were similar whenever the subfamily associated with the query sequence was well represented in the multiple alignment. It was possible to construct subfamily specific conservation profiles using the new algorithm for subfamilies with as few as five members. The speed of the new algorithm was comparable to the multiple alignment approach.</p> <p>Conclusion</p> <p>Subfamily specific conservation profiles can be generated by the new algorithm without aprioi knowledge of family relationships or domain architecture. This is useful when the subfamily contains multiple domains with different levels of representation in protein databases. It may also be applicable when the subfamily sample size is too small for the multiple alignment approach.</p

    A novel substitution matrix fitted to the compositional bias in Mollicutes improves the prediction of homologous relationships

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Substitution matrices are key parameters for the alignment of two protein sequences, and consequently for most comparative genomics studies. The composition of biological sequences can vary importantly between species and groups of species, and classical matrices such as those in the BLOSUM series fail to accurately estimate alignment scores and statistical significance with sequences sharing marked compositional biases.</p> <p>Results</p> <p>We present a general and simple methodology to build matrices that are especially fitted to the compositional bias of proteins. Our approach is inspired from the one used to build the BLOSUM matrices and is based on learning substitution and amino acid frequencies on real sequences with the corresponding compositional bias. We applied it to the large scale comparison of Mollicute AT-rich genomes. The new matrix, MOLLI60, was used to predict pairwise orthology relationships, as well as homolog families among 24 Mollicute genomes. We show that this new matrix enables to better discriminate between true and false orthologs and improves the clustering of homologous proteins, with respect to the use of the classical matrix BLOSUM62.</p> <p>Conclusions</p> <p>We show in this paper that well-fitted matrices can improve the predictions of orthologous and homologous relationships among proteins with a similar compositional bias. With the ever-increasing number of sequenced genomes, our approach could prove valuable in numerous comparative studies focusing on atypical genomes.</p

    The effects of breastfeeding on retinoblastoma development: Results from an international multicenter retinoblastoma survey

    Get PDF
    The protective effects of breastfeeding on various childhood malignancies have been established but an association has not yet been determined for retinoblastoma (RB). We aimed to further investigate the role of breastfeeding in the severity of nonhereditary RB development, assessing relationship to (1) age at diagnosis, (2) ocular prognosis, measured by International Intraocular RB Classification (IIRC) or Intraocular Classification of RB (ICRB) group and success of eye salvage, and (3) extraocular involvement. Analyses were performed on a global dataset subgroup of 344 RB patients whose legal guardian(s) consented to answer a neonatal questionnaire. Patients with undetermined or mixed feeding history, family history of RB, or sporadic bilateral RB were excluded. There was no statistically significant difference between breastfed and formula-fed groups in (1) age at diagnosis (p = 0.20), (2) ocular prognosis measures of IIRC/ICRB group (p = 0.62) and success of eye salvage (p = 0.16), or (3) extraocular involvement shown by International Retinoblastoma Staging System (IRSS) at presentation (p = 0.74), lymph node involvement (p = 0.20), and distant metastases (p = 0.37). This study suggests that breastfeeding neither impacts the sporadic development nor is associated with a decrease in the severity of nonhereditary RB as measured by age at diagnosis, stage of disease, ocular prognosis, and extraocular spread. A further exploration into the impact of diet on children who develop RB is warranted

    A novel series of compositionally biased substitution matrices for comparing Plasmodium proteins

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The most common substitution matrices currently used (BLOSUM and PAM) are based on protein sequences with average amino acid distributions, thus they do not represent a fully accurate substitution model for proteins characterized by a biased amino acid composition. This problem has been addressed recently by adjusting existing matrices, however, to date, no empirical approach has been taken to build matrices which offer a substitution model for comparing proteins sharing an amino acid compositional bias. Here, we present a novel procedure to construct series of symmetrical substitution matrices to align proteins from similarly biased <it>Plasmodium </it>proteomes.</p> <p>Results</p> <p>We generated substitution matrices by selecting from the BLOCKS database those multiple alignments with a compositional bias similar to that of <it>P. falciparum </it>and <it>P. yoelii </it>proteins. A novel 'fuzzy' clustering method was adopted to group sequences within these alignments, showing that this method retains more complete information on the amino acid substitutions when compared to hierarchical clustering. We assessed the performance against the BLOSUM62 series and showed that the usage of our matrices results in an improvement in the performance of BLAST database searches, greatly reducing the number of false positive hits. We then demonstrated applications of the use of novel matrices to improve the annotation of homologs between the two <it>Plasmodium </it>species and to classify members of the <it>P. falciparum </it>RIFIN/STEVOR family.</p> <p>Conclusion</p> <p>We confirmed that in the case of compositionally biased proteins, standard BLOSUM matrices are not suited for optimal alignments, and specific substitution matrices are required. In addition, we showed that the usage of these matrices leads to a reduction of false positive hits, facilitating the automatic annotation process.</p

    The expression of selenium-binding protein 1 is decreased in uterine leiomyoma

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Selenium has been shown to inhibit cancer development and growth through the mediation of selenium-binding proteins. Decreased expression of selenium-binding protein 1 has been reported in cancers of the prostate, stomach, colon, and lungs. No information, however, is available concerning the roles of selenium-binding protein 1 in uterine leiomyoma.</p> <p>Methods</p> <p>Using Western Blot analysis and immunohistochemistry, we examined the expression of selenium-binding protein 1 in uterine leiomyoma and normal myometrium in 20 patients who had undergone hysterectomy for uterine leiomyoma.</p> <p>Results and Discussion</p> <p>The patient age ranged from 34 to 58 years with a mean of 44.3 years. Proliferative endometrium was seen in 8 patients, secretory endometrium in 7 patients, and atrophic endometrium in 5 patients. Two patients showed solitary leiomyoma, and eighteen patients revealed 2 to 5 tumors. Tumor size ranged from 1 to 15.5 cm with a mean of 4.3 cm. Both Western Blot analysis and immunohistochemistry showed a significant lower level of selenium-binding protein 1 in leiomyoma than in normal myometrium. Larger tumors had a tendency to show a lower level of selenium-binding protein 1 than smaller ones, but the difference did not reach a statistical significance. The expression of selenium-binding protein 1 was the same among patients with proliferative, secretory, and atrophic endometrium in either leiomyoma or normal myometrium. Also, we did not find a difference of selenium-binding protein 1 level between patients younger than 45 years and older patients in either leiomyoma or normal myometrium.</p> <p>Conclusions</p> <p>Decreased expression of selenium-binding protein 1 in uterine leiomyoma may indicate a role of the protein in tumorigenesis. Our findings may provide a basis for future studies concerning the molecular mechanisms of selenium-binding protein 1 in tumorigenesis as well as the possible use of selenium in prevention and treatment of uterine leiomyoma.</p

    Modern classification of neoplasms: reconciling differences between morphologic and molecular approaches

    Get PDF
    BACKGROUND: For over 150 years, pathologists have relied on histomorphology to classify and diagnose neoplasms. Their success has been stunning, permitting the accurate diagnosis of thousands of different types of neoplasms using only a microscope and a trained eye. In the past two decades, cancer genomics has challenged the supremacy of histomorphology by identifying genetic alterations shared by morphologically diverse tumors and by finding genetic features that distinguish subgroups of morphologically homogeneous tumors. DISCUSSION: The Developmental Lineage Classification and Taxonomy of Neoplasms groups neoplasms by their embryologic origin. The putative value of this classification is based on the expectation that tumors of a common developmental lineage will share common metabolic pathways and common responses to drugs that target these pathways. The purpose of this manuscript is to show that grouping tumors according to their developmental lineage can reconcile certain fundamental discrepancies resulting from morphologic and molecular approaches to neoplasm classification. In this study, six issues in tumor classification are described that exemplify the growing rift between morphologic and molecular approaches to tumor classification: 1) the morphologic separation between epithelial and non-epithelial tumors; 2) the grouping of tumors based on shared cellular functions; 3) the distinction between germ cell tumors and pluripotent tumors of non-germ cell origin; 4) the distinction between tumors that have lost their differentiation and tumors that arise from uncommitted stem cells; 5) the molecular properties shared by morphologically disparate tumors that have a common developmental lineage, and 6) the problem of re-classifying morphologically identical but clinically distinct subsets of tumors. The discussion of these issues in the context of describing different methods of tumor classification is intended to underscore the clinical value of a robust tumor classification. SUMMARY: A classification of neoplasms should guide the rational design and selection of a new generation of cancer medications targeted to metabolic pathways. Without a scientifically sound neoplasm classification, biological measurements on individual tumor samples cannot be generalized to class-related tumors, and constitutive properties common to a class of tumors cannot be distinguished from uninformative data in complex and chaotic biological systems. This paper discusses the importance of biological classification and examines several different approaches to the specific problem of tumor classification

    Introducing global peat-specific temperature and pH calibrations based on brGDGT bacterial lipids

    Get PDF
    Glycerol dialkyl glycerol tetraethers (GDGTs) are membrane-spanning lipids from Bacteria and Archaea that are ubiquitous in a range of natural archives and especially abundant in peat. Previous work demonstrated that the distribution of bacterial branched GDGTs (brGDGTs) in mineral soils is correlated to environmental factors such as mean annual air temperature (MAAT) and soil pH. However, the influence of these parameters on brGDGT distributions in peat is largely unknown. Here we investigate the distribution of brGDGTs in 470 samples from 96 peatlands around the world with a broad mean annual air temperature (-8 to 27 degrees C) and pH (3-8) range and present the first peat-specific brGDGT-based temperature and pH calibrations. Our results demonstrate that the degree of cyclisation of brGDGTs in peat is positively correlated with pH, pH = 2.49 x CBTpeat + 8.07 (n = 51, R-2 = 0.58, RMSE = 0.8) and the degree of methylation of brGDGTs is positively correlated with MAAT, MAAT(peat) (degrees C) = 52.18 x MBT'(5me) - 23.05 (n = 96, R-2 = 0.76, RMSE = 4.7 degrees C). These peat-specific calibrations are distinct from the available mineral soil calibrations. In light of the error in the temperature calibration (similar to 4.7 degrees C), we urge caution in any application to reconstruct late Holocene climate variability, where the climatic signals are relatively small, and the duration of excursions could be brief. Instead, these proxies are well-suited to reconstruct large amplitude, longer-term shifts in climate such as deglacial transitions. Indeed, when applied to a peat deposit spanning the late glacial period (similar to 15.2 kyr), we demonstrate that MAAT(peat) yields absolute temperatures and relative temperature changes that are consistent with those from other proxies. In addition, the application of MAAT(peat) to fossil peat (i.e. lignites) has the potential to reconstruct terrestrial climate during the Cenozoic. We conclude that there is clear potential to use brGDGTs in peats and lignites to reconstruct past terrestrial climate. (C) 2017 The Authors. Published by Elsevier Ltd

    A Synthesis of Tagging Studies Examining the Behaviour and Survival of Anadromous Salmonids in Marine Environments

    Get PDF
    This paper synthesizes tagging studies to highlight the current state of knowledge concerning the behaviour and survival of anadromous salmonids in the marine environment. Scientific literature was reviewed to quantify the number and type of studies that have investigated behaviour and survival of anadromous forms of Pacific salmon (Oncorhynchus spp.), Atlantic salmon (Salmo salar), brown trout (Salmo trutta), steelhead (Oncorhynchus mykiss), and cutthroat trout (Oncorhynchus clarkii). We examined three categories of tags including electronic (e.g. acoustic, radio, archival), passive (e.g. external marks, Carlin, coded wire, passive integrated transponder [PIT]), and biological (e.g. otolith, genetic, scale, parasites). Based on 207 papers, survival rates and behaviour in marine environments were found to be extremely variable spatially and temporally, with some of the most influential factors being temperature, population, physiological state, and fish size. Salmonids at all life stages were consistently found to swim at an average speed of approximately one body length per second, which likely corresponds with the speed at which transport costs are minimal. We found that there is relatively little research conducted on open-ocean migrating salmonids, and some species (e.g. masu [O. masou] and amago [O. rhodurus]) are underrepresented in the literature. The most common forms of tagging used across life stages were various forms of external tags, coded wire tags, and acoustic tags, however, the majority of studies did not measure tagging/handling effects on the fish, tag loss/failure, or tag detection probabilities when estimating survival. Through the interdisciplinary application of existing and novel technologies, future research examining the behaviour and survival of anadromous salmonids could incorporate important drivers such as oceanography, tagging/handling effects, predation, and physiology

    The global abundance of tree palms

    Get PDF
    Aim: Palms are an iconic, diverse and often abundant component of tropical ecosystems that provide many ecosystem services. Being monocots, tree palms are evolutionarily, morphologically and physiologically distinct from other trees, and these differences have important consequences for ecosystem services (e.g., carbon sequestration and storage) and in terms of responses to climate change. We quantified global patterns of tree palm relative abundance to help improve understanding of tropical forests and reduce uncertainty about these ecosystems under climate change. Location: Tropical and subtropical moist forests. Time period: Current. Major taxa studied: Palms (Arecaceae). Methods: We assembled a pantropical dataset of 2,548 forest plots (covering 1,191 ha) and quantified tree palm (i.e., ≥10 cm diameter at breast height) abundance relative to co‐occurring non‐palm trees. We compared the relative abundance of tree palms across biogeographical realms and tested for associations with palaeoclimate stability, current climate, edaphic conditions and metrics of forest structure. Results: On average, the relative abundance of tree palms was more than five times larger between Neotropical locations and other biogeographical realms. Tree palms were absent in most locations outside the Neotropics but present in >80% of Neotropical locations. The relative abundance of tree palms was more strongly associated with local conditions (e.g., higher mean annual precipitation, lower soil fertility, shallower water table and lower plot mean wood density) than metrics of long‐term climate stability. Life‐form diversity also influenced the patterns; palm assemblages outside the Neotropics comprise many non‐tree (e.g., climbing) palms. Finally, we show that tree palms can influence estimates of above‐ground biomass, but the magnitude and direction of the effect require additional work. Conclusions: Tree palms are not only quintessentially tropical, but they are also overwhelmingly Neotropical. Future work to understand the contributions of tree palms to biomass estimates and carbon cycling will be particularly crucial in Neotropical forests
    corecore