15 research outputs found

    ccSOL omics: a webserver for solubility prediction of endogenous and heterologous expression in Escherichia coli

    Get PDF
    SUMMARY: Here we introduce ccSOL omics, a webserver for large-scale calculations of protein solubility. Our method allows (i) proteome-wide predictions; (ii) identification of soluble fragments within each sequences; (iii) exhaustive single-point mutation analysis.RESULTS: Using coil/disorder, hydrophobicity, hydrophilicity, β-sheet and α-helix propensities, we built a predictor of protein solubility. Our approach shows an accuracy of 79% on the training set (36 990 Target Track entries). Validation on three independent sets indicates that ccSOL omics discriminates soluble and insoluble proteins with an accuracy of 74% on 31 760 proteins sharing <30% sequence similarity.AVAILABILITY AND IMPLEMENTATION: ccSOL omics can be freely accessed on the web at http://s.tartaglialab.com/page/ccsol_group. Documentation and tutorial are available at http://s.tartaglialab.com/static_files/shared/tutorial_ccsol_omics.html.CONTACT: [email protected] INFORMATION: Supplementary data are available at Bioinformatics online

    Mosaic human preimplantation embryos and their developmental potential in a prospective, non-selection clinical trial

    No full text
    Chromosome imbalance (aneuploidy) is the major cause of pregnancy loss and congenital disorders in humans. Analyses of small biopsies from human embryos suggest that aneuploidy commonly originates during early divisions, resulting in mosaicism. However, the developmental potential of mosaic embryos remains unclear. We followed the distribution of aneuploid chromosomes across 73 unselected preimplantation embryos and 365 biopsies, sampled from four multifocal trophectoderm (TE) samples and the inner cell mass (ICM). When mosaicism impacted fewer than 50% of cells in one TE biopsy (low-medium mosaicism), only 1% of aneuploidies affected other portions of the embryo. A double-blinded prospective non-selection trial (NCT03673592) showed equivalent live-birth rates and miscarriage rates across 484 euploid, 282 low-grade mosaic, and 131 medium-grade mosaic embryos. No instances of mosaicism or uniparental disomy were detected in the ensuing pregnancies or newborns, and obstetrical and neonatal outcomes were similar between the study groups. Thus, low-medium mosaicism in the trophectoderm mostly arises after TE and ICM differentiation, and such embryos have equivalent developmental potential as fully euploid ones

    Protein-specific prediction of mRNA binding using RNA sequences, binding motifs and predicted secondary structures

    Get PDF
    Background: RNA-binding proteins interact with specific RNA molecules to regulate important cellular processes. It is therefore necessary to identify the RNA interaction partners in order to understand the precise functions of such proteins. Protein-RNA interactions are typically characterized using in vivo and in vitro experiments but these may not detect all binding partners. Therefore, computational methods that capture the protein-dependent nature of such binding interactions could help to predict potential binding partners in silico. Results: We have developed three methods to predict whether an RNA can interact with a particular RNA-binding protein using support vector machines and different features based on the sequence (the Oli method), the motif score (the OliMo method) and the secondary structure (the OliMoSS method). We applied these approaches to different experimentally-derived datasets and compared the predictions with RNAcontext and RPISeq. Oli outperformed OliMoSS and RPISeq, confirming our protein-specific predictions and suggesting that tetranucleotide frequencies are appropriate discriminative features. Oli and RNAcontext were the most competitive methods in terms of the area under curve. A precision-recall curve analysis achieved higher precision values for Oli. On a second experimental dataset including real negative binding information, Oli outperformed RNAcontext with a precision of 0.73 vs. 0.59. Conclusions: Our experiments showed that features based on primary sequence information are sufficiently discriminating to predict specific RNA-protein interactions. Sequence motifs and secondary structure information were not necessary to improve these predictions. Finally we confirmed that protein-specific experimental data concerning RNA-protein interactions are valuable sources of information that can be used for the efficient training of models for in silico predictions. The scripts are available upon request to the corresponding author.This work has been funded by research grants from the University of Trento, Ital

    Protein aggregation, structural disorder and RNA-binding ability: a new approach for physico-chemical and gene ontology classification of multiple datasets

    Get PDF
    BACKGROUND: Comparison between multiple protein datasets requires the choice of an appropriate reference system and a number of variables to describe their differences. Here we introduce an innovative approach to discriminate multiple protein datasets (multiCM) and to measure enrichments in gene ontology terms (cleverGO) using semantic similarities. RESULTS: We illustrate the powerfulness of our approach by investigating the links between RNA-binding ability and other protein features, such as structural disorder and aggregation, in S. cerevisiae, C. elegans, M. musculus and H. sapiens. Our results are in striking agreement with available experimental evidence and unravel features that are key to understand the mechanisms regulating cellular homeostasis. CONCLUSIONS: In an intuitive way, multiCM and cleverGO provide accurate classifications of physico-chemical features and annotations of biological processes, molecular functions and cellular components, which is extremely useful for the discovery and characterization of new trends in protein datasets. The multiCM and cleverGO can be freely accessed on the Web at http://www.tartaglialab.com/cs_multi/submission and http://www.tartaglialab.com/GO_analyser/universal . Each of the pages contains links to the corresponding documentation and tutorial.The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013), through the European Research Council, under grant agreement RIBOMYLOME_309545 (Gian Gaetano Tartaglia), and from the Spanish Ministry of Economy and Competitiveness (BFU2014-55054-P). We also acknowledge support from AGAUR (2014 SGR 00685), the Spanish Ministry of Economy and Competitiveness, ‘Centro de Excelencia Severo Ochoa 2013–2017’ (SEV-2012-0208). PK and RDP are recipients of “La Caixa” and “Severo Ochoa” studentships, respectivel

    Comparison of computational methods for Hi-C data analysis

    No full text
    Hi-C is a genome-wide sequencing technique used to investigate 3D chromatin conformation inside the nucleus. Computational methods are required to analyze Hi-C data and identify chromatin interactions and topologically associating domains (TADs) from genome-wide contact probability maps. We quantitatively compared the performance of 13 algorithms in their analyses of Hi-C data from six landmark studies and simulations. This comparison revealed differences in the performance of methods for chromatin interaction identification, but more comparable results for TAD detection between algorithms

    Discovering the 3′ UTR-mediated regulation of alpha-synuclein

    No full text
    Recent evidence indicates a link between Parkinson's Disease (PD) and the expression of a-synuclein (SNCA) isoforms with different 3' untranslated regions (3'UTRs). Yet, the post-transcriptional mechanisms regulating SNCA expression are unknown. Using a large-scale in vitro /in silico screening we identified RNA-binding proteins (RBPs) that interact with SNCA 3' UTRs. We identified two RBPs, ELAVL1 and TIAR, that bind with high affinity to the most abundant and translationally active 3' UTR isoform (575 nt). Knockdown and overexpression experiments indicate that both ELAVL1 and TIAR positively regulate endogenous SNCA in vivo. The mechanism of regulation implies mRNA stabilization as well as enhancement of translation in the case of TIAR. We observed significant alteration of both TIAR and ELAVL1 expression in motor cortex of post-mortem brain donors and primary cultured fibroblast from patients affected by PD and Multiple System Atrophy (MSA). Moreover, trans expression quantitative trait loci (trans-eQTLs) analysis revealed that a group of single nucleotide polymorphisms (SNPs) in TIAR genomic locus influences SNCA expression in two different brain areas, nucleus accumbens and hippocampus. Our study sheds light on the 3' UTR-mediated regulation of SNCA and its link with PD pathogenesis, thus opening up new avenues for investigation of post-transcriptional mechanisms in neurodegeneration

    Hydroxymethylation profile of cell-free DNA is a biomarker for early colorectal cancer.

    Get PDF
    Early detection of cancer will improve survival rates. The blood biomarker 5-hydroxymethylcytosine has been shown to discriminate cancer. In a large covariate-controlled study of over two thousand individual blood samples, we created, tested and explored the properties of a 5-hydroxymethylcytosine-based classifier to detect colorectal cancer (CRC). In an independent validation sample set, the classifier discriminated CRC samples from controls with an area under the receiver operating characteristic curve (AUC) of 90% (95% CI [87, 93]). Sensitivity was 55% at 95% specificity. Performance was similar for early stage 1 (AUC 89%; 95% CI [83, 94]) and late stage 4 CRC (AUC 94%; 95% CI [89, 98]). The classifier could detect CRC even when the proportion of tumor DNA in blood was undetectable by other methods. Expanding the classifier to include information about cell-free DNA fragment size and abundance across the genome led to gains in sensitivity (63% at 95% specificity), with similar overall performance (AUC 91%; 95% CI [89, 94]). We confirm that 5-hydroxymethylcytosine can be used to detect CRC, even in early-stage disease. Therefore, the inclusion of 5-hydroxymethylcytosine in multianalyte testing could improve sensitivity for the detection of early-stage cancer
    corecore