148 research outputs found

    De l'identification à la caractérisation des complexes protéiques : développement d'une plateforme bioinformatique d'analyse

    Get PDF
    Un des dĂ©fis de l’ùre post-gĂ©nomique est de dĂ©terminer la fonction des protĂ©ines et plus prĂ©cisĂ©ment d’établir une cartographie protĂ©omique de la cellule. Ainsi le dĂ©fi de la gĂ©nomique fonctionnelle et plus prĂ©cisĂ©ment de la protĂ©omique est de comprendre les Ă©vĂ©nements qui ont lieu au cours de la maturation des protĂ©ines. Plusieurs approches ont Ă©tĂ© dĂ©crites pour comprendre la fonction des protĂ©ines dont les interactions protĂ©iques. Traditionnellement, les Ă©tudes des interactions protĂ©iques Ă©taient basĂ©es sur des approches ciblĂ©es ou sur des hypothĂšses d’interactions. RĂ©cemment, le dĂ©veloppement des analyses Ă  haut dĂ©bit a gĂ©nĂ©rĂ© une quantitĂ© impressionnante d’information. Face Ă  l’accumulation des donnĂ©es, une approche uniquement expĂ©rimentale n’apparaĂźt plus suffisante. Par consĂ©quent, la crĂ©ation de mĂ©thodes bioinformatiques dĂ©veloppant des procĂ©dures de prospection de donnĂ©es couplĂ©es avec des approches expĂ©rimentales permettra de prĂ©dire les interacteurs in silico. C’est dans cette optique que le laboratoire a dĂ©veloppĂ© son projet de recherche sur la famille des poly (ADP-ribose) polymĂ©rases (PARPs). La poly(ADP-ribosyl)ation est une modification post-traductionnelle qui consiste en l’ajout d’une chaĂźne d’ADP-ribose sur des protĂ©ines cibles.L’objectif principal de notre Ă©tude est de caractĂ©riser par des expĂ©riences d’immunoprĂ©cipitation le rĂŽle dynamique de la poly(ADP-ribosyl)ation. L’identification des interacteurs des PARPs s’effectuera par spectromĂ©trie de masse. Cette technique va gĂ©nĂ©rer d’importantes quantitĂ©s de donnĂ©es et nĂ©cessitera une plate-forme d’analyse et de grandes capacitĂ©s de calcul informatique. Dans ce contexte gĂ©nĂ©ral, l’objectif de ce travail de thĂšse Ă©tait de dĂ©velopper la plateforme bioinformatique d’analyse, d’implĂ©menter les outils d’identifications des protĂ©ines, d’établir un contrĂŽle de qualitĂ© des mĂ©thodes d’identification (spĂ©cificitĂ©/sensibilitĂ©) et enfin d’explorer le contenu des bases de connaissances. A l’aide du systĂšme mis en place au sein de la plateforme de protĂ©omique, nous avons identifiĂ© de nouvaux interacteurs de la famille des PARPs comme par exemple RFC1, 2, 3, 4, 5.An ambitious goal of proteomics is to elucidate the structure, interactions and functions of all proteins within cells and organisms. In the “post-genome” era, mass spectrometry (MS) has become an important method for the analysis of proteome data. One strategy to determine protein function is to identify protein–protein interactions. The rapid advances made in mass spectrometry in combination with other methods used in proteomics results in an increasing of proteomics projects. The increasing use of high-throughput and large-scale bioinformatics-based studies has generated a massive amount of data stored in a number of different databases. A challenge for bioinformatics is to explore array of information to uncover biologically relevant interactions and pathways. Thus for protein interaction studies, there is clearly a need to develop a systematic and stepwise in silico approach that can predict potential interactors or are most likely to improve our understanding of how complex biological systems work. The focus of our laboratory is the study of the activity of poly(ADP-ribose) polymerases (PARPs) and their role in the cell. Poly(ADP-ribosylation) is a post-synthetic protein modification consisting of long chains of poly(ADP-ribose) (pADPr) synthesized by PARPs at the expense of NAD+. The overall objective of this research is to extensively characterize the dynamic roles of poly(ADP-ribosyl)ation in response to cellular stresses that cause DNA damage. Our approach utilizes immunoprecipitation and affinity purification followed by mass spectrometry identification of associated proteins. One part of this thesis projet is to develop the architecture and major features of a web-based utility tool, which is designed to rationally organize protein and peptide data generated by the tandem mass spectrometry. Next, we have performed benchmarking to optimize protein identification. The system will be expanded as needed in order to make the analysis more efficient. We have also explored the public database information for protein identification data mining. Using the described pipeline, we have successfully identified several interactions of biological significance between PARP and other proteins such as RFC1, 2, 3, 4, 5

    PARPs database: A LIMS systems for protein-protein interaction data mining or laboratory information management system

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the "post-genome" era, mass spectrometry (MS) has become an important method for the analysis of proteins and the rapid advancement of this technique, in combination with other proteomics methods, results in an increasing amount of proteome data. This data must be archived and analysed using specialized bioinformatics tools.</p> <p>Description</p> <p>We herein describe "PARPs database," a data analysis and management pipeline for liquid chromatography tandem mass spectrometry (LC-MS/MS) proteomics. PARPs database is a web-based tool whose features include experiment annotation, protein database searching, protein sequence management, as well as data-mining of the peptides and proteins identified.</p> <p>Conclusion</p> <p>Using this pipeline, we have successfully identified several interactions of biological significance between PARP-1 and other proteins, namely RFC-1, 2, 3, 4 and 5.</p

    Date 2014-03-31

    Get PDF
    Title Interfaces the tandem protein identification algorithm in

    Quantitative profiling of the UGT transcriptome in human drug metabolizing tissues

    Get PDF
    Alternative splicing as a mean to control gene expression and diversify function is suspected to considerably influence drug response and clearance. We report the quantitative expression profiles of the human UGT genes including alternatively spliced variants not previously annotated established by deep RNA-sequencing in tissues of pharmacological importance. We reveal a comprehensive quantification of the alternative UGT transcriptome that differ across tissues and among individuals. Alternative transcripts that comprise novel in-frame sequences associated or not with truncations of the 5’ and/or 3’ termini, significantly contribute to the total expression levels of each UGT1 and UGT2 gene averaging 21% in normal tissues, with expression of UGT2 variants surpassing those of UGT1. Quantitative data expose preferential tissue expression patterns and remodelling in favour of alternative variants upon tumorigenesis. These complex alternative splicing programs have the strong potential to contribute to interindividual variability in drug metabolism in addition to diversify the UGT proteome

    Generalised Mutual Information: a Framework for Discriminative Clustering

    Full text link
    In the last decade, recent successes in deep clustering majorly involved the Mutual Information (MI) as an unsupervised objective for training neural networks with increasing regularisations. While the quality of the regularisations have been largely discussed for improvements, little attention has been dedicated to the relevance of MI as a clustering objective. In this paper, we first highlight how the maximisation of MI does not lead to satisfying clusters. We identified the Kullback-Leibler divergence as the main reason of this behaviour. Hence, we generalise the mutual information by changing its core distance, introducing the Generalised Mutual Information (GEMINI): a set of metrics for unsupervised neural network training. Unlike MI, some GEMINIs do not require regularisations when training as they are geometry-aware thanks to distances or kernels in the data space. Finally, we highlight that GEMINIs can automatically select a relevant number of clusters, a property that has been little studied in deep discriminative clustering context where the number of clusters is a priori unknown.Comment: Submitted for review at the IEEE Transactions on Pattern Analysis and Machine Intelligence. This article is an extension of an original NeurIPS 2022 article [arXiv:2210.06300

    Unravelling the transcriptomic landscape of the major phase II UDP-glucuronosyltransferase drug metabolizing pathway using targeted RNA sequencing

    Get PDF
    A comprehensive view of the human UDP-glucuronosyltransferase (UGT) transcriptome is a prerequisite to the establishment of an individual’s UGT metabolic glucuronidation signature. Here, we uncover the transcriptome landscape of the ten human UGT loci genes in normal and tumoral metabolic tissues by targeted RNA next generation sequencing. Alignment on the human hg19 reference genome identifies 234 novel exon-exon junctions. We recover all previously known UGT1 and UGT2 enzyme-coding transcripts and identify over 130 structurally and functionally diverse novel UGT variants. We further expose a revised genomic structure of UGT loci and provide a comprehensive repertoire of transcripts for each UGT gene. Data also uncover a remodelling of the UGT transcriptome occurring in a tissue- and tumor-specific manner. The complex alternative splicing program regulating UGT expression and protein functions is likely critical in determining detoxification capacity of an organ and stress-related responses, with significant impact on drug responses and diseases. Keywords: Alternative splicing, transcriptome, glucuronidation, RNA sequencing, drug metabolism, glucuronosyltransferase (UGT

    Comparative proteome analysis of human epithelial ovarian cancer

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Epithelial ovarian cancer is a devastating disease associated with low survival prognosis mainly because of the lack of early detection markers and the asymptomatic nature of the cancer until late stage. Using two complementary proteomics approaches, a differential protein expression profile was carried out between low and highly transformed epithelial ovarian cancer cell lines which realistically mimic the phenotypic changes observed during evolution of a tumour metastasis. This investigation was aimed at a better understanding of the molecular mechanisms underlying differentiation, proliferation and neoplastic progression of ovarian cancer.</p> <p>Results</p> <p>The quantitative profiling of epithelial ovarian cancer model cell lines TOV-81D and TOV-112D generated using iTRAQ analysis and two-dimensional electrophoresis coupled to liquid chromatography tandem mass spectrometry revealed some proteins with altered expression levels. Several of these proteins have been the object of interest in cancer research but others were unrecognized as differentially expressed in a context of ovarian cancer. Among these, series of proteins involved in transcriptional activity, cellular metabolism, cell adhesion or motility and cytoskeleton organization were identified, suggesting their possible role in the emergence of oncogenic pathways leading to aggressive cellular behavior.</p> <p>Conclusion</p> <p>The differential protein expression profile generated by the two proteomics approaches combined to complementary characterizations studies will open the way to more exhaustive and systematic representation of the disease and will provide valuable information that may be helpful to uncover the molecular mechanisms related to epithelial ovarian cancer.</p

    Large-Scale Automatic Feature Selection for Biomarker Discovery in High-Dimensional OMICs Data

    Get PDF
    The identification of biomarker signatures in omics molecular profiling is usually performed to predict outcomes in a precision medicine context, such as patient disease susceptibility, diagnosis, prognosis, and treatment response. To identify these signatures, we have developed a biomarker discovery tool, called BioDiscML. From a collection of samples and their associated characteristics, i.e., the biomarkers (e.g., gene expression, protein levels, clinico-pathological data), BioDiscML exploits various feature selection procedures to produce signatures associated to machine learning models that will predict efficiently a specified outcome. To this purpose, BioDiscML uses a large variety of machine learning algorithms to select the best combination of biomarkers for predicting categorical or continuous outcomes from highly unbalanced datasets. The software has been implemented to automate all machine learning steps, including data pre-processing, feature selection, model selection, and performance evaluation. BioDiscML is delivered as a stand-alone program and is available for download at https://github.com/mickaelleclercq/BioDiscML

    Portrait of blood-derived extracellular vesicles in patients with Parkinson's disease.

    Get PDF
    The production of extracellular vesicles (EV) is a ubiquitous feature of eukaryotic cells but pathological events can affect their formation and constituents. We sought to characterize the nature, profile and protein signature of EV in the plasma of Parkinson's disease (PD) patients and how they correlate to clinical measures of the disease. EV were initially collected from cohorts of PD (n = 60; Controls, n = 37) and Huntington's disease (HD) patients (Pre-manifest, n = 11; manifest, n = 52; Controls, n = 55) - for comparative purposes in individuals with another chronic neurodegenerative condition - and exhaustively analyzed using flow cytometry, electron microscopy and proteomics. We then collected 42 samples from an additional independent cohort of PD patients to confirm our initial results. Through a series of iterative steps, we optimized an approach for defining the EV signature in PD. We found that the number of EV derived specifically from erythrocytes segregated with UPDRS scores corresponding to different disease stages. Proteomic analysis further revealed that there is a specific signature of proteins that could reliably differentiate control subjects from mild and moderate PD patients. Taken together, we have developed/identified an EV blood-based assay that has the potential to be used as a biomarker for PD
    • 

    corecore