148 research outputs found
De l'identification à la caractérisation des complexes protéiques : développement d'une plateforme bioinformatique d'analyse
Un des dĂ©fis de lâĂšre post-gĂ©nomique est de dĂ©terminer la fonction des protĂ©ines et plus prĂ©cisĂ©ment dâĂ©tablir une cartographie protĂ©omique de la cellule. Ainsi le dĂ©fi de la gĂ©nomique fonctionnelle et plus prĂ©cisĂ©ment de la protĂ©omique est de comprendre les Ă©vĂ©nements qui ont lieu au cours de la maturation des protĂ©ines. Plusieurs approches ont Ă©tĂ© dĂ©crites pour comprendre la fonction des protĂ©ines dont les interactions protĂ©iques. Traditionnellement, les Ă©tudes des interactions protĂ©iques Ă©taient basĂ©es sur des approches ciblĂ©es ou sur des hypothĂšses dâinteractions. RĂ©cemment, le dĂ©veloppement des analyses Ă haut dĂ©bit a gĂ©nĂ©rĂ© une quantitĂ© impressionnante dâinformation. Face Ă lâaccumulation des donnĂ©es, une approche uniquement expĂ©rimentale nâapparaĂźt plus suffisante. Par consĂ©quent, la crĂ©ation de mĂ©thodes bioinformatiques dĂ©veloppant des procĂ©dures de prospection de donnĂ©es couplĂ©es avec des approches expĂ©rimentales permettra de prĂ©dire les interacteurs in silico. Câest dans cette optique que le laboratoire a dĂ©veloppĂ© son projet de recherche sur la famille des poly (ADP-ribose) polymĂ©rases (PARPs). La poly(ADP-ribosyl)ation est une modification post-traductionnelle qui consiste en lâajout dâune chaĂźne dâADP-ribose sur des protĂ©ines cibles.Lâobjectif principal de notre Ă©tude est de caractĂ©riser par des expĂ©riences dâimmunoprĂ©cipitation le rĂŽle dynamique de la poly(ADP-ribosyl)ation. Lâidentification des interacteurs des PARPs sâeffectuera par spectromĂ©trie de masse. Cette technique va gĂ©nĂ©rer dâimportantes quantitĂ©s de donnĂ©es et nĂ©cessitera une plate-forme dâanalyse et de grandes capacitĂ©s de calcul informatique. Dans ce contexte gĂ©nĂ©ral, lâobjectif de ce travail de thĂšse Ă©tait de dĂ©velopper la plateforme bioinformatique dâanalyse, dâimplĂ©menter les outils dâidentifications des protĂ©ines, dâĂ©tablir un contrĂŽle de qualitĂ© des mĂ©thodes dâidentification (spĂ©cificitĂ©/sensibilitĂ©) et enfin dâexplorer le contenu des bases de connaissances. A lâaide du systĂšme mis en place au sein de la plateforme de protĂ©omique, nous avons identifiĂ© de nouvaux interacteurs de la famille des PARPs comme par exemple RFC1, 2, 3, 4, 5.An ambitious goal of proteomics is to elucidate the structure, interactions and functions of all proteins within cells and organisms. In the âpost-genomeâ era, mass spectrometry (MS) has become an important method for the analysis of proteome data. One strategy to determine protein function is to identify proteinâprotein interactions. The rapid advances made in mass spectrometry in combination with other methods used in proteomics results in an increasing of proteomics projects. The increasing use of high-throughput and large-scale bioinformatics-based studies has generated a massive amount of data stored in a number of different databases. A challenge for bioinformatics is to explore array of information to uncover biologically relevant interactions and pathways. Thus for protein interaction studies, there is clearly a need to develop a systematic and stepwise in silico approach that can predict potential interactors or are most likely to improve our understanding of how complex biological systems work. The focus of our laboratory is the study of the activity of poly(ADP-ribose) polymerases (PARPs) and their role in the cell. Poly(ADP-ribosylation) is a post-synthetic protein modification consisting of long chains of poly(ADP-ribose) (pADPr) synthesized by PARPs at the expense of NAD+. The overall objective of this research is to extensively characterize the dynamic roles of poly(ADP-ribosyl)ation in response to cellular stresses that cause DNA damage. Our approach utilizes immunoprecipitation and affinity purification followed by mass spectrometry identification of associated proteins. One part of this thesis projet is to develop the architecture and major features of a web-based utility tool, which is designed to rationally organize protein and peptide data generated by the tandem mass spectrometry. Next, we have performed benchmarking to optimize protein identification. The system will be expanded as needed in order to make the analysis more efficient. We have also explored the public database information for protein identification data mining. Using the described pipeline, we have successfully identified several interactions of biological significance between PARP and other proteins such as RFC1, 2, 3, 4, 5
PARPs database: A LIMS systems for protein-protein interaction data mining or laboratory information management system
<p>Abstract</p> <p>Background</p> <p>In the "post-genome" era, mass spectrometry (MS) has become an important method for the analysis of proteins and the rapid advancement of this technique, in combination with other proteomics methods, results in an increasing amount of proteome data. This data must be archived and analysed using specialized bioinformatics tools.</p> <p>Description</p> <p>We herein describe "PARPs database," a data analysis and management pipeline for liquid chromatography tandem mass spectrometry (LC-MS/MS) proteomics. PARPs database is a web-based tool whose features include experiment annotation, protein database searching, protein sequence management, as well as data-mining of the peptides and proteins identified.</p> <p>Conclusion</p> <p>Using this pipeline, we have successfully identified several interactions of biological significance between PARP-1 and other proteins, namely RFC-1, 2, 3, 4 and 5.</p
Quantitative profiling of the UGT transcriptome in human drug metabolizing tissues
Alternative splicing as a mean to control gene expression and diversify function is
suspected to considerably influence drug response and clearance. We report the
quantitative expression profiles of the human UGT genes including alternatively spliced
variants not previously annotated established by deep RNA-sequencing in tissues of
pharmacological importance. We reveal a comprehensive quantification of the
alternative UGT transcriptome that differ across tissues and among individuals.
Alternative transcripts that comprise novel in-frame sequences associated or not with
truncations of the 5â and/or 3â termini, significantly contribute to the total expression
levels of each UGT1 and UGT2 gene averaging 21% in normal tissues, with expression
of UGT2 variants surpassing those of UGT1. Quantitative data expose preferential
tissue expression patterns and remodelling in favour of alternative variants upon
tumorigenesis. These complex alternative splicing programs have the strong potential to
contribute to interindividual variability in drug metabolism in addition to diversify the UGT
proteome
Generalised Mutual Information: a Framework for Discriminative Clustering
In the last decade, recent successes in deep clustering majorly involved the
Mutual Information (MI) as an unsupervised objective for training neural
networks with increasing regularisations. While the quality of the
regularisations have been largely discussed for improvements, little attention
has been dedicated to the relevance of MI as a clustering objective. In this
paper, we first highlight how the maximisation of MI does not lead to
satisfying clusters. We identified the Kullback-Leibler divergence as the main
reason of this behaviour. Hence, we generalise the mutual information by
changing its core distance, introducing the Generalised Mutual Information
(GEMINI): a set of metrics for unsupervised neural network training. Unlike MI,
some GEMINIs do not require regularisations when training as they are
geometry-aware thanks to distances or kernels in the data space. Finally, we
highlight that GEMINIs can automatically select a relevant number of clusters,
a property that has been little studied in deep discriminative clustering
context where the number of clusters is a priori unknown.Comment: Submitted for review at the IEEE Transactions on Pattern Analysis and
Machine Intelligence. This article is an extension of an original NeurIPS
2022 article [arXiv:2210.06300
Unravelling the transcriptomic landscape of the major phase II UDP-glucuronosyltransferase drug metabolizing pathway using targeted RNA sequencing
A comprehensive view of the human UDP-glucuronosyltransferase (UGT) transcriptome is a
prerequisite to the establishment of an individualâs UGT metabolic glucuronidation signature. Here,
we uncover the transcriptome landscape of the ten human UGT loci genes in normal and tumoral
metabolic tissues by targeted RNA next generation sequencing. Alignment on the human hg19
reference genome identifies 234 novel exon-exon junctions. We recover all previously known
UGT1 and UGT2 enzyme-coding transcripts and identify over 130 structurally and functionally
diverse novel UGT variants. We further expose a revised genomic structure of UGT loci and
provide a comprehensive repertoire of transcripts for each UGT gene. Data also uncover a
remodelling of the UGT transcriptome occurring in a tissue- and tumor-specific manner. The
complex alternative splicing program regulating UGT expression and protein functions is likely
critical in determining detoxification capacity of an organ and stress-related responses, with
significant impact on drug responses and diseases. Keywords: Alternative splicing, transcriptome, glucuronidation, RNA sequencing, drug
metabolism, glucuronosyltransferase (UGT
Comparative proteome analysis of human epithelial ovarian cancer
<p>Abstract</p> <p>Background</p> <p>Epithelial ovarian cancer is a devastating disease associated with low survival prognosis mainly because of the lack of early detection markers and the asymptomatic nature of the cancer until late stage. Using two complementary proteomics approaches, a differential protein expression profile was carried out between low and highly transformed epithelial ovarian cancer cell lines which realistically mimic the phenotypic changes observed during evolution of a tumour metastasis. This investigation was aimed at a better understanding of the molecular mechanisms underlying differentiation, proliferation and neoplastic progression of ovarian cancer.</p> <p>Results</p> <p>The quantitative profiling of epithelial ovarian cancer model cell lines TOV-81D and TOV-112D generated using iTRAQ analysis and two-dimensional electrophoresis coupled to liquid chromatography tandem mass spectrometry revealed some proteins with altered expression levels. Several of these proteins have been the object of interest in cancer research but others were unrecognized as differentially expressed in a context of ovarian cancer. Among these, series of proteins involved in transcriptional activity, cellular metabolism, cell adhesion or motility and cytoskeleton organization were identified, suggesting their possible role in the emergence of oncogenic pathways leading to aggressive cellular behavior.</p> <p>Conclusion</p> <p>The differential protein expression profile generated by the two proteomics approaches combined to complementary characterizations studies will open the way to more exhaustive and systematic representation of the disease and will provide valuable information that may be helpful to uncover the molecular mechanisms related to epithelial ovarian cancer.</p
Large-Scale Automatic Feature Selection for Biomarker Discovery in High-Dimensional OMICs Data
The identification of biomarker signatures in omics molecular profiling is usually performed to predict outcomes in a precision medicine context, such as patient disease susceptibility, diagnosis, prognosis, and treatment response. To identify these signatures, we have developed a biomarker discovery tool, called BioDiscML. From a collection of samples and their associated characteristics, i.e., the biomarkers (e.g., gene expression, protein levels, clinico-pathological data), BioDiscML exploits various feature selection procedures to produce signatures associated to machine learning models that will predict efficiently a specified outcome. To this purpose, BioDiscML uses a large variety of machine learning algorithms to select the best combination of biomarkers for predicting categorical or continuous outcomes from highly unbalanced datasets. The software has been implemented to automate all machine learning steps, including data pre-processing, feature selection, model selection, and performance evaluation. BioDiscML is delivered as a stand-alone program and is available for download at https://github.com/mickaelleclercq/BioDiscML
Portrait of blood-derived extracellular vesicles in patients with Parkinson's disease.
The production of extracellular vesicles (EV) is a ubiquitous feature of eukaryotic cells but pathological events can affect their formation and constituents. We sought to characterize the nature, profile and protein signature of EV in the plasma of Parkinson's disease (PD) patients and how they correlate to clinical measures of the disease. EV were initially collected from cohorts of PD (nâŻ=âŻ60; Controls, nâŻ=âŻ37) and Huntington's disease (HD) patients (Pre-manifest, nâŻ=âŻ11; manifest, nâŻ=âŻ52; Controls, nâŻ=âŻ55) - for comparative purposes in individuals with another chronic neurodegenerative condition - and exhaustively analyzed using flow cytometry, electron microscopy and proteomics. We then collected 42 samples from an additional independent cohort of PD patients to confirm our initial results. Through a series of iterative steps, we optimized an approach for defining the EV signature in PD. We found that the number of EV derived specifically from erythrocytes segregated with UPDRS scores corresponding to different disease stages. Proteomic analysis further revealed that there is a specific signature of proteins that could reliably differentiate control subjects from mild and moderate PD patients. Taken together, we have developed/identified an EV blood-based assay that has the potential to be used as a biomarker for PD
- âŠ