4,289 research outputs found
Computational analysis of a plant receptor interaction network
Trabajo fin de máster en Bioinformática y Biología ComputacionalIn all organisms, complex protein-protein interactions (PPI) networks control major
biological functions yet studying their structural features presents a major analytical
challenge. In plants, leucine-rich-repeat receptor kinases (LRR-RKs) are key in sensing
and transmitting non-self as well as self-signals from the cell surface. As such, LRR-RKs
have both developmental and immune functions that allow plants to make the most of their
environments. In the model organism in plant molecular biology, Arabidopsis thaliana,
most LRR-RKs are still represented by biochemically and genetically uncharacterized
receptors. To fix this an LRR-based Cell Surface Interaction (CSI LRR ) network was
obtained in 2018, a protein-protein interaction network of the extracellular domain of 170
LRR-RKs that contains 567 bidirectional interactions. Several network analyses have been
performed with CSI LRR . However, these analyses have so far not considered the spatial and
temporal expression of its proteins. Neither has it been characterized in detail the role of
the extracellular domain (ECD) size in the network structure. Because of that, the objective
of the present work is to continue with more in depth analyses with the CSI LRR network.
This would provide important insights that will facilitate LRR-RKs function
characterization.
The first aim of this work is to test out the fit of the CSI LRR network to a scale-free
topology. To accomplish that, the degree distribution of the CSI LRR network was compared
with the degree distribution of the known network models of scale-free and random.
Additionally, three network attack algorithms were implemented and applied to these two
network models and the CSI LRR network to compare their behavior. However, since the
CSI LRR interaction data comes from an in vitro screening, there is no direct evidence
whether its protein-protein interactions occur inside the plant cells. To gain insight on how
the network composition changes depending on the transcriptional regulation, the
interaction data of the CSI LRR was integrated with 4 different RNA-Seq datasets related
with the network biological functions. To automatize this task a Python script was written.
Furthermore, it was evaluated the role of the LRR-RKs in the network structure depending
on the size of their extracellular domain (large or small). For that, centrality parameters
were measured, and size-targeted attacks performed. Finally, gene regulatory information
was integrated into the CSI LRR to classify the different network proteins according to the
function of the transcription factors that regulate its expression.
The results were that CSI LRR fits a power law degree distribution and approximates a scale-
free topology. Moreover, CSI LRR displays high resistance to random attacks and reduced
resistance to hub/bottleneck-directed attacks, similarly to scale-free network model. Also,
the integration of CSI LRR interaction data and RNA-Seq data suggests that the
transcriptional regulation of the network is more relevant for developmental programs than
for defense responses. Another result was that the LRR-RKs with a small ECD size have a
major role in the maintenance of the CSI LRR integrity. Lastly, it was hypothesized that the
integration of CSI LRR interaction data with predicted gene regulatory networks could shed
light upon the functioning of growth-immunity signaling crosstalk
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
Epigenetic characterization of human hepatocyte subpopulations in context of complex metabolic diseases and during in vitro differentiation of hepatocyte-like cells
The comprehensive transcriptional and epigenetic characterization of human hepatocyte subpopulations is necessary to achieve a better understanding of regulatory processes in health and complex metabolic diseases as well as during in vitro differentiation. Based on integrative analysis of genome-wide sequencing data, this thesis aims to unravel hepatocyte heterogeneity in different biological contexts. A deeper understanding of spatial organization of cells in human tissues is an important challenge. Using a unique experimental set-up based on laser capture microdissection coupled to next generation sequencing, which preserves spatial orientation and still provides genome-wide data of well defined subpopulations, the first combined spatial analysis of transcriptomes and methylomes across three micro-dissected zones of human liver provides a wealth of new positional insights, both in health and in context of fatty liver disease. In addition, these spatial maps serve as reference for projection of single cell data into hepatic pseudospace, which is still a major challenge. Hence, a novel pseudospace inference approach, which considerably improves spatial reconstruction of single cells into tissue context, is demonstrated for human liver. Finally, the identification of underlying regulatory networks by integrative epigenomic analysis of in vitro differentiated hepatocyte-like cells contributes to the development of reasonable cell culture interventions to improve differentiation.Die umfassende transkriptionelle und epigenetische Charakterisierung humaner Leberzellsubpopulationen ist notwendig für die Aufklärung regulatorischer Prozesse in gesundem Gewebe, sowie im Zusammenhang mit komplexen metabolischen Erkrankungen und während der in vitro Differenzierung. Ziel dieser Arbeit ist es, basierend auf der integrativen Analyse genomweiter Sequenzierungsdaten, die Heterogenität von Leberzellen besser zu verstehen. Die räumliche Organisation von Zellen in humanem Gewebe stellt eine große Herausforderung dar. Mit Hilfe von Lasermikrodissektion gekoppelt an Hochdurchsatzsequenzierung ist es möglich definierte Subpopulationen hinsichtlich ihres Gewebekontextes zu analysieren. Somit konnte die erste räumliche Analyse von Transkriptom und Methylom dreier Zonen der humanen Leber erstellt werden, die eine Vielzahl neuer Erkenntnisse sowohl in gesundem Lebergewebe als auch in Zusammenhang mit Fettlebererkrankungen liefert. Außerdem wurde auf Grundlage dieser räumlichen Karten ein neuer Ansatz zur Projektion von Einzelzelldaten in den räumlichen Gewebekontext etabliert. Schließlich konnte durch die integrative Analyse der ausschlaggebenden regulatorischen Netzwerke während der in vitro Differenzierung von Hepatozyten-ähnlichen Zellen neue Strategien zur Verbesserung der Differenzierung entwickelt werden
Recommended from our members
Regulation of mitochondrial biogenesis in erythropoiesis by mTORC1-mediated protein translation.
Advances in genomic profiling present new challenges of explaining how changes in DNA and RNA are translated into proteins linking genotype to phenotype. Here we compare the genome-scale proteomic and transcriptomic changes in human primary haematopoietic stem/progenitor cells and erythroid progenitors, and uncover pathways related to mitochondrial biogenesis enhanced through post-transcriptional regulation. Mitochondrial factors including TFAM and PHB2 are selectively regulated through protein translation during erythroid specification. Depletion of TFAM in erythroid cells alters intracellular metabolism, leading to elevated histone acetylation, deregulated gene expression, and defective mitochondria and erythropoiesis. Mechanistically, mTORC1 signalling is enhanced to promote translation of mitochondria-associated transcripts through TOP-like motifs. Genetic and pharmacological perturbation of mitochondria or mTORC1 specifically impairs erythropoiesis in vitro and in vivo. Our studies support a mechanism for post-transcriptional control of erythroid mitochondria and may have direct relevance to haematologic defects associated with mitochondrial diseases and ageing
Transcriptome profiling of grapevine seedless segregants during berry development reveals candidate genes associated with berry weight
Indexación: Web of Science; PubMedBackground
Berry size is considered as one of the main selection criteria in table grape breeding programs. However, this is a quantitative and polygenic trait, and its genetic determination is still poorly understood. Considering its economic importance, it is relevant to determine its genetic architecture and elucidate the mechanisms involved in its expression. To approach this issue, an RNA-Seq experiment based on Illumina platform was performed (14 libraries), including seedless segregants with contrasting phenotypes for berry weight at fruit setting (FST) and 6–8 mm berries (B68) phenological stages.
Results
A group of 526 differentially expressed (DE) genes were identified, by comparing seedless segregants with contrasting phenotypes for berry weight: 101 genes from the FST stage and 463 from the B68 stage. Also, we integrated differential expression, principal components analysis (PCA), correlations and network co-expression analyses to characterize the transcriptome profiling observed in segregants with contrasting phenotypes for berry weight. After this, 68 DE genes were selected as candidate genes, and seven candidate genes were validated by real time-PCR, confirming their expression profiles.
Conclusions
We have carried out the first transcriptome analysis focused on table grape seedless segregants with contrasting phenotypes for berry weight. Our findings contributed to the understanding of the mechanisms involved in berry weight determination. Also, this comparative transcriptome profiling revealed candidate genes for berry weight which could be evaluated as selection tools in table grape breeding programs.http://bmcplantbiol.biomedcentral.com/articles/10.1186/s12870-016-0789-
Recommended from our members
Overexpression of a Prefoldin β subunit gene reduces biomass recalcitrance in the bioenergy crop Populus.
Prefoldin (PFD) is a group II chaperonin that is ubiquitously present in the eukaryotic kingdom. Six subunits (PFD1-6) form a jellyfish-like heterohexameric PFD complex and function in protein folding and cytoskeleton organization. However, little is known about its function in plant cell wall-related processes. Here, we report the functional characterization of a PFD gene from Populus deltoides, designated as PdPFD2.2. There are two copies of PFD2 in Populus, and PdPFD2.2 was ubiquitously expressed with high transcript abundance in the cambial region. PdPFD2.2 can physically interact with DELLA protein RGA1_8g, and its subcellular localization is affected by the interaction. In P. deltoides transgenic plants overexpressing PdPFD2.2, the lignin syringyl/guaiacyl ratio was increased, but cellulose content and crystallinity index were unchanged. In addition, the total released sugar (glucose and xylose) amounts were increased by 7.6% and 6.1%, respectively, in two transgenic lines. Transcriptomic and metabolomic analyses revealed that secondary metabolic pathways, including lignin and flavonoid biosynthesis, were affected by overexpressing PdPFD2.2. A total of eight hub transcription factors (TFs) were identified based on TF binding sites of differentially expressed genes in Populus transgenic plants overexpressing PdPFD2.2. In addition, several known cell wall-related TFs, such as MYB3, MYB4, MYB7, TT8 and XND1, were affected by overexpression of PdPFD2.2. These results suggest that overexpression of PdPFD2.2 can reduce biomass recalcitrance and PdPFD2.2 is a promising target for genetic engineering to improve feedstock characteristics to enhance biofuel conversion and reduce the cost of lignocellulosic biofuel production
A model validation pipeline for healthy tissue genome-scale metabolic models
Dissertação de mestrado em BioinformáticaNos últimos anos, os métodos de alto rendimento disponibilizaram dados ómicos referentes a várias
camadas da organização biológica, permitindo a integração do conhecimento de componentes individuais
em modelos complexos, como modelos metabólicos à escala genómica (GSMMs). Estes podem ser
analisados por métodos de modelação baseada em restrições(CBM), que facilitam abordagens preditivas
in silico.
Os modelos metabólicos humanos têm sido usados para estudar tecidos saudáveis e as suas
doenças metabólicas associadas, como obesidade, diabetes e cancro. Modelos humanos genéricos
podem ser integrados com dados contextuais por meio de algoritmos de reconstrução, com vista a
produzir modelos metabólicos contextualizados (CSMs), que são normalmente melhores a capturar a
variação entre diferentes tecidos e tipos de células. Como o corpo humano contém uma grande variedade
de tecidos e tipos de células, os CSMs são frequentemente adotados como um meio de obter modelos
metabólicos mais precisos de tecido humano saudável.
No entanto, ao contrário de modelos de microrganismos e cancro, que acomodam vários
métodos de validação, como a comparação de fluxos in silico ou de previsões de genes essenciais com
dados experimentais, os métodos de validação facilmente aplicáveis a CSMs de tecido humano saudável
podem ser mais limitados. Consequentemente, apesar de esforços continuados para atualizar os
modelos humanos genéricos e algoritmos de reconstrução para extrair CSMs de alta qualidade, a sua
validação continua a ser uma preocupação.
Este trabalho apresenta uma pipeline para a extração e validação básica de CSMs de tecidos
humanos normais derivados da integração de dados transcriptómicos com um modelo humano genérico.
Todos os CSMs foram extraídos do modelo genérico Human-GEM publicado recentemente por Robinson
et al. (2020), usando o package Troppo em Python e nos algoritmos de reconstrução fastCORE e tINIT
nele implementados. Os CSMs extraídos correspondem a 11 tecidos saudáveis disponíveis no conjunto
de dados GTEx v8.
Antes da extração, métodos de aprendizagem máquina foram aplicados à seleção de um limiar
para conversão em gene scores. Os modelos de maior qualidade foram obtidos com um limite mínimo
global aplicado diretamente aos dados ómicos. A estratégia de validação focou-se no número de tarefas
metabólicas passadas como um indicador de desempenho. Por último, este trabalho é acompanhado
por Jupyter Notebooks, que incluem um guia de extração de modelos para novos utilizadores.n the past few years, high-throughput experimental methods have made omics data available for several
layers of biological organization, enabling the integration of knowledge from individual components into
complex modelssuch as genome-scale metabolic models (GSMMs). These can be analysed by constraint based modelling (CBM) methods, which facilitate in silico predictive approaches.
Human metabolic models have been used to study healthy human tissues and their associated
metabolic diseases, such as obesity, diabetes, and cancer. Generic human models can be integrated with
contextual data through reconstruction algorithms to produce context-specific models (CSMs), which are
typically better at capturing the variation between different tissues and cell types. As the human body
contains a multitude of tissues and cell types, CSMs are frequently adopted as a means to obtain accurate
metabolic models of healthy human tissues.
However, unlike microorganisms’ or cancer models, which allow several methods of validation
such as the comparison of in silico fluxes or gene essentiality predictions to experimental data, the
validation methods easily applicable to CSMs of healthy human tissue are more limited. Consequently,
despite continued efforts to update generic human models and reconstruction algorithms to extract high
quality CSMs, their validation remains a concern.
This work presents a pipeline for the extraction and basic validation of CSMs of normal human
tissues derived from the integration of transcriptomics data with a generic human model. All CSMs were
extracted from the Human-GEM generic model recently published by Robinson et al. (2020), relied on
the open-source Troppo Python package and in the fastCORE and tINIT reconstruction algorithms
implemented therein. CSMs were extracted for 11 healthy tissues available in the GTEx v8 dataset.
Prior to extraction, machine learning methods were applied to threshold selection for gene scores
conversion. The highest quality models were obtained with a global threshold applied to the omics data
directly. The CSM validation strategy focused on the total number of metabolic tasks passed as a
performance indicator. Lastly, this work is accompanied by Jupyter Notebooks, which include a beginner
friendly model extraction guide
- …