24 research outputs found
Evolution of spatiotemporal organization of biological systems : origins and phenotypic impact of duplicated genes
Eine Schwemme von Genomsequenzen sowie weitere groß angelegte Studien zur Charakterisierung von molekularen Funktionen hat Forschern erlaubt, komparative Studien der funktionellen Komponenten und ihrer Interaktionen für eine große Anzahl von Spezies durchegeführt werden. Die so gewonnen Erkenntnisse können weiter untersucht werden und mithilfe von Orthologie (Homologie abgeleitet durch Artenbildung) auf neu-sequenzierte Spezies übertragen werden um Erkenntnisse über die Evolution von molekularen Funktionen und ihrer Organisation zu gewinnen. Eine robuste Orthologie ist Voraussetzung für akkurate phylogenomische und komparative Analysen. Obwohl sich das Forschungsfeld der Orthologie Fortschritte gemacht hat, ist die Orthologie- Voraussage noch immer von widersprüchlich und unsicher. Aus diesem Grund sollten Tests zur Qualitätskontrolle eingeführt werden. Im Rahmen dieser Arbeit wurde ein Phylogenie-basierter Datensatz entwickelt, mit dem die Orthologie Voraussage in den Animalia überprüft werden kann. Dieser Datensatz wurde benutzt um die Orthologie-Voraussagen von fünf öffentlich zugänglichen Repositorien zu evaluieren und die Auswirkungen von einer Anzahl von technischen und biologischen Faktoren zu untersuchen. Gleichzeitig hat die große Anzahl von komplett sequenzierten Genomen zur Formulierung von interessanten Hypothesen über die Mechanismen der Evolution von molekularen Funktionen geführt. Zum Beispiel wurden Paraloge, Homologe die durch Gen- oder Genomduplikation entstanden sind, mit der Erweiterung und Teilung von molekularen Funktionen assoziiert. Eine Vielzahl von Studien wurden durchgeführt um herauszufinden, wie siche duplizierte Gene, die mit morphologischen Veränderungen assoziert werden, ihre Genexpressionsraten in unterschiedlichen Geweben ändern. Es wurde jedoch noch nicht großflächig untersucht, ob die regulatorische Divergenz von Paralogen bestimmte Muster bevorzugt und wie diese Muster entstanden sind. Um dies zu untersuchen wurden die Expressionsdaten von 31 menschlichen Geweben benutzt und bevorzugte Gewebekombinationen von sub(neo)funktionalisierten Paralogen identifiziert. Interessanterweise stellte sich heraus, dass Paraloge die mit dem Choradata- Wirbeltiere Übergang im Zusammenhang stehen und bereits vor dem Ur-Wibeltier vorhanden waren, häufig zwischen Gehirn und nicht-Gehirngeweben divergieren. Im Kontrast zur weitreichenden Literatur über die Evolution von Geweben und Paralogie, ist die Rolle von Genduplikation in der temporalen Regulation von biologischen Systemen schlechter untersucht. Um dies zu untersuchen wurden Orthologie und Genexpressionsdaten kombiniert. Wir konnten herausfinden, dass der Zell-Zyklus und andere periodische Prozesse (wie der Circadianen und Ultradianen Rhythmik) von Paralogen reguliert werden. Das funktionelle Repertoire dieser Paraloge unterscheidet sich in 3 eukaryotischen Spezies (Arabidopsis, Mensch und Hefe), was impliziert, dass sich die temporale Regulation der Zellen durch Paraloge sich in den drei Organismen unabhängig Zusammenfassung voneinander entwickelt hat. Zusammenfassend ist die größte Herausvorderung der postgenomischen Ära eine effektive Integration von funktionell relevanten genomischen Daten um herauszufinden, wie komplexe Eigenschaften sich entwickelt haben. Um dieses Ziel zu erreichen sollten die dynamischen Veränderungen der Gen-Inventare unter Beachtung von der Beziehung von Orthologen (gleicher Ursprung) und Paralogen (Potenzial für Divergenz) untersucht werden
Cell population structure prior to bifurcation predicts efficiency of directed differentiation in human induced pluripotent cells.
Steering the differentiation of induced pluripotent stem cells (iPSCs) toward specific cell types is crucial for patient-specific disease modeling and drug testing. This effort requires the capacity to predict and control when and how multipotent progenitor cells commit to the desired cell fate. Cell fate commitment represents a critical state transition or tipping point at which complex systems undergo a sudden qualitative shift. To characterize such transitions during iPSC to cardiomyocyte differentiation, we analyzed the gene expression patterns of 96 developmental genes at single-cell resolution. We identified a bifurcation event early in the trajectory when a primitive streak-like cell population segregated into the mesodermal and endodermal lineages. Before this branching point, we could detect the signature of an imminent critical transition: increase in cell heterogeneity and coordination of gene expression. Correlation analysis of gene expression profiles at the tipping point indicates transcription factors that drive the state transition toward each alternative cell fate and their relationships with specific phenotypic readouts. The latter helps us to facilitate small molecule screening for differentiation efficiency. To this end, we set up an analysis of cell population structure at the tipping point after systematic variation of the protocol to bias the differentiation toward mesodermal or endodermal cell lineage. We were able to predict the proportion of cardiomyocytes many days before cells manifest the differentiated phenotype. The analysis of cell populations undergoing a critical state transition thus affords a tool to forecast cell fate outcomes and can be used to optimize differentiation protocols to obtain desired cell populations
eggNOG v4.0: nested orthology inference across 3686 organisms
With the increasing availability of various ‘omics data, high-quality orthology assignment is crucial for evolutionary and functional genomics studies. We here present the fourth version of the eggNOG database (available at http://eggnog.embl.de) that derives nonsupervised orthologous groups (NOGs) from complete genomes, and then applies a comprehensive characterization and analysis pipeline to the resulting gene families. Compared with the previous version, we have more than tripled the underlying species set to cover 3686 organisms, keeping track with genome project completions while prioritizing the inclusion of high-quality genomes to minimize error propagation from incomplete proteome sets. Major technological advances include (i) a robust and scalable procedure for the identification and inclusion of high-quality genomes, (ii) provision of orthologous groups for 107 different taxonomic levels compared with 41 in eggNOGv3, (iii) identification and annotation of particularly closely related orthologous groups, facilitating analysis of related gene families, (iv) improvements of the clustering and functional annotation approach, (v) adoption of a revised tree building procedure based on the multiple alignments generated during the process and (vi) implementation of quality control procedures throughout the entire pipeline. As in previous versions, eggNOGv4 provides multiple sequence alignments and maximum-likelihood trees, as well as broad functional annotation. Users can access the complete database of orthologous groups via a web interface, as well as through bulk downloa
eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges
Orthologous relationships form the basis of most comparative genomic and metagenomic studies and are essential for proper phylogenetic and functional analyses. The third version of the eggNOG database (http://eggnog.embl.de) contains non-supervised orthologous groups constructed from 1133 organisms, doubling the number of genes with orthology assignment compared to eggNOG v2. The new release is the result of a number of improvements and expansions: (i) the underlying homology searches are now based on the SIMAP database; (ii) the orthologous groups have been extended to 41 levels of selected taxonomic ranges enabling much more fine-grained orthology assignments; and (iii) the newly designed web page is considerably faster with more functionality. In total, eggNOG v3 contains 721 801 orthologous groups, encompassing a total of 4 396 591 genes. Additionally, we updated 4873 and 4850 original COGs and KOGs, respectively, to include all 1133 organisms. At the universal level, covering all three domains of life, 101 208 orthologous groups are available, while the others are applicable at 40 more limited taxonomic ranges. Each group is amended by multiple sequence alignments and maximum-likelihood trees and broad functional descriptions are provided for 450 904 orthologous groups (62.5%
eggNOG v4.0:Nested orthology inference across 3686 organisms
With the increasing availability of various 'omics data, high-quality orthology assignment is crucial for evolutionary and functional genomics studies. We here present the fourth version of the eggNOG database (available at http://eggnog.embl.de) that derives nonsupervised orthologous groups (NOGs) from complete genomes, and then applies a comprehensive characterization and analysis pipeline to the resulting gene families. Compared with the previous version, we have more than tripled the underlying species set to cover 3686 organisms, keeping track with genome project completions while prioritizing the inclusion of high-quality genomes to minimize error propagation from incomplete proteome sets. Major technological advances include (i) a robust and scalable procedure for the identification and inclusion of high-quality genomes, (ii) provision of orthologous groups for 107 different taxonomic levels compared with 41 in eggNOGv3, (iii) identification and annotation of particularly closely related orthologous groups, facilitating analysis of related gene families, (iv) improvements of the clustering and functional annotation approach, (v) adoption of a revised tree building procedure based on the multiple alignments generated during the process and (vi) implementation of quality control procedures throughout the entire pipeline. As in previous versions, eggNOGv4 provides multiple sequence alignments and maximum-likelihood trees, as well as broad functional annotation. Users can access the complete database of orthologous groups via a web interface, as well as through bulk download
Evolution and regulation of cellular periodic processes: a role for paralogues
Several cyclic processes take place within a single organism. For example, the cell cycle is coordinated with the 24 h diurnal rhythm in animals and plants, and with the 40 min ultradian rhythm in budding yeast. To examine the evolution of periodic gene expression during these processes, we performed the first systematic comparison in three organisms (Homo sapiens, Arabidopsis thaliana and Saccharomyces cerevisiae) by using public microarray data. We observed that although diurnal-regulated and ultradian-regulated genes are not generally cell-cycle-regulated, they tend to have cell-cycle-regulated paralogues. Thus, diverged temporal expression of paralogues seems to facilitate cellular orchestration under different periodic stimuli. Lineage-specific functional repertoires of periodic-associated paralogues imply that this mode of regulation might have evolved independently in several organisms
Gene socialization: gene order, GC content and gene silencing in Salmonella
BACKGROUND: Genes of conserved order in bacterial genomes tend to evolve slower than genes whose order is not conserved. In addition, genes with a GC content lower than the GC content of the resident genome are known to be selectively silenced by the histone-like nucleoid structuring protein (H-NS) in Salmonella. RESULTS: In this study, we use a comparative genomics approach to demonstrate that in Salmonella, genes whose order is not conserved (or genes without homologs) in closely related bacteria possess a significantly lower average GC content in comparison to genes that preserve their relative position in the genome. Moreover, these genes are more frequently targeted by H-NS than genes that have conserved their genomic neighborhood. We also observed that duplicated genes that do not preserve their genomic neighborhood are, on average, under less selective pressure. CONCLUSIONS: We establish a strong association between gene order, GC content and gene silencing in a model bacterial species. This analysis suggests that genes that are not under strong selective pressure (evolve faster than others) in Salmonella tend to accumulate more AT-rich mutations and are eventually silenced by H-NS. Our findings may establish new approaches for a better understanding of bacterial genome evolution and function, using information from functional and comparative genomics
Taking Systems Medicine to Heart.
Systems medicine is a holistic approach to deciphering the complexity of human physiology in health and disease. In essence, a living body is constituted of networks of dynamically interacting units (molecules, cells, organs, etc) that underlie its collective functions. Declining resilience because of aging and other chronic environmental exposures drives the system to transition from a health state to a disease state; these transitions, triggered by acute perturbations or chronic disturbance, manifest as qualitative shifts in the interactions and dynamics of the disease-perturbed networks. Understanding health-to-disease transitions poses a high-dimensional nonlinear reconstruction problem that requires deep understanding of biology and innovation in study design, technology, and data analysis. With a focus on the principles of systems medicine, this Review discusses approaches for deciphering this biological complexity from a novel perspective, namely, understanding how disease-perturbed networks function; their study provides insights into fundamental disease mechanisms. The immediate goals for systems medicine are to identify early transitions to cardiovascular (and other chronic) diseases and to accelerate the translation of new preventive, diagnostic, or therapeutic targets into clinical practice, a critical step in the development of personalized, predictive, preventive, and participatory (P4) medicine