18 research outputs found

    Evolution of gene regulation of pluripotency - the case for wiki tracks at genome browsers

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Experimentally validated data on gene regulation are hard to obtain. In particular, information about transcription factor binding sites in regulatory regions are scattered around in the literature. This impedes their systematic in-context analysis, e.g. the inference of their conservation in evolutionary history.</p> <p>Results</p> <p>We demonstrate the power of integrative bioinformatics by including curated transcription factor binding site information into the UCSC genome browser, using wiki and custom tracks, which enable easy publication of annotation data. Data integration allows to investigate the evolution of gene regulation of the pluripotency-associated genes Oct4, Sox2 and Nanog. For the first time, experimentally validated transcription factor binding sites in the regulatory regions of all three genes were assembled together based on manual curation of data from 39 publications. Using the UCSC genome browser, these data were then visualized in the context of multi-species conservation based on genomic alignment. We confirm previous hypotheses regarding the evolutionary age of specific regulatory patterns, establishing their "deep homology". We also confirm some other principles of Carroll's "Genetic theory of Morphological Evolution", such as "mosaic pleiotropy", exemplified by the dual role of Sox2 reflected in its regulatory region.</p> <p>Conclusions</p> <p>We were able to elucidate some aspects of the evolution of gene regulation for three genes associated with pluripotency. Based on the expected return on investment for the community, we encourage other scientists to contribute experimental data on gene regulation (original work as well as data collected for reviews) to the UCSC system, to enable studies of the evolution of gene regulation on a large scale, and to report their findings.</p> <p>Reviewers</p> <p>This article was reviewed by Dr. Gustavo Glusman and Dr. Juan Caballero, Institute for Systems Biology, Seattle, USA (nominated by Dr. Doron Lancet, Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel), Dr. Niels Grabe, TIGA Center (BIOQUANT) and Medical Systems Biology Group, Institute of Medical Biometry and Informatics, University Hospital Heidelberg, Germany (nominated by Dr. Mikhail Gelfand, Department of Bioinformatics, Institute of Information Transfer Problems, Russian Academy of Science, Moscow, Russian Federation) and Dr. Franz-Josef MĂĽller, Center for Regenerative Medicine, The Scripps Research Institute, La Jolla, CA, USA and University Hospital for Psychiatry and Psychotherapy (part of ZIP gGmbH), University of Kiel, Germany (nominated by Dr. Trey Ideker, University of California, San Diego, La Jolla CA, United States).</p

    Exploration Methods to Investigate the Evolution of Transcriptional Regulation in the Eukaryotic Genome

    No full text
    Background: Computational tools for the investigation of transcriptional regulation, in particular of transcription factor binding sites (TFBS), in evolutionary context are developed. Existing sequence based tools prediction such binding sites do not consider their actual functionality, although it is known that besides the base sequence many other aspects are relevant for binding and for the effects of that binding. In particular in Eukaryotes a perfectly matching sequence motif is neither necessary nor sufficient for a functional transcription factor binding site. Published work in the field of transcriptional regulation frequently focus on the prediction of putative transcription factor binding sites based on sequence similarity to known binding sites. Furthermore, among the related software, only a small number implements visualization of the evolution of transcription factor binding sites or the integration of other regulation related data. The interface of many tools is made for computer scientists, although the actual interpretation of their outcome needs profound biological background knowledge. Results and Discussion: The tool presented in this thesis, "ReXSpecies" is a web application. Therefore, it is ready to use for the end user without installation providing a graphical user interface. Besides extensive automation of analyses of transcriptional regulation (the only necessary input are the genomic coordinates of a regulatory region), new techniques to visualize the evolution of transcription factor binding sites were developed. Furthermore, an interface to genome browsers was implemented to enable scientists to comprehensively analyze their regulatory regions with respect to other regulation relevant data. ReXSpecies contains a novel algorithm that searches for evolutionary conserved patterns of transcription factor binding sites, which could imply functionality. Such patterns were verified using some known transcription factor binding sites of genes involved in pluripotency. In the appendix, efficiency and correctness of the used algorithm are discussed. Furthermore, a novel algorithm to color phylogenetic trees intuitively is presented. In the thesis, new possibilities to render evolutionary conserved sets of transcription factor binding sites are developed. The thesis also discusses the evolutionary conservation of regulation and its context dependency. An important source of errors in the analysis of regulatory regions using comparative genetics is probably to find and to align homologous regulatory regions. Some alternatives to using sequence similarity alone are discussed. Outlook: Other possibilities to find (functional) homologous regulatory regions (besides whole-genome-alignments currently used) are BLAST searches, local alignments, homology databases and alignment-free approaches. Using one ore more of these alternatives could reduce the number of artifacts by reduction of the number of regions that are erroneously declared homologous. To achieve more robust predictions of transcription, the author suggests to use other regulation related data besides sequence data only. Therefore, the use and extension of existing tools, in particular of systems biology, is proposed.Hintergrund: Die vorliegende Arbeit befasst sich mit der Entwicklung von rechnergestützten Werkzeugen zur Erforschung transkriptionaler Regulation, insbesondere von Transkriptionsfaktorbindestellen (TFBS), vor dem Hintergrund ihrer Evolution. Bestehende (sequenzbasierte) Verfahren zur Vorhersage der Lage solcher Bindestellen sind mit dem Nachteil behaftet, die Funktionalität dieser Bindestellen zu vernachlässigen, obwohl bekannt ist, dass neben der Nukleotidsequenz viele andere Einflussfaktoren auf das Binden von Transkriptionsfaktoren und den Effekt einer solchen Bindung existieren. Insbesondere ist bei Eukaryoten eine perfekt passende Sequenz weder notwendiges noch hinreichendes Kriterium für das Vorliegen einer funktionalen Transkriptionsfaktorbindestelle. Bisherige Studien zum Thema konzentrieren sich oft auf das Auffinden potentieller Bindestellen aufgrund ihrer Sequenzähnlichkeit zu bereits bekannten Bindestellen. Weiterhin gibt es unter den bestehenden Werkzeugen bislang nur wenige, die die Visualisierung der Evolution von Transkriptionsfaktorbindestellen und deren Untersuchung im Kontext der Ergebnisse anderer regulationsrelevanter Einflüsse erlauben. Die Handhabung der Werkzeuge ist in der Regel nur für Informatiker ausgelegt, obwohl die eigentliche Auswertung fundiertes biologisches Hintergrundwissen erfordert. Ergebnisse und Diskussion: Das im Rahmen meiner Arbeit entstandene Werkzeug "ReXSpecies" ist als Webanwendung für den Endanwender ohne Installation weiterer Software und mit grafischer Benutzerschnittstelle verfügbar. Es bietet neben einer weitgehenden Automatisierung solcher Analysen (notwendiger Input sind nur die Koordinaten der regulatorischen Region, die untersucht werden soll) neue Visualisierungstechniken der Evolution von Transkriptionsfaktorbindestellen und eine Schnittstelle zu Genombrowsern, um eine umfassende Analyse einer bestimmten regulatorischen Region unter Berücksichtigung anderer Erkenntnisquellen zu ermöglichen. Insbesondere enthält ReXSpecies einen neuartigen Algorithmus, der nach evolutionär erhaltenen Mustern von Transkriptionsfaktorbindestellen sucht. Solche Muster können auf eine funktionale Bedeutung der am Muster beteiligten Bindestellen hinweisen und so möglicherweise bedeutsame Bindestellen aufzeigen. Am Beispiel der regulatorischen Regionen pluripotenzrelevanter Gene konnten einige der allgemein als gesichert betrachteten bekannten Bindestellen als relevant im Sinne ihrer evolutionaren Konservierung erkannt werden. Au&szlig;erdem werden im Anhang Effizienz und Korrektheit des verwendeten Algorithmus zur Mustersuche behandelt. Weiterhin wird im Anhang ein neuartiger Algorithmus vorgestellt, um die Knoten phylogenetischer Bäume intuitiv einzufärben. In diesem Zusammenhang werden neuartige Möglichkeiten der Darstellung solcher Muster im Kontext ihrer Evolution vorgestellt. Im Rahmen der Arbeit wird diskutiert, dass die Phylogenie von Transkriptionsfaktorbindestellen im Vergleich zu anderer intergener DNA anderen Regeln unterliegt; Transkriptionsfaktorbindestellen sind Merkmale "höherer Ordnung" als Nukleotidsequenzen es sind. Weiterhin wird die Vermutung aufgestellt, dass die meisten vorhergesagten Bindestellen zwar tatsächlich von Transkriptionsfaktoren gebunden werden könnten; allerdings ist nicht jede Bindestelle von tragender Bedeutung für die Genregulation. Als wichtigste Fehlerquelle bei der Analyse regulatorischer Regionen im Vergleich verschiedener Spezies gilt das Auffinden homologer regulatorischer Regionen. Hierzu werden verschiedene Wege diskutiert, von denen in der vorliegenden Version von ReXSpecies ein schon berechnetes multiples Genom-Alignment verwendet wird. Ausblick: Andere Möglichkeiten zum Auffinden (funktional) homologer regulatorischer Regionen sind BLAST-Suchen, sequenzbasierte lokale Alignments, Homologiedatenbanken und der alignmentfreie Ansatz. Vom Einsatz dieser Techniken in zukünftigen Versionen von ReXSpecies erhofft sich der Autor vollständigere und richtigere Sätze homologer Regionen, was die Anzahl artizieller Muster senken könnte. Um schlie&szlig;lich zu tragenden Vorhersagen funktionaler Transkriptionsfaktorbindestellen zu gelangen, schlägt der Autor vor, neben Nukleotidsequenzen und ihrer Evolution so viele andere Daten wie möglich in einer umfassenden Analyse zusammenzutragen. Hierzu wird die Verwendung und Erweiterung bestehender Werkzeuge insbesondere der Systembiologie empfohlen

    Visualization and Exploration of Conserved Regulatory Modules Using ReXSpecies 2

    Get PDF
    Abstract Background The prediction of transcription factor binding sites is difficult for many reasons. Thus, filtering methods are needed to enrich for biologically relevant (true positive) matches in the large amount of computational predictions that are frequently generated from promoter sequences. Results ReXSpecies 2 filters predictions of transcription factor binding sites and generates a set of figures displaying them in evolutionary context. More specifically, it uses position specific scoring matrices to search for motifs that specify transcription factor binding sites. It removes redundant matches and filters the remaining matches by the phylogenetic group that the matrices belong to. It then identifies potential transcriptional modules, and generates figures that highlight such modules, taking evolution into consideration. Module formation, scoring by evolutionary criteria and visual clues reduce the amount of predictions to a manageable scale. Identification of transcription factor binding sites of particular functional importance is left to expert filtering. ReXSpecies 2 interacts with genome browsers to enable scientists to filter predictions together with other sequence-related data. Conclusions Based on ReXSpecies 2, we derive plausible hypotheses about the regulation of pluripotency. Our tool is designed to analyze transcription factor binding site predictions considering their common pattern of occurrence, highlighting their evolutionary history.</p

    R Packages for Data Quality Assessments and Data Monitoring: A Software Scoping Review with Recommendations for Future Developments

    No full text
    Data quality assessments (DQA) are necessary to ensure valid research results. Despite the growing availability of tools of relevance for DQA in the R language, a systematic comparison of their functionalities is missing. Therefore, we review R packages related to data quality (DQ) and assess their scope against a DQ framework for observational health studies. Based on a systematic search, we screened more than 140 R packages related to DQA in the Comprehensive R Archive Network. From these, we selected packages which target at least three of the four DQ dimensions (integrity, completeness, consistency, accuracy) in a reference framework. We evaluated the resulting 27 packages for general features (e.g., usability, metadata handling, output types, descriptive statistics) and the possible assessment’s breadth. To facilitate comparisons, we applied all packages to a publicly available dataset from a cohort study. We found that the packages’ scope varies considerably regarding functionalities and usability. Only three packages follow a DQ concept, and some offer an extensive rule-based issue analysis. However, the reference framework does not include a few implemented functionalities, and it should be broadened accordingly. Improved use of metadata to empower DQA and user-friendliness enhancement, such as GUIs and reports that grade the severity of DQ issues, stand out as the main directions for future developments

    Differential Network Analysis Applied to Preoperative Breast Cancer Chemotherapy Response

    Get PDF
    <div><p>In silico approaches are increasingly considered to improve breast cancer treatment. One of these treatments, neoadjuvant TFAC chemotherapy, is used in cases where application of preoperative systemic therapy is indicated. Estimating response to treatment allows or improves clinical decision-making and this, in turn, may be based on a good understanding of the underlying molecular mechanisms. Ever increasing amounts of high throughput data become available for integration into functional networks. In this study, we applied our software tool ExprEssence to identify specific mechanisms relevant for TFAC therapy response, from a gene/protein interaction network. We contrasted the resulting active subnetwork to the subnetworks of two other such methods, OptDis and KeyPathwayMiner. We could show that the ExprEssence subnetwork is more related to the mechanistic functional principles of TFAC therapy than the subnetworks of the other two methods despite the simplicity of ExprEssence. We were able to validate our method by recovering known mechanisms and as an application example of our method, we identified a mechanism that may further explain the synergism between paclitaxel and doxorubicin in TFAC treatment: Paclitaxel may attenuate MELK gene expression, resulting in lower levels of its target MYBL2, already associated with doxorubicin synergism in hepatocellular carcinoma cell lines. We tested our hypothesis in three breast cancer cell lines, confirming it in part. In particular, the predicted effect on MYBL2 could be validated, and a synergistic effect of paclitaxel and doxorubicin could be demonstrated in the breast cancer cell lines SKBR3 and MCF-7.</p></div

    ExprEssence-condensed network describing the 16 most and 16 least active interactions between the E40 genes/proteins.

    No full text
    <p>For each gene, its mean expression level is visualized for non-responders (left) and responders (right) by color (green for low, white for intermediate, red for high expression). Interactions between the genes/proteins are represented by a line. Stimulations are indicated by an arrow on the target, inhibitions by a t-bar. The up- (red) and down-regulation (green) of interactions are also colorcoded. Full gene names can be found in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0081784#pone.0081784.s007" target="_blank">Table S1</a>.</p

    Expression analysis of MYBL2 protein after treatment with paclitaxel (Taxol, T) and doxorubicin (Adriamycin, A) in several cell lines by Western blotting (non-tumorigenic cell line MCF-10A and breast cancer cell lines MCF-7, BT-20 and SKBR3).

    No full text
    <p>Single treatment with T or A for 48(T (48 h); A (48 h)), combined treatment for 48 h (T + A (48 h)) or successive treatment for each for 24 h (T (24 h), A (24 h) was applied. Quantification of western blotting results was carried out with individual passaged cells for at least three times. Representative western blots were displayed on top of the graphs. Proliferative alterations were detected against Proliferating Cell Nuclear Antigen (PCNA). Loading controls were labeling of the house keeping protein <i>β</i>-actin and stain-free imaging of the SDS-PAGEs prior blotting procedure. Mean ± SD values (n = 3). * : <i>p</i><0.05; ** : <i>p</i><0.01; * * * : <i>p</i><0.001 as compared to control treatment (unpaired t test).</p

    Expression levels of MELK and MYBL2 protein in the non-tumorigenic cell line MCF-10A in contrast to the breast cancer cell lines MCF-7, BT-20 and SKBR3 detected by immunofluorescence.

    No full text
    <p>Note that MELK protein levels were below detection threshold while MYBL2 protein was abundant in all cell lines. The strongest MYBL2 signal was reached in the cell line SKBR3. MELK and MYBL2 protein: green; cell nuclei: blue.</p
    corecore