61 research outputs found

    A Spectral Clustering Algorithm Improved by P Systems

    Get PDF
    Using spectral clustering algorithm is diffcult to find the clusters in the cases that dataset has a large difference in density and its clustering effect depends on the selection of initial centers. To overcome the shortcomings, we propose a novel spectral clustering algorithm based on membrane computing framework, called MSC algorithm, whose idea is to use membrane clustering algorithm to realize the clustering component in spectral clustering. A tissue-like P system is used as its computing framework, where each object in cells denotes a set of cluster centers and velocity-location model is used as the evolution rules. Under the control of evolutioncommunication mechanism, the tissue-like P system can obtain a good clustering partition for each dataset. The proposed spectral clustering algorithm is evaluated on three artiffcial datasets and ten UCI datasets, and it is further compared with classical spectral clustering algorithms. The comparison results demonstrate the advantage of the proposed spectral clustering algorithm

    Analysis of the transcriptional program governing meiosis and gametogenesis in yeast and mammals

    Get PDF
    During meiosis a competent diploid cell replicates its DNA once and then undergoes two consecutive divisions followed by haploid gamete diïŹ€erentiation. Important aspects of meiotic development that distinguish it from mitotic growth include a highly increased rate of recombination, formation of the synaptonemal complex that aligns the homologous chromosomes, as well as separation of the homologues and sister chromatids during meiosis I and II without an intervening S-phase. Budding yeast is an excellent model organism to study meiosis and gametogenesis and accordingly, to date it belongs to the best studied eukaryotic systems in this context. Knowledge coming from these studies has provided important insights into meiotic development in higher eukaryotes. This was possible because sporulation in yeast and spermatogenesis in higher eukaryotes are analogous developmental pathways that involve conserved genes. For budding yeast a huge amount of data from numerous genome-scale studies on gene expression and deletion phenotypes of meiotic development and sporulation are available. In contrast, mammalian gametogenesis has not been studied on a large-scale until recently. It was unclear if an expression proïŹling study using germ cells and testicular somatic control cells that underwent lengthy puriïŹcation procedures would yield interpretable results. We have therefore carried out a pioneering expression proïŹling study of male germ cells from Rattus norvegicus using AïŹ€ymetrix U34A and B GeneChips. This work resulted in the ïŹrst comprehensive large-scale expression proïŹling analysis of mammalian male germ cells undergoing mitotic growth, meiosis and gametogenesis. We have identiïŹed 1268 diïŹ€erentially expressed genes in germ cells at diïŹ€erent developmental stages, which were organized into four distinct expression clusters that reïŹ‚ect somatic, mitotic, meiotic and post-meiotic cell types. This included 293 yet uncharacterized transcripts whose expression pattern suggests that they are involved in spermatogenesis and fertility. A group of 121 transcripts were only expressed in meiotic (spermatocytes) and postmeiotic germ cells (round spermatids) but not in dividing germ cells (spermatogonia), Sertoli cells or two somatic control tissues (brain and skeletal muscle). Functional analysis reveals that most of the known genes in this group fulfill essential functions during meiosis, spermiogenesis (the process of sperm maturation) and fertility. Therefore it is highly possible that some of the ïżœ30 uncharacterized transcripts in this group also contribute to these processes. A web-accessible database (called reXbase, which was later on integrated into GermOnline) has been developed for our expression profiling study of mammalian male meiosis, which summarizes annotation information and shows a graphical display of expression profiles of every gene covered in our study. In the budding yeast Saccharomyces cerevisiae entry into meiosis and subsequent progression through sporulation and gametogenesis are driven by a highly regulated transcriptional program activated by signal pathways responding to nutritional and cell-type cues. Abf1p, which is a general transcription factor, has previously been demonstrated to participate in the induction of numerous mitotic as well as early and middle meiotic genes. In the current study we have addressed the question how Abf1p transcriptionally coordinates mitotic growth and meiotic development on a genome-wide level. Because ABF1 is an essential gene we used the temperature-sensitive allele abf1-1. A phenotypical analysis of mutant cells revealed that ABF1 plays an important role in cell separation during mitosis, meiotic development, and spore formation. In order to identify genes whose expression depends on Abf1p in growing and sporulating cells we have performed expression profiling experiments using Affymetrix S98 GeneChips comparing wild-type and abf1-1 mutant cells at both permissive and restrictive temperature. We have identified 504 genes whose normal expression depends on functional ABF1. By combining the expression profiling data with data from genome-wide DNA binding assays (ChIPCHIP) and in silico predictions of potential Abf1p-binding sites in the yeast genome, we were able to define direct target genes. Expression of these genes decreases in the absence of functional ABF1 and whose promotors are bound by Abf1p and/or contain a predicted binding site. Among 352 such bona fide direct target genes we found many involved in ribosome biogenesis, translation, vegetative growth and meiotic developement and therefore could account for the observed growth and sporulation defects of abf1-1 mutant cells. Furthermore, the fact that two members of the septin family (CDC3 and CDC10 ) were found to be direct target genes suggests a novel role for Abf1p in cytokinesis. This was further substantiated by the observation that chitin localization and septin ring formation are perturbed in abf1-1 mutant cells

    Mesenchymal and stem-like prostate cancer linked to therapy-induced lineage plasticity and metastasis

    Get PDF
    Bioinformatic analysis of 94 patient-derived xenografts (PDXs), cell lines, and organoids (PCOs) identifies three intrinsic transcriptional subtypes of metastatic castration-resistant prostate cancer: androgen receptor (AR) pathway + prostate cancer (PC) (ARPC), mesenchymal and stem-like PC (MSPC), and neuroendocrine PC (NEPC). A sizable proportion of castration-resistant and metastatic stage PC (M-CRPC) cases are admixtures of ARPC and MSPC. Analysis of clinical datasets and mechanistic studies indicates that MSPC arises from ARPC as a consequence of therapy-induced lineage plasticity. AR blockade with enzalutamide induces (1) transcriptional silencing of TP53 and hence dedifferentiation to a hybrid epithelial and mesenchymal and stem-like state and (2) inhibition of BMP signaling, which promotes resistance to AR inhibition. Enzalutamide-tolerant LNCaP cells re-enter the cell cycle in response to neuregulin and generate metastasis in mice. Combined inhibition of HER2/3 and AR or mTORC1 exhibits efficacy in models of ARPC and MSPC or MSPC, respectively. These results define MSPC, trace its origin to therapy-induced lineage plasticity, and reveal its sensitivity to HER2/3 inhibition.ope

    Genetic and molecular mechanisms of sarcomas

    Get PDF
    Sarcomas are heterogeneous malignant mesenchymal tumors with diverse biological features and unique clinical characteristics, the genetic alterations of sarcomas are highly variable. With the development of sequencing technologies, efficient and practical approaches to detect gene expressions and gene variants contribute to the prediction of patient prognosis and the choice of treatment modalities. Given the rarity of sarcomas, the comprehensive transcriptomic or genomic profiles are still lacking for many subtypes. In the present thesis, by applying sequencing technology in sarcoma cohorts, combined with bioinformatics data analysis and molecular biology experiments, we have revealed new biological mechanisms dictating sarcoma behavior and provided insights for clinical applications. In Paper I, we characterized the gene signatures related to poor prognosis, first-line treatment failure, and chemotherapy resistance in Ewing sarcoma (ES). High expression of IGF2 was associated with shorter overall survival in ES patients and promoted cell proliferation, radiation resistance, and apoptosis inhibition in vitro. The transcriptome analysis of clinical samples and cell lines uncovered an IGF-dependent signature and potentially related to stem cell-like signatures in ES. Paper II continued to highlight the transcriptome signatures in ES. Here, we identified prognosis-related RNA-binding proteins (RBPs) and constructed an RBP-based prognostic risk model that showed stable predictive power for evaluating overall survival in clinical samples. Within the model, NSUN7 is considered an independent prognostic favorable prognostic marker, which was also validated by immunohistochemistry. In Paper III, we discovered that TERT promoter mutations were present in 45% of patients in a cohort of 190 patients with conventional chondrosarcoma (CHS). The mutation was significantly associated with recurrence, distant metastasis, and high tumor grade. The heterogeneity of primary tumors and the altered mutational status between asynchronous metastatic lesions revealed that CHS is a multiclonal disease that progresses through branching evolution. In Paper IV, we identified three clusters with distinct transcriptomic and genomic patterns in synovial sarcoma (SS), of which SS cluster I (SSC-I) was characterized by hyperproliferation, immune cell silencing, and poor prognosis; SSC-II was characterized by high vascularity and stromal component with the better clinical outcome; SSC-III was characterized by epithelial components with genomic complexity and checkpoint-mediated immune suppression. Collectively, the present thesis illustrated the pathogenic mechanisms of ES, CHS, and SS through the analysis of transcriptomic and genomic data, identified prognostic biomarkers, and at the clinical application-level provided strong evidence for patient stratification, risk prediction, and personalized treatment assessment

    Integrated Genomic Analysis of the Ubiquitin Pathway across Cancer Types

    Get PDF
    Protein ubiquitination is a dynamic and reversibleprocess of adding single ubiquitin molecules orvarious ubiquitin chains to target proteins. Here,using multidimensional omic data of 9,125 tumorsamples across 33 cancer types from The CancerGenome Atlas, we perform comprehensive molecu-lar characterization of 929 ubiquitin-related genesand 95 deubiquitinase genes. Among them, we sys-tematically identify top somatic driver candidates,including mutatedFBXW7with cancer-type-specificpatterns and amplifiedMDM2showing a mutuallyexclusive pattern withBRAFmutations. Ubiquitinpathway genes tend to be upregulated in cancermediated by diverse mechanisms. By integratingpan-cancer multiomic data, we identify a group oftumor samples that exhibit worse prognosis. Thesesamples are consistently associated with the upre-gulation of cell-cycle and DNA repair pathways, char-acterized by mutatedTP53,MYC/TERTamplifica-tion, andAPC/PTENdeletion. Our analysishighlights the importance of the ubiquitin pathwayin cancer development and lays a foundation fordeveloping relevant therapeutic strategies

    The transcriptome response of leaves of the resurrection plant, Xerophyta humilis to desiccation

    Get PDF
    Includes bibliographical references.In angiosperms, desiccation tolerance, a genetic trait that enables tissues to survive loss of more than 95% of cellular water is widely observed in the seeds, but is only found in the vegetative tissues of a small group of species known as the resurrection plants. Xerophyta humilis is a small resurrection plant indigenous to Southern Africa. In this study, the hypothesis that vegetative desiccation tolerance is derived from an adaptation of seed desiccation tolerance was tested by characterizing changes in the transcriptome of X. humilis leaves during desiccation. The mRNA transcript abundance of a set of 1680 X. humilis genes was analyzed at 6 different stages of water loss in the leaves of X. humilis. Functional enrichment analysis showed that genes that were down-regulated during desiccation were over-represented with genes involved in photosynthesis, cellular developmental processes, as well as transcription regulator activity. Three distinct clusters of up-regulated genes were identified. The earliest set of up-regulated genes were enriched with genes associated with the turnover of proteins and the simultaneous synthesis of proteins required for protection. Enrichment also included genes associated with lipid body synthesis, as well as the transport of storage proteins to vacuoles. Two groups of late desiccation up-regulated genes were also identified, their expression only increased at later stages of desiccation and remained high in the desiccated leaves

    Activation of seed-specific genes in leaves and roots of the desiccation tolerant plant, Xerophyta humilis

    Get PDF
    Includes abstract.Includes bibliographical references (leaves 131-169).The ability of tissues to survive almost complete loss of cellular water is a trait found throughout the plant kingdom. While this desiccation tolerance is common in seeds of most angiosperms it is rare in their vegetative tissues. Xerophyta humilis (Bak.) Dur and Schintz belongs to a small group of resurrection angiosperms and it possesses the ability to withstand extreme desiccation of greater than 90% in both its seeds and vegetative tissues and return to active metabolism upon rehydration. We have tested the hypothesis that vegetative desiccation tolerance in angiosperms has evolved as an adaptation of seed desiccation tolerance

    A COMPUTATIONAL APPROACH FOR ACCESSING PHOSPHORYLATED RESPONSE REGULATOR CONFORMATIONS AND SIGNALING COMPLEXES INVOLVING THE FUNGAL PHOSPHORELAY PROTEIN YPD1

    Get PDF
    Two-component signaling is the primary means by which bacteria, archaea and certain eukaryotes sense and respond to their environments. Signal transfer proceeds through sequential His-to-Asp phosphorylation of upstream histidine kinases and downstream response regulators. These systems share highly modular designs and have been incorporated into a myriad of cellular processes. The highly labile chemical natures of phosphoaspartate and phosphohistidine lead to relatively short experimental life-times, making study of the modified signaling proteins challenging. The focus of this research was to develop computational and experimental approaches for characterizing phosphorylated two-component signaling proteins. Following an introductory chapter, the first experimental section presents a computational technique for simulating the activation of individual response regulator proteins. This is accomplished using known experimental data on conserved active site chemistry to define a common set of restraints to drive each simulation. The protocol was verified on five genetically diverse response regulators with known experimental structures. The second section applies this principle to signaling complexes to study the effects of phosphorylation on protein- protein interactions within the Saccharomyces cerevisiae osmoregulatory signaling system. The third section describes the experimental characterization of a specific signaling complex from Saccharomyces cerevisiae between the response regulator Ssk1 and a point mutant (G68Q) of the histidine phosphotransfer protein Ypd1 using X-ray crystallography. This mutation occurs near the active site of both proteins and appears to interfere with phosphotransfer. Further in silico studies were performed to observe the role of G68 in catalysis of phosphotransfer

    Finding regions of aberrant DNA copy number associated with tumor phenotype

    Get PDF
    DNA copy number alterations are a hallmark of cancer. Understanding their role in tumor progression can help improve diagnosis, prognosis and therapy selection for cancer patients. High-resolution, genome-wide measurements of DNA copy number changes for large cohorts of tumors are currently available, owing to technologies like microarray-based array comparative hybridization (arrayCGH). In this thesis, we present a computational pipeline for statistical analysis of tumor cohorts, which can help extract relevant patterns of copy number aberrations and infer their association with various phenotypical indicators. The main challenges are the instability of classification models due to the high dimensionality of the arrays compared to the small number of tumor samples, as well as the large correlations between copy number estimates measured at neighboring loci. We show that the feature ranking given by several widely-used methods for feature selection is biased due to the large correlations between features. In order to correct for the bias and instability of the feature ranking, we introduce methods for consensus segmentation of the set of arrays. We present three algorithms for consensus segmentation, which are based on identifying recurrent DNA breakpoints or DNA regions of constant copy number profile. The segmentation constitutes the basis for computing a set of super-features, corresponding to the regions. We use the super-features for supervised classification and we compare the models to baseline models trained on probe data. We validated the methods by training models for prediction of the phenotype of breast cancers and neuroblastoma tumors. We show that the multivariate segmentation affords higher model stability, in general improves prediction accuracy and facilitates model interpretation. One of our most important biological results refers to the classification of neuroblastoma tumors. We show that patients belonging to different age subgroups are characterized by distinct copy number patterns, with largest discrepancy when the subgroups are defined as older or younger than 16-18 months. We thereby confirm the recommendation for a higher age cutoff than 12 months (current clinical practice) for differential diagnosis of neuroblastoma.Die abnormale MultiplizitĂ€t bestimmter Segmente der DNS (copy number aberrations) ist eines der hervorstechenden Merkmale von Krebs. Das VerstĂ€ndnis der Rolle dieses Merkmals fĂŒr das Tumorwachstum könnte massgeblich zur Verbesserung von Krebsdiagnose,-prognose und -therapie beitragen und somit bei der Auswahl individueller Therapien helfen. Micoroarray-basierte Technologien wie 'Array Comparative Hybridization' (array-CGH) erlauben es, hochauflösende, genomweite Kopiezahl-Karten von Tumorgeweben zu erstellen. Gegenstand dieser Arbeit ist die Entwicklung einer Software-Pipeline fĂŒr die statistische Analyse von Tumorkohorten, die es ermöglicht, relevante Muster abnormaler Kopiezahlen abzuleiten und diese mit diversen phĂ€notypischen Merkmalen zu assoziieren. Dies geschieht mithilfe maschineller Lernmethoden fĂŒr Klassifikation und Merkmalselektion mit Fokus auf die Interpretierbarkeit der gelernten Modelle (regularisierte lineare Methoden sowie Entscheidungsbaum-basierte Modelle). Herausforderungen an die Methoden liegen vor allem in der hohen DimensionalitĂ€t der Daten, denen lediglich eine vergleichsweise geringe Anzahl von gemessenen Tumorproben gegenĂŒber steht, sowie der hohen Korrelation zwischen den gemessenen Kopiezahlen in benachbarten genomischen Regionen. Folglich hĂ€ngen die Resultate der Merkmalselektion stark von der Auswahl des Trainingsdatensatzes ab, was die Reproduzierbarkeit bei unterschiedlichen klinischen DatensĂ€tzen stark einschrĂ€nkt. Diese Arbeit zeigt, dass die von diversen gĂ€ngigen Methoden bestimmte Rangfolge von Features in Folge hoher Korrelationskoefizienten einzelner PrĂ€diktoren stark verfĂ€lscht ist. Um diesen 'Bias' sowie die InstabilitĂ€t der Merkmalsrangfolge zu korrigieren, fĂŒhren wir in unserer Pipeline einen dimensions-reduzierenden Schritt ein, der darin besteht, die Arrays gemeinsam multivariat zu segmentieren. Wir prĂ€sentieren drei Algorithmen fĂŒr diese multivariate Segmentierung,die auf der Identifikation rekurrenter DNA Breakpoints oder genomischer Regionen mit konstanten Kopiezahl-Profilen beruhen. Durch Zusammenfassen der DNA Kopiezahlwerte innerhalb einer Region bildet die multivariate Segmentierung die Grundlage fĂŒr die Berechnung einer kleineren Menge von 'Super-Merkmalen'. Im Vergleich zu Klassifikationsverfahren,die auf Ebene einzelner Arrayproben beruhen, verbessern wir durch ĂŒberwachte Klassifikation basierend auf den Super-Merkmalen die Interpretierbarkeit sowie die StabilitĂ€t der Modelle. Wir validieren die Methoden in dieser Arbeit durch das Trainieren von Vorhersagemodellen auf Brustkrebs und Neuroblastoma DatensĂ€tzen. Hier zeigen wir, dass der multivariate Segmentierungsschritt eine erhöhte ModellstabilitĂ€t erzielt, wobei die VorhersagequalitĂ€t nicht abnimmt. Die Dimension des Problems wird erheblich reduziert (bis zu 200-fach weniger Merkmale), welches die multivariate Segmentierung nicht nur zu einem probaten Mittel fĂŒr die Vorhersage von PhĂ€notypen macht.Vielmehr eignet sich das Verfahren darĂŒberhinaus auch als Vorverarbeitungschritt fĂŒr spĂ€tere integrative Analysen mit anderen Datentypen. Auch die Interpretierbarkeit der Modelle wird verbessert. Dies ermöglicht die Identifikation von wichtigen Relationen zwischen Änderungen der Kopiezahl und PhĂ€notyp. Beispielsweise zeigen wir, dass eine Koamplifikation in direkter Nachbarschaft des ERBB2 Genlokus einen höchst informativen PrĂ€diktor fĂŒr die Unterscheidung von entzĂŒndlichen und nicht-entzĂŒndlichen Brustkrebsarten darstellt. Damit bestĂ€tigen wir die in der Literatur gĂ€ngige Hypothese, dass die Grösse eines Amplikons mit dem Krebssubtyp zusammenhĂ€ngt. Im Fall von Neuroblastoma Tumoren zeigen wir, dass Untergruppen, die durch das Alter des Patienten deniert werden, durch Kopiezahl-Muster charakterisiert werden können. Insbesondere ist dies möglich, wenn ein Altersschwellenwert von 16 bis 18 Monaten zur Definition der Gruppen verwandt wird, bei dem ausserdem auch die höchste Vorhersagegenauigkeit vorliegt. Folglich geben wir weitere Evidenz fĂŒr die Empfehlung, einen höheren Schwellenwert als zwölf Monate fĂŒr die differentielle Diagnose von Neuroblastoma zu verwenden
    • 

    corecore