15 research outputs found

    GSA-PCA : gene set generation by principal component analysis of the Laplacian matrix of a metabolic network

    Get PDF
    The original publication is available at http://www.biomedcentral.com/1471-2105/13/197Publication of this article was funded by the Stellenbosch University Open Access Fund.Abstract Background Gene Set Analysis (GSA) has proven to be a useful approach to microarray analysis. However, most of the method development for GSA has focused on the statistical tests to be used rather than on the generation of sets that will be tested. Existing methods of set generation are often overly simplistic. The creation of sets from individual pathways (in isolation) is a poor reflection of the complexity of the underlying metabolic network. We have developed a novel approach to set generation via the use of Principal Component Analysis of the Laplacian matrix of a metabolic network. We have analysed a relatively simple data set to show the difference in results between our method and the current state-of-the-art pathway-based sets. Results The sets generated with this method are semi-exhaustive and capture much of the topological complexity of the metabolic network. The semi-exhaustive nature of this method has also allowed us to design a hypergeometric enrichment test to determine which genes are likely responsible for set significance. We show that our method finds significant aspects of biology that would be missed (i.e. false negatives) and addresses the false positive rates found with the use of simple pathway-based sets. Conclusions The set generation step for GSA is often neglected but is a crucial part of the analysis as it defines the full context for the analysis. As such, set generation methods should be robust and yield as complete a representation of the extant biological knowledge as possible. The method reported here achieves this goal and is demonstrably superior to previous set analysis methods.Publishers' Versio

    Single-Cell Transcriptomics of Regulatory T Cells Reveals Trajectories of Tissue Adaptation.

    Get PDF
    Non-lymphoid tissues (NLTs) harbor a pool of adaptive immune cells with largely unexplored phenotype and development. We used single-cell RNA-seq to characterize 35,000 CD4+ regulatory (Treg) and memory (Tmem) T cells in mouse skin and colon, their respective draining lymph nodes (LNs) and spleen. In these tissues, we identified Treg cell subpopulations with distinct degrees of NLT phenotype. Subpopulation pseudotime ordering and gene kinetics were consistent in recruitment to skin and colon, yet the initial NLT-priming in LNs and the final stages of NLT functional adaptation reflected tissue-specific differences. Predicted kinetics were recapitulated using an in vivo melanoma-induction model, validating key regulators and receptors. Finally, we profiled human blood and NLT Treg and Tmem cells, and identified cross-mammalian conserved tissue signatures. In summary, we describe the relationship between Treg cell heterogeneity and recruitment to NLTs through the combined use of computational prediction and in vivo validation

    A global network for operational flood risk reduction

    Get PDF
    Every year riverine flooding affects millions of people in developing countries, due to the large population exposure in the floodplains and the lack of adequate flood protection measures. Preparedness and monitoring are effective ways to reduce flood risk. State-of-the-art technologies relying on satellite remote sensing as well as numerical hydrological and weather predictions can detect and monitor severe flood events at a global scale. This paper describes the emerging role of the Global Flood Partnership (GFP), a global network of scientists, users, private and public organizations active in global flood risk management. Currently, a number of GFP member institutes regularly share results from their experimental products, developed to predict and monitor where and when flooding is taking place in near real-time. GFP flood products have already been used on several occasions by national environmental agencies and humanitarian organizations to support emergency operations and to reduce the overall socio-economic impacts of disasters. This paper describes a range of global flood products developed by GFP partners, and how these provide complementary information to support and improve current global flood risk management for large scale catastrophes. We also discuss existing challenges and ways forward to turn current experimental products into an integrated flood risk management platform to improve rapid access to flood information and increase resilience to flood events at global scale

    Data-driven methods for exploratory analysis in chemometrics and scientific experimentation

    Get PDF
    Thesis (MSc)--Stellenbosch University, 2014.ENGLISH ABSTRACT: Background New methods to facilitate exploratory analysis in scientific data are in high demand. There is an abundance of available data used only for confirmatory analysis from which new hypotheses can be drawn. To this end, two new exploratory techniques are developed: one for chemometrics and another for visualisation of fundamental scientific experiments. The former transforms large-scale multiple raw HPLC/UV-vis data into a conserved set of putative features - something not often attempted outside of Mass-Spectrometry. The latter method ('StatNet'), applies network techniques to the results of designed experiments to gain new perspective on variable relations. Results The resultant data format from un-targeted chemometric processing was amenable to both chemical and statistical analysis. It proved to have integrity when machine-learning techniques were applied to infer attributes of the experimental set-up. The visualisation techniques were equally successful in generating hypotheses, and were easily extendible to three different types of experimental results. Conclusion The overall aim was to create useful tools for hypothesis generation in a variety of data. This has been largely reached through a combination of novel and existing techniques. It is hoped that the methods here presented are further applied and developed.AFRIKAANSE OPSOMMING: Agtergrond Nuwe metodes om ondersoekende ontleding in wetenskaplike data te fasiliteer is in groot aanvraag. Daar is 'n oorvloed van beskikbaar data wat slegs gebruik word vir bevestigende ontleding waaruit nuwe hipoteses opgestel kan word. Vir hierdie doel, word twee nuwe ondersoekende tegnieke ontwikkel: een vir chemometrie en 'n ander vir die visualisering van fundamentele wetenskaplike eksperimente. Die eersgenoemde transformeer grootskaalse veelvoudige rou HPLC / UV-vis data in 'n bewaarde stel putatiewe funksies - iets wat nie gereeld buite Massaspektrometrie aangepak word nie. Die laasgenoemde metode ('StatNet') pas netwerktegnieke tot die resultate van ontwerpte eksperimente toe om sodoende ân nuwe perspektief op veranderlike verhoudings te verkry. Resultate Die gevolglike data formaat van die ongeteikende chemometriese verwerking was in 'n formaat wat vatbaar is vir beide chemiese en statistiese analise. Daar is bewys dat dit integriteit gehad het wanneer masjienleertegnieke toegepas is om eienskappe van die eksperimentele opstelling af te lei. Die visualiseringtegnieke was ewe suksesvol in die generering van hipoteses, en ook maklik uitbreibaar na drie verskillende tipes eksperimentele resultate. Samevatting Die hoofdoel was om nuttige middele vir hipotese generasie in 'n verskeidenheid van data te skep. Dit is grootliks bereik deur 'n kombinasie van oorspronklike en bestaande tegnieke. Hopelik sal die metodes wat hier aangebied is verder toegepas en ontwikkel word

    GSA-PCA : gene set generation by principal component analysis of the Laplacian matrix of a metabolic network

    No full text
    The original publication is available at http://www.biomedcentral.com/1471-2105/13/197Publication of this article was funded by the Stellenbosch University Open Access Fund.Abstract Background Gene Set Analysis (GSA) has proven to be a useful approach to microarray analysis. However, most of the method development for GSA has focused on the statistical tests to be used rather than on the generation of sets that will be tested. Existing methods of set generation are often overly simplistic. The creation of sets from individual pathways (in isolation) is a poor reflection of the complexity of the underlying metabolic network. We have developed a novel approach to set generation via the use of Principal Component Analysis of the Laplacian matrix of a metabolic network. We have analysed a relatively simple data set to show the difference in results between our method and the current state-of-the-art pathway-based sets. Results The sets generated with this method are semi-exhaustive and capture much of the topological complexity of the metabolic network. The semi-exhaustive nature of this method has also allowed us to design a hypergeometric enrichment test to determine which genes are likely responsible for set significance. We show that our method finds significant aspects of biology that would be missed (i.e. false negatives) and addresses the false positive rates found with the use of simple pathway-based sets. Conclusions The set generation step for GSA is often neglected but is a crucial part of the analysis as it defines the full context for the analysis. As such, set generation methods should be robust and yield as complete a representation of the extant biological knowledge as possible. The method reported here achieves this goal and is demonstrably superior to previous set analysis methods.Publishers' Versio
    corecore