15 research outputs found
GSA-PCA : gene set generation by principal component analysis of the Laplacian matrix of a metabolic network
The original publication is available at http://www.biomedcentral.com/1471-2105/13/197Publication of this article was funded by the Stellenbosch University Open Access Fund.Abstract
Background
Gene Set Analysis (GSA) has proven to be a useful approach to microarray analysis. However, most of the method development for GSA has focused on the statistical tests to be used rather than on the generation of sets that will be tested. Existing methods of set generation are often overly simplistic. The creation of sets from individual pathways (in isolation) is a poor reflection of the complexity of the underlying metabolic network. We have developed a novel approach to set generation via the use of Principal Component Analysis of the Laplacian matrix of a metabolic network. We have analysed a relatively simple data set to show the difference in results between our method and the current state-of-the-art pathway-based sets.
Results
The sets generated with this method are semi-exhaustive and capture much of the topological complexity of the metabolic network. The semi-exhaustive nature of this method has also allowed us to design a hypergeometric enrichment test to determine which genes are likely responsible for set significance. We show that our method finds significant aspects of biology that would be missed (i.e. false negatives) and addresses the false positive rates found with the use of simple pathway-based sets.
Conclusions
The set generation step for GSA is often neglected but is a crucial part of the analysis as it defines the full context for the analysis. As such, set generation methods should be robust and yield as complete a representation of the extant biological knowledge as possible. The method reported here achieves this goal and is demonstrably superior to previous set analysis methods.Publishers' Versio
Single-Cell Transcriptomics of Regulatory T Cells Reveals Trajectories of Tissue Adaptation.
Non-lymphoid tissues (NLTs) harbor a pool of adaptive immune cells with largely unexplored phenotype and development. We used single-cell RNA-seq to characterize 35,000 CD4+ regulatory (Treg) and memory (Tmem) T cells in mouse skin and colon, their respective draining lymph nodes (LNs) and spleen. In these tissues, we identified Treg cell subpopulations with distinct degrees of NLT phenotype. Subpopulation pseudotime ordering and gene kinetics were consistent in recruitment to skin and colon, yet the initial NLT-priming in LNs and the final stages of NLT functional adaptation reflected tissue-specific differences. Predicted kinetics were recapitulated using an in vivo melanoma-induction model, validating key regulators and receptors. Finally, we profiled human blood and NLT Treg and Tmem cells, and identified cross-mammalian conserved tissue signatures. In summary, we describe the relationship between Treg cell heterogeneity and recruitment to NLTs through the combined use of computational prediction and in vivo validation
A global network for operational flood risk reduction
Every year riverine flooding affects millions of people in developing countries, due to the large population exposure in the floodplains and the lack of adequate flood protection measures. Preparedness and monitoring are effective ways to reduce flood risk. State-of-the-art technologies relying on satellite remote sensing as well as numerical hydrological and weather predictions can detect and monitor severe flood events at a global scale. This paper describes the emerging role of the Global Flood Partnership (GFP), a global network of scientists, users, private and public organizations active in global flood risk management. Currently, a number of GFP member institutes regularly share results from their experimental products, developed to predict and monitor where and when flooding is taking place in near real-time. GFP flood products have already been used on several occasions by national environmental agencies and humanitarian organizations to support emergency operations and to reduce the overall socio-economic impacts of disasters. This paper describes a range of global flood products developed by GFP partners, and how these provide complementary information to support and improve current global flood risk management for large scale catastrophes. We also discuss existing challenges and ways forward to turn current experimental products into an integrated flood risk management platform to improve rapid access to flood information and increase resilience to flood events at global scale
Data-driven methods for exploratory analysis in chemometrics and scientific experimentation
Thesis (MSc)--Stellenbosch University, 2014.ENGLISH ABSTRACT: Background
New methods to facilitate exploratory analysis in scientific data are in high
demand. There is an abundance of available data used only for confirmatory
analysis from which new hypotheses can be drawn. To this end, two new
exploratory techniques are developed: one for chemometrics and another for
visualisation of fundamental scientific experiments. The former transforms
large-scale multiple raw HPLC/UV-vis data into a conserved set of putative
features - something not often attempted outside of Mass-Spectrometry. The
latter method ('StatNet'), applies network techniques to the results of designed
experiments to gain new perspective on variable relations.
Results
The resultant data format from un-targeted chemometric processing was
amenable to both chemical and statistical analysis. It proved to have integrity
when machine-learning techniques were applied to infer attributes of
the experimental set-up. The visualisation techniques were equally successful
in generating hypotheses, and were easily extendible to three different types
of experimental results.
Conclusion
The overall aim was to create useful tools for hypothesis generation in a
variety of data. This has been largely reached through a combination of novel
and existing techniques. It is hoped that the methods here presented are
further applied and developed.AFRIKAANSE OPSOMMING: Agtergrond
Nuwe metodes om ondersoekende ontleding in wetenskaplike data te fasiliteer
is in groot aanvraag. Daar is 'n oorvloed van beskikbaar data wat slegs
gebruik word vir bevestigende ontleding waaruit nuwe hipoteses opgestel kan
word. Vir hierdie doel, word twee nuwe ondersoekende tegnieke ontwikkel: een
vir chemometrie en 'n ander vir die visualisering van fundamentele wetenskaplike
eksperimente. Die eersgenoemde transformeer grootskaalse veelvoudige
rou HPLC / UV-vis data in 'n bewaarde stel putatiewe funksies - iets wat
nie gereeld buite Massaspektrometrie aangepak word nie. Die laasgenoemde
metode ('StatNet') pas netwerktegnieke tot die resultate van ontwerpte eksperimente
toe om sodoende ân nuwe perspektief op veranderlike verhoudings te
verkry.
Resultate
Die gevolglike data formaat van die ongeteikende chemometriese verwerking
was in 'n formaat wat vatbaar is vir beide chemiese en statistiese analise. Daar
is bewys dat dit integriteit gehad het wanneer masjienleertegnieke toegepas
is om eienskappe van die eksperimentele opstelling af te lei. Die visualiseringtegnieke
was ewe suksesvol in die generering van hipoteses, en ook maklik
uitbreibaar na drie verskillende tipes eksperimentele resultate.
Samevatting
Die hoofdoel was om nuttige middele vir hipotese generasie in 'n verskeidenheid
van data te skep. Dit is grootliks bereik deur 'n kombinasie van oorspronklike
en bestaande tegnieke. Hopelik sal die metodes wat hier aangebied
is verder toegepas en ontwikkel word
GSA-PCA : gene set generation by principal component analysis of the Laplacian matrix of a metabolic network
The original publication is available at http://www.biomedcentral.com/1471-2105/13/197Publication of this article was funded by the Stellenbosch University Open Access Fund.Abstract
Background
Gene Set Analysis (GSA) has proven to be a useful approach to microarray analysis. However, most of the method development for GSA has focused on the statistical tests to be used rather than on the generation of sets that will be tested. Existing methods of set generation are often overly simplistic. The creation of sets from individual pathways (in isolation) is a poor reflection of the complexity of the underlying metabolic network. We have developed a novel approach to set generation via the use of Principal Component Analysis of the Laplacian matrix of a metabolic network. We have analysed a relatively simple data set to show the difference in results between our method and the current state-of-the-art pathway-based sets.
Results
The sets generated with this method are semi-exhaustive and capture much of the topological complexity of the metabolic network. The semi-exhaustive nature of this method has also allowed us to design a hypergeometric enrichment test to determine which genes are likely responsible for set significance. We show that our method finds significant aspects of biology that would be missed (i.e. false negatives) and addresses the false positive rates found with the use of simple pathway-based sets.
Conclusions
The set generation step for GSA is often neglected but is a crucial part of the analysis as it defines the full context for the analysis. As such, set generation methods should be robust and yield as complete a representation of the extant biological knowledge as possible. The method reported here achieves this goal and is demonstrably superior to previous set analysis methods.Publishers' Versio