Search CORE

Archivio della ricerca - Università degli studi di Napoli Federico II

Computational Stem Cell Biology: Open Questions and Guiding Principles

Author: Cacchiarelli D.
Cahan P.
de Sousa Lopes S. M. C.
del Sol A.
Dunn S. -J.
Hemberg M.
Morris S. A.
Rackham O. J. L.
Wells C. A.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

Computational biology is enabling an explosive growth in our understanding of stem cells and our ability to use them for disease modeling, regenerative medicine, and drug discovery. We discuss four topics that exemplify applications of computation to stem cell biology: cell typing, lineage tracing, trajectory inference, and regulatory networks. We use these examples to articulate principles that have guided computational biology broadly and call for renewed attention to these principles as computation becomes increasingly important in stem cell biology. We also discuss important challenges for this field with the hope that it will inspire more to join this exciting area

Uncovering vector, parasite, blood meal and microbiome patterns from mixed-DNA specimens of the Chagas disease vector Triatoma dimidiata

Author: Dorn Patricia L.
Hanley John P.
Helms Cahan Sara
Monroy Carlota
Morrissey Leslie
Orantes Lucia C.
Richards Bethany
Rizzo Donna M.
Rodas Antonieta Guadalupe
Stevens Lori
Wallin Kimberly F.
Publication venue: UVM ScholarWorks
Publication date: 01/10/2018
Field of study

Chagas disease, considered a neglected disease by the World Health Organization, is caused by the protozoan parasite Trypanosoma cruzi, and transmitted by \u3e140 triatomine species across the Americas. In Central America, the main vector is Triatoma dimidiata, an opportunistic blood meal feeder inhabiting both domestic and sylvatic ecotopes. Given the diversity of interacting biological agents involved in the epidemiology of Chagas disease, having simultaneous information on the dynamics of the parasite, vector, the gut microbiome of the vector, and the blood meal source would facilitate identifying key biotic factors associated with the risk of T. cruzi transmission. In this study, we developed a RADseq-based analysis pipeline to study mixed-species DNA extracted from T. dimidiata abdomens. To evaluate the efficacy of the method across spatial scales, we used a nested spatial sampling design that spanned from individual villages within Guatemala to major biogeographic regions of Central America. Information from each biotic source was distinguished with bioinformatics tools and used to evaluate the prevalence of T. cruzi infection and predominant Discrete Typing Units (DTUs) in the region, the population genetic structure of T. dimidiata, gut microbial diversity, and the blood meal history. An average of 3.25 million reads per specimen were obtained, with approximately 1% assigned to the parasite, 20% to the vector, 11% to bacteria, and 4% to putative blood meals. Using a total of 6,405 T. cruzi SNPs, we detected nine infected vectors harboring two distinct DTUs: TcI and a second unidentified strain, possibly TcIV. Vector specimens were sufficiently variable for population genomic analyses, with a total of 25,710 T. dimidiata SNPs across all samples that were sufficient to detect geographic genetic structure at both local and regional scales. We observed a diverse microbiotic community, with significantly higher bacterial species richness in infected T. dimidiata abdomens than those that were not infected. Unifrac analysis suggests a common assemblage of bacteria associated with infection, which co-occurs with the typical gut microbial community derived from the local environment. We identified vertebrate blood meals from five T. dimidiata abdomens, including chicken, dog, duck and human; however, additional detection methods would be necessary to confidently identify blood meal sources from most specimens. Overall, our study shows this method is effective for simultaneously generating genetic data on vectors and their associated parasites, along with ecological information on feeding patterns and microbial interactions that may be followed up with complementary approaches such as PCR-based parasite detection, 18S eukaryotic and 16S bacterial barcoding

UVM ScholarWorks

FigShare

High-throughput processing and normalization of one-color microarrays for transcriptional meta-analyses

Author: A Brazma
A Brazma
A Campain
BM Bolstad
D Ghosh
DR Rhodes
DR Rhodes
F Hong
GP Srivastava
HK Lee
I Dozmorov
J Hubble
JC Newman
JD Wren
JE Larkin
Jonathan D Wren
L Shi
L Shi
M Kapushesky
M Severgnini
MG Dozmorov
Mikhail G Dozmorov
P Cahan
P Cahan
PK Tan
RA Irizarry
T Bammler
T Barrett
T Konishi
W Fujibuchi
WC Cheng
X Yang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Microarray experiments are becoming increasingly common in biomedical research, as is their deposition in publicly accessible repositories, such as Gene Expression Omnibus (GEO). As such, there has been a surge in interest to use this microarray data for meta-analytic approaches, whether to increase sample size for a more powerful analysis of a specific disease (e.g. lung cancer) or to re-examine experiments for reasons different than those examined in the initial, publishing study that generated them. For the average biomedical researcher, there are a number of practical barriers to conducting such meta-analyses such as manually aggregating, filtering and formatting the data. Methods to automatically process large repositories of microarray data into a standardized, directly comparable format will enable easier and more reliable access to microarray data to conduct meta-analyses. Methods We present a straightforward, simple but robust against potential outliers method for automatic quality control and pre-processing of tens of thousands of single-channel microarray data files. GEO GDS files are quality checked by comparing parametric distributions and quantile normalized to enable direct comparison of expression level for subsequent meta-analyses. Results 13,000 human 1-color experiments were processed to create a single gene expression matrix that subsets can be extracted from to conduct meta-analyses. Interestingly, we found that when conducting a global meta-analysis of gene-gene co-expression patterns across all 13,000 experiments to predict gene function, normalization had minimal improvement over using the raw data. Conclusions Normalization of microarray data appears to be of minimal importance on analyses based on co-expression patterns when the sample size is on the order of thousands microarray datasets. Smaller subsets, however, are more prone to aberrations and artefacts, and effective means of automating normalization procedures not only empowers meta-analytic approaches, but aids in reproducibility by providing a standard way of approaching the problem. Data availability: matrix containing normalized expression of 20,813 genes across 13,000 experiments is available for download at . Source code for GDS files pre-processing is available from the authors upon request.</p

Springer - Publisher Connector

Jyväskylä University Digital Archive

UNCLES: Method for the identification of genes differentially consistently co-expressed in a specific subset of datasets

Author: A Huber
A Prelić
AA Shabalin
AP Gasch
Asoke K. Nandi
B Abu-Jamous
B Abu-Jamous
Basel Abu-Jamous
C Koch
CH Wade
CT Harbison
D Dikicioglu
D Liu
DA Orlando
David J. Roberts
IS Dhillon
J Bahler
J Yang
JK Choi
JK Limb
JM Pena
JM Stuart
KC Li
KC Li
KY Yeung
KY Yeung
L Lazzeroni
LP Zhao
MB Eisen
P Cahan
P Grandi
PC Roberts
PT Spellman
R Fa
R Lletı́a
R Nilsson
RJ Cho
RM Piro
Rui Fa
S Chu
S Fujii
S Sharma
S Vega-Pons
T Hayata
T Murali
T Pramila
TC Fleischer
VA Gennarino
X Liu
Y Cheng
Y Kluger
Z Tao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/06/2015
Field of study

Background: Collective analysis of the increasingly emerging gene expression datasets are required. The recently proposed binarisation of consensus partition matrices (Bi-CoPaM) method can combine clustering results from multiple datasets to identify the subsets of genes which are consistently co-expressed in all of the provided datasets in a tuneable manner. However, results validation and parameter setting are issues that complicate the design of such methods. Moreover, although it is a common practice to test methods by application to synthetic datasets, the mathematical models used to synthesise such datasets are usually based on approximations which may not always be sufficiently representative of real datasets. Results: Here, we propose an unsupervised method for the unification of clustering results from multiple datasets using external specifications (UNCLES). This method has the ability to identify the subsets of genes consistently co-expressed in a subset of datasets while being poorly co-expressed in another subset of datasets, and to identify the subsets of genes consistently co-expressed in all given datasets. We also propose the M-N scatter plots validation technique and adopt it to set the parameters of UNCLES, such as the number of clusters, automatically. Additionally, we propose an approach for the synthesis of gene expression datasets using real data profiles in a way which combines the ground-truth-knowledge of synthetic data and the realistic expression values of real data, and therefore overcomes the problem of faithfulness of synthetic expression data modelling. By application to those datasets, we validate UNCLES while comparing it with other conventional clustering methods, and of particular relevance, biclustering methods. We further validate UNCLES by application to a set of 14 real genome-wide yeast datasets as it produces focused clusters that conform well to known biological facts. Furthermore, in-silico-based hypotheses regarding the function of a few previously unknown genes in those focused clusters are drawn. Conclusions: The UNCLES method, the M-N scatter plots technique, and the expression data synthesis approach will have wide application for the comprehensive analysis of genomic and other sources of multiple complex biological datasets. Moreover, the derived in-silico-based biological hypotheses represent subjects for future functional studies.The National Institute for Health Research (NIHR) under its Programme Grants for Applied Research Programme (Grant Reference Number RP-PG-0310-1004)

Springer - Publisher Connector

Brunel University Research Archive

Clinical Outcomes and Quality of Life in Recipients of Livers Donated after Cardiac Death

Author: Amna Daud
Anton I. Skaro
Daniela P. Ladner
Joshua G. Cahan
Neehar D. Parikh
Vadim Lyuksemburg
Zeeshan Butt
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

Donation after cardiac death (DCD) has expanded in the last decade in the US; however, DCD liver utilization has flattened in recent years due to poor outcomes. We examined clinical and quality of life (QOL) outcomes of DCD recipients by conducting a retrospective and cross-sectional review of patients from 2003 to 2010. We compared clinical outcomes of DCD recipients (n=60) to those of donation after brain death (DBD) liver recipients (n=669) during the same time period. DCD recipients had significantly lower rates of 5-year graft survival (P<0.001) and a trend toward lower rates of 5-year patient survival (P=0.064) when compared to the DBD cohort. In order to examine QOL outcomes in our cohorts, we administered the Short Form Liver Disease Quality of Life questionnaire to 30 DCD and 60 DBD recipients. The DCD recipients reported lower generic and liver-specific QOL. We further stratified the DCD cohort by the presence of ischemic cholangiopathy (IC). Patients with IC reported lower QOL when compared to DBD recipients and those DCD recipients without IC (P<0.05). While the results are consistent with clinical experience, this is the first report of QOL in DCD recipients using standardized measures. These data can be used to guide future comparative effectiveness studies

Public Library of Science (PLOS)

Elusive Copy Number Variation in the Mouse Genome

Author: AB Olshen
AFA Smit
Amarjit Bhomra
AR Quinlan
Avigail Agam
B Ewing
B Yalcin
BE Stranger
Binnaz Yalcin
C Curtis
Caleb Webber
Christopher Holmes
CN Henrichsen
D Gordon
D St Clair
Daniel J. Kliebenstein
DE Watkins-Chow
DF Conrad
DF Conrad
DP Locke
DQ Nguyen
EJ Hollox
G Cutler
GH Perry
GJ Huang
Jonathan Flint
JP Schouten
JP Schouten
JR Lupski
KA Frazer
LD Orozco
LM Boyden
M Kubista
Matthew Cubin
NM Maas
P Cahan
P Cahan
P Hupe
R Redon
Richard Mott
S Rozen
SA McCarroll
SW Scherer
TA Graubert
TS Price
WB Breunis
X She
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Array comparative genomic hybridization (aCGH) to detect copy number variants (CNVs) in mammalian genomes has led to a growing awareness of the potential importance of this category of sequence variation as a cause of phenotypic variation. Yet there are large discrepancies between studies, so that the extent of the genome affected by CNVs is unknown. We combined molecular and aCGH analyses of CNVs in inbred mouse strains to investigate this question.Using a 2.1 million probe array we identified 1,477 deletions and 499 gains in 7 inbred mouse strains. Molecular characterization indicated that approximately one third of the CNVs detected by the array were false positives and we estimate the false negative rate to be more than 50%. We show that low concordance between studies is largely due to the molecular nature of CNVs, many of which consist of a series of smaller deletions and gains interspersed by regions where the DNA copy number is normal.Our results indicate that CNVs detected by arrays may be the coincidental co-localization of smaller CNVs, whose presence is more likely to perturb an aCGH hybridization profile than the effect of an isolated, small, copy number alteration. Our findings help explain the hitherto unexplored discrepancies between array-based studies of copy number variation in the mouse genome

Online Research @ Cardiff

HAL-Inserm