635 research outputs found

    Novel evolutionary algorithm identifies interactions driving infestation of triatoma dimidiata, a chagas disease vector

    Get PDF
    Chagas disease is a lethal, neglected tropical disease. Unfortunately, aggressive insecticide-spraying campaigns have not been able to eliminate domestic infestation of Triatoma dimidiata, the native vector in Guatemala. To target interventions toward houses most at risk of infestation, comprehensive socioeconomic and entomologic surveys were conducted in two towns in Jutiapa, Guatemala. Given the exhaustively large search space associated with combinations of risk factors, traditional statistics are limited in their ability to discover risk factor interactions. Two recently developed statistical evolutionary algorithms, specifically designed to accommodate risk factor interactions and heterogeneity, were applied to this large combinatorial search space and used in tandem to identify sets of risk factor combinations associated with infestation. The optimal model includes 10 risk factors in what is known as a third-order disjunctive normal form (i.e., infested households have chicken coops AND deteriorated bedroom walls OR an accumulation of objects AND dirt floors AND total number of occupants 3 5 AND years of electricity 3 5 OR poor hygienic condition ratings AND adobe walls AND deteriorated walls AND dogs). Houses with dirt floors and deteriorated walls have been reported previously as risk factors and align well with factors currently targeted by Ecohealth interventions to minimize infestation. However, the tandem evolutionary algorithms also identified two new socioeconomic risk factors (i.e., households having many occupants and years of electricity 3 5). Identifying key risk factors may help with the development of new Ecohealth interventions and/or reduce the survey time needed to identify houses most at risk

    Computational Stem Cell Biology: Open Questions and Guiding Principles

    Get PDF
    Computational biology is enabling an explosive growth in our understanding of stem cells and our ability to use them for disease modeling, regenerative medicine, and drug discovery. We discuss four topics that exemplify applications of computation to stem cell biology: cell typing, lineage tracing, trajectory inference, and regulatory networks. We use these examples to articulate principles that have guided computational biology broadly and call for renewed attention to these principles as computation becomes increasingly important in stem cell biology. We also discuss important challenges for this field with the hope that it will inspire more to join this exciting area

    Uncovering vector, parasite, blood meal and microbiome patterns from mixed-DNA specimens of the Chagas disease vector Triatoma dimidiata

    Get PDF
    Chagas disease, considered a neglected disease by the World Health Organization, is caused by the protozoan parasite Trypanosoma cruzi, and transmitted by \u3e140 triatomine species across the Americas. In Central America, the main vector is Triatoma dimidiata, an opportunistic blood meal feeder inhabiting both domestic and sylvatic ecotopes. Given the diversity of interacting biological agents involved in the epidemiology of Chagas disease, having simultaneous information on the dynamics of the parasite, vector, the gut microbiome of the vector, and the blood meal source would facilitate identifying key biotic factors associated with the risk of T. cruzi transmission. In this study, we developed a RADseq-based analysis pipeline to study mixed-species DNA extracted from T. dimidiata abdomens. To evaluate the efficacy of the method across spatial scales, we used a nested spatial sampling design that spanned from individual villages within Guatemala to major biogeographic regions of Central America. Information from each biotic source was distinguished with bioinformatics tools and used to evaluate the prevalence of T. cruzi infection and predominant Discrete Typing Units (DTUs) in the region, the population genetic structure of T. dimidiata, gut microbial diversity, and the blood meal history. An average of 3.25 million reads per specimen were obtained, with approximately 1% assigned to the parasite, 20% to the vector, 11% to bacteria, and 4% to putative blood meals. Using a total of 6,405 T. cruzi SNPs, we detected nine infected vectors harboring two distinct DTUs: TcI and a second unidentified strain, possibly TcIV. Vector specimens were sufficiently variable for population genomic analyses, with a total of 25,710 T. dimidiata SNPs across all samples that were sufficient to detect geographic genetic structure at both local and regional scales. We observed a diverse microbiotic community, with significantly higher bacterial species richness in infected T. dimidiata abdomens than those that were not infected. Unifrac analysis suggests a common assemblage of bacteria associated with infection, which co-occurs with the typical gut microbial community derived from the local environment. We identified vertebrate blood meals from five T. dimidiata abdomens, including chicken, dog, duck and human; however, additional detection methods would be necessary to confidently identify blood meal sources from most specimens. Overall, our study shows this method is effective for simultaneously generating genetic data on vectors and their associated parasites, along with ecological information on feeding patterns and microbial interactions that may be followed up with complementary approaches such as PCR-based parasite detection, 18S eukaryotic and 16S bacterial barcoding

    High-throughput processing and normalization of one-color microarrays for transcriptional meta-analyses

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Microarray experiments are becoming increasingly common in biomedical research, as is their deposition in publicly accessible repositories, such as Gene Expression Omnibus (GEO). As such, there has been a surge in interest to use this microarray data for meta-analytic approaches, whether to increase sample size for a more powerful analysis of a specific disease (e.g. lung cancer) or to re-examine experiments for reasons different than those examined in the initial, publishing study that generated them. For the average biomedical researcher, there are a number of practical barriers to conducting such meta-analyses such as manually aggregating, filtering and formatting the data. Methods to automatically process large repositories of microarray data into a standardized, directly comparable format will enable easier and more reliable access to microarray data to conduct meta-analyses.</p> <p>Methods</p> <p>We present a straightforward, simple but robust against potential outliers method for automatic quality control and pre-processing of tens of thousands of single-channel microarray data files. GEO GDS files are quality checked by comparing parametric distributions and quantile normalized to enable direct comparison of expression level for subsequent meta-analyses.</p> <p>Results</p> <p>13,000 human 1-color experiments were processed to create a single gene expression matrix that subsets can be extracted from to conduct meta-analyses. Interestingly, we found that when conducting a global meta-analysis of gene-gene co-expression patterns across all 13,000 experiments to predict gene function, normalization had minimal improvement over using the raw data.</p> <p>Conclusions</p> <p>Normalization of microarray data appears to be of minimal importance on analyses based on co-expression patterns when the sample size is on the order of thousands microarray datasets. Smaller subsets, however, are more prone to aberrations and artefacts, and effective means of automating normalization procedures not only empowers meta-analytic approaches, but aids in reproducibility by providing a standard way of approaching the problem.</p> <p>Data availability: matrix containing normalized expression of 20,813 genes across 13,000 experiments is available for download at . Source code for GDS files pre-processing is available from the authors upon request.</p

    UNCLES: Method for the identification of genes differentially consistently co-expressed in a specific subset of datasets

    Get PDF
    Background: Collective analysis of the increasingly emerging gene expression datasets are required. The recently proposed binarisation of consensus partition matrices (Bi-CoPaM) method can combine clustering results from multiple datasets to identify the subsets of genes which are consistently co-expressed in all of the provided datasets in a tuneable manner. However, results validation and parameter setting are issues that complicate the design of such methods. Moreover, although it is a common practice to test methods by application to synthetic datasets, the mathematical models used to synthesise such datasets are usually based on approximations which may not always be sufficiently representative of real datasets. Results: Here, we propose an unsupervised method for the unification of clustering results from multiple datasets using external specifications (UNCLES). This method has the ability to identify the subsets of genes consistently co-expressed in a subset of datasets while being poorly co-expressed in another subset of datasets, and to identify the subsets of genes consistently co-expressed in all given datasets. We also propose the M-N scatter plots validation technique and adopt it to set the parameters of UNCLES, such as the number of clusters, automatically. Additionally, we propose an approach for the synthesis of gene expression datasets using real data profiles in a way which combines the ground-truth-knowledge of synthetic data and the realistic expression values of real data, and therefore overcomes the problem of faithfulness of synthetic expression data modelling. By application to those datasets, we validate UNCLES while comparing it with other conventional clustering methods, and of particular relevance, biclustering methods. We further validate UNCLES by application to a set of 14 real genome-wide yeast datasets as it produces focused clusters that conform well to known biological facts. Furthermore, in-silico-based hypotheses regarding the function of a few previously unknown genes in those focused clusters are drawn. Conclusions: The UNCLES method, the M-N scatter plots technique, and the expression data synthesis approach will have wide application for the comprehensive analysis of genomic and other sources of multiple complex biological datasets. Moreover, the derived in-silico-based biological hypotheses represent subjects for future functional studies.The National Institute for Health Research (NIHR) under its Programme Grants for Applied Research Programme (Grant Reference Number RP-PG-0310-1004)

    Clinical Outcomes and Quality of Life in Recipients of Livers Donated after Cardiac Death

    Get PDF
    Donation after cardiac death (DCD) has expanded in the last decade in the US; however, DCD liver utilization has flattened in recent years due to poor outcomes. We examined clinical and quality of life (QOL) outcomes of DCD recipients by conducting a retrospective and cross-sectional review of patients from 2003 to 2010. We compared clinical outcomes of DCD recipients (n=60) to those of donation after brain death (DBD) liver recipients (n=669) during the same time period. DCD recipients had significantly lower rates of 5-year graft survival (P<0.001) and a trend toward lower rates of 5-year patient survival (P=0.064) when compared to the DBD cohort. In order to examine QOL outcomes in our cohorts, we administered the Short Form Liver Disease Quality of Life questionnaire to 30 DCD and 60 DBD recipients. The DCD recipients reported lower generic and liver-specific QOL. We further stratified the DCD cohort by the presence of ischemic cholangiopathy (IC). Patients with IC reported lower QOL when compared to DBD recipients and those DCD recipients without IC (P<0.05). While the results are consistent with clinical experience, this is the first report of QOL in DCD recipients using standardized measures. These data can be used to guide future comparative effectiveness studies

    Elusive Copy Number Variation in the Mouse Genome

    Get PDF
    Array comparative genomic hybridization (aCGH) to detect copy number variants (CNVs) in mammalian genomes has led to a growing awareness of the potential importance of this category of sequence variation as a cause of phenotypic variation. Yet there are large discrepancies between studies, so that the extent of the genome affected by CNVs is unknown. We combined molecular and aCGH analyses of CNVs in inbred mouse strains to investigate this question.Using a 2.1 million probe array we identified 1,477 deletions and 499 gains in 7 inbred mouse strains. Molecular characterization indicated that approximately one third of the CNVs detected by the array were false positives and we estimate the false negative rate to be more than 50%. We show that low concordance between studies is largely due to the molecular nature of CNVs, many of which consist of a series of smaller deletions and gains interspersed by regions where the DNA copy number is normal.Our results indicate that CNVs detected by arrays may be the coincidental co-localization of smaller CNVs, whose presence is more likely to perturb an aCGH hybridization profile than the effect of an isolated, small, copy number alteration. Our findings help explain the hitherto unexplored discrepancies between array-based studies of copy number variation in the mouse genome

    Study of Intraventricular Cerliponase Alfa for CLN2 Disease

    Get PDF
    BACKGROUND Recombinant human tripeptidyl peptidase 1 (cerliponase alfa) is an enzyme-replacement therapy that has been developed to treat neuronal ceroid lipofuscinosis type 2 (CLN2) disease, a rare lysosomal disorder that causes progressive dementia in children. METHODS In a multicenter, open-label study, we evaluated the effect of intraventricular infusion of cerliponase alfa every 2 weeks in children with CLN2 disease who were between the ages of 3 and 16 years. Treatment was initiated at a dose of 30 mg, 100 mg, or 300 mg; all the patients then received the 300-mg dose for at least 96 weeks. The primary outcome was the time until a 2-point decline in the score on the motor and language domains of the CLN2 Clinical Rating Scale (which ranges from 0 to 6, with 0 representing no function and 3 representing normal function in each of the two domains), which was compared with the time until a 2-point decline in 42 historical controls. We also compared the rate of decline in the motor–language score between the two groups, using data from baseline to the last assessment with a score of more than 0, divided by the length of follow-up (in units of 48 weeks). RESULTS Twenty-four patients were enrolled, 23 of whom constituted the efficacy population. The median time until a 2-point decline in the motor–language score was not reached for treated patients and was 345 days for historical controls. The mean (±SD) unadjusted rate of decline in the motor–language score per 48-week period was 0.27±0.35 points in treated patients and 2.12±0.98 points in 42 historical controls (mean difference, 1.85; P<0.001). Common adverse events included convulsions, pyrexia, vomiting, hypersensitivity reactions, and failure of the intraventricular device. In 2 patients, infections developed in the intraventricular device that was used to administer the infusion, which required antibiotic treatment and device replacement. CONCLUSIONS Intraventricular infusion of cerliponase alfa in patients with CLN2 disease resulted in less decline in motor and language function than that in historical controls. Serious adverse events included failure of the intraventricular device and device-related infections. (Funded by BioMarin Pharmaceutical and others; CLN2 ClinicalTrials.gov numbers, NCT01907087 and NCT02485899.
    corecore