159 research outputs found

    Predictive gene lists for breast cancer prognosis: A topographic visualisation study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The controversy surrounding the non-uniqueness of predictive gene lists (PGL) of small selected subsets of genes from very large potential candidates as available in DNA microarray experiments is now widely acknowledged <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Many of these studies have focused on constructing discriminative semi-parametric models and as such are also subject to the issue of random correlations of sparse model selection in high dimensional spaces. In this work we outline a different approach based around an unsupervised patient-specific nonlinear topographic projection in predictive gene lists.</p> <p>Methods</p> <p>We construct nonlinear topographic projection maps based on inter-patient gene-list relative dissimilarities. The Neuroscale, the Stochastic Neighbor Embedding(SNE) and the Locally Linear Embedding(LLE) techniques have been used to construct two-dimensional projective visualisation plots of 70 dimensional PGLs per patient, classifiers are also constructed to identify the prognosis indicator of each patient using the resulting projections from those visualisation techniques and investigate whether <it>a-posteriori </it>two prognosis groups are separable on the evidence of the gene lists.</p> <p>A literature-proposed predictive gene list for breast cancer is benchmarked against a separate gene list using the above methods. Generalisation ability is investigated by using the mapping capability of Neuroscale to visualise the follow-up study, but based on the projections derived from the original dataset.</p> <p>Results</p> <p>The results indicate that small subsets of patient-specific PGLs have insufficient prognostic dissimilarity to permit a distinction between two prognosis patients. Uncertainty and diversity across multiple gene expressions prevents unambiguous or even confident patient grouping. Comparative projections across different PGLs provide similar results.</p> <p>Conclusion</p> <p>The random correlation effect to an arbitrary outcome induced by small subset selection from very high dimensional interrelated gene expression profiles leads to an outcome with associated uncertainty. This continuum and uncertainty precludes any attempts at constructing discriminative classifiers.</p> <p>However a patient's gene expression profile could possibly be used in treatment planning, based on knowledge of other patients' responses.</p> <p>We conclude that many of the patients involved in such medical studies are <it>intrinsically unclassifiable </it>on the basis of provided PGL evidence. This additional category of 'unclassifiable' should be accommodated within medical decision support systems if serious errors and unnecessary adjuvant therapy are to be avoided.</p

    R-Gada: a fast and flexible pipeline for copy number analysis in association studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome-wide association studies (GWAS) using Copy Number Variation (CNV) are becoming a central focus of genetic research. CNVs have successfully provided target genome regions for some disease conditions where simple genetic variation (i.e., SNPs) has previously failed to provide a clear association.</p> <p>Results</p> <p>Here we present a new R package, that integrates: (i) data import from most common formats of Affymetrix, Illumina and aCGH arrays; (ii) a fast and accurate segmentation algorithm to call CNVs based on Genome Alteration Detection Analysis (GADA); and (iii) functions for displaying and exporting the Copy Number calls, identification of recurrent CNVs, multivariate analysis of population structure, and tools for performing association studies. Using a large dataset containing 270 HapMap individuals (Affymetrix Human SNP Array 6.0 Sample Dataset) we demonstrate a flexible pipeline implemented with the package. It requires less than one minute per sample (3 million probe arrays) on a single core computer, and provides a flexible parallelization for very large datasets. Case-control data were generated from the HapMap dataset to demonstrate a GWAS analysis.</p> <p>Conclusions</p> <p>The package provides the tools for creating a complete integrated pipeline from data normalization to statistical association. It can effciently handle a massive volume of data consisting of millions of genetic markers and hundreds or thousands of samples with very accurate results.</p

    puma: a Bioconductor package for propagating uncertainty in microarray analysis

    Get PDF
    BACKGROUND: Most analyses of microarray data are based on point estimates of expression levels and ignore the uncertainty of such estimates. By determining uncertainties from Affymetrix GeneChip data and propagating these uncertainties to downstream analyses it has been shown that we can improve results of differential expression detection, principal component analysis and clustering. Previously, implementations of these uncertainty propagation methods have only been available as separate packages, written in different languages. Previous implementations have also suffered from being very costly to compute, and in the case of differential expression detection, have been limited in the experimental designs to which they can be applied. RESULTS: puma is a Bioconductor package incorporating a suite of analysis methods for use on Affymetrix GeneChip data. puma extends the differential expression detection methods of previous work from the 2-class case to the multi-factorial case. puma can be used to automatically create design and contrast matrices for typical experimental designs, which can be used both within the package itself but also in other Bioconductor packages. The implementation of differential expression detection methods has been parallelised leading to significant decreases in processing time on a range of computer architectures. puma incorporates the first R implementation of an uncertainty propagation version of principal component analysis, and an implementation of a clustering method based on uncertainty propagation. All of these techniques are brought together in a single, easy-to-use package with clear, task-based documentation. CONCLUSION: For the first time, the puma package makes a suite of uncertainty propagation methods available to a general audience. These methods can be used to improve results from more traditional analyses of microarray data. puma also offers improvements in terms of scope and speed of execution over previously available methods. puma is recommended for anyone working with the Affymetrix GeneChip platform for gene expression analysis and can also be applied more generally

    Self- and peer assessment may not be an accurate measure of PBL tutorial process

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Universidade Cidade de São Paulo adopted a problem-based learning (PBL) strategy as the predominant method for teaching and learning medicine. Self-, peer- and tutor marks of the educational process are taken into account as part of the final grade, which also includes assessment of content. This study compared the different perspectives (and grades) of evaluators during tutorials with first year medical students, from 2004 to 2007 (n = 349), from seven semesters.</p> <p>Methods</p> <p>The tutorial evaluation method was comprised of the students' self assessment (SA) (10%), tutor assessment (TA) (80%) and peer assessment (PA) (10%) to calculate a final educational process grade for each tutorial. We compared these three grades from each tutorial for seven semesters using ANOVA and a post hoc test.</p> <p>Results</p> <p>A total of 349 students participated with 199 (57%) women and 150 (42%) men. The SA and PA scores were consistently greater than the TA scores. Moreover, the SA and PA groups did not show statistical difference in any semester evaluated, while both differed from tutor assessment in all semesters (Kruskal-Wallis, Dunn's test). The Spearman rank order showed significant (p < 0.0001) and positive correlation for the SA and PA groups (r = 0.806); this was not observed when we compared TA with PA (r = 0.456) or TA with SA (r = 0.376).</p> <p>Conclusion</p> <p>Peer- and self-assessment marks might be reliable but not valid for PBL tutorial process, especially if these assessments are used for summative assessment, composing the final grade. This article suggests reconsideration of the use of summative assessment for self-evaluation in PBL tutorials.</p

    The C:N:P:S stoichiometry of soil organic matter

    Get PDF
    The formation and turnover of soil organic matter (SOM) includes the biogeochemical processing of the macronutrient elements nitrogen (N), phosphorus (P) and sulphur (S), which alters their stoichiometric relationships to carbon (C) and to each other. We sought patterns among soil organic C, N, P and S in data for c. 2000 globally distributed soil samples, covering all soil horizons. For non-peat soils, strong negative correlations (p < 0.001) were found between N:C, P:C and S:C ratios and % organic carbon (OC), showing that SOM of soils with low OC concentrations (high in mineral matter) is rich in N, P and S. The results can be described approximately with a simple mixing model in which nutrient-poor SOM (NPSOM) has N:C, P:C and S:C ratios of 0.039, 0.0011 and 0.0054, while nutrient-rich SOM (NRSOM) has corresponding ratios of 0.12, 0.016 and 0.016, so that P is especially enriched in NRSOM compared to NPSOM. The trends hold across a range of ecosystems, for topsoils, including O horizons, and subsoils, and across different soil classes. The major exception is that tropical soils tend to have low P:C ratios especially at low N:C. We suggest that NRSOM comprises compounds selected by their strong adsorption to mineral matter. The stoichiometric patterns established here offer a new quantitative framework for SOM classification and characterisation, and provide important constraints to dynamic soil and ecosystem models of carbon turnover and nutrient dynamics

    Hierarchical Models in the Brain

    Get PDF
    This paper describes a general model that subsumes many parametric models for continuous data. The model comprises hidden layers of state-space or dynamic causal models, arranged so that the output of one provides input to another. The ensuing hierarchy furnishes a model for many types of data, of arbitrary complexity. Special cases range from the general linear model for static data to generalised convolution models, with system noise, for nonlinear time-series analysis. Crucially, all of these models can be inverted using exactly the same scheme, namely, dynamic expectation maximization. This means that a single model and optimisation scheme can be used to invert a wide range of models. We present the model and a brief review of its inversion to disclose the relationships among, apparently, diverse generative models of empirical data. We then show that this inversion can be formulated as a simple neural network and may provide a useful metaphor for inference and learning in the brain

    Depletion of B2 but Not B1a B Cells in BAFF Receptor-Deficient ApoE−/− Mice Attenuates Atherosclerosis by Potently Ameliorating Arterial Inflammation

    Get PDF
    We have recently identified conventional B2 cells as atherogenic and B1a cells as atheroprotective in hypercholesterolemic ApoE−/− mice. Here, we examined the development of atherosclerosis in BAFF-R deficient ApoE−/− mice because B2 cells but not B1a cells are selectively depleted in BAFF-R deficient mice. We fed BAFF-R−/− ApoE−/− (BaffR.ApoE DKO) and BAFF-R+/+ApoE−/− (ApoE KO) mice a high fat diet (HFD) for 8-weeks. B2 cells were significantly reduced by 82%, 81%, 94%, 72% in blood, peritoneal fluid, spleen and peripheral lymph nodes respectively; while B1a cells and non-B lymphocytes were unaffected. Aortic atherosclerotic lesions assessed by oil red-O stained-lipid accumulation and CD68+ macrophage accumulation were decreased by 44% and 50% respectively. B cells were absent in atherosclerotic lesions of BaffR.ApoE DKO mice as were IgG1 and IgG2a immunoglobulins produced by B2 cells, despite low but measurable numbers of B2 cells and IgG1 and IgG2a immunoglobulin concentrations in plasma. Plasma IgM and IgM deposits in atherosclerotic lesions were also reduced. BAFF-R deficiency in ApoE−/− mice was also associated with a reduced expression of VCAM-1 and fewer macrophages, dendritic cells, CD4+ and CD8+ T cell infiltrates and PCNA+ cells in lesions. The expression of proinflammatory cytokines, TNF-α, IL1-β and proinflammatory chemokine MCP-1 was also reduced. Body weight and plasma cholesterols were unaffected in BaffR.ApoE DKO mice. Our data indicate that B2 cells are important contributors to the development of atherosclerosis and that targeting the BAFF-R to specifically reduce atherogenic B2 cell numbers while preserving atheroprotective B1a cell numbers may be a potential therapeutic strategy to reduce atherosclerosis by potently reducing arterial inflammation

    Transthyretin Aggregation Pathway toward the Formation of Distinct Cytotoxic Oligomers

    Get PDF
    Characterization of small oligomers formed at an early stage of amyloid formation is critical to understanding molecular mechanism of pathogenic aggregation process. Here we identifed and characterized cytotoxic oligomeric intermediates populated during transthyretin (TTR) aggregation process. Under the amyloid-forming conditions, TTR initially forms a dimer through interactions between outer strands. The dimers are then associated to form a hexamer with a spherical shape, which serves as a building block to self-assemble into cytotoxic oligomers. Notably, wild-type (WT) TTR tends to form linear oligomers, while aTTR variant(G53A) prefers forming annular oligomers with pore-like structures. Structural analyses of the amyloidogenic intermediates using circular dichroism (CD) and solid-state NMR revealthatthe dimer and oligomers have a signifcant degree of native-like β-sheet structures (35–38%), but with more disordered regions (~60%)than those of nativeTTR.TheTTR variant oligomers are also less structured than WT oligomers. The partially folded nature of the oligomeric intermediates might be a common structural property of cytotoxic oligomers.The higher fexibility of the dimer and oligomers may also compensate for the entropic loss due to the oligomerization of the monomers

    Mining and state-space modeling and verification of sub-networks from large-scale biomolecular networks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Biomolecular networks dynamically respond to stimuli and implement cellular function. Understanding these dynamic changes is the key challenge for cell biologists. As biomolecular networks grow in size and complexity, the model of a biomolecular network must become more rigorous to keep track of all the components and their interactions. In general this presents the need for computer simulation to manipulate and understand the biomolecular network model.</p> <p>Results</p> <p>In this paper, we present a novel method to model the regulatory system which executes a cellular function and can be represented as a biomolecular network. Our method consists of two steps. First, a novel scale-free network clustering approach is applied to the large-scale biomolecular network to obtain various sub-networks. Second, a state-space model is generated for the sub-networks and simulated to predict their behavior in the cellular context. The modeling results represent <it>hypotheses </it>that are tested against high-throughput data sets (microarrays and/or genetic screens) for both the natural system and perturbations. Notably, the dynamic modeling component of this method depends on the automated network structure generation of the first component and the sub-network clustering, which are both essential to make the solution tractable.</p> <p>Conclusion</p> <p>Experimental results on time series gene expression data for the human cell cycle indicate our approach is promising for sub-network mining and simulation from large-scale biomolecular network.</p

    Characterization of transcriptional networks in blood stem and progenitor cells using high-throughput single-cell gene expression analysis

    Get PDF
    Cellular decision-making is mediated by a complex interplay of external stimuli with the intracellular environment, in particular transcription factor regulatory networks. Here we have determined the expression of a network of 18 key haematopoietic transcription factors in 597 single primary blood stem and progenitor cells isolated from mouse bone marrow. We demonstrate that different stem/progenitor populations are characterized by distinctive transcription factor expression states, and through comprehensive bioinformatic analysis reveal positively and negatively correlated transcription factor pairings, including previously unrecognized relationships between Gata2, Gfi1 and Gfi1b. Validation using transcriptional and transgenic assays confirmed direct regulatory interactions consistent with a regulatory triad in immature blood stem cells, where Gata2 may function to modulate cross-inhibition between Gfi1 and Gfi1b. Single-cell expression profiling therefore identifies network states and allows reconstruction of network hierarchies involved in controlling stem cell fate choices, and provides a blueprint for studying both normal development and human disease
    • …
    corecore