15,805 research outputs found

    Automated data integration for developmental biological research

    Get PDF
    In an era exploding with genome-scale data, a major challenge for developmental biologists is how to extract significant clues from these publicly available data to benefit our studies of individual genes, and how to use them to improve our understanding of development at a systems level. Several studies have successfully demonstrated new approaches to classic developmental questions by computationally integrating various genome-wide data sets. Such computational approaches have shown great potential for facilitating research: instead of testing 20,000 genes, researchers might test 200 to the same effect. We discuss the nature and state of this art as it applies to developmental research

    Systematic Planning of Genome-Scale Experiments in Poorly Studied Species

    Get PDF
    Genome-scale datasets have been used extensively in model organisms to screen for specific candidates or to predict functions for uncharacterized genes. However, despite the availability of extensive knowledge in model organisms, the planning of genome-scale experiments in poorly studied species is still based on the intuition of experts or heuristic trials. We propose that computational and systematic approaches can be applied to drive the experiment planning process in poorly studied species based on available data and knowledge in closely related model organisms. In this paper, we suggest a computational strategy for recommending genome-scale experiments based on their capability to interrogate diverse biological processes to enable protein function assignment. To this end, we use the data-rich functional genomics compendium of the model organism to quantify the accuracy of each dataset in predicting each specific biological process and the overlap in such coverage between different datasets. Our approach uses an optimized combination of these quantifications to recommend an ordered list of experiments for accurately annotating most proteins in the poorly studied related organisms to most biological processes, as well as a set of experiments that target each specific biological process. The effectiveness of this experiment- planning system is demonstrated for two related yeast species: the model organism Saccharomyces cerevisiae and the comparatively poorly studied Saccharomyces bayanus. Our system recommended a set of S. bayanus experiments based on an S. cerevisiae microarray data compendium. In silico evaluations estimate that less than 10% of the experiments could achieve similar functional coverage to the whole microarray compendium. This estimation was confirmed by performing the recommended experiments in S. bayanus, therefore significantly reducing the labor devoted to characterize the poorly studied genome. This experiment-planning framework could readily be adapted to the design of other types of large-scale experiments as well as other groups of organisms

    Novel Methods for Multivariate Ordinal Data applied to Genetic Diplotypes, Genomic Pathways, Risk Profiles, and Pattern Similarity

    Get PDF
    Introduction: Conventional statistical methods for multivariate data (e.g., discriminant/regression) are based on the (generalized) linear model, i.e., the data are interpreted as points in a Euclidian space of independent dimensions. The dimensionality of the data is then reduced by assuming the components to be related by a specific function of known type (linear, exponential, etc.), which allows the distance of each point from a hyperspace to be determined. While mathematically elegant, these approaches may have shortcomings when applied to real world applications where the relative importance, the functional relationship, and the correlation among the variables tend to be unknown. Still, in many applications, each variable can be assumed to have at least an “orientation”, i.e., it can reasonably assumed that, if all other conditions are held constant, an increase in this variable is either “good” or “bad”. The direction of this orientation can be known or unknown. In genetics, for instance, having more “abnormal” alleles may increase the risk (or magnitude) of a disease phenotype. In genomics, the expression of several related genes may indicate disease activity. When screening for security risks, more indicators for atypical behavior may constitute raise more concern, in face or voice recognition, more indicators being similar may increase the likelihood of a person being identified. Methods: In 1998, we developed a nonparametric method for analyzing multivariate ordinal data to assess the overall risk of HIV infection based on different types of behavior or the overall protective effect of barrier methods against HIV infection. By using u-statistics, rather than the marginal likelihood, we were able to increase the computational efficiency of this approach by several orders of magnitude. Results: We applied this approach to assessing immunogenicity of a vaccination strategy in cancer patients. While discussing the pitfalls of the conventional methods for linking quantitative traits to haplotypes, we realized that this approach could be easily modified into to a statistically valid alternative to a previously proposed approaches. We have now begun to use the same methodology to correlate activity of anti-inflammatory drugs along genomic pathways with disease severity of psoriasis based on several clinical and histological characteristics. Conclusion: Multivariate ordinal data are frequently observed to assess semiquantitative characteristics, such as risk profiles (genetic, genomic, or security) or similarity of pattern (faces, voices, behaviors). The conventional methods require empirical validation, because the functions and weights chosen cannot be justified on theoretical grounds. The proposed statistical method for analyzing profiles of ordinal variables, is intrinsically valid. Since no additional assumptions need to be made, the often time-consuming empirical validation can be skipped.ranking; nonparametric; robust; scoring; multivariate

    Conserved host response to highly pathogenic avian influenza virus infection in human cell culture, mouse and macaque model systems

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Understanding host response to influenza virus infection will facilitate development of better diagnoses and therapeutic interventions. Several different experimental models have been used as a proxy for human infection, including cell cultures derived from human cells, mice, and non-human primates. Each of these systems has been studied extensively in isolation, but little effort has been directed toward systematically characterizing the conservation of host response on a global level beyond known immune signaling cascades.</p> <p>Results</p> <p>In the present study, we employed a multivariate modeling approach to characterize and compare the transcriptional regulatory networks between these three model systems after infection with a highly pathogenic avian influenza virus of the H5N1 subtype. Using this approach we identified functions and pathways that display similar behavior and/or regulation including the well-studied impact on the interferon response and the inflammasome. Our results also suggest a primary response role for airway epithelial cells in initiating hypercytokinemia, which is thought to contribute to the pathogenesis of H5N1 viruses. We further demonstrate that we can use a transcriptional regulatory model from the human cell culture data to make highly accurate predictions about the behavior of important components of the innate immune system in tissues from whole organisms.</p> <p>Conclusions</p> <p>This is the first demonstration of a global regulatory network modeling conserved host response between <it>in vitro </it>and <it>in vivo </it>models.</p

    Privacy and Accountability in Black-Box Medicine

    Get PDF
    Black-box medicine—the use of big data and sophisticated machine learning techniques for health-care applications—could be the future of personalized medicine. Black-box medicine promises to make it easier to diagnose rare diseases and conditions, identify the most promising treatments, and allocate scarce resources among different patients. But to succeed, it must overcome two separate, but related, problems: patient privacy and algorithmic accountability. Privacy is a problem because researchers need access to huge amounts of patient health information to generate useful medical predictions. And accountability is a problem because black-box algorithms must be verified by outsiders to ensure they are accurate and unbiased, but this means giving outsiders access to this health information. This article examines the tension between the twin goals of privacy and accountability and develops a framework for balancing that tension. It proposes three pillars for an effective system of privacy-preserving accountability: substantive limitations on the collection, use, and disclosure of patient information; independent gatekeepers regulating information sharing between those developing and verifying black-box algorithms; and information-security requirements to prevent unintentional disclosures of patient information. The article examines and draws on a similar debate in the field of clinical trials, where disclosing information from past trials can lead to new treatments but also threatens patient privacy

    EGFR associated expression profiles vary with breast tumor subtype

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The epidermal growth factor receptor (EGFR/HER1) and its downstream signaling events are important for regulating cell growth and behavior in many epithelial tumors types. In breast cancer, the role of EGFR is complex and appears to vary relative to important clinical features including estrogen receptor (ER) status. To investigate EGFR-signaling using a genomics approach, several breast basal-like and luminal epithelial cell lines were examined for sensitivity to EGFR inhibitors. An EGFR-associated gene expression signature was identified in the basal-like SUM102 cell line and was used to classify a diverse set of sporadic breast tumors.</p> <p>Results</p> <p><it>In vitro</it>, breast basal-like cell lines were more sensitive to EGFR inhibitors compared to luminal cell lines. The basal-like tumor derived lines were also the most sensitive to carboplatin, which acted synergistically with cetuximab. An EGFR-associated signature was developed <it>in vitro</it>, evaluated on 241 primary breast tumors; three distinct clusters of genes were evident <it>in vivo</it>, two of which were predictive of poor patient outcomes. These EGFR-associated poor prognostic signatures were highly expressed in almost all basal-like tumors and many of the HER2+/ER- and Luminal B tumors.</p> <p>Conclusion</p> <p>These results suggest that breast basal-like cell lines are sensitive to EGFR inhibitors and carboplatin, and this combination may also be synergistic. <it>In vivo</it>, the EGFR-signatures were of prognostic value, were associated with tumor subtype, and were uniquely associated with the high expression of distinct EGFR-RAS-MEK pathway genes.</p

    A cell–ECM screening method to predict breast cancer metastasis

    Get PDF
    Breast cancer preferentially spreads to the bone, brain, liver, and lung. The clinical patterns of this tissue-specific spread (tropism) cannot be explained by blood flow alone, yet our understanding of what mediates tropism to these physically and chemically diverse tissues is limited. While the micro- environment has been recognized as a critical factor in governing metastatic colonization, the role of the extracellular matrix (ECM) in mediating tropism has not been thoroughly explored. We created a simple biomaterial platform with systematic control over the ECM protein density and composition to determine if integrin binding governs how metastatic cells differentiate between secondary tissue sites. Instead of examining individual behaviors, we compiled large patterns of phenotypes associated with adhesion to and migration on these controlled ECMs. In combining this novel analysis with a simple biomaterial platform, we created an in vitro fingerprint that is predictive of in vivo metastasis. This rapid biomaterial screen also provided information on how b1, a2, and a6 integrins might mediate metastasis in patients, providing insights beyond a purely genetic analysis. We propose that this approach of screening many cell–ECM interactions, across many different heterogeneous cell lines, is predictive of in vivo behavior, and is much simpler, faster, and more economical than complex 3D environments or mouse models. We also propose that when specifically applied toward the question of tissue tropism in breast cancer, it can be used to provide insight into certain integrin subunits as therapeutic targets

    Graph Theory and Networks in Biology

    Get PDF
    In this paper, we present a survey of the use of graph theoretical techniques in Biology. In particular, we discuss recent work on identifying and modelling the structure of bio-molecular networks, as well as the application of centrality measures to interaction networks and research on the hierarchical structure of such networks and network motifs. Work on the link between structural network properties and dynamics is also described, with emphasis on synchronization and disease propagation.Comment: 52 pages, 5 figures, Survey Pape
    corecore