650 research outputs found

    Semi-supervised learning for the identification of syn-expressed genes from fused microarray and in situ image data

    Get PDF
    Background: Gene expression measurements during the development of the fly Drosophila melanogaster are routinely used to find functional modules of temporally co-expressed genes. Complimentary large data sets of in situ RNA hybridization images for different stages of the fly embryo elucidate the spatial expression patterns. Results: Using a semi-supervised approach, constrained clustering with mixture models, we can find clusters of genes exhibiting spatio-temporal similarities in expression, or syn-expression. The temporal gene expression measurements are taken as primary data for which pairwise constraints are computed in an automated fashion from raw in situ images without the need for manual annotation. We investigate the influence of these pairwise constraints in the clustering and discuss the biological relevance of our results. Conclusion: Spatial information contributes to a detailed, biological meaningful analysis of temporal gene expression data. Semi-supervised learning provides a flexible, robust and efficient framework for integrating data sources of differing quality and abundance

    Learning Sparse Representations for Fruit-Fly Gene Expression Pattern Image Annotation and Retrieval

    Get PDF
    abstract: Background Fruit fly embryogenesis is one of the best understood animal development systems, and the spatiotemporal gene expression dynamics in this process are captured by digital images. Analysis of these high-throughput images will provide novel insights into the functions, interactions, and networks of animal genes governing development. To facilitate comparative analysis, web-based interfaces have been developed to conduct image retrieval based on body part keywords and images. Currently, the keyword annotation of spatiotemporal gene expression patterns is conducted manually. However, this manual practice does not scale with the continuously expanding collection of images. In addition, existing image retrieval systems based on the expression patterns may be made more accurate using keywords. Results In this article, we adapt advanced data mining and computer vision techniques to address the key challenges in annotating and retrieving fruit fly gene expression pattern images. To boost the performance of image annotation and retrieval, we propose representations integrating spatial information and sparse features, overcoming the limitations of prior schemes. Conclusions We perform systematic experimental studies to evaluate the proposed schemes in comparison with current methods. Experimental results indicate that the integration of spatial information and sparse features lead to consistent performance improvement in image annotation, while for the task of retrieval, sparse features alone yields better results.The electronic version of this article is the complete one and can be found online at: http://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-13-10

    A mesh generation and machine learning framework for Drosophilagene expression pattern image analysis

    Get PDF
    abstract: Background Multicellular organisms consist of cells of many different types that are established during development. Each type of cell is characterized by the unique combination of expressed gene products as a result of spatiotemporal gene regulation. Currently, a fundamental challenge in regulatory biology is to elucidate the gene expression controls that generate the complex body plans during development. Recent advances in high-throughput biotechnologies have generated spatiotemporal expression patterns for thousands of genes in the model organism fruit fly Drosophila melanogaster. Existing qualitative methods enhanced by a quantitative analysis based on computational tools we present in this paper would provide promising ways for addressing key scientific questions. Results We develop a set of computational methods and open source tools for identifying co-expressed embryonic domains and the associated genes simultaneously. To map the expression patterns of many genes into the same coordinate space and account for the embryonic shape variations, we develop a mesh generation method to deform a meshed generic ellipse to each individual embryo. We then develop a co-clustering formulation to cluster the genes and the mesh elements, thereby identifying co-expressed embryonic domains and the associated genes simultaneously. Experimental results indicate that the gene and mesh co-clusters can be correlated to key developmental events during the stages of embryogenesis we study. The open source software tool has been made available at http://compbio.cs.odu.edu/fly/. Conclusions Our mesh generation and machine learning methods and tools improve upon the flexibility, ease-of-use and accuracy of existing methods.The electronic version of this article is the complete one and can be found online at: http://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-14-37

    SHIRAZ: an automated histology image annotation system for zebrafish phenomics

    Get PDF
    Histological characterization is used in clinical and research contexts as a highly sensitive method for detecting the morphological features of disease and abnormal gene function. Histology has recently been accepted as a phenotyping method for the forthcoming Zebrafish Phenome Project, a large-scale community effort to characterize the morphological, physiological, and behavioral phenotypes resulting from the mutations in all known genes in the zebrafish genome. In support of this project, we present a novel content-based image retrieval system for the automated annotation of images containing histological abnormalities in the developing eye of the larval zebrafish

    Prediction of gene expression in embryonic structures of Drosophila melanogaster.

    Get PDF
    Understanding how sets of genes are coordinately regulated in space and time to generate the diversity of cell types that characterise complex metazoans is a major challenge in modern biology. The use of high-throughput approaches, such as large-scale in situ hybridisation and genome-wide expression profiling via DNA microarrays, is beginning to provide insights into the complexities of development. However, in many organisms the collection and annotation of comprehensive in situ localisation data is a difficult and time-consuming task. Here, we present a widely applicable computational approach, integrating developmental time-course microarray data with annotated in situ hybridisation studies, that facilitates the de novo prediction of tissue-specific expression for genes that have no in vivo gene expression localisation data available. Using a classification approach, trained with data from microarray and in situ hybridisation studies of gene expression during Drosophila embryonic development, we made a set of predictions on the tissue-specific expression of Drosophila genes that have not been systematically characterised by in situ hybridisation experiments. The reliability of our predictions is confirmed by literature-derived annotations in FlyBase, by overrepresentation of Gene Ontology biological process annotations, and, in a selected set, by detailed gene-specific studies from the literature. Our novel organism-independent method will be of considerable utility in enriching the annotation of gene function and expression in complex multicellular organisms

    A conserved major facilitator superfamily member orchestrates a subset of O-glycosylation to aid macrophage tissue invasion

    Get PDF
    Aberrant display of the truncated core1 O-glycan T-antigen is a common feature of human cancer cells that correlates with metastasis. Here we show that T-antigen in Drosophila melanogaster macrophages is involved in their developmentally programmed tissue invasion. Higher macrophage T-antigen levels require an atypical major facilitator superfamily (MFS) member that we named Minerva which enables macrophage dissemination and invasion. We characterize for the first time the T and Tn glycoform O-glycoproteome of the Drosophila melanogaster embryo, and determine that Minerva increases the presence of T-antigen on proteins in pathways previously linked to cancer, most strongly on the sulfhydryl oxidase Qsox1 which we show is required for macrophage tissue entry. Minerva’s vertebrate ortholog, MFSD1, rescues the minerva mutant’s migration and T-antigen glycosylation defects. We thus identify a key conserved regulator that orchestrates O-glycosylation on a protein subset to activate a program governing migration steps important for both development and cancer metastasis

    A conserved major facilitator superfamily member orchestrates a subset of O-glycosylation to aid macrophage tissue invasion

    Get PDF
    Aberrant display of the truncated core1 O-glycan T-antigen is a common feature of human cancer cells that correlates with metastasis. Here we show that T-antigen in Drosophila melanogaster macrophages is involved in their developmentally programmed tissue invasion. Higher macrophage T-antigen levels require an atypical major facilitator superfamily (MFS) member that we named Minerva which enables macrophage dissemination and invasion. We characterize for the first time the T and Tn glycoform O-glycoproteome of the Drosophila melanogaster embryo, and determine that Minerva increases the presence of T-antigen on proteins in pathways previously linked to cancer, most strongly on the sulfhydryl oxidase Qsox1 which we show is required for macrophage tissue entry. Minerva’s vertebrate ortholog, MFSD1, rescues the minerva mutant’s migration and T-antigen glycosylation defects. We thus identify a key conserved regulator that orchestrates O-glycosylation on a protein subset to activate a program governing migration steps important for both development and cancer metastasis
    • …
    corecore