1,563 research outputs found
Three Steps to Heaven: Semantic Publishing in a Real World Workflow
Semantic publishing offers the promise of computable papers, enriched
visualisation and a realisation of the linked data ideal. In reality, however,
the publication process contrives to prevent richer semantics while culminating
in a `lumpen' PDF. In this paper, we discuss a web-first approach to
publication, and describe a three-tiered approach which integrates with the
existing authoring tooling. Critically, although it adds limited semantics, it
does provide value to all the participants in the process: the author, the
reader and the machine.Comment: Published as part of SePublica 201
Generalized regular expressions—A language for synthesis of programs with branching in loops
AbstractRegular expressions are generalized to the effect that, besides letters from a finite alphabet, they may also contain natural numbers. Within the framework of these generalized expressions the task of the inductive synthesis of programs from its sample run is formalized. Special automata recognizing the sets defined by generalized expressions are introduced, and their equivalence problem is shown to be recursively solvable. The set-theoretic properties of the sets defined by generalized expressions are also studied
clustComp, a bioconductor package for the comparison of clustering results
clustComp is an open source Bioconductor package that implements different techniques for the comparison of two gene expression clustering results. These include flat versus flat
and hierarchical versus flat comparisons. The visualization of the similarities is provided by means
of a bipartite graph, whose layout is heuristically optimized. Its flexibility allows a suitable visualization for both small and large datasets.This work was supported by the Ramón Areces Foundation
Prediction of gene expression in embryonic structures of Drosophila melanogaster.
Understanding how sets of genes are coordinately regulated in space and time to generate the diversity of cell types that characterise complex metazoans is a major challenge in modern biology. The use of high-throughput approaches, such as large-scale in situ hybridisation and genome-wide expression profiling via DNA microarrays, is beginning to provide insights into the complexities of development. However, in many organisms the collection and annotation of comprehensive in situ localisation data is a difficult and time-consuming task. Here, we present a widely applicable computational approach, integrating developmental time-course microarray data with annotated in situ hybridisation studies, that facilitates the de novo prediction of tissue-specific expression for genes that have no in vivo gene expression localisation data available. Using a classification approach, trained with data from microarray and in situ hybridisation studies of gene expression during Drosophila embryonic development, we made a set of predictions on the tissue-specific expression of Drosophila genes that have not been systematically characterised by in situ hybridisation experiments. The reliability of our predictions is confirmed by literature-derived annotations in FlyBase, by overrepresentation of Gene Ontology biological process annotations, and, in a selected set, by detailed gene-specific studies from the literature. Our novel organism-independent method will be of considerable utility in enriching the annotation of gene function and expression in complex multicellular organisms
Importing ArrayExpress datasets into R/Bioconductor
Summary:ArrayExpress is one of the largest public repositories of microarray datasets. R/Bioconductor provides a comprehensive suite of microarray analysis and integrative bioinformatics software. However, easy ways for importing datasets from ArrayExpress into R/Bioconductor have been lacking. Here, we present such a tool that is suitable for both interactive and automated use
- …