1,596 research outputs found
Three Steps to Heaven: Semantic Publishing in a Real World Workflow
Semantic publishing offers the promise of computable papers, enriched
visualisation and a realisation of the linked data ideal. In reality, however,
the publication process contrives to prevent richer semantics while culminating
in a `lumpen' PDF. In this paper, we discuss a web-first approach to
publication, and describe a three-tiered approach which integrates with the
existing authoring tooling. Critically, although it adds limited semantics, it
does provide value to all the participants in the process: the author, the
reader and the machine.Comment: Published as part of SePublica 201
Generalized regular expressionsâA language for synthesis of programs with branching in loops
AbstractRegular expressions are generalized to the effect that, besides letters from a finite alphabet, they may also contain natural numbers. Within the framework of these generalized expressions the task of the inductive synthesis of programs from its sample run is formalized. Special automata recognizing the sets defined by generalized expressions are introduced, and their equivalence problem is shown to be recursively solvable. The set-theoretic properties of the sets defined by generalized expressions are also studied
clustComp, a bioconductor package for the comparison of clustering results
clustComp is an open source Bioconductor package that implements different techniques for the comparison of two gene expression clustering results. These include flat versus flat
and hierarchical versus flat comparisons. The visualization of the similarities is provided by means
of a bipartite graph, whose layout is heuristically optimized. Its flexibility allows a suitable visualization for both small and large datasets.This work was supported by the RamĂłn Areces Foundation
Prediction of gene expression in embryonic structures of Drosophila melanogaster.
Understanding how sets of genes are coordinately regulated in space and time to generate the diversity of cell types that characterise complex metazoans is a major challenge in modern biology. The use of high-throughput approaches, such as large-scale in situ hybridisation and genome-wide expression profiling via DNA microarrays, is beginning to provide insights into the complexities of development. However, in many organisms the collection and annotation of comprehensive in situ localisation data is a difficult and time-consuming task. Here, we present a widely applicable computational approach, integrating developmental time-course microarray data with annotated in situ hybridisation studies, that facilitates the de novo prediction of tissue-specific expression for genes that have no in vivo gene expression localisation data available. Using a classification approach, trained with data from microarray and in situ hybridisation studies of gene expression during Drosophila embryonic development, we made a set of predictions on the tissue-specific expression of Drosophila genes that have not been systematically characterised by in situ hybridisation experiments. The reliability of our predictions is confirmed by literature-derived annotations in FlyBase, by overrepresentation of Gene Ontology biological process annotations, and, in a selected set, by detailed gene-specific studies from the literature. Our novel organism-independent method will be of considerable utility in enriching the annotation of gene function and expression in complex multicellular organisms
Proposal for a Standard Representation of Two-Dimensional Gel Electrophoresis Data
The global analysis of proteins is now feasible due to improvements in techniques such as two-dimensional gel electrophoresis (2-DE), mass spectrometry, yeast two-hybrid
systems and the development of bioinformatics applications. The experiments form
the basis of proteomics, and present significant challenges in data analysis, storage and
querying. We argue that a standard format for proteome data is required to enable
the storage, exchange and subsequent re-analysis of large datasets. We describe the
criteria that must be met for the development of a standard for proteomics. We have
developed a model to represent data from 2-DE experiments, including difference
gel electrophoresis along with image analysis and statistical analysis across multiple
gels. This part of proteomics analysis is not represented in current proposals for
proteomics standards. We are working with the Proteomics Standards Initiative to
develop a model encompassing biological sample origin, experimental protocols, a
number of separation techniques and mass spectrometry. The standard format will
facilitate the development of central repositories of data, enabling results to be verified
or re-analysed, and the correlation of results produced by different research groups
using a variety of laboratory techniques
- âŚ