88,558 research outputs found

    eXframe: reusable framework for storage, analysis and visualization of genomics experiments

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome-wide experiments are routinely conducted to measure gene expression, DNA-protein interactions and epigenetic status. Structured metadata for these experiments is imperative for a complete understanding of experimental conditions, to enable consistent data processing and to allow retrieval, comparison, and integration of experimental results. Even though several repositories have been developed for genomics data, only a few provide annotation of samples and assays using controlled vocabularies. Moreover, many of them are tailored for a single type of technology or measurement and do not support the integration of multiple data types.</p> <p>Results</p> <p>We have developed eXframe - a reusable web-based framework for genomics experiments that provides 1) the ability to publish structured data compliant with accepted standards 2) support for multiple data types including microarrays and next generation sequencing 3) query, analysis and visualization integration tools (enabled by consistent processing of the raw data and annotation of samples) and is available as open-source software. We present two case studies where this software is currently being used to build repositories of genomics experiments - one contains data from hematopoietic stem cells and another from Parkinson's disease patients.</p> <p>Conclusion</p> <p>The web-based framework eXframe offers structured annotation of experiments as well as uniform processing and storage of molecular data from microarray and next generation sequencing platforms. The framework allows users to query and integrate information across species, technologies, measurement types and experimental conditions. Our framework is reusable and freely modifiable - other groups or institutions can deploy their own custom web-based repositories based on this software. It is interoperable with the most important data formats in this domain. We hope that other groups will not only use eXframe, but also contribute their own useful modifications.</p

    A Molecular Biology Database Digest

    Get PDF
    Computational Biology or Bioinformatics has been defined as the application of mathematical and Computer Science methods to solving problems in Molecular Biology that require large scale data, computation, and analysis [18]. As expected, Molecular Biology databases play an essential role in Computational Biology research and development. This paper introduces into current Molecular Biology databases, stressing data modeling, data acquisition, data retrieval, and the integration of Molecular Biology data from different sources. This paper is primarily intended for an audience of computer scientists with a limited background in Biology

    PeptiCKDdb-peptide- and protein-centric database for the investigation of genesis and progression of chronic kidney disease

    Get PDF
    The peptiCKDdb is a publicly available database platform dedicated to support research in the field of chronic kidney disease (CKD) through identification of novel biomarkers and molecular features of this complex pathology. PeptiCKDdb collects peptidomics and proteomics datasets manually extracted from published studies related to CKD. Datasets from peptidomics or proteomics, human case/control studies on CKD and kidney or urine profiling were included. Data from 114 publications (studies of body fluids and kidney tissue: 26 peptidomics and 76 proteomics manuscripts on human CKD, and 12 focusing on healthy proteome profiling) are currently deposited and the content is quarterly updated. Extracted datasets include information about the experimental setup, clinical study design, discovery-validation sample sizes and list of differentially expressed proteins (P-value &lt; 0.05). A dedicated interactive web interface, equipped with multiparametric search engine, data export and visualization tools, enables easy browsing of the data and comprehensive analysis. In conclusion, this repository might serve as a source of data for integrative analysis or a knowledgebase for scientists seeking confirmation of their findings and as such, is expected to facilitate the modeling of molecular mechanisms underlying CKD and identification of biologically relevant biomarkers.Database URL: www.peptickddb.com

    Chemoinformatics Research at the University of Sheffield: A History and Citation Analysis

    Get PDF
    This paper reviews the work of the Chemoinformatics Research Group in the Department of Information Studies at the University of Sheffield, focusing particularly on the work carried out in the period 1985-2002. Four major research areas are discussed, these involving the development of methods for: substructure searching in databases of three-dimensional structures, including both rigid and flexible molecules; the representation and searching of the Markush structures that occur in chemical patents; similarity searching in databases of both two-dimensional and three-dimensional structures; and compound selection and the design of combinatorial libraries. An analysis of citations to 321 publications from the Group shows that it attracted a total of 3725 residual citations during the period 1980-2002. These citations appeared in 411 different journals, and involved 910 different citing organizations from 54 different countries, thus demonstrating the widespread impact of the Group's work

    Phase Retrieval with Application to Optical Imaging

    Get PDF
    This review article provides a contemporary overview of phase retrieval in optical imaging, linking the relevant optical physics to the information processing methods and algorithms. Its purpose is to describe the current state of the art in this area, identify challenges, and suggest vision and areas where signal processing methods can have a large impact on optical imaging and on the world of imaging at large, with applications in a variety of fields ranging from biology and chemistry to physics and engineering

    Constraining the Atmospheric Composition of the Day-Night Terminators of HD 189733b : Atmospheric Retrieval with Aerosols

    Get PDF
    A number of observations have shown that Rayleigh scattering by aerosols dominates the transmission spectrum of HD 189733b at wavelengths shortward of 1 Ό\mum. In this study, we retrieve a range of aerosol distributions consistent with transmission spectroscopy between 0.3-24 Ό\mum that were recently re-analyzed by Pont et al. (2013). To constrain the particle size and the optical depth of the aerosol layer, we investigate the degeneracies between aerosol composition, temperature, planetary radius, and molecular abundances that prevent unique solutions for transit spectroscopy. Assuming that the aerosol is composed of MgSiO3_3, we suggest that a vertically uniform aerosol layer over all pressures with a monodisperse particle size smaller than about 0.1 Ό\mum and an optical depth in the range 0.002-0.02 at 1 Ό\mum provides statistically meaningful solutions for the day/night terminator regions of HD 189733b. Generally, we find that a uniform aerosol layer provide adequate fits to the data if the optical depth is less than 0.1 and the particle size is smaller than 0.1 Ό\mum, irrespective of the atmospheric temperature, planetary radius, aerosol composition, and gaseous molecules. Strong constraints on the aerosol properties are provided by spectra at wavelengths shortward of 1 Ό\mum as well as longward of 8 Ό\mum, if the aerosol material has absorption features in this region. We show that these are the optimal wavelengths for quantifying the effects of aerosols, which may guide the design of future space observations. The present investigation indicates that the current data offer sufficient information to constrain some of the aerosol properties of HD189733b, but the chemistry in the terminator regions remains uncertain.Comment: Transferred to ApJ and accepted. 11 pages, 10 figures, 1 tabl

    Data collection methods for task-based information access in molecular medicine

    Get PDF
    An important area of improving access to health information is the study of task-based information access in the health domain. This is a significant challenge towards developing focused information retrieval (IR) systems. Due to the complexities of this context, its study requires multiple and often tedious means of data collection, which yields a lot of data for analysis, but also allows triangulation so as to increase the reliability of the findings. In addition to traditional means of data collection, such as questionnaires, interviews and observation, there are novel opportunities provided by lifelogging technologies such as the SenseCam. Together they yield an understanding of information needs, the sources used, and their access strategies. The present paper examines the strengths and weaknesses of the traditional and the more novel means of data collection and addresses the challenges in their application in molecular medicine, which intensively uses digital information sources

    Conceptual biology, hypothesis discovery, and text mining: Swanson's legacy

    Get PDF
    Innovative biomedical librarians and information specialists who want to expand their roles as expert searchers need to know about profound changes in biology and parallel trends in text mining. In recent years, conceptual biology has emerged as a complement to empirical biology. This is partly in response to the availability of massive digital resources such as the network of databases for molecular biologists at the National Center for Biotechnology Information. Developments in text mining and hypothesis discovery systems based on the early work of Swanson, a mathematician and information scientist, are coincident with the emergence of conceptual biology. Very little has been written to introduce biomedical digital librarians to these new trends. In this paper, background for data and text mining, as well as for knowledge discovery in databases (KDD) and in text (KDT) is presented, then a brief review of Swanson's ideas, followed by a discussion of recent approaches to hypothesis discovery and testing. 'Testing' in the context of text mining involves partially automated methods for finding evidence in the literature to support hypothetical relationships. Concluding remarks follow regarding (a) the limits of current strategies for evaluation of hypothesis discovery systems and (b) the role of literature-based discovery in concert with empirical research. Report of an informatics-driven literature review for biomarkers of systemic lupus erythematosus is mentioned. Swanson's vision of the hidden value in the literature of science and, by extension, in biomedical digital databases, is still remarkably generative for information scientists, biologists, and physicians. © 2006Bekhuis; licensee BioMed Central Ltd

    Identification of functionally related enzymes by learning-to-rank methods

    Full text link
    Enzyme sequences and structures are routinely used in the biological sciences as queries to search for functionally related enzymes in online databases. To this end, one usually departs from some notion of similarity, comparing two enzymes by looking for correspondences in their sequences, structures or surfaces. For a given query, the search operation results in a ranking of the enzymes in the database, from very similar to dissimilar enzymes, while information about the biological function of annotated database enzymes is ignored. In this work we show that rankings of that kind can be substantially improved by applying kernel-based learning algorithms. This approach enables the detection of statistical dependencies between similarities of the active cleft and the biological function of annotated enzymes. This is in contrast to search-based approaches, which do not take annotated training data into account. Similarity measures based on the active cleft are known to outperform sequence-based or structure-based measures under certain conditions. We consider the Enzyme Commission (EC) classification hierarchy for obtaining annotated enzymes during the training phase. The results of a set of sizeable experiments indicate a consistent and significant improvement for a set of similarity measures that exploit information about small cavities in the surface of enzymes

    Explant Analysis of Total Disc Replacement

    Get PDF
    Explant analysis of human disc prostheses allow early evaluation of the host response to the prosthesis and the response of the prosthesis from the host. Furthermore, early predictions of failure and wear can be obtained. Thus far, about 2-3% of disc prostheses have been removed. Observed wear patterns are similar to that of appendicular prostheses including abrasions/scratching, burnishing, surface deformation, fatigue, and embedded debris. Chemically the polymeric components have shown little degradation in short-term implantation. In metal on metal prostheses the histologic responses consist of large numbers of metallic particles with occasional macrophages and giant cells. Only rare cases of significant inflammatory response from polymeric debris have been seen
    • 

    corecore