50,778 research outputs found

    Conceptual biology, hypothesis discovery, and text mining: Swanson's legacy

    Get PDF
    Innovative biomedical librarians and information specialists who want to expand their roles as expert searchers need to know about profound changes in biology and parallel trends in text mining. In recent years, conceptual biology has emerged as a complement to empirical biology. This is partly in response to the availability of massive digital resources such as the network of databases for molecular biologists at the National Center for Biotechnology Information. Developments in text mining and hypothesis discovery systems based on the early work of Swanson, a mathematician and information scientist, are coincident with the emergence of conceptual biology. Very little has been written to introduce biomedical digital librarians to these new trends. In this paper, background for data and text mining, as well as for knowledge discovery in databases (KDD) and in text (KDT) is presented, then a brief review of Swanson's ideas, followed by a discussion of recent approaches to hypothesis discovery and testing. 'Testing' in the context of text mining involves partially automated methods for finding evidence in the literature to support hypothetical relationships. Concluding remarks follow regarding (a) the limits of current strategies for evaluation of hypothesis discovery systems and (b) the role of literature-based discovery in concert with empirical research. Report of an informatics-driven literature review for biomarkers of systemic lupus erythematosus is mentioned. Swanson's vision of the hidden value in the literature of science and, by extension, in biomedical digital databases, is still remarkably generative for information scientists, biologists, and physicians. © 2006Bekhuis; licensee BioMed Central Ltd

    Sticks, balls or a ribbon? Results of a formative user study with bioinformaticians

    Get PDF
    User interfaces in modern bioinformatics tools are designed for experts. They are too complicated for\ud novice users such as bench biologists. This report presents the full results of a formative user study as part of a\ud domain and requirements analysis to enhance user interfaces and collaborative environments for\ud multidisciplinary teamwork. Contextual field observations, questionnaires and interviews with bioinformatics\ud researchers of different levels of expertise and various backgrounds were performed in order to gain insight into\ud their needs and working practices. The analysed results are presented as a user profile description and user\ud requirements for designing user interfaces that support the collaboration of multidisciplinary research teams in\ud scientific collaborative environments. Although the number of participants limits the generalisability of the\ud findings, the combination of recurrent observations with other user analysis techniques in real-life settings\ud makes the contribution of this user study novel

    Managing a portal of digital web resources by content syndication

    Get PDF
    As users become more accustomed to continuous Internet access, they will have less patience with the offering of disparate resources. A new generation of portals is being designed that aids users in navigating resource space and in processing the data they retrieved. Such portals offer added value by means of content syndication: the effort to have multiple, federated? resources co-operate in order to profit optimally from their synergy. A portal that offers these advantages, however, can only be of lasting value if it is sustainable. We sketch a way to set up and run an organisation that can manage a content syndication portal in a sustainable way.\ud \u

    BcCluster: a bladder cancer database at the molecular level

    Get PDF
    Background: Bladder Cancer (BC) has two clearly distinct phenotypes. Non-muscle invasive BC has good prognosis and is treated with tumor resection and intravesical therapy whereas muscle invasive BC has poor prognosis and requires usually systemic cisplatin based chemotherapy either prior to or after radical cystectomy. Neoadjuvant chemotherapy is not often used for patients undergoing cystectomy. High-throughput analytical omics techniques are now available that allow the identification of individual molecular signatures to characterize the invasive phenotype. However, a large amount of data produced by omics experiments is not easily accessible since it is often scattered over many publications or stored in supplementary files. Objective: To develop a novel open-source database, BcCluster (http://www.bccluster.org/), dedicated to the comprehensive molecular characterization of muscle invasive bladder carcinoma. Materials: A database was created containing all reported molecular features significant in invasive BC. The query interface was developed in Ruby programming language (version 1.9.3) using the web-framework Rails (version 4.1.5) (http://rubyonrails.org/). Results: BcCluster contains the data from 112 published references, providing 1,559 statistically significant features relative to BC invasion. The database also holds 435 protein-protein interaction data and 92 molecular pathways significant in BC invasion. The database can be used to retrieve binding partners and pathways for any protein of interest. We illustrate this possibility using survivin, a known BC biomarker. Conclusions: BcCluster is an online database for retrieving molecular signatures relative to BC invasion. This application offers a comprehensive view of BC invasiveness at the molecular level and allows formulation of research hypotheses relevant to this phenotype

    Identification of novel molecular signatures of IgA nephropathy through an integrative -omics analysis

    Get PDF
    IgA nephropathy (IgAN) is the most prevalent among primary glomerular diseases worldwide. Although our understanding of IgAN has advanced significantly, its underlying biology and potential drug targets are still unexplored. We investigated a combinatorial approach for the analysis of IgAN-relevant -omics data, aiming at identification of novel molecular signatures of the disease. Nine published urinary proteomics datasets were collected and the reported differentially expressed proteins in IgAN vs. healthy controls were integrated into known biological pathways. Proteins participating in these pathways were subjected to multi-step assessment, including investigation of IgAN transcriptomics datasets (Nephroseq database), their reported protein-protein interactions (STRING database), kidney tissue expression (Human Protein Atlas) and literature mining. Through this process, from an initial dataset of 232 proteins significantly associated with IgAN, 20 pathways were predicted, yielding 657 proteins for further analysis. Step-wise evaluation highlighted 20 proteins of possibly high relevance to IgAN and/or kidney disease. Experimental validation of 3 predicted relevant proteins, adenylyl cyclase-associated protein 1 (CAP1), SHC-transforming protein 1 (SHC1) and prolylcarboxypeptidase (PRCP) was performed by immunostaining of human kidney sections. Collectively, this study presents an integrative procedure for -omics data exploitation, giving rise to biologically relevant results

    Updates in metabolomics tools and resources: 2014-2015

    Get PDF
    Data processing and interpretation represent the most challenging and time-consuming steps in high-throughput metabolomic experiments, regardless of the analytical platforms (MS or NMR spectroscopy based) used for data acquisition. Improved machinery in metabolomics generates increasingly complex datasets that create the need for more and better processing and analysis software and in silico approaches to understand the resulting data. However, a comprehensive source of information describing the utility of the most recently developed and released metabolomics resources—in the form of tools, software, and databases—is currently lacking. Thus, here we provide an overview of freely-available, and open-source, tools, algorithms, and frameworks to make both upcoming and established metabolomics researchers aware of the recent developments in an attempt to advance and facilitate data processing workflows in their metabolomics research. The major topics include tools and researches for data processing, data annotation, and data visualization in MS and NMR-based metabolomics. Most in this review described tools are dedicated to untargeted metabolomics workflows; however, some more specialist tools are described as well. All tools and resources described including their analytical and computational platform dependencies are summarized in an overview Table

    A Path to Implement Precision Child Health Cardiovascular Medicine.

    Get PDF
    Congenital heart defects (CHDs) affect approximately 1% of live births and are a major source of childhood morbidity and mortality even in countries with advanced healthcare systems. Along with phenotypic heterogeneity, the underlying etiology of CHDs is multifactorial, involving genetic, epigenetic, and/or environmental contributors. Clear dissection of the underlying mechanism is a powerful step to establish individualized therapies. However, the majority of CHDs are yet to be clearly diagnosed for the underlying genetic and environmental factors, and even less with effective therapies. Although the survival rate for CHDs is steadily improving, there is still a significant unmet need for refining diagnostic precision and establishing targeted therapies to optimize life quality and to minimize future complications. In particular, proper identification of disease associated genetic variants in humans has been challenging, and this greatly impedes our ability to delineate gene-environment interactions that contribute to the pathogenesis of CHDs. Implementing a systematic multileveled approach can establish a continuum from phenotypic characterization in the clinic to molecular dissection using combined next-generation sequencing platforms and validation studies in suitable models at the bench. Key elements necessary to advance the field are: first, proper delineation of the phenotypic spectrum of CHDs; second, defining the molecular genotype/phenotype by combining whole-exome sequencing and transcriptome analysis; third, integration of phenotypic, genotypic, and molecular datasets to identify molecular network contributing to CHDs; fourth, generation of relevant disease models and multileveled experimental investigations. In order to achieve all these goals, access to high-quality biological specimens from well-defined patient cohorts is a crucial step. Therefore, establishing a CHD BioCore is an essential infrastructure and a critical step on the path toward precision child health cardiovascular medicine
    • …
    corecore