2,281 research outputs found

    The Semantic Automated Discovery and Integration (SADI) Web service Design-Pattern, API and Reference Implementation

    Get PDF
    Background. 
The complexity and inter-related nature of biological data poses a difficult challenge for data and tool integration. There has been a proliferation of interoperability standards and projects over the past decade, none of which has been widely adopted by the bioinformatics community. Recent attempts have focused on the use of semantics to assist integration, and Semantic Web technologies are being welcomed by this community.

Description. 
SADI – Semantic Automated Discovery and Integration – is a lightweight set of fully standards-compliant Semantic Web service design patterns that simplify the publication of services of the type commonly found in bioinformatics and other scientific domains. Using Semantic Web technologies at every level of the Web services “stack”, SADI services consume and produce instances of OWL Classes following a small number of very straightforward best-practices. In addition, we provide codebases that support these best-practices, and plug-in tools to popular developer and client software that dramatically simplify deployment of services by providers, and the discovery and utilization of those services by their consumers.

Conclusions.
SADI Services are fully compliant with, and utilize only foundational Web standards; are simple to create and maintain for service providers; and can be discovered and utilized in a very intuitive way by biologist end-users. In addition, the SADI design patterns significantly improve the ability of software to automatically discover appropriate services based on user-needs, and automatically chain these into complex analytical workflows. We show that, when resources are exposed through SADI, data compliant with a given ontological model can be automatically gathered, or generated, from these distributed, non-coordinating resources - a behavior we have not observed in any other Semantic system. Finally, we show that, using SADI, data dynamically generated from Web services can be explored in a manner very similar to data housed in static triple-stores, thus facilitating the intersection of Web services and Semantic Web technologies

    Towards an interoperable healthcare information infrastructure - working from the bottom up

    Get PDF
    Historically, the healthcare system has not made effective use of information technology. On the face of things, it would seem to provide a natural and richly varied domain in which to target benefit from IT solutions. But history shows that it is one of the most difficult domains in which to bring them to fruition. This paper provides an overview of the changing context and information requirements of healthcare that help to explain these characteristics.First and foremost, the disciplines and professions that healthcare encompasses have immense complexity and diversity to deal with, in structuring knowledge about what medicine and healthcare are, how they function, and what differentiates good practice and good performance. The need to maintain macro-economic stability of the health service, faced with this and many other uncertainties, means that management bottom lines predominate over choices and decisions that have to be made within everyday individual patient services. Individual practice and care, the bedrock of healthcare, is, for this and other reasons, more and more subject to professional and managerial control and regulation.One characteristic of organisations shown to be good at making effective use of IT is their capacity to devolve decisions within the organisation to where they can be best made, for the purpose of meeting their customers' needs. IT should, in this context, contribute as an enabler and not as an enforcer of good information services. The information infrastructure must work effectively, both top down and bottom up, to accommodate these countervailing pressures. This issue is explored in the context of infrastructure to support electronic health records.Because of the diverse and changing requirements of the huge healthcare sector, and the need to sustain health records over many decades, standardised systems must concentrate on doing the easier things well and as simply as possible, while accommodating immense diversity of requirements and practice. The manner in which the healthcare information infrastructure can be formulated and implemented to meet useful practical goals is explored, in the context of two case studies of research in CHIME at UCL and their user communities.Healthcare has severe problems both as a provider of information and as a purchaser of information systems. This has an impact on both its customer and its supplier relationships. Healthcare needs to become a better purchaser, more aware and realistic about what technology can and cannot do and where research is needed. Industry needs a greater awareness of the complexity of the healthcare domain, and the subtle ways in which information is part of the basic contract between healthcare professionals and patients, and the trust and understanding that must exist between them. It is an ideal domain for deeper collaboration between academic institutions and industry

    1st INCF Workshop on Sustainability of Neuroscience Databases

    Get PDF
    The goal of the workshop was to discuss issues related to the sustainability of neuroscience databases, identify problems and propose solutions, and formulate recommendations to the INCF. The report summarizes the discussions of invited participants from the neuroinformatics community as well as from other disciplines where sustainability issues have already been approached. The recommendations for the INCF involve rating, ranking, and supporting database sustainability

    RNeXML: a package for reading and writing richly annotated phylogenetic, character, and trait data in R

    Full text link
    NeXML is a powerful and extensible exchange standard recently proposed to better meet the expanding needs for phylogenetic data and metadata sharing. Here we present the RNeXML package, which provides users of the R programming language with easy-to-use tools for reading and writing NeXML documents, including rich metadata, in a way that interfaces seamlessly with the extensive library of phylogenetic tools already available in the R ecosystem

    The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows. The DBCLS BioHackathon Consortium*

    Get PDF
    Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However, various incompatibilities among database resources and analysis services make it difficult to connect and integrate these into interoperable workflows. To resolve this situation, we invited domain specialists from web service providers, client software developers, Open Bio* projects, the BioMoby project and researchers of emerging areas where a standard exchange data format is not well established, for an intensive collaboration entitled the BioHackathon 2008. The meeting was hosted by the Database Center for Life Science (DBCLS) and Computational Biology Research Center (CBRC) and was held in Tokyo from February 11th to 15th, 2008. In this report we highlight the work accomplished and the common issues arisen from this event, including the standardization of data exchange formats and services in the emerging fields of glycoinformatics, biological interaction networks, text mining, and phyloinformatics. In addition, common shared object development based on BioSQL, as well as technical challenges in large data management, asynchronous services, and security are discussed. Consequently, we improved interoperability of web services in several fields, however, further cooperation among major database centers and continued collaborative efforts between service providers and software developers are still necessary for an effective advance in bioinformatics web service technologies

    Lessons Learned: Recommendations for Establishing Critical Periodic Scientific Benchmarking

    Get PDF
    The dependence of life scientists on software has steadily grown in recent years. For many tasks, researchers have to decide which of the available bioinformatics software are more suitable for their specific needs. Additionally researchers should be able to objectively select the software that provides the highest accuracy, the best efficiency and the highest level of reproducibility when integrated in their research projects. Critical benchmarking of bioinformatics methods, tools and web services is therefore an essential community service, as well as a critical component of reproducibility efforts. Unbiased and objective evaluations are challenging to set up and can only be effective when built and implemented around community driven efforts, as demonstrated by the many ongoing community challenges in bioinformatics that followed the success of CASP. Community challenges bring the combined benefits of intense collaboration, transparency and standard harmonization. Only open systems for the continuous evaluation of methods offer a perfect complement to community challenges, offering to larger communities of users that could extend far beyond the community of developers, a window to the developments status that they can use for their specific projects. We understand by continuous evaluation systems as those services which are always available and periodically update their data and/or metrics according to a predefined schedule keeping in mind that the performance has to be always seen in terms of each research domain. We argue here that technology is now mature to bring community driven benchmarking efforts to a higher level that should allow effective interoperability of benchmarks across related methods. New technological developments allow overcoming the limitations of the first experiences on online benchmarking e.g. EVA. We therefore describe OpenEBench, a novel infra-structure designed to establish a continuous automated benchmarking system for bioinformatics methods, tools and web services. OpenEBench is being developed so as to cater for the needs of the bioinformatics community, especially software developers who need an objective and quantitative way to inform their decisions as well as the larger community of end-users, in their search for unbiased and up-to-date evaluation of bioinformatics methods. As such OpenEBench should soon become a central place for bioinformatics software developers, community-driven benchmarking initiatives, researchers using bioinformatics methods, and funders interested in the result of methods evaluation.Preprin

    Rice Galaxy: An open resource for plant science

    Get PDF
    Background: Rice molecular genetics, breeding, genetic diversity, and allied research (such as rice-pathogen interaction) have adopted sequencing technologies and high-density genotyping platforms for genome variation analysis and gene discovery. Germplasm collections representing rice diversity, improved varieties, and elite breeding materials are accessible through rice gene banks for use in research and breeding, with many having genome sequences and high-density genotype data available. Combining phenotypic and genotypic information on these accessions enables genome-wide association analysis, which is driving quantitative trait loci discovery and molecular marker development. Comparative sequence analyses across quantitative trait loci regions facilitate the discovery of novel alleles. Analyses involving DNA sequences and large genotyping matrices for thousands of samples, however, pose a challenge to non−computer savvy rice researchers. Findings: The Rice Galaxy resource has shared datasets that include high-density genotypes from the 3,000 Rice Genomes project and sequences with corresponding annotations from 9 published rice genomes. The Rice Galaxy web server and deployment installer includes tools for designing single-nucleotide polymorphism assays, analyzing genome-wide association studies, population diversity, rice−bacterial pathogen diagnostics, and a suite of published genomic prediction methods. A prototype Rice Galaxy compliant to Open Access, Open Data, and Findable, Accessible, Interoperable, and Reproducible principles is also presented. Conclusions: Rice Galaxy is a freely available resource that empowers the plant research community to perform state-of-the-art analyses and utilize publicly available big datasets for both fundamental and applied science

    Supporting the clinical trial recruitment process through the grid

    Get PDF
    Patient recruitment for clinical trials and studies is a large-scale task. To test a given drug for example, it is desirable that as large a pool of suitable candidates is used as possible to support reliable assessment of often moderate effects of the drugs. To make such a recruitment campaign successful, it is necessary to efficiently target the petitioning of these potential subjects. Because of the necessarily large numbers involved in such campaigns, this is a problem that naturally lends itself to the paradigm of Grid technology. However the accumulation and linkage of data sets across clinical domain boundaries poses challenges due to the sensitivity of the data involved that are atypical of other Grid domains. This includes handling the privacy and integrity of data, and importantly the process by which data can be collected and used, and ensuring for example that patient involvement and consent is dealt with appropriately throughout the clinical trials process. This paper describes a Grid infrastructure developed as part of the MRC funded VOTES project (Virtual Organisations for Trials and Epidemiological Studies) at the National e-Science Centre in Glasgow that supports these processes and the different security requirements specific to this domain
    • 

    corecore