1,294 research outputs found

    BIRCH: A user-oriented, locally-customizable, bioinformatics system

    Get PDF
    BACKGROUND: Molecular biologists need sophisticated analytical tools which often demand extensive computational resources. While finding, installing, and using these tools can be challenging, pipelining data from one program to the next is particularly awkward, especially when using web-based programs. At the same time, system administrators tasked with maintaining these tools do not always appreciate the needs of research biologists. RESULTS: BIRCH (Biological Research Computing Hierarchy) is an organizational framework for delivering bioinformatics resources to a user group, scaling from a single lab to a large institution. The BIRCH core distribution includes many popular bioinformatics programs, unified within the GDE (Genetic Data Environment) graphic interface. Of equal importance, BIRCH provides the system administrator with tools that simplify the job of managing a multiuser bioinformatics system across different platforms and operating systems. These include tools for integrating locally-installed programs and databases into BIRCH, and for customizing the local BIRCH system to meet the needs of the user base. BIRCH can also act as a front end to provide a unified view of already-existing collections of bioinformatics software. Documentation for the BIRCH and locally-added programs is merged in a hierarchical set of web pages. In addition to manual pages for individual programs, BIRCH tutorials employ step by step examples, with screen shots and sample files, to illustrate both the important theoretical and practical considerations behind complex analytical tasks. CONCLUSION: BIRCH provides a versatile organizational framework for managing software and databases, and making these accessible to a user base. Because of its network-centric design, BIRCH makes it possible for any user to do any task from anywhere

    Bioinformatics process management: information flow via a computational journal

    Get PDF
    This paper presents the Bioinformatics Computational Journal (BCJ), a framework for conducting and managing computational experiments in bioinformatics and computational biology. These experiments often involve series of computations, data searches, filters, and annotations which can benefit from a structured environment. Systems to manage computational experiments exist, ranging from libraries with standard data models to elaborate schemes to chain together input and output between applications. Yet, although such frameworks are available, their use is not widespread–ad hoc scripts are often required to bind applications together. The BCJ explores another solution to this problem through a computer based environment suitable for on-site use, which builds on the traditional laboratory notebook paradigm. It provides an intuitive, extensible paradigm designed for expressive composition of applications. Extensive features facilitate sharing data, computational methods, and entire experiments. By focusing on the bioinformatics and computational biology domain, the scope of the computational framework was narrowed, permitting us to implement a capable set of features for this domain. This report discusses the features determined critical by our system and other projects, along with design issues. We illustrate the use of our implementation of the BCJ on two domain-specific examples

    The Lattice Project: A Multi-model Grid Computing System

    Get PDF
    This thesis presents The Lattice Project, a system that combines multiple models of Grid computing. Grid computing is a paradigm for leveraging multiple distributed computational resources to solve fundamental scientific problems that require large amounts of computation. The system combines the traditional Service model of Grid computing with the Desktop model of Grid computing, and is thus capable of utilizing diverse resources such as institutional desktop computers, dedicated computing clusters, and machines volunteered by the general public to advance science. The production Grid system includes a fully-featured user interface, support for a large number of popular scientific applications, a robust Grid-level scheduler, and novel enhancements such as a Grid-wide file caching scheme. A substantial amount of scientific research has already been completed using The Lattice Project

    Bioinformatic workflows : G-PIPE as an implementation

    Full text link
    We present G-PIPE, a graphic pipeline generator for PISE that allows the definition of pipelines, parameterization of its component methods, and storage of metadata in XML formats. Our implementation goes beyond macro capacities currently in PISE. As the entire analysis protocol is defined in XML, a complete bioinformatic experiment (linked sets of methods, parameters and results) can be reproduced or shared among users. We also discuss the role of ontologies as as guidance systems in order to provide users with the possibility to define abstract work-flows, and execute them. A relevant baseline ontology is presented. Availability: http://if-web.imb.uq.edu.a

    Workflows in bioinformatics: meta-analysis and prototype implementation of a workflow generator

    Get PDF
    BACKGROUND: Computational methods for problem solving need to interleave information access and algorithm execution in a problem-specific workflow. The structures of these workflows are defined by a scaffold of syntactic, semantic and algebraic objects capable of representing them. Despite the proliferation of GUIs (Graphic User Interfaces) in bioinformatics, only some of them provide workflow capabilities; surprisingly, no meta-analysis of workflow operators and components in bioinformatics has been reported. RESULTS: We present a set of syntactic components and algebraic operators capable of representing analytical workflows in bioinformatics. Iteration, recursion, the use of conditional statements, and management of suspend/resume tasks have traditionally been implemented on an ad hoc basis and hard-coded; by having these operators properly defined it is possible to use and parameterize them as generic re-usable components. To illustrate how these operations can be orchestrated, we present GPIPE, a prototype graphic pipeline generator for PISE that allows the definition of a pipeline, parameterization of its component methods, and storage of metadata in XML formats. This implementation goes beyond the macro capacities currently in PISE. As the entire analysis protocol is defined in XML, a complete bioinformatic experiment (linked sets of methods, parameters and results) can be reproduced or shared among users. Availability: (interactive), (download). CONCLUSION: From our meta-analysis we have identified syntactic structures and algebraic operators common to many workflows in bioinformatics. The workflow components and algebraic operators can be assimilated into re-usable software components. GPIPE, a prototype implementation of this framework, provides a GUI builder to facilitate the generation of workflows and integration of heterogeneous analytical tools

    Semiautomatic generation of CORBA interfaces for databases in molecular biology

    Get PDF
    The amount and complexity of genome related data is growing quickly. This highly interrelated data is distributed at many different sites, stored in numerous different formats, and maintained by independent data providers. CORBA, the industry standard for distributed computing, offers the opportunity to make implementation differences and distribution transparent and thereby helps to combine disparate data sources and application programs. In this thesis, the different aspects of CORBA access to molecular biology data are examined in detail. The work is motivated by a concrete application for distributed genome maps. Then, the different design issues relevant to the implementation of CORBA access layers are surveyed and evaluated. The most important of these issues is the question of how to represent data in a CORBA environment using the interface definition language IDL. Different representations have different advantages and disadvantages and the best representation is highly application specific. It is therefore in general impossible to generate a CORBA wrapper automatically for a given database. On the other hand, coding a server for each application manually is tedious and error prone. Therefore, a method is presented for the semiautomatic generation of CORBA wrappers for relational databases. A declarative language is described, which is used to specify the mapping between relations and IDL constructs. Using a set of such mapping rules, a CORBA server can be generated automatically. Additionally, the declarative mapping language allows for the support of ad-hoc queries, which are based on the IDL definitions

    Swiss EMBnet node web server

    Get PDF
    EMBnet is a consortium of collaborating bioinformatics groups located mainly within Europe (http://www.embnet.org). Each member country is represented by a ‘node', a group responsible for the maintenance of local services for their users (e.g. education, training, software, database distribution, technical support, helpdesk). Among these services a web portal with links and access to locally developed and maintained software is essential and different for each node. Our web portal targets biomedical scientists in Switzerland and elsewhere, offering them access to a collection of important sequence analysis tools mirrored from other sites or developed locally. We describe here the Swiss EMBnet node web site (http://www.ch.embnet.org), which presents a number of original services not available anywhere els
    • …
    corecore