3,459 research outputs found
An Open Framework for Extensible Multi-Stage Bioinformatics Software
In research labs, there is often a need to customise software at every step
in a given bioinformatics workflow, but traditionally it has been difficult to
obtain both a high degree of customisability and good performance.
Performance-sensitive tools are often highly monolithic, which can make
research difficult. We present a novel set of software development principles
and a bioinformatics framework, Friedrich, which is currently in early
development. Friedrich applications support both early stage experimentation
and late stage batch processing, since they simultaneously allow for good
performance and a high degree of flexibility and customisability. These
benefits are obtained in large part by basing Friedrich on the multiparadigm
programming language Scala. We present a case study in the form of a basic
genome assembler and its extension with new functionality. Our architecture has
the potential to greatly increase the overall productivity of software
developers and researchers in bioinformatics.Comment: 12 pages, 1 figure, to appear in proceedings of PRIB 201
Reactive Rules for Emergency Management
The goal of the following survey on Event-Condition-Action (ECA) Rules is to come to a common understanding and intuition on this topic within EMILI. Thus it does not give an academic overview on Event-Condition-Action Rules which would be valuable for computer scientists only. Instead the survey tries to introduce Event-Condition-Action Rules and their use for emergency management based on real-life examples from the use-cases identified in Deliverable 3.1. In this way we hope to address both, computer scientists and security experts, by showing how the Event-Condition-Action Rule technology can help to solve security issues in emergency management. The survey incorporates information from other work packages, particularly from Deliverable D3.1 and its Annexes, D4.1, D2.1 and D6.2 wherever possible
Neuroimaging study designs, computational analyses and data provenance using the LONI pipeline.
Modern computational neuroscience employs diverse software tools and multidisciplinary expertise to analyze heterogeneous brain data. The classical problems of gathering meaningful data, fitting specific models, and discovering appropriate analysis and visualization tools give way to a new class of computational challenges--management of large and incongruous data, integration and interoperability of computational resources, and data provenance. We designed, implemented and validated a new paradigm for addressing these challenges in the neuroimaging field. Our solution is based on the LONI Pipeline environment [3], [4], a graphical workflow environment for constructing and executing complex data processing protocols. We developed study-design, database and visual language programming functionalities within the LONI Pipeline that enable the construction of complete, elaborate and robust graphical workflows for analyzing neuroimaging and other data. These workflows facilitate open sharing and communication of data and metadata, concrete processing protocols, result validation, and study replication among different investigators and research groups. The LONI Pipeline features include distributed grid-enabled infrastructure, virtualized execution environment, efficient integration, data provenance, validation and distribution of new computational tools, automated data format conversion, and an intuitive graphical user interface. We demonstrate the new LONI Pipeline features using large scale neuroimaging studies based on data from the International Consortium for Brain Mapping [5] and the Alzheimer's Disease Neuroimaging Initiative [6]. User guides, forums, instructions and downloads of the LONI Pipeline environment are available at http://pipeline.loni.ucla.edu
An Introduction to Programming for Bioscientists: A Python-based Primer
Computing has revolutionized the biological sciences over the past several
decades, such that virtually all contemporary research in the biosciences
utilizes computer programs. The computational advances have come on many
fronts, spurred by fundamental developments in hardware, software, and
algorithms. These advances have influenced, and even engendered, a phenomenal
array of bioscience fields, including molecular evolution and bioinformatics;
genome-, proteome-, transcriptome- and metabolome-wide experimental studies;
structural genomics; and atomistic simulations of cellular-scale molecular
assemblies as large as ribosomes and intact viruses. In short, much of
post-genomic biology is increasingly becoming a form of computational biology.
The ability to design and write computer programs is among the most
indispensable skills that a modern researcher can cultivate. Python has become
a popular programming language in the biosciences, largely because (i) its
straightforward semantics and clean syntax make it a readily accessible first
language; (ii) it is expressive and well-suited to object-oriented programming,
as well as other modern paradigms; and (iii) the many available libraries and
third-party toolkits extend the functionality of the core language into
virtually every biological domain (sequence and structure analyses,
phylogenomics, workflow management systems, etc.). This primer offers a basic
introduction to coding, via Python, and it includes concrete examples and
exercises to illustrate the language's usage and capabilities; the main text
culminates with a final project in structural bioinformatics. A suite of
Supplemental Chapters is also provided. Starting with basic concepts, such as
that of a 'variable', the Chapters methodically advance the reader to the point
of writing a graphical user interface to compute the Hamming distance between
two DNA sequences.Comment: 65 pages total, including 45 pages text, 3 figures, 4 tables,
numerous exercises, and 19 pages of Supporting Information; currently in
press at PLOS Computational Biolog
Automatic deployment of interoperable legacy code services
The Grid Execution Management for Legacy Code Architecture (GEMLCA) enables
exposing legacy applications as Grid services without re-engineering the code, or even requiring access to the source files. The integration of current GT3 and GT4 based GEMLCA implementations with the P-GRADE Grid portal allows the creation,
execution and visualisation of complex Grid workflows composed of legacy and nonlegacy components. However, the deployment of legacy codes and mapping their
execution to Grid resources is currently done manually. This paper outlines how
GEMLCA can be extended with automatic service deployment, brokering, and
information system support. A conceptual architecture for an Automatic Deployment Service (ADS) and for an x-Service Interoperability Layer (XSILA) are introduced explaining how these mechanisms support desired features in future releases of
GEMLCA
- ā¦