Search CORE

1,641 research outputs found

Multiplierz: An Extensible API Based Desktop Environment for Proteomics Data Analysis

Author: Askenazi Manor
Blank Nathaniel C.
Cashorali Tanya
Ficarro Scott B.
Marto Jarrod A.
Parikh Jignesh R.
Webber James T.
Zhang Yi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

BACKGROUND. Efficient analysis of results from mass spectrometry-based proteomics experiments requires access to disparate data types, including native mass spectrometry files, output from algorithms that assign peptide sequence to MS/MS spectra, and annotation for proteins and pathways from various database sources. Moreover, proteomics technologies and experimental methods are not yet standardized; hence a high degree of flexibility is necessary for efficient support of high- and low-throughput data analytic tasks. Development of a desktop environment that is sufficiently robust for deployment in data analytic pipelines, and simultaneously supports customization for programmers and non-programmers alike, has proven to be a significant challenge. RESULTS. We describe multiplierz, a flexible and open-source desktop environment for comprehensive proteomics data analysis. We use this framework to expose a prototype version of our recently proposed common API (mzAPI) designed for direct access to proprietary mass spectrometry files. In addition to routine data analytic tasks, multiplierz supports generation of information rich, portable spreadsheet-based reports. Moreover, multiplierz is designed around a "zero infrastructure" philosophy, meaning that it can be deployed by end users with little or no system administration support. Finally, access to multiplierz functionality is provided via high-level Python scripts, resulting in a fully extensible data analytic environment for rapid development of custom algorithms and deployment of high-throughput data pipelines. CONCLUSION. Collectively, mzAPI and multiplierz facilitate a wide range of data analysis tasks, spanning technology development to biological annotation, for mass spectrometry-based proteomics research.Dana-Farber Cancer Institute; National Human Genome Research Institute (P50HG004233); National Science Foundation Integrative Graduate Education and Research Traineeship grant (DGE-0654108

Crossref

Boston University Institutional Repository (OpenBU)

Springer - Publisher Connector

PubMed Central

Protein Structure Data Management System

Author: Wang Yanchao
Publication venue: ScholarWorks @ Georgia State University
Publication date: 03/08/2007
Field of study

With advancement in the development of the new laboratory instruments and experimental techniques, the protein data has an explosive increasing rate. Therefore how to efficiently store, retrieve and modify protein data is becoming a challenging issue that most biological scientists have to face and solve. Traditional data models such as relational database lack of support for complex data types, which is a big issue for protein data application. Hence many scientists switch to the object-oriented databases since object-oriented nature of life science data perfectly matches the architecture of object-oriented databases, but there are still a lot of problems that need to be solved in order to apply OODB methodologies to manage protein data. One major problem is that the general-purpose OODBs do not have any built-in data types for biological research and built-in biological domain-specific functional operations. In this dissertation, we present an application system with built-in data types and built-in biological domain-specific functional operations that extends the Object-Oriented Database (OODB) system by adding domain-specific additional layers Protein-QL, Protein Algebra Architecture and Protein-OODB above OODB to manage protein structure data. This system is composed of three parts: 1) Client API to provide easy usage for different users. 2) Middleware including Protein-QL, Protein Algebra Architecture and Protein-OODB is designed to implement protein domain specific query language and optimize the complex queries, also it capsulates the details of the implementation such that users can easily understand and master Protein-QL. 3) Data Storage is used to store our protein data. This system is for protein domain, but it can be easily extended into other biological domains to build a bio-OODBMS. In this system, protein, primary, secondary, and tertiary structures are defined as internal data types to simplify the queries in Protein-QL such that the domain scientists can easily master the query language and formulate data requests, and EyeDB is used as the underlying OODB to communicate with Protein-OODB. In addition, protein data is usually stored as PDB format and PDB format is old, ambiguous, and inadequate, therefore, PDB data curation will be discussed in detail in the dissertation

CiteSeerX

ScholarWorks @ Georgia State University

Integrating data warehouses with web data : a survey

Author: Aramburu Cabo María José
Berlanga Llavori Rafael
Pedersen Torben Bach
Pérez Martínez Juan Manuel
Publication venue: IEEE Computer Society
Publication date: 01/01/2008
Field of study

This paper surveys the most relevant research on combining Data Warehouse (DW) and Web data. It studies the XML technologies that are currently being used to integrate, store, query, and retrieve Web data and their application to DWs. The paper reviews different DW distributed architectures and the use of XML languages as an integration tool in these systems. It also introduces the problem of dealing with semistructured data in a DW. It studies Web data repositories, the design of multidimensional databases for XML data sources, and the XML extensions of OnLine Analytical Processing techniques. The paper addresses the application of information retrieval technology in a DW to exploit text-rich document collections. The authors hope that the paper will help to discover the main limitations and opportunities that offer the combination of the DW and the Web fields, as well as to identify open research line

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Repositori Institucional de la Universitat Jaume I

VBN

Reactive Rules for Emergency Management

Author: Brodt Simon
Bry François
Hausmann Steffen
Publication venue
Publication date: 31/08/2010
Field of study

The goal of the following survey on Event-Condition-Action (ECA) Rules is to come to a common understanding and intuition on this topic within EMILI. Thus it does not give an academic overview on Event-Condition-Action Rules which would be valuable for computer scientists only. Instead the survey tries to introduce Event-Condition-Action Rules and their use for emergency management based on real-life examples from the use-cases identified in Deliverable 3.1. In this way we hope to address both, computer scientists and security experts, by showing how the Event-Condition-Action Rule technology can help to solve security issues in emergency management. The survey incorporates information from other work packages, particularly from Deliverable D3.1 and its Annexes, D4.1, D2.1 and D6.2 wherever possible

Open Access LMU

Final version of SeamFrame design

Author: Athanasiadis I.N.
Huber D.
Knapen M.J.R.
Li H.
Rizzoli A.E.
Senaldi F.
Svensson M.
Villa F.
Wien J.J.F.
Publication venue: LUND University
Publication date: 01/01/2008
Field of study

Wageningen University & Research Publications

hITeQ: A new workflow-based computing environment for streamlining discovery. Application in materials science

Author: Avelino Corma
Barr
Barr
Baumes
Baumes
Baumes
Baumes
Baumes
Baumes
Baumes
Baumes
Baumes
Baumes
Bish
Bish
Bish
Bonomini
Cantin
Cantin
Chipera
Chung
Corma
Corma
Corma
Corma
Corma
Corma
Corma
D’Elia
Fang
Farrusseng
Gilardoni
Gilmore
Harmon
Hoon
Klanner
Klanner
Klanner
Klanner
Klanner
Laurent A. Baumes
Lee
Lee
Leo
Letondal
Long
Moliner
Moliner
Oinn
Santiago Jimenez
Schüth
Schüth
Serna
Serra
Serra
Serra
Steinbeck
Steinbeck
Takeuchi
Tiwari
Vato
Vistad
Vistad
Publication venue: 'Elsevier BV'
Publication date: 10/01/2011
Field of study

[EN] This paper presents the implementation of the recent methodology called Adaptable Time Warping (ATW) for the automatic identification of mixture of crystallographic phases from powder X-ray diffraction data, inside the framework of a new integrative platform named hITeQ. The methodology is encapsulated into a so-called workflow, and we explore the benefits of such an environment for streamlining discovery in R&D. Beside the fact that ATW successfully identifies and classifies crystalline phases from powder XRD for the very complicated case of zeolite ITQ-33 for which has been employed a high throughput synthesis process, we stress on the numerous difficulties encountered by academic laboratories and companies when facing the integration of new software or techniques. It is shown how an integrative approach provides a real asset in terms of cost, efficiency, and speed due to a unique environment that supports well-defined and reusable processes, improves knowledge management, and handles properly multi-disciplinary teamwork, and disparate data structures and protocols.EU Commission FP6 (TOPCOMBI Project) is gratefully acknowledged.Baumes, LA.; Jiménez Serrano, S.; Corma Canós, A. (2011). hITeQ: A new workflow-based computing environment for streamlining discovery. Application in materials science. Catalysis Today. 159(1):126-137. doi:10.1016/j.cattod.2010.03.067S126137159

Crossref

RiuNet

A Quick Guide for Developing Effective Bioinformatics Programming Skills

Author: A Matsunaga
Atul J. Butte
B Smith
DW Mount
Fran Lewitter
H Mangalam
I Bogdan
ITS Li
J Aerts
J Dean
J Kinser
J Kleinjung
J Tisdall
JD Tisdall
JE Stajich
JE Stajich
Joel T. Dudley
K Chaichoompu
K Lee
M Farrar
M Halling-Brown
M Model
M Schatz
MC Schatz
MS Friedrichs
NF Noy
O Bodenreider
PJ Cock
R Chen
R Chen
RA Dwyer
RC Gentleman
RCG Holland
RT Fielding
S Kumar
S Kumar
SB Hedges
T Oliver
T Rognes
T Rognes
Y Gu
Y Liu
YS Dandass
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Bioinformatics programming skills are becoming a necessity across many facets of biology and medicine, owed in part to the continuing explosion of biological dat

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

Four Lessons in Versatility or How Query Languages Adapt to the Web

Author: A. Bonifati
A. Gelder van
A. Polleres
A. Polleres
A.C. Klug
B. Adida
B. Cooper
B. Jenner
D. Olteanu
D. Olteanu
D. Recordon
D.D. Chamberlin
D.R. Fulkerson
E. Augurusa
F. Bry
F. Bry
F. Bry
F. Wei
F. Weigel
G. Gottlob
G. Karvounarakis
H. Björklund
H. Garcia-Molina
H. Meuss
H. Meuss
H. Przymusinska
H. Tamaki
H. Wang
H.V. Jagadish
J. Bailey
J. Euzenat
J. Pérez
J. Pérez
J. Pérez
J.D. Ullman
J.J. Carroll
J.V.D. Bussche
K. Kochut
K.A. Ross
K.R. Apt
K.S. Booth
L. Cabibbo
M. Habib
M. Kay
M. Marx
M. Marx
N. Bruno
N. Walsh
P. Boncz
P. Buneman
P. Cholak
P. O’Neil
P.G. Kolaitis
P.P. Schneider
R. Agrawal
R. Fagin
R. Goldman
R. Hull
R. Khare
R. Khare
R. Schenkel
S. Abiteboul
S. Abiteboul
S. Abiteboul
S. Al-Khalifa
S. Berger
S. Groppe
S. Trißl
T. Chen
T. Furche
T. Grust
T. Schwentick
T.C. Przymusinski
U. Assmann
W. Akhtar
W. Chen
W.L. Hsu
W.L. Hsu
Z. Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Exposing not only human-centered information, but machine-processable data on the Web is one of the commonalities of recent Web trends. It has enabled a new kind of applications and businesses where the data is used in ways not foreseen by the data providers. Yet this exposition has fractured the Web into islands of data, each in different Web formats: Some providers choose XML, others RDF, again others JSON or OWL, for their data, even in similar domains. This fracturing stifles innovation as application builders have to cope not only with one Web stack (e.g., XML technology) but with several ones, each of considerable complexity. With Xcerpt we have developed a rule- and pattern based query language that aims to give shield application builders from much of this complexity: In a single query language XML and RDF data can be accessed, processed, combined, and re-published. Though the need for combined access to XML and RDF data has been recognized in previous work (including the W3C’s GRDDL), our approach differs in four main aspects: (1) We provide a single language (rather than two separate or embedded languages), thus minimizing the conceptual overhead of dealing with disparate data formats. (2) Both the declarative (logic-based) and the operational semantics are unified in that they apply for querying XML and RDF in the same way. (3) We show that the resulting query language can be implemented reusing traditional database technology, if desirable. Nevertheless, we also give a unified evaluation approach based on interval labelings of graphs that is at least as fast as existing approaches for tree-shaped XML data, yet provides linear time and space querying also for many RDF graphs. We believe that Web query languages are the right tool for declarative data access in Web applications and that Xcerpt is a significant step towards a more convenient, yet highly efficient data access in a “Web of Data”

CiteSeerX

Crossref

Open Access LMU