Search CORE

65,010 research outputs found

Control versus Data Flow in Parallel Database Machines

Author: Blanken Henk M.
Teeuw Wouter B.
Publication venue: IEEE Computer Society Press
Publication date: 01/01/1993
Field of study

The execution of a query in a parallel database machine can be controlled in either a control flow way, or in a data flow way. In the former case a single system node controls the entire query execution. In the latter case the processes that execute the query, although possibly running on different nodes of the system, trigger each other. Lately, many database research projects focus on data flow control since it should enhance response times and throughput. The authors study control versus data flow with regard to controlling the execution of database queries. An analytical model is used to compare control and data flow in order to gain insights into the question which mechanism is better under which circumstances. Also, some systems using data flow techniques are described, and the authors investigate to which degree they are really data flow. The results show that for particular types of queries data flow is very attractive, since it reduces the number of control messages and balances these messages over the node

University of Twente Research Information

Optimizing Splicing Junction Detection in Next Generation Sequencing Data on a Virtual-GRID Infrastructure

Author: Abate Francesco
Acquaviva Andrea
Ficarra Elisa
Mossucca L.
Provenzano R.
Terzo Olivier
Publication venue
Publication date: 01/01/2012
Field of study

The new protocol for sequencing the messenger RNA in a cell, named RNA-seq produce millions of short sequence fragments. Next Generation Sequencing technology allows more accurate analysis but increase needs in term of computational resources. This paper describes the optimization of a RNA-seq analysis pipeline devoted to splicing variants detection, aimed at reducing computation time and providing a multi-user/multisample environment. This work brings two main contributions. First, we optimized a well-known algorithm called TopHat by parallelizing some sequential mapping steps. Second, we designed and implemented a hybrid virtual GRID infrastructure allowing to efficiently execute multiple instances of TopHat running on different samples or on behalf of different users, thus optimizing the overall execution time and enabling a flexible multi-user environmen

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

PORTO Publications Open Repository TOrino

Data acquisition system for the MuLan muon lifetime experiment

Author: Barnes
Brun
Chitwood
D.B. Chitwood
D.M. Webber
F. Gray
Frlez
Huffman
I. Logashenko
K.R. Lynch
S. Battu
S. Cheekatmalla
S. Dhamija
S. Rath
T.P. Gorringe
V. Tishchenko
Publication venue: 'Elsevier BV'
Publication date: 07/02/2008
Field of study

We describe the data acquisition system for the MuLan muon lifetime experiment at Paul Scherrer Institute. The system was designed to record muon decays at rates up to 1 MHz and acquire data at rates up to 60 MB/sec. The system employed a parallel network of dual-processor machines and repeating acquisition cycles of deadtime-free time segments in order to reach the design goals. The system incorporated a versatile scheme for control and diagnostics and a custom web interface for monitoring experimental conditions.Comment: 19 pages, 8 figures, submitted to Nuclear Instruments and Methods

arXiv.org e-Print Archive

Crossref

A Survey of Parallel Data Mining

Author: Freitas Alex A.
Publication venue
Publication date
Field of study

With the fast, continuous increase in the number and size of databases, parallel data mining is a natural and cost-effective approach to tackle the problem of scalability in data mining. Recently there has been a considerable research on parallel data mining. However, most projects focus on the parallelization of a single kind of data mining algorithm/paradigm. This paper surveys parallel data mining with a broader perspective. More precisely, we discuss the parallelization of data mining algorithms of four knowledge discovery paradigms, namely rule induction, instance-based learning, genetic algorithms and neural networks. Using the lessons learned from this discussion, we also derive a set of heuristic principles for designing efficient parallel data mining algorithms

Kent Academic Repository

Performance of the distributed central analysis in BaBar

Author: Fritsch M
Gradl W
Khan A
Mommsen RK
Petzold A
Roethel W
Smith DA
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2006
Field of study

The total dataset produced by the BaBar experiment at the Stanford Linear Accelerator Center (SLAC) currently comprises roughly

3times 10^9

data events and an equal amount of simulated events, corresponding to 23 Tbytes of real data and 51 Tbytes simulated events. Since individual analyses typically select a very small fraction of all events, it would be extremely inefficient if each analysis had to process the full dataset. A first, centrally managed analysis step is therefore a common pre-selection (‘skimming’) of all data according to very loose, inclusive criteria to facilitate data access for later analysis. Usually, there are common selection criteria for several analysis. However, they may change over time, e.g., when new analyses are developed. Currently,$cal

Crossref

Brunel University Research Archive

Deep Space Network information system architecture study

Author: Atkinson D. J.
Beswick C. A.
Cooper L. P.
Crowe R. A.
Jenkins J. S.
Markley R. W.
Masline R. C.
Stoloff M. J.
Tausworthe R. C.
Thomas J. L.
Publication venue
Publication date: 15/05/1992
Field of study

The purpose of this article is to describe an architecture for the Deep Space Network (DSN) information system in the years 2000-2010 and to provide guidelines for its evolution during the 1990s. The study scope is defined to be from the front-end areas at the antennas to the end users (spacecraft teams, principal investigators, archival storage systems, and non-NASA partners). The architectural vision provides guidance for major DSN implementation efforts during the next decade. A strong motivation for the study is an expected dramatic improvement in information-systems technologies, such as the following: computer processing, automation technology (including knowledge-based systems), networking and data transport, software and hardware engineering, and human-interface technology. The proposed Ground Information System has the following major features: unified architecture from the front-end area to the end user; open-systems standards to achieve interoperability; DSN production of level 0 data; delivery of level 0 data from the Deep Space Communications Complex, if desired; dedicated telemetry processors for each receiver; security against unauthorized access and errors; and highly automated monitor and control

NASA Technical Reports Server