Search CORE

670,206 research outputs found

Does replication groups scoring reduce false positive rate in SNP interaction discovery?

Author: Curk Tomaz
Demsar Janez
Toplak Marko
Zupan Blaz
Publication venue
Publication date: 01/01/2010
Field of study

BACKGROUNG. Computational methods that infer single nucleotide polymorphism (SNP) interactions from phenotype data may uncover new biological mechanisms in non-Mendelian diseases. However, practical aspects of such analysis face many problems. Present experimental studies typically use SNP arrays with hundreds of thousands of SNPs but record only hundreds of samples. Candidate SNP pairs inferred by interaction analysis may include a high proportion of false positives. Recently, Gayan et al. (2008) proposed to reduce the number of false positives by combining results of interaction analysis performed on subsets of data (replication groups), rather than analyzing the entire data set directly. If performing as hypothesized, replication groups scoring could improve interaction analysis and also any type of feature ranking and selection procedure in systems biology. Because Gayan et al. do not compare their approach to the standard interaction analysis techniques, we here investigate if replication groups indeed reduce the number of reported false positive interactions. RESULTS. A set of simulated and false interaction-imputed experimental SNP data sets were used to compare the inference of SNP-SNP interactions by means of replication groups to the standard approach where the entire data set was directly used to score all candidate SNP pairs. In all our experiments, the inference of interactions from the entire data set (e.g. without using the replication groups) reported fewer false positives. CONCLUSIONS. With respect to the direct scoring approach the utility of replication groups does not reduce false positive rates, and may, depending on the data set, often perform worse

Replication Data

Author: Alessandra Casella
Alessandra Casella
Publication venue: 'Now Publishers'
Publication date: 01/01/2008
Field of study

Replication Data

Author: Timothy Besley
Timothy Besley
Publication venue: 'Now Publishers'
Publication date: 01/01/2008
Field of study

Object level physics data replication in the Grid

Author: Holtman Koen
Publication venue: 'AIP Publishing'
Publication date: 01/01/2000
Field of study

To support distributed physics analysis on a scale as foreseen by the LHC experiments, 'Grid' systems are needed that manage and streamline data distribution, replication, and synchronization. We report on the development of a tool that allows large physics datasets to be managed and replicated at the granularity level of single objects. Efficient and convenient support for data extraction and replication at the level of individual objects and events will enable for types of interactive data analysis that would be too inconvenient or costly to perform with tools that work on a file level only. Our tool development effort is intended as both a demonstrator project for various types of existing Grid technology, and as a research effort to develop Grid technology further. The basic use case supported by our tool is one in which a physicist repeatedly selects some physics objects located at a central repository, and replicates them to a local site. The selection can be done using 'tag' or 'ntuple' analysis at the local site. The tool replicates the selected objects, and merges all replicated objects into a single single coherent 'virtual' dataset. This allows all objects to be used together seamlessly, even if they were replicated at different times or from different locations. The version of the tool that is reported on in this paper replicates ORCA based physics data created by CMS in its ongoing high level trigger design studies. The basic capabilities and limitations of the tool are discussed, together with some performance results. Some tool internals are also presented. Finally we will report on experiences so far and on future plans

CiteSeerX

Caltech Authors

Transparent Replication Using Metaprogramming in Cyan

Author: Ugliara Fellipe A.
Vieira Gustavo M. D.
Guimarães José de O.
Publication venue
Publication date: 01/01/2017
Field of study

Replication can be used to increase the availability of a service by creating many operational copies of its data called replicas. Active replication is a form of replication that has strong consistency semantics, easier to reason about and program. However, creating replicated services using active replication still demands from the programmer the knowledge of subtleties of the replication mechanism. In this paper we show how to use the metaprogramming infrastructure of the Cyan language to shield the application programmer from these details, allowing easier creation of fault-tolerant replicated applications through simple annotations.Comment: 8 page

arXiv.org e-Print Archive

FigShare

The Impact of Data Replicatino on Job Scheduling Performance in Hierarchical data Grid

Author: Abdi Somayeh
Mohamadi Somayeh
Pedram Hossein
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 01/01/2010
Field of study

In data-intensive applications data transfer is a primary cause of job execution delay. Data access time depends on bandwidth. The major bottleneck to supporting fast data access in Grids is the high latencies of Wide Area Networks and Internet. Effective scheduling can reduce the amount of data transferred across the internet by dispatching a job to where the needed data are present. Another solution is to use a data replication mechanism. Objective of dynamic replica strategies is reducing file access time which leads to reducing job runtime. In this paper we develop a job scheduling policy and a dynamic data replication strategy, called HRS (Hierarchical Replication Strategy), to improve the data access efficiencies. We study our approach and evaluate it through simulation. The results show that our algorithm has improved 12% over the current strategies.Comment: 11 pages, 7 figure

arXiv.org e-Print Archive

CiteSeerX