21,883 research outputs found
A Formalization of Linkage Analysis
In this report a formalization of genetic linkage analysis is introduced. Linkage analysis is a computationally hard biomathematical method, which purpose is to locate genes on the human genome. It is rooted in the new area of bioinformatics and no formalization of the method has previously been established. Initially, the biological model is presented. On the basis of this biological model we establish a formalization that enables reasoning about algorithms used in linkage analysis. The formalization applies both for single and multi point linkage analysis. We illustrate the usage of the formalization in correctness proofs of central algorithms and optimisations for linkage analysis. A further use of the formalization is to reason about alternative methods for linkage analysis. We discuss the use of MTBDDs and PDGs in linkage analysis, since they have proven efficient for other computationally hard problems involving large state spaces. We conclude that none of the techniques discussed are directly applicable to linkage analysis, however further research is needed in order to investigated whether a modified version of one or more of these are applicable
A Formalization of Linkage Analysis
In this report a formalization of genetic linkage analysis is introduced. Linkage analysis is a computationally hard biomathematical method, which purpose is to locate genes on the human genome. It is rooted in the new area of bioinformatics and no formalization of the method has previously been established. Initially, the biological model is presented. On the basis of this biological model we establish a formalization that enables reasoning about algorithms used in linkage analysis. The formalization applies both for single and multi point linkage analysis. We illustrate the usage of the formalization in correctness proofs of central algorithms and optimisations for linkage analysis. A further use of the formalization is to reason about alternative methods for linkage analysis. We discuss the use of MTBDDs and PDGs in linkage analysis, since they have proven efficient for other computationally hard problems involving large state spaces. We conclude that none of the techniques discussed are directly applicable to linkage analysis, however further research is needed in order to investigated whether a modified version of one or more of these are applicable
Incremental multiple objective genetic algorithms
This paper presents a new genetic algorithm approach to multi-objective optimization problemsIncremental Multiple Objective Genetic Algorithms (IMOGA). Different from conventional MOGA methods, it takes each objective into consideration incrementally. The whole evolution is divided into as many phases as the number of objectives, and one more objective is considered in each phase. Each phase is composed of two stages: first, an independent population is evolved to optimize one specific objective; second, the better-performing individuals from the evolved single-objective population and the multi-objective population evolved in the last phase are joined together by the operation of integration. The resulting population then becomes an initial multi-objective population, to which a multi-objective evolution based on the incremented objective set is applied. The experiment results show that, in most problems, the performance of IMOGA is better than that of three other MOGAs, NSGA-II, SPEA and PAES. IMOGA can find more solutions during the same time span, and the quality of solutions is better
Second-generation PLINK: rising to the challenge of larger and richer datasets
PLINK 1 is a widely used open-source C/C++ toolset for genome-wide
association studies (GWAS) and research in population genetics. However, the
steady accumulation of data from imputation and whole-genome sequencing studies
has exposed a strong need for even faster and more scalable implementations of
key functions. In addition, GWAS and population-genetic data now frequently
contain probabilistic calls, phase information, and/or multiallelic variants,
none of which can be represented by PLINK 1's primary data format.
To address these issues, we are developing a second-generation codebase for
PLINK. The first major release from this codebase, PLINK 1.9, introduces
extensive use of bit-level parallelism, O(sqrt(n))-time/constant-space
Hardy-Weinberg equilibrium and Fisher's exact tests, and many other algorithmic
improvements. In combination, these changes accelerate most operations by 1-4
orders of magnitude, and allow the program to handle datasets too large to fit
in RAM. This will be followed by PLINK 2.0, which will introduce (a) a new data
format capable of efficiently representing probabilities, phase, and
multiallelic variants, and (b) extensions of many functions to account for the
new types of information.
The second-generation versions of PLINK will offer dramatic improvements in
performance and compatibility. For the first time, users without access to
high-end computing resources can perform several essential analyses of the
feature-rich and very large genetic datasets coming into use.Comment: 2 figures, 1 additional fil
- …