Search CORE

38 research outputs found

International Evaluation of Research and Doctoral Training at the University of Helsinki 2005-2010 : RC-Specific Evaluation of ALKO - Algorithms and Data Analysis

Author
Publication venue
Publication date: 01/01/2012
Field of study

Helsingin yliopiston digitaalinen arkisto

Fifth Biennial Report : June 1999 - August 2001

Author
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/2001
Field of study

MPG.PuRe

Sixth Biennial Report : August 2001 - May 2003

Author
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/2003
Field of study

MPG.PuRe

Optimization methods for side-chain positioning and macromolecular docking

Author: Moghadasi Mohammad
Publication venue
Publication date: 08/04/2016
Field of study

This dissertation proposes new optimization algorithms targeting protein-protein docking which is an important class of problems in computational structural biology. The ultimate goal of docking methods is to predict the 3-dimensional structure of a stable protein-protein complex. We study two specific problems encountered in predictive docking of proteins. The first problem is Side-Chain Positioning (SCP), a central component of homology modeling and computational protein docking methods. We formulate SCP as a Maximum Weighted Independent Set (MWIS) problem on an appropriately constructed graph. Our formulation also considers the significant special structure of proteins that SCP exhibits for docking. We develop an approximate algorithm that solves a relaxation of MWIS and employ randomized estimation heuristics to obtain high-quality feasible solutions to the problem. The algorithm is fully distributed and can be implemented on multi-processor architectures. Our computational results on a benchmark set of protein complexes show that the accuracy of our approximate MWIS-based algorithm predictions is comparable with the results achieved by a state-of-the-art method that finds an exact solution to SCP. The second problem we target in this work is protein docking refinement. We propose two different methods to solve the refinement problem. The first approach is based on a Monte Carlo-Minimization (MCM) search to optimize rigid-body and side-chain conformations for binding. In particular, we study the impact of optimally positioning the side-chains in the interface region between two proteins in the process of binding. We report computational results showing that incorporating side-chain flexibility in docking provides substantial improvement in the quality of docked predictions compared to the rigid-body approaches. Further, we demonstrate that the inclusion of unbound side-chain conformers in the side-chain search introduces significant improvement in the performance of the docking refinement protocols. In the second approach, we propose a novel stochastic optimization algorithm based on Subspace Semi-Definite programming-based Underestimation (SSDU), which aims to solve protein docking and protein structure prediction. SSDU is based on underestimating the binding energy function in a permissive subspace of the space of rigid-body motions. We apply Principal Component Analysis (PCA) to determine the permissive subspace and reduce the dimensionality of the conformational search space. We consider the general class of convex polynomial underestimators, and formulate the problem of finding such underestimators as a Semi-Definite Programming (SDP) problem. Using these underestimators, we perform a biased sampling in the vicinity of the conformational regions where the energy function is at its global minimum. Moreover, we develop an exploration procedure based on density-based clustering to detect the near-native regions even when there are many local minima residing far from each other. We also incorporate a Model Selection procedure into SSDU to pick a predictive conformation. Testing our algorithm over a benchmark of protein complexes indicates that SSDU substantially improves the quality of docking refinement compared with existing methods

Boston University Institutional Repository (OpenBU)

The Minimum Description Length Principle for Pattern Mining: A Survey

Author: Galbrun Esther
Publication venue
Publication date: 28/07/2021
Field of study

This is about the Minimum Description Length (MDL) principle applied to pattern mining. The length of this description is kept to the minimum. Mining patterns is a core task in data analysis and, beyond issues of efficient enumeration, the selection of patterns constitutes a major challenge. The MDL principle, a model selection method grounded in information theory, has been applied to pattern mining with the aim to obtain compact high-quality sets of patterns. After giving an outline of relevant concepts from information theory and coding, as well as of work on the theory behind the MDL and similar principles, we review MDL-based methods for mining various types of data and patterns. Finally, we open a discussion on some issues regarding these methods, and highlight currently active related data analysis problems

arXiv.org e-Print Archive

Seventh Biennial Report : June 2003 - March 2005

Author
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/2005
Field of study

MPG.PuRe

A forensics software toolkit for DNA steganalysis.

Author: Beck Marc Bjoern
Publication venue: ThinkIR: The University of Louisville\u27s Institutional Repository
Publication date: 01/05/2015
Field of study

Recent advances in genetic engineering have allowed the insertion of artificial DNA strands into the living cells of organisms. Several methods have been developed to insert information into a DNA sequence for the purpose of data storage, watermarking, or communication of secret messages. The ability to detect, extract, and decode messages from DNA is important for forensic data collection and for data security. We have developed a software toolkit that is able to detect the presence of a hidden message within a DNA sequence, extract that message, and then decode it. The toolkit is able to detect, extract, and decode messages that have been encoded with a variety of different coding schemes. The goal of this project is to enable our software toolkit to determine with which coding scheme a message has been encoded in DNA and then to decode it. The software package is able to decode messages that have been encoded with every variation of most of the coding schemes described in this document. The software toolkit has two different options for decoding that can be selected by the user. The first is a frequency analysis approach that is very commonly used in cryptanalysis. This approach is very fast, but is unable to decode messages shorter than 200 words accurately. The second option is using a Genetic Algorithm (GA) in combination with a Wisdom of Artificial Crowds (WoAC) technique. This approach is very time consuming, but can decode shorter messages with much higher accuracy

University of Louisville

Exploiting the probability of observation for efficient Bayesian network inference

Author: Mousumi Fouzia Ashraf
Publication venue: 'University of Central Missouri, Department of Mathematics and Computer Science'
Publication date: 01/01/2013
Field of study

xi, 88 leaves : ill. ; 29 cmIt is well-known that the observation of a variable in a Bayesian network can affect the effective connectivity of the network, which in turn affects the efficiency of inference. Unfortunately, the observed variables may not be known until runtime, which limits the amount of compile-time optimization that can be done in this regard. This thesis considers how to improve inference when users know the likelihood of a variable being observed. It demonstrates how these probabilities of observation can be exploited to improve existing heuristics for choosing elimination orderings for inference. Empirical tests over a set of benchmark networks using the Variable Elimination algorithm show reductions of up to 50% and 70% in multiplications and summations, as well as runtime reductions of up to 55%. Similarly, tests using the Elimination Tree algorithm show reductions by as much as 64%, 55%, and 50% in recursive calls, total cache size, and runtime, respectively

OPUS: Open Uleth Scholarship - University of Lethbridge Research Repository

Eight Biennial Report : April 2005 – March 2007

Author
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/2007
Field of study

MPG.PuRe

Advances in Wheat Genetics: From Genome to Field: Proceedings of the 12th International Wheat Genetics Symposium

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

plant genetics; plant genomics; agricultur

OAPEN Library