321,533 research outputs found
Polynomial algorithms for the Maximal Pairing Problem: efficient phylogenetic targeting on arbitrary trees
Background: The Maximal Pairing Problem (MPP) is the prototype of a class of combinatorial optimization problems that are of considerable interest in bioinformatics: Given an arbitrary phylogenetic tree T and weights ωxy for the paths between any two pairs of leaves (x, y), what is the collection of edge-disjoint paths between pairs of leaves that maximizes the total weight? Special cases of the MPP for binary trees and equal weights have been described previously; algorithms to solve the general MPP are still missing, however. Results: We describe a relatively simple dynamic programming algorithm for the special case of binary trees. We then show that the general case of multifurcating trees can be treated by interleaving solutions to certain auxiliary Maximum Weighted Matching problems with an extension of this dynamic programming approach, resulting in an overall polynomial-time solution of complexity (n^4 log n) w.r.t. the number n of leaves. The source code of a C implementation can be obtained under the GNU Public License from http://www.bioinf.uni-leipzig.de/Software/Targeting. For binary trees, we furthermore discuss several constrained variants of the MPP as well as a partition function approach to the probabilistic version of the MPP. Conclusions: The algorithms introduced here make it possible to solve the MPP also for large trees with high-degree vertices. This has practical relevance in the field of comparative phylogenetics and, for example, in the context of phylogenetic targeting, i.e., data collection with resource limitations.Human Evolutionary Biolog
Optimal modeling for complex system design
The article begins with a brief introduction to the theory describing optimal data compression systems and their performance. A brief outline is then given of a representative algorithm that employs these lessons for optimal data compression system design. The implications of rate-distortion theory for practical data compression system design is then described, followed by a description of the tensions between theoretical optimality and system practicality and a discussion of common tools used in current algorithms to resolve these tensions. Next, the generalization of rate-distortion principles to the design of optimal collections of models is presented. The discussion focuses initially on data compression systems, but later widens to describe how rate-distortion theory principles generalize to model design for a wide variety of modeling applications. The article ends with a discussion of the performance benefits to be achieved using the multiple-model design algorithms
On some new approaches to practical Slepian-Wolf compression inspired by channel coding
This paper considers the problem, first introduced by Ahlswede and Körner in 1975, of lossless source coding with coded side information. Specifically, let X and Y be two random variables such that X is desired losslessly at the decoder while Y serves as side information. The random variables are encoded independently, and both descriptions are used by the decoder to reconstruct X. Ahlswede and Körner describe the achievable rate region in terms of an auxiliary random variable. This paper gives a partial solution for the optimal auxiliary random variable, thereby describing part of the rate region explicitly in terms of the distribution of X and Y
Update-Efficiency and Local Repairability Limits for Capacity Approaching Codes
Motivated by distributed storage applications, we investigate the degree to
which capacity achieving encodings can be efficiently updated when a single
information bit changes, and the degree to which such encodings can be
efficiently (i.e., locally) repaired when single encoded bit is lost.
Specifically, we first develop conditions under which optimum
error-correction and update-efficiency are possible, and establish that the
number of encoded bits that must change in response to a change in a single
information bit must scale logarithmically in the block-length of the code if
we are to achieve any nontrivial rate with vanishing probability of error over
the binary erasure or binary symmetric channels. Moreover, we show there exist
capacity-achieving codes with this scaling.
With respect to local repairability, we develop tight upper and lower bounds
on the number of remaining encoded bits that are needed to recover a single
lost bit of the encoding. In particular, we show that if the code-rate is
less than the capacity, then for optimal codes, the maximum number
of codeword symbols required to recover one lost symbol must scale as
.
Several variations on---and extensions of---these results are also developed.Comment: Accepted to appear in JSA
Strategies for protecting intellectual property when using CUDA applications on graphics processing units
Recent advances in the massively parallel computational abilities of graphical processing units (GPUs) have increased their use for general purpose computation, as companies look to take advantage of big data processing techniques. This has given rise to the potential for malicious software targeting GPUs, which is of interest to forensic investigators examining the operation of software. The ability to carry out reverse-engineering of software is of great importance within the security and forensics elds, particularly when investigating malicious software or carrying out forensic analysis following a successful security breach. Due to the complexity of the Nvidia CUDA (Compute Uni ed Device Architecture) framework, it is not clear how best to approach the reverse engineering of a piece of CUDA software. We carry out a review of the di erent binary output formats which may be encountered from the CUDA compiler, and their implications on reverse engineering. We then demonstrate the process of carrying out disassembly of an example CUDA application, to establish the various techniques available to forensic investigators carrying out black-box disassembly and reverse engineering of CUDA binaries. We show that the Nvidia compiler, using default settings, leaks useful information. Finally, we demonstrate techniques to better protect intellectual property in CUDA algorithm implementations from reverse engineering
- …