55,096 research outputs found
A MOSAIC of methods: Improving ortholog detection through integration of algorithmic diversity
Ortholog detection (OD) is a critical step for comparative genomic analysis
of protein-coding sequences. In this paper, we begin with a comprehensive
comparison of four popular, methodologically diverse OD methods: MultiParanoid,
Blat, Multiz, and OMA. In head-to-head comparisons, these methods are shown to
significantly outperform one another 12-30% of the time. This high
complementarity motivates the presentation of the first tool for integrating
methodologically diverse OD methods. We term this program MOSAIC, or Multiple
Orthologous Sequence Analysis and Integration by Cluster optimization. Relative
to component and competing methods, we demonstrate that MOSAIC more than
quintuples the number of alignments for which all species are present, while
simultaneously maintaining or improving functional-, phylogenetic-, and
sequence identity-based measures of ortholog quality. Further, we demonstrate
that this improvement in alignment quality yields 40-280% more confidently
aligned sites. Combined, these factors translate to higher estimated levels of
overall conservation, while at the same time allowing for the detection of up
to 180% more positively selected sites. MOSAIC is available as python package.
MOSAIC alignments, source code, and full documentation are available at
http://pythonhosted.org/bio-MOSAIC
The System Kato: Detecting Cases of Plagiarism for Answer-Set Programs
Plagiarism detection is a growing need among educational institutions and
solutions for different purposes exist. An important field in this direction is
detecting cases of source-code plagiarism. In this paper, we present the tool
Kato for supporting the detection of this kind of plagiarism in the area of
answer-set programming (ASP). Currently, the tool is implemented for DLV
programs but it is designed to handle other logic-programming dialects as well.
We review the basic features of Kato, introduce its theoretical underpinnings,
and discuss an application of Kato for plagiarism detection in the context of
courses on logic programming at the Vienna University of Technology
Graph Symmetry Detection and Canonical Labeling: Differences and Synergies
Symmetries of combinatorial objects are known to complicate search
algorithms, but such obstacles can often be removed by detecting symmetries
early and discarding symmetric subproblems. Canonical labeling of combinatorial
objects facilitates easy equivalence checking through quick matching. All
existing canonical labeling software also finds symmetries, but the fastest
symmetry-finding software does not perform canonical labeling. In this work, we
contrast the two problems and dissect typical algorithms to identify their
similarities and differences. We then develop a novel approach to canonical
labeling where symmetries are found first and then used to speed up the
canonical labeling algorithms. Empirical results show that this approach
outperforms state-of-the-art canonical labelers.Comment: 15 pages, 10 figures, 1 table, Turing-10
- …