27,037 research outputs found
Extracting Conflict-free Information from Multi-labeled Trees
A multi-labeled tree, or MUL-tree, is a phylogenetic tree where two or more
leaves share a label, e.g., a species name. A MUL-tree can imply multiple
conflicting phylogenetic relationships for the same set of taxa, but can also
contain conflict-free information that is of interest and yet is not obvious.
We define the information content of a MUL-tree T as the set of all
conflict-free quartet topologies implied by T, and define the maximal reduced
form of T as the smallest tree that can be obtained from T by pruning leaves
and contracting edges while retaining the same information content. We show
that any two MUL-trees with the same information content exhibit the same
reduced form. This introduces an equivalence relation in MUL-trees with
potential applications to comparing MUL-trees. We present an efficient
algorithm to reduce a MUL-tree to its maximally reduced form and evaluate its
performance on empirical datasets in terms of both quality of the reduced tree
and the degree of data reduction achieved.Comment: Submitted in Workshop on Algorithms in Bioinformatics 2012
(http://algo12.fri.uni-lj.si/?file=wabi
Learning to Prove Theorems via Interacting with Proof Assistants
Humans prove theorems by relying on substantial high-level reasoning and
problem-specific insights. Proof assistants offer a formalism that resembles
human mathematical reasoning, representing theorems in higher-order logic and
proofs as high-level tactics. However, human experts have to construct proofs
manually by entering tactics into the proof assistant. In this paper, we study
the problem of using machine learning to automate the interaction with proof
assistants. We construct CoqGym, a large-scale dataset and learning environment
containing 71K human-written proofs from 123 projects developed with the Coq
proof assistant. We develop ASTactic, a deep learning-based model that
generates tactics as programs in the form of abstract syntax trees (ASTs).
Experiments show that ASTactic trained on CoqGym can generate effective tactics
and can be used to prove new theorems not previously provable by automated
methods. Code is available at https://github.com/princeton-vl/CoqGym.Comment: Accepted to ICML 201
Provably Correct Control-Flow Graphs from Java Programs with Exceptions
We present an algorithm to extract flow graphs from Java bytecode, focusing on exceptional control flows. We prove its correctness, meaning that the behaviour of the extracted control-flow graph is an over-approximation of the behaviour of the original program. Thus any safety property that holds for the extracted control-flow graph also holds for the original program. This makes control-flow graphs suitable for performing different static analyses. For precision and efficiency, the extraction is performed in two phases. In the first phase the program is transformed into a BIR program, where BIR is a stack-less intermediate representation of Java bytecode; in the second phase the control-flow graph is extracted from the BIR representation. To prove the correctness of the two-phase extraction, we also define a direct extraction algorithm, whose correctness can be proven immediately. Then we show that the behaviour of the control-flow graph extracted via the intermediate representation is an over-approximation of the behaviour of the directly extracted graphs, and thus of the original program
Treebank-based acquisition of wide-coverage, probabilistic LFG resources: project overview, results and evaluation
This paper presents an overview of a project to acquire wide-coverage, probabilistic Lexical-Functional Grammar
(LFG) resources from treebanks. Our approach is based on an automatic annotation algorithm that annotates “raw” treebank trees with LFG f-structure information approximating to basic predicate-argument/dependency structure. From the f-structure-annotated treebank
we extract probabilistic unification grammar resources. We present the annotation algorithm, the extraction of
lexical information and the acquisition of wide-coverage and robust PCFG-based LFG approximations including
long-distance dependency resolution.
We show how the methodology can be applied to multilingual, treebank-based unification grammar acquisition. Finally
we show how simple (quasi-)logical forms can be derived automatically from the f-structures generated for the treebank trees
Ground Profile Recovery from Aerial 3D LiDAR-based Maps
The paper presents the study and implementation of the ground detection
methodology with filtration and removal of forest points from LiDAR-based 3D
point cloud using the Cloth Simulation Filtering (CSF) algorithm. The
methodology allows to recover a terrestrial relief and create a landscape map
of a forestry region. As the proof-of-concept, we provided the outdoor flight
experiment, launching a hexacopter under a mixed forestry region with sharp
ground changes nearby Innopolis city (Russia), which demonstrated the
encouraging results for both ground detection and methodology robustness.Comment: 8 pages, FRUCT-2019 conferenc
- …