133 research outputs found
A new balance index for phylogenetic trees
Several indices that measure the degree of balance of a rooted phylogenetic
tree have been proposed so far in the literature. In this work we define and
study a new index of this kind, which we call the total cophenetic index: the
sum, over all pairs of different leaves, of the depth of their least common
ancestor. This index makes sense for arbitrary trees, can be computed in linear
time and it has a larger range of values and a greater resolution power than
other indices like Colless' or Sackin's. We compute its maximum and minimum
values for arbitrary and binary trees, as well as exact formulas for its
expected value for binary trees under the Yule and the uniform models of
evolution. As a byproduct of this study, we obtain an exact formula for the
expected value of the Sackin index under the uniform model, a result that seems
to be new in the literature.Comment: 24 pages, 2 figures, preliminary version presented at the JBI 201
Identification of functionally related enzymes by learning-to-rank methods
Enzyme sequences and structures are routinely used in the biological sciences
as queries to search for functionally related enzymes in online databases. To
this end, one usually departs from some notion of similarity, comparing two
enzymes by looking for correspondences in their sequences, structures or
surfaces. For a given query, the search operation results in a ranking of the
enzymes in the database, from very similar to dissimilar enzymes, while
information about the biological function of annotated database enzymes is
ignored.
In this work we show that rankings of that kind can be substantially improved
by applying kernel-based learning algorithms. This approach enables the
detection of statistical dependencies between similarities of the active cleft
and the biological function of annotated enzymes. This is in contrast to
search-based approaches, which do not take annotated training data into
account. Similarity measures based on the active cleft are known to outperform
sequence-based or structure-based measures under certain conditions. We
consider the Enzyme Commission (EC) classification hierarchy for obtaining
annotated enzymes during the training phase. The results of a set of sizeable
experiments indicate a consistent and significant improvement for a set of
similarity measures that exploit information about small cavities in the
surface of enzymes
The Decomposition Theorem and the topology of algebraic maps
We give a motivated introduction to the theory of perverse sheaves,
culminating in the Decomposition Theorem of Beilinson, Bernstein, Deligne and
Gabber. A goal of this survey is to show how the theory develops naturally from
classical constructions used in the study of topological properties of
algebraic varieties. While most proofs are omitted, we discuss several
approaches to the Decomposition Theorem, indicate some important applications
and examples.Comment: 117 pages. New title. Major structure changes. Final version of a
survey to appear in the Bulletin of the AM
Towards Stratification Learning through Homology Inference
A topological approach to stratification learning is developed for point
cloud data drawn from a stratified space. Given such data, our objective is to
infer which points belong to the same strata. First we define a multi-scale
notion of a stratified space, giving a stratification for each radius level. We
then use methods derived from kernel and cokernel persistent homology to
cluster the data points into different strata, and we prove a result which
guarantees the correctness of our clustering, given certain topological
conditions; some geometric intuition for these topological conditions is also
provided. Our correctness result is then given a probabilistic flavor: we give
bounds on the minimum number of sample points required to infer, with
probability, which points belong to the same strata. Finally, we give an
explicit algorithm for the clustering, prove its correctness, and apply it to
some simulated data.Comment: 48 page
Type-safe two-level data transformation
A two-level data transformation consists of a type-level transformation of a data format coupled with value-level transformations of data instances corresponding to that format. Examples of two-level data transformations include XML schema evolution coupled with document migration, and data mappings used for interoperability and persistence. We provide a formal treatment of two-level data transformations that is type-safe in the sense that the well-formedness of the value-level transformations with respect to the type-level transformation is guarded by a strong type system. We
rely on various techniques for generic functional programming to implement the formalization in Haskell.
The formalization addresses various two-level transformation scenarios, covering fully automated as well as user-driven transformations, and allowing transformations that are information-preserving or not. In each case, two-level transformations are disciplined by one-step transformation rules and type-level transformations induce value-level transformations. We demonstrate an example hierarchical-relational mapping and subsequent migration of relational data induced by hierarchical format evolution.Fundação para a Ciência e a Tecnologia (FCT
Spacetime and Physical Equivalence
In this essay I begin to lay out a conceptual scheme for: (i) analysing
dualities as cases of theoretical equivalence; (ii) assessing when cases of
theoretical equivalence are also cases of physical equivalence. The scheme is
applied to gauge/gravity dualities. I expound what I argue to be their
contribution to questions about: (iii) the nature of spacetime in quantum
gravity; (iv) broader philosophical and physical discussions of spacetime.
(i)-(ii) proceed by analysing duality through four contrasts. A duality will be
a suitable isomorphism between models: and the four relevant contrasts are as
follows:
(a) Bare theory: a triple of states, quantities, and dynamics endowed with
appropriate structures and symmetries; vs. interpreted theory: which is endowed
with, in addition, a suitable pair of interpretative maps.
(b) Extendable vs. unextendable theories: which can, respectively cannot, be
extended as regards their domains of application.
(c) External vs. internal intepretations: which are constructed,
respectively, by coupling the theory to another interpreted theory vs. from
within the theory itself.
(d) Theoretical vs. physical equivalence: which contrasts formal equivalence
with the equivalence of fully interpreted theories.
I apply this scheme to answering questions (iii)-(iv) for gauge/gravity
dualities. I argue that the things that are physically relevant are those that
stand in a bijective correspondence under duality: the common core of the two
models. I therefore conclude that most of the mathematical and physical
structures that we are familiar with, in these models, are largely, though
crucially never entirely, not part of that common core. Thus, the
interpretation of dualities for theories of quantum gravity compels us to
rethink the roles that spacetime, and many other tools in theoretical physics,
play in theories of spacetime.Comment: 25 pages. Winner of the essay contest "Space and Time After Quantum
Gravity" of the University of Illinois at Chicago and the University of
Genev
Isomorphisms of types in the presence of higher-order references
We investigate the problem of type isomorphisms in a programming language
with higher-order references. We first recall the game-theoretic model of
higher-order references by Abramsky, Honda and McCusker. Solving an open
problem by Laurent, we show that two finitely branching arenas are isomorphic
if and only if they are geometrically the same, up to renaming of moves
(Laurent's forest isomorphism). We deduce from this an equational theory
characterizing isomorphisms of types in a finitary language with higher order
references. We show however that Laurent's conjecture does not hold on
infinitely branching arenas, yielding a non-trivial type isomorphism in the
extension of this language with natural numbers.Comment: Twenty-Sixth Annual IEEE Symposium on Logic In Computer Science (LICS
2011), Toronto : Canada (2011
Isomorphisms of types in the presence of higher-order references (extended version)
We investigate the problem of type isomorphisms in the presence of
higher-order references. We first introduce a finitary programming language
with sum types and higher-order references, for which we build a fully abstract
games model following the work of Abramsky, Honda and McCusker. Solving an open
problem by Laurent, we show that two finitely branching arenas are isomorphic
if and only if they are geometrically the same, up to renaming of moves
(Laurent's forest isomorphism). We deduce from this an equational theory
characterizing isomorphisms of types in our language. We show however that
Laurent's conjecture does not hold on infinitely branching arenas, yielding new
non-trivial type isomorphisms in a variant of our language with natural
numbers
- …