6 research outputs found
Hidden breakpoints in genome alignments
During the course of evolution, an organism's genome can undergo changes that
affect the large-scale structure of the genome. These changes include gene
gain, loss, duplication, chromosome fusion, fission, and rearrangement. When
gene gain and loss occurs in addition to other types of rearrangement,
breakpoints of rearrangement can exist that are only detectable by comparison
of three or more genomes. An arbitrarily large number of these "hidden"
breakpoints can exist among genomes that exhibit no rearrangements in pairwise
comparisons.
We present an extension of the multichromosomal breakpoint median problem to
genomes that have undergone gene gain and loss. We then demonstrate that the
median distance among three genomes can be used to calculate a lower bound on
the number of hidden breakpoints present. We provide an implementation of this
calculation including the median distance, along with some practical
improvements on the time complexity of the underlying algorithm.
We apply our approach to measure the abundance of hidden breakpoints in
simulated data sets under a wide range of evolutionary scenarios. We
demonstrate that in simulations the hidden breakpoint counts depend strongly on
relative rates of inversion and gene gain/loss. Finally we apply current
multiple genome aligners to the simulated genomes, and show that all aligners
introduce a high degree of error in hidden breakpoint counts, and that this
error grows with evolutionary distance in the simulation. Our results suggest
that hidden breakpoint error may be pervasive in genome alignments.Comment: 13 pages, 4 figure
Nearest Neighbor Distances on a Circle: Multidimensional Case
We study the distances, called spacings, between pairs of neighboring energy
levels for the quantum harmonic oscillator. Specifically, we consider all
energy levels falling between E and E+1, and study how the spacings between
these levels change for various choices of E, particularly when E goes to
infinity. Primarily, we study the case in which the spring constant is a badly
approximable vector. We first give the proof by Boshernitzan-Dyson that the
number of distinct spacings has a uniform bound independent of E. Then, if the
spring constant has components forming a basis of an algebraic number field, we
show that, when normalized up to a unit, the spacings are from a finite set.
Moreover, in the specific case that the field has one fundamental unit, the
probability distribution of these spacings behaves quasiperiodically in log E.
We conclude by studying the spacings in the case that the spring constant is
not badly approximable, providing examples for which the number of distinct
spacings is unbounded.Comment: Version 2 is updated to include more discussion of previous works. 17
pages with five figures. To appear in the Journal of Statistical Physic