36,899 research outputs found
Dispersion and collapse of wave maps
We study numerically the Cauchy problem for equivariant wave maps from 3+1
Minkowski spacetime into the 3-sphere. On the basis of numerical evidence
combined with stability analysis of self-similar solutions we formulate two
conjectures. The first conjecture states that singularities which are produced
in the evolution of sufficiently large initial data are approached in a
universal manner given by the profile of a stable self-similar solution. The
second conjecture states that the codimension-one stable manifold of a
self-similar solution with exactly one instability determines the threshold of
singularity formation for a large class of initial data. Our results can be
considered as a toy-model for some aspects of the critical behavior in
formation of black holes.Comment: 14 pages, Latex, 9 eps figures included, typos correcte
On the Moduli Space of Singular Euclidean Surfaces
The goal of this paper is to develop some aspects of the deformation theory
of piecewise flat structures on surfaces and use this theory to construct new
geometric structures on the moduli space of Riemann surfaces.Comment: To appear in the Handbook of Teichmuller Theory, vol. 1, ed. A.
Papadopoulos, European Math. Society Series, 200
An Extended Stable Marriage Problem Algorithm for Clone Detection
Code cloning negatively affects industrial software and threatens
intellectual property. This paper presents a novel approach to detecting cloned
software by using a bijective matching technique. The proposed approach focuses
on increasing the range of similarity measures and thus enhancing the precision
of the detection. This is achieved by extending a well-known stable-marriage
problem (SMP) and demonstrating how matches between code fragments of different
files can be expressed. A prototype of the proposed approach is provided using
a proper scenario, which shows a noticeable improvement in several features of
clone detection such as scalability and accuracy.Comment: 20 pages, 10 figures, 6 table
On the Similarities Between Native, Non-native and Translated Texts
We present a computational analysis of three language varieties: native,
advanced non-native, and translation. Our goal is to investigate the
similarities and differences between non-native language productions and
translations, contrasting both with native language. Using a collection of
computational methods we establish three main results: (1) the three types of
texts are easily distinguishable; (2) non-native language and translations are
closer to each other than each of them is to native language; and (3) some of
these characteristics depend on the source or native language, while others do
not, reflecting, perhaps, unified principles that similarly affect translations
and non-native language.Comment: ACL2016, 12 page
Measure, Topology and Probabilistic Reasoning in Cosmology
I explain the difficulty of making various concepts of and relating to
probability precise, rigorous and physically significant when attempting to
apply them in reasoning about objects (e.g., spacetimes) living in
infinite-dimensional spaces, working through many examples from cosmology. I
focus on the relation of topological to measure-theoretic notions of and
relating to probability, how they diverge in unpleasant ways in the
infinite-dimensional case, and are difficult to work with on their own as well
in that context. Even in cases where an appropriate family of spacetimes is
finite-dimensional, however, and so admits a measure of the relevant sort, it
is always the case that the family is not a compact topological space, and so
does not admit a physically significant, well behaved probability measure.
Problems of a different but still deeply troubling sort plague arguments about
likelihood in that context, which I also discuss. I conclude that most standard
forms of argument used in cosmology to estimate the likelihood of the
occurrence of various properties or behaviors of spacetimes have serious
mathematical, physical and conceptual problems.Comment: 26 page
Clustering by compression
We present a new method for clustering based on compression. The method
doesn't use subject-specific features or background knowledge, and works as
follows: First, we determine a universal similarity distance, the normalized
compression distance or NCD, computed from the lengths of compressed data files
(singly and in pairwise concatenation). Second, we apply a hierarchical
clustering method. The NCD is universal in that it is not restricted to a
specific application area, and works across application area boundaries. A
theoretical precursor, the normalized information distance, co-developed by one
of the authors, is provably optimal but uses the non-computable notion of
Kolmogorov complexity. We propose precise notions of similarity metric, normal
compressor, and show that the NCD based on a normal compressor is a similarity
metric that approximates universality. To extract a hierarchy of clusters from
the distance matrix, we determine a dendrogram (binary tree) by a new quartet
method and a fast heuristic to implement it. The method is implemented and
available as public software, and is robust under choice of different
compressors. To substantiate our claims of universality and robustness, we
report evidence of successful application in areas as diverse as genomics,
virology, languages, literature, music, handwritten digits, astronomy, and
combinations of objects from completely different domains, using statistical,
dictionary, and block sorting compressors. In genomics we presented new
evidence for major questions in Mammalian evolution, based on
whole-mitochondrial genomic analysis: the Eutherian orders and the Marsupionta
hypothesis against the Theria hypothesis.Comment: LaTeX, 27 pages, 20 figure
A geometric framework for modelling similarity search
The aim of this paper is to propose a geometric framework for modelling
similarity search in large and multidimensional data spaces of general nature,
which seems to be flexible enough to address such issues as analysis of
complexity, indexability, and the `curse of dimensionality.' Such a framework
is provided by the concept of the so-called similarity workload, which is a
probability metric space (query domain) with a distinguished finite
subspace (dataset), together with an assembly of concepts, techniques, and
results from metric geometry. They include such notions as metric transform,
\e-entropy, and the phenomenon of concentration of measure on
high-dimensional structures. In particular, we discuss the relevance of the
latter to understanding the curse of dimensionality. As some of those concepts
and techniques are being currently reinvented by the database community, it
seems desirable to try and bridge the gap between database research and the
relevant work already done in geometry and analysis.Comment: 11 pages, LaTeX 2.
- …