904 research outputs found
The Topological Field Theory of Data: a program towards a novel strategy for data mining through data language
This paper aims to challenge the current thinking in IT for the 'Big Data' question, proposing - almost verbatim, with no formulas - a program aiming to construct an innovative methodology to perform data analytics in a way that returns an automaton as a recognizer of the data language: a Field Theory of Data. We suggest to build, directly out of probing data space, a theoretical framework enabling us to extract the manifold hidden relations (patterns) that exist among data, as correlations depending on the semantics generated by the mining context. The program, that is grounded in the recent innovative ways of integrating data into a topological setting, proposes the realization of a Topological Field Theory of Data, transferring and generalizing to the space of data notions inspired by physical (topological) field theories and harnesses the theory of formal languages to define the potential semantics necessary to understand the emerging patterns
Factory of realities: on the emergence of virtual spatiotemporal structures
The ubiquitous nature of modern Information Retrieval and Virtual World give
rise to new realities. To what extent are these "realities" real? Which
"physics" should be applied to quantitatively describe them? In this essay I
dwell on few examples. The first is Adaptive neural networks, which are not
networks and not neural, but still provide service similar to classical ANNs in
extended fashion. The second is the emergence of objects looking like
Einsteinian spacetime, which describe the behavior of an Internet surfer like
geodesic motion. The third is the demonstration of nonclassical and even
stronger-than-quantum probabilities in Information Retrieval, their use.
Immense operable datasets provide new operationalistic environments, which
become to greater and greater extent "realities". In this essay, I consider the
overall Information Retrieval process as an objective physical process,
representing it according to Melucci metaphor in terms of physical-like
experiments. Various semantic environments are treated as analogs of various
realities. The readers' attention is drawn to topos approach to physical
theories, which provides a natural conceptual and technical framework to cope
with the new emerging realities.Comment: 21 p
Complexity and Heegaard genus of an infinite class of compact 3-manifolds
Using the theory of hyperbolic manifolds with totally geodesic boundary, we
provide for every integer n greater than 1 a class of such manifolds all having
Matveev complexity equal to n and Heegaard genus equal to n+1. All the elements
of this class have a single boundary component of genus n, and the numbers of
distinct members of the class grows at least exponentially with n.Comment: 15 pages, 7 figure
Regular Languages meet Prefix Sorting
Indexing strings via prefix (or suffix) sorting is, arguably, one of the most
successful algorithmic techniques developed in the last decades. Can indexing
be extended to languages? The main contribution of this paper is to initiate
the study of the sub-class of regular languages accepted by an automaton whose
states can be prefix-sorted. Starting from the recent notion of Wheeler graph
[Gagie et al., TCS 2017]-which extends naturally the concept of prefix sorting
to labeled graphs-we investigate the properties of Wheeler languages, that is,
regular languages admitting an accepting Wheeler finite automaton.
Interestingly, we characterize this family as the natural extension of regular
languages endowed with the co-lexicographic ordering: when sorted, the strings
belonging to a Wheeler language are partitioned into a finite number of
co-lexicographic intervals, each formed by elements from a single Myhill-Nerode
equivalence class. Moreover: (i) We show that every Wheeler NFA (WNFA) with
states admits an equivalent Wheeler DFA (WDFA) with at most
states that can be computed in time. This is in sharp contrast with
general NFAs. (ii) We describe a quadratic algorithm to prefix-sort a proper
superset of the WDFAs, a -time online algorithm to sort acyclic
WDFAs, and an optimal linear-time offline algorithm to sort general WDFAs. By
contribution (i), our algorithms can also be used to index any WNFA at the
moderate price of doubling the automaton's size. (iii) We provide a
minimization theorem that characterizes the smallest WDFA recognizing the same
language of any input WDFA. The corresponding constructive algorithm runs in
optimal linear time in the acyclic case, and in time in the
general case. (iv) We show how to compute the smallest WDFA equivalent to any
acyclic DFA in nearly-optimal time.Comment: added minimization theorems; uploaded submitted version; New version
with new results (W-MH theorem, linear determinization), added author:
Giovanna D'Agostin
- …