904 research outputs found

    The Topological Field Theory of Data: a program towards a novel strategy for data mining through data language

    Get PDF
    This paper aims to challenge the current thinking in IT for the 'Big Data' question, proposing - almost verbatim, with no formulas - a program aiming to construct an innovative methodology to perform data analytics in a way that returns an automaton as a recognizer of the data language: a Field Theory of Data. We suggest to build, directly out of probing data space, a theoretical framework enabling us to extract the manifold hidden relations (patterns) that exist among data, as correlations depending on the semantics generated by the mining context. The program, that is grounded in the recent innovative ways of integrating data into a topological setting, proposes the realization of a Topological Field Theory of Data, transferring and generalizing to the space of data notions inspired by physical (topological) field theories and harnesses the theory of formal languages to define the potential semantics necessary to understand the emerging patterns

    Factory of realities: on the emergence of virtual spatiotemporal structures

    Full text link
    The ubiquitous nature of modern Information Retrieval and Virtual World give rise to new realities. To what extent are these "realities" real? Which "physics" should be applied to quantitatively describe them? In this essay I dwell on few examples. The first is Adaptive neural networks, which are not networks and not neural, but still provide service similar to classical ANNs in extended fashion. The second is the emergence of objects looking like Einsteinian spacetime, which describe the behavior of an Internet surfer like geodesic motion. The third is the demonstration of nonclassical and even stronger-than-quantum probabilities in Information Retrieval, their use. Immense operable datasets provide new operationalistic environments, which become to greater and greater extent "realities". In this essay, I consider the overall Information Retrieval process as an objective physical process, representing it according to Melucci metaphor in terms of physical-like experiments. Various semantic environments are treated as analogs of various realities. The readers' attention is drawn to topos approach to physical theories, which provides a natural conceptual and technical framework to cope with the new emerging realities.Comment: 21 p

    Complexity and Heegaard genus of an infinite class of compact 3-manifolds

    Get PDF
    Using the theory of hyperbolic manifolds with totally geodesic boundary, we provide for every integer n greater than 1 a class of such manifolds all having Matveev complexity equal to n and Heegaard genus equal to n+1. All the elements of this class have a single boundary component of genus n, and the numbers of distinct members of the class grows at least exponentially with n.Comment: 15 pages, 7 figure

    Regular Languages meet Prefix Sorting

    Full text link
    Indexing strings via prefix (or suffix) sorting is, arguably, one of the most successful algorithmic techniques developed in the last decades. Can indexing be extended to languages? The main contribution of this paper is to initiate the study of the sub-class of regular languages accepted by an automaton whose states can be prefix-sorted. Starting from the recent notion of Wheeler graph [Gagie et al., TCS 2017]-which extends naturally the concept of prefix sorting to labeled graphs-we investigate the properties of Wheeler languages, that is, regular languages admitting an accepting Wheeler finite automaton. Interestingly, we characterize this family as the natural extension of regular languages endowed with the co-lexicographic ordering: when sorted, the strings belonging to a Wheeler language are partitioned into a finite number of co-lexicographic intervals, each formed by elements from a single Myhill-Nerode equivalence class. Moreover: (i) We show that every Wheeler NFA (WNFA) with nn states admits an equivalent Wheeler DFA (WDFA) with at most 2n1Σ2n-1-|\Sigma| states that can be computed in O(n3)O(n^3) time. This is in sharp contrast with general NFAs. (ii) We describe a quadratic algorithm to prefix-sort a proper superset of the WDFAs, a O(nlogn)O(n\log n)-time online algorithm to sort acyclic WDFAs, and an optimal linear-time offline algorithm to sort general WDFAs. By contribution (i), our algorithms can also be used to index any WNFA at the moderate price of doubling the automaton's size. (iii) We provide a minimization theorem that characterizes the smallest WDFA recognizing the same language of any input WDFA. The corresponding constructive algorithm runs in optimal linear time in the acyclic case, and in O(nlogn)O(n\log n) time in the general case. (iv) We show how to compute the smallest WDFA equivalent to any acyclic DFA in nearly-optimal time.Comment: added minimization theorems; uploaded submitted version; New version with new results (W-MH theorem, linear determinization), added author: Giovanna D'Agostin
    corecore