23 research outputs found
Extended Regular Expressions: Succinctness and Decidability
Most modern implementations of regular expression engines allow the use of variables (also called back references). The resulting extended regular expressions (which, in the literature, are also called practical regular expressions, rewbr, or regex) are able to express non-regular languages.
The present paper demonstrates that extended regular-expressions cannot be minimized effectively (neither with respect to length, nor number of variables), and that the tradeoff in size between extended and ``classical\u27\u27 regular expressions is not bounded by any recursive function. In addition to this, we prove the undecidability of several decision problems (universality, equivalence, inclusion, regularity, and cofiniteness) for extended regular expressions. Furthermore, we show that all these results hold even if the extended regular expressions contain only a single variable
Document Spanners: From Expressive Power to Decision Problems
We examine document spanners, a formal framework for information extraction that was introduced by Fagin et al. (PODS 2013). A document spanner is a function that maps an input string to a relation over spans (intervals of positions of the string). We focus on document spanners that are defined by regex formulas, which are basically regular expressions that map matched subexpressions to corresponding spans, and on core spanners, which extend the former by standard algebraic operators and string equality selection.
First, we compare the expressive power of core spanners to three models - namely, patterns, word equations, and a rich and natural subclass of extended regular expressions (regular expressions with a repetition operator). These results are then used to analyze the complexity of query evaluation and various aspects of static analysis of core spanners. Finally, we examine the relative succinctness of different kinds of representations of core spanners and relate this to the simplification of core spanners that are extended with difference operators
New architectures for very deep learning
Artificial Neural Networks are increasingly being used in complex real- world applications because many-layered (i.e., deep) architectures can now be trained on large quantities of data. However, training even deeper, and therefore more powerful networks, has hit a barrier due to fundamental limitations in the design of existing networks. This thesis develops new architectures that, for the first time, allow very deep networks to be optimized efficiently and reliably. Specifically, it addresses two key issues that hamper credit assignment in neural networks: cross-pattern interference and vanishing gradients. Cross- pattern interference leads to oscillations of the networkâs weights that make training inefficient. The proposed Local Winner-Take-All networks reduce interference among computation units in the same layer through local competition. An in-depth analysis of locally competitive networks provides generalizable insights and reveals unifying properties that improve credit assignment. As network depth increases, vanishing gradients make a networkâs outputs increasingly insensitive to the weights close to the inputs, causing the failure of gradient-based training. To overcome this limitation, the proposed Highway networks regulate information flow across layers through additional skip connections which are modulated by learned computation units. Their beneficial properties are extended to the sequential domain with Recurrent Highway Networks that gain from increased depth and learn complex sequential transitions without requiring more parameters
On the Semantics of Intensionality and Intensional Recursion
Intensionality is a phenomenon that occurs in logic and computation. In the
most general sense, a function is intensional if it operates at a level finer
than (extensional) equality. This is a familiar setting for computer
scientists, who often study different programs or processes that are
interchangeable, i.e. extensionally equal, even though they are not implemented
in the same way, so intensionally distinct. Concomitant with intensionality is
the phenomenon of intensional recursion, which refers to the ability of a
program to have access to its own code. In computability theory, intensional
recursion is enabled by Kleene's Second Recursion Theorem. This thesis is
concerned with the crafting of a logical toolkit through which these phenomena
can be studied. Our main contribution is a framework in which mathematical and
computational constructions can be considered either extensionally, i.e. as
abstract values, or intensionally, i.e. as fine-grained descriptions of their
construction. Once this is achieved, it may be used to analyse intensional
recursion.Comment: DPhil thesis, Department of Computer Science & St John's College,
University of Oxfor
ON THE FOUNDATIONS OF COMPUTABILITY THEORY
The principal motivation for this work is the observation that there are significant deficiencies in the foundations of conventional computability theory. This thesis examines the problems with conventional computability theory, including its failure to address discrepancies between theory and practice in computer science, semantic confusion in terminology, and limitations in the scope of conventional computing models. In light of these difficulties, fundamental notions are re-examined and revised definitions of key concepts such as âcomputer,â âcomputable,â and âcomputing powerâ are provided. A detailed analysis is conducted to determine desirable semantics and scope of applicability of foundational notions. The credibility of the revised definitions is ascertained by demonstrating by their ability to address identified problems with conventional definitions. Their practical utility is established through application to examples. Other related issues, including hidden complexity in computations, subtleties related to encodings, and the cardinalities of sets involved in computing, are examined. A resource-based meta-model for characterizing computing model properties is introduced. The proposed definitions are presented as a starting point for an alternate foundation for computability theory. However, formulation of the particular concepts under discussion is not the sole purpose of the thesis. The underlying objective of this research is to open discourse on alternate foundations of computability theory and to inspire re-examination of fundamental notions
Merging the Natural with the Artificial: The Nature of a Machine and the Collapse of Cybernetics
This thesis is concerned with the rise and fall of cybernetics, understood as an inquiry regarding the nature of a machine. The collapse of this scientific movement, usually explained by external factors such as lack of funding, will be addressed from a philosophical standpoint.
Delving deeper into the theoretical core of cybernetics, one could find that the contributions of William Ross Ashby and John von Neumann shed light onto the particular ways in which cybernetics understood the nature and behavior of a machine. Ross Ashby offered an account of the nature of a machine and then extended the scope of âthe mechanicalâ. This extension would encompass areas that will later be shown to be problematic for mechanization, such as learning and adaptation. The way in which a machine-ontology was applied would trigger effects seemingly contrary to cyberneticsâ own distinctive features. Von Neumann, on the other hand, tinkered with a mechanical model of the brain, realizing grave limitations that prompted him to look for an alternative for cybernetics to work on. The proposal that came out of this resulted in a serious blow against the theoretical core of cybernetics.
Why did cybernetics collapse? The contributions coming from both thinkers, in their own ways, spelled out the main tenets of the cybernetic proposal. But these very contributions led to cyberneticsâ own demise. The whole story can be framed under the rubric of a serious inquiry into the metaphysical underpinnings of a machine. The rise and fall of cybernetics could thus help us better understand what a machine is from a philosophical standpoint.
Although a historical component is present, my emphasis relies on a philosophical consideration of the cybernetic phenomenon. This metaphysical dissection will attempt to clarify how a machine-based ontology remained at the core of cybernetics. An emerging link will hopefully lead towards establishing a tri-partite correlation between cyberneticsâ own evolution, its theoretical core, and its collapse. It will hopefully show how cybernetic inquiries into the nature of a machine might have proved fatal to the very enterprise at large, due to unsolvable theoretical tensions
Conjunctive Queries for Logic-Based Information Extraction
This thesis offers two logic-based approaches to conjunctive queries in the
context of information extraction. The first and main approach is the
introduction of conjunctive query fragments of the logics FC and FC[REG],
denoted as FC-CQ and FC[REG]-CQ respectively. FC is a first-order logic based
on word equations, where the semantics are defined by limiting the universe to
the factors of some finite input word. FC[REG] is FC extended with regular
constraints. The second approach is to consider the dynamic complexity of FC.Comment: Based on the author's PhD thesis and contains work from two
conference publications (arXiv:2104.04758, arXiv:1909.10869) which are joint
work with Dominik D. Freydenberge
Recommended from our members
Aspects of emergent cyclicity in language and computation
This thesis has four parts, which correspond to the presentation and development of a theoretical
framework for the study of cognitive capacities qua physical phenomena, and a case study of locality conditions over natural languages.
Part I deals with computational considerations, setting the tone of the rest of the thesis, and introducing and defining critical concepts like âgrammarâ, âautomatonâ, and the relations between them
. Fundamental questions concerning the place of formal language theory in
linguistic inquiry, as well as the expressibility of linguistic and computational concepts in
common terms, are raised in this part.
Part II further explores the issues addressed in Part I with particular emphasis on how
grammars are implemented by means of automata, and the properties of the formal languages
that these automata generate. We will argue against the equation between effective computation
and function-based computation, and introduce examples of computable procedures which are
nevertheless impossible to capture using traditional function-based theories. The connection
with cognition will be made in the light of dynamical frustrations: the irreconciliable tension
between mutually incompatible tendencies that hold for a given dynamical system. We will
provide arguments in favour of analyzing natural language as emerging from a tension between
different systems (essentially, semantics and morpho-phonology) which impose orthogonal
requirements over admissible outputs. The concept of level of organization or scale comes to
the foreground here; and apparent contradictions and incommensurabilities between concepts
and theories are revisited in a new light: that of dynamical nonlinear systems which are
fundamentally frustrated. We will also characterize the computational system that emerges from
such an architecture: the goal is to get a syntactic component which assigns the simplest
possible structural description to sub-strings, in terms of its computational complexity. A
system which can oscillate back and forth in the hierarchy of formal languages in assigning
structural representations to local domains will be referred to as a computationally mixed
system.
Part III is where the really fun stuff starts. Field theory is introduced, and its applicability to
neurocognitive phenomena is made explicit, with all due scale considerations. Physical and
mathematical concepts are permanently interacting as we analyze phrase structure in terms of
pseudo-fractals (in Mandelbrotâs sense) and define syntax as a (possibly unary) set of
topological operations over completely Hausdorff (CH) ultrametric spaces. These operations, which makes field perturbations interfere, transform that initial completely Hausdorff
ultrametric space into a metric, Hausdorff space with a weaker separation axiom. Syntax, in this
proposal, is not âgenerativeâ in any traditional sense âexcept the âfully explicit theoryâ one-:
rather, it partitions (technically, âparametrizesâ) a topological space. Syntactic dependencies are
defined as interferences between perturbations over a field, which reduce the total entropy of
the system per cycles, at the cost of introducing further dimensions where attractors
corresponding to interpretations for a phrase marker can be found.
Part IV is a sample of what we can gain by further pursuing the physics of language approach,
both in terms of empirical adequacy and theoretical elegance, not to mention the unlimited
possibilities of interdisciplinary collaboration. In this section we set our focus on island
phenomena as defined by Ross (1967), critically revisiting the most relevant literature on this
topic, and establishing a typology of constructions that are strong islands, which cannot be
violated. These constructions are particularly interesting because they limit the phase space of
what is expressible via natural language, and thus reveal crucial aspects of its underlying
dynamics. We will argue that a dynamically frustrated system which is characterized by
displaying mixed computational dependencies can provide straightforward characterizations of
cyclicity in terms of changes in dependencies in local domains