422,535 research outputs found
Extension of information geometry for modelling non-statistical systems
In this dissertation, an abstract formalism extending information geometry is
introduced. This framework encompasses a broad range of modelling problems,
including possible applications in machine learning and in the information
theoretical foundations of quantum theory. Its purely geometrical foundations
make no use of probability theory and very little assumptions about the data or
the models are made. Starting only from a divergence function, a Riemannian
geometrical structure consisting of a metric tensor and an affine connection is
constructed and its properties are investigated. Also the relation to
information geometry and in particular the geometry of exponential families of
probability distributions is elucidated. It turns out this geometrical
framework offers a straightforward way to determine whether or not a
parametrised family of distributions can be written in exponential form. Apart
from the main theoretical chapter, the dissertation also contains a chapter of
examples illustrating the application of the formalism and its geometric
properties, a brief introduction to differential geometry and a historical
overview of the development of information geometry.Comment: PhD thesis, University of Antwerp, Advisors: Prof. dr. Jan Naudts and
Prof. dr. Jacques Tempere, December 2014, 108 page
The informational mind and the information integration theory of consciousness
According to Aleksander and Morton’s informational mind hypothesis, conscious minds are state structures that are created through iconic learning. Distributed representations of colors, edges, objects, etc. are linked with proprioceptive and motor information to generate the awareness of an out-there world. The uniqueness and indivisibility of these iconically learnt states reflect the uniqueness and indivisibility of the world. This article summarizes the key claims of the informational mind hypothesis and considers them in relation to Tononi’s information integration theory of consciousness. Some suggestions are made about how the informational mind hypothesis could be experimentally tested, and its significance for work on machine consciousness is considered
The informational mind and the information integration theory of consciousness
According to Aleksander and Morton’s informational mind hypothesis, conscious minds are state structures that are created through iconic learning. Distributed representations of colors, edges, objects, etc. are linked with proprioceptive and motor information to generate the awareness of an out-there world. The uniqueness and indivisibility of these iconically learnt states reflect the uniqueness and indivisibility of the world. This article summarizes the key claims of the informational mind hypothesis and considers them in relation to Tononi’s information integration theory of consciousness. Some suggestions are made about how the informational mind hypothesis could be experimentally tested, and its significance for work on machine consciousness is considered
From AI with Love: Reading Big Data Poetry through Gilbert Simondon’s Theory of Transduction
Computation initiated a far-reaching re-imagination of language, not just as an information tool, but as a social, bio-physical activity in general. Modern lexicology provides an important overview of the ongoing development of textual documentation and its applications in relation to language and linguistics. At the same time, the evolution of lexical tools from the first dictionaries and graphs to algorithmically generated scatter plots of live online interaction patterns has been surprisingly swift. Modern communication and information studies from Norbert Weiner to the present-day support direct parallels between coding and linguistic systems. However, most theories of computation as a model of language use remain highly indefinite and open-ended, at times purposefully ambiguous. Comparing the use of computation and semantic technologies ranging from Christopher Strachey’s early love letter templates to David Jhave Johnson’s more recent experiments with artificial neural networks (ANNs), this paper proposes the philosopher Gilbert Simondon’s theory of transduction and metastable systems as a suitable framework for understanding various ontological and epistemological ramifications in our increasingly complex and intimate interactions with machine learning. Such developments are especially clear, I argue, in the poetic reimagining of language as a space of cybernetic hybridity.
initiated a far-reaching re-imagination of language, not just as an information tool, but as a social, bio-physical activity in general. Modern lexicology provides an important overview of the ongoing development of textual documentation and its applications in relation to language and linguistics. At the same time, the evolution of lexical tools from the first dictionaries and graphs to algorithmically generated scatter plots of live online interaction patterns has been surprisingly swift. Modern communication and information studies from Norbert Weiner to the present-day support direct parallels between coding and linguistic systems. However, most theories of computation as a model of language use remain highly indefinite and open-ended, at times purposefully ambiguous. Comparing the use of computation and semantic technologies ranging from Christopher Strachey’s early love letter templates to David Jhave Johnson’s more recent experiments with artificial neural networks (ANNs), this paper proposes the philosopher Gilbert Simondon’s theory of transduction and metastable systems as a suitable framework for understanding various ontological and epistemological ramifications in our increasingly complex and intimate interactions with machine learning. Such developments are especially clear, I argue, in the poetic reimagining of language as a space of cybernetic hybridity.
Keywords: Artificial Intelligence, Poetics, Neural Networks, Information Studies, Computation, Linguistics, Transduction, Philosoph
Item response theory in AI: Analysing machine learning classifiers at the instance level
[EN] AI systems are usually evaluated on a range of problem instances and compared to other AI systems that use different strategies. These instances are rarely independent. Machine learning, and supervised learning in particular, is a very good example of this. Given a machine learning model, its behaviour for a single instance cannot be understood in isolation but rather in relation to the rest of the data distribution or dataset. In a dual way, the results of one machine learning model for an instance can be analysed in comparison to other models. While this analysis is relative to a population or distribution of models, it can give much more insight than an isolated analysis. Item response theory (IRT) combines this duality between items and respondents to extract latent variables of the items (such as discrimination or difficulty) and the respondents (such as ability). IRT can be adapted to the analysis of machine learning experiments (and by extension to any other artificial intelligence experiments). In this paper, we see that IRT suits classification tasks perfectly, where instances correspond to items and classifiers correspond to respondents. We perform a series of experiments with a range of datasets and classification methods to fully understand what the IRT parameters such as discrimination, difficulty and guessing mean for classification instances (and their relation to instance hardness measures) and how the estimated classifier ability can be used to compare classifier performance in a different way through classifier characteristic curves.This work has been partially supported by the EU (FEDER) and the Ministerio de Economia y Competitividad (MINECO) in Spain grant TIN2015-69175-C4-1-R, the Air Force Office of Scientific Research under award number FA9550-17-1-0287, and the REFRAME project, granted by the European Coordinated Research on Long-term Challenges in Information and Communication Sciences Technologies ERA-Net (CHIST-ERA) and funded by Ministerio de Economia y Competitividad (MINECO) in Spain (PCIN-2013-037), and by Generalitat Valenciana PROMETEOII/2015/013. Fernando Martinez-Plumed was also supported by INCIBE (INCIBEI-2015-27345) "Ayudas para la excelencia de los equipos de investigacion avanzada en ciberseguridad", the European Commission (Joint Research Centre) HUMAINT project (Expert Contract CT-EX2018D335821-101), and Universitat Politecnica de Valencia (PAID-06-18 Ref. SP20180210). Ricardo Prudencio was financially supported by CNPq (Brazilian Agency). Jose Hernandez-Orallo was supported by a Salvador de Madariaga grant (PRX17/00467) from the Spanish MECD for a research stay at the Leverhulme Centre for the Future of Intelligence (CFI), Cambridge, a BEST grant (BEST/2017/045) from the Valencia GVA for another research stay also at the CFI, and an FLI grant RFP2.MartĂnez-Plumed, F.; Prudencio, R.; MartĂnez-UsĂł, A.; Hernández-Orallo, J. (2019). Item response theory in AI: Analysing machine learning classifiers at the instance level. Artificial Intelligence. 271:18-42. https://doi.org/10.1016/j.artint.2018.09.004S184227
Generalization Error in Deep Learning
Deep learning models have lately shown great performance in various fields
such as computer vision, speech recognition, speech translation, and natural
language processing. However, alongside their state-of-the-art performance, it
is still generally unclear what is the source of their generalization ability.
Thus, an important question is what makes deep neural networks able to
generalize well from the training set to new data. In this article, we provide
an overview of the existing theory and bounds for the characterization of the
generalization error of deep neural networks, combining both classical and more
recent theoretical and empirical results
A kernel-based framework for learning graded relations from data
Driven by a large number of potential applications in areas like
bioinformatics, information retrieval and social network analysis, the problem
setting of inferring relations between pairs of data objects has recently been
investigated quite intensively in the machine learning community. To this end,
current approaches typically consider datasets containing crisp relations, so
that standard classification methods can be adopted. However, relations between
objects like similarities and preferences are often expressed in a graded
manner in real-world applications. A general kernel-based framework for
learning relations from data is introduced here. It extends existing approaches
because both crisp and graded relations are considered, and it unifies existing
approaches because different types of graded relations can be modeled,
including symmetric and reciprocal relations. This framework establishes
important links between recent developments in fuzzy set theory and machine
learning. Its usefulness is demonstrated through various experiments on
synthetic and real-world data.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
von Neumann-Morgenstern and Savage Theorems for Causal Decision Making
Causal thinking and decision making under uncertainty are fundamental aspects
of intelligent reasoning. Decision making under uncertainty has been well
studied when information is considered at the associative (probabilistic)
level. The classical Theorems of von Neumann-Morgenstern and Savage provide a
formal criterion for rational choice using purely associative information.
Causal inference often yields uncertainty about the exact causal structure, so
we consider what kinds of decisions are possible in those conditions. In this
work, we consider decision problems in which available actions and consequences
are causally connected. After recalling a previous causal decision making
result, which relies on a known causal model, we consider the case in which the
causal mechanism that controls some environment is unknown to a rational
decision maker. In this setting we state and prove a causal version of Savage's
Theorem, which we then use to develop a notion of causal games with its
respective causal Nash equilibrium. These results highlight the importance of
causal models in decision making and the variety of potential applications.Comment: Submitted to Journal of Causal Inferenc
Discourse Structure in Machine Translation Evaluation
In this article, we explore the potential of using sentence-level discourse
structure for machine translation evaluation. We first design discourse-aware
similarity measures, which use all-subtree kernels to compare discourse parse
trees in accordance with the Rhetorical Structure Theory (RST). Then, we show
that a simple linear combination with these measures can help improve various
existing machine translation evaluation metrics regarding correlation with
human judgments both at the segment- and at the system-level. This suggests
that discourse information is complementary to the information used by many of
the existing evaluation metrics, and thus it could be taken into account when
developing richer evaluation metrics, such as the WMT-14 winning combined
metric DiscoTKparty. We also provide a detailed analysis of the relevance of
various discourse elements and relations from the RST parse trees for machine
translation evaluation. In particular we show that: (i) all aspects of the RST
tree are relevant, (ii) nuclearity is more useful than relation type, and (iii)
the similarity of the translation RST tree to the reference tree is positively
correlated with translation quality.Comment: machine translation, machine translation evaluation, discourse
analysis. Computational Linguistics, 201
- …