422,535 research outputs found

    Extension of information geometry for modelling non-statistical systems

    Full text link
    In this dissertation, an abstract formalism extending information geometry is introduced. This framework encompasses a broad range of modelling problems, including possible applications in machine learning and in the information theoretical foundations of quantum theory. Its purely geometrical foundations make no use of probability theory and very little assumptions about the data or the models are made. Starting only from a divergence function, a Riemannian geometrical structure consisting of a metric tensor and an affine connection is constructed and its properties are investigated. Also the relation to information geometry and in particular the geometry of exponential families of probability distributions is elucidated. It turns out this geometrical framework offers a straightforward way to determine whether or not a parametrised family of distributions can be written in exponential form. Apart from the main theoretical chapter, the dissertation also contains a chapter of examples illustrating the application of the formalism and its geometric properties, a brief introduction to differential geometry and a historical overview of the development of information geometry.Comment: PhD thesis, University of Antwerp, Advisors: Prof. dr. Jan Naudts and Prof. dr. Jacques Tempere, December 2014, 108 page

    The informational mind and the information integration theory of consciousness

    Get PDF
    According to Aleksander and Morton’s informational mind hypothesis, conscious minds are state structures that are created through iconic learning. Distributed representations of colors, edges, objects, etc. are linked with proprioceptive and motor information to generate the awareness of an out-there world. The uniqueness and indivisibility of these iconically learnt states reflect the uniqueness and indivisibility of the world. This article summarizes the key claims of the informational mind hypothesis and considers them in relation to Tononi’s information integration theory of consciousness. Some suggestions are made about how the informational mind hypothesis could be experimentally tested, and its significance for work on machine consciousness is considered

    The informational mind and the information integration theory of consciousness

    Get PDF
    According to Aleksander and Morton’s informational mind hypothesis, conscious minds are state structures that are created through iconic learning. Distributed representations of colors, edges, objects, etc. are linked with proprioceptive and motor information to generate the awareness of an out-there world. The uniqueness and indivisibility of these iconically learnt states reflect the uniqueness and indivisibility of the world. This article summarizes the key claims of the informational mind hypothesis and considers them in relation to Tononi’s information integration theory of consciousness. Some suggestions are made about how the informational mind hypothesis could be experimentally tested, and its significance for work on machine consciousness is considered

    From AI with Love: Reading Big Data Poetry through Gilbert Simondon’s Theory of Transduction

    Get PDF
    Computation initiated a far-reaching re-imagination of language, not just as an information tool, but as a social, bio-physical activity in general. Modern lexicology provides an important overview of the ongoing development of textual documentation and its applications in relation to language and linguistics. At the same time, the evolution of lexical tools from the first dictionaries and graphs to algorithmically generated scatter plots of live online interaction patterns has been surprisingly swift. Modern communication and information studies from Norbert Weiner to the present-day support direct parallels between coding and linguistic systems. However, most theories of computation as a model of language use remain highly indefinite and open-ended, at times purposefully ambiguous. Comparing the use of computation and semantic technologies ranging from Christopher Strachey’s early love letter templates to David Jhave Johnson’s more recent experiments with artificial neural networks (ANNs), this paper proposes the philosopher Gilbert Simondon’s theory of transduction and metastable systems as a suitable framework for understanding various ontological and epistemological ramifications in our increasingly complex and intimate interactions with machine learning. Such developments are especially clear, I argue, in the poetic reimagining of language as a space of cybernetic hybridity. initiated a far-reaching re-imagination of language, not just as an information tool, but as a social, bio-physical activity in general. Modern lexicology provides an important overview of the ongoing development of textual documentation and its applications in relation to language and linguistics. At the same time, the evolution of lexical tools from the first dictionaries and graphs to algorithmically generated scatter plots of live online interaction patterns has been surprisingly swift. Modern communication and information studies from Norbert Weiner to the present-day support direct parallels between coding and linguistic systems. However, most theories of computation as a model of language use remain highly indefinite and open-ended, at times purposefully ambiguous. Comparing the use of computation and semantic technologies ranging from Christopher Strachey’s early love letter templates to David Jhave Johnson’s more recent experiments with artificial neural networks (ANNs), this paper proposes the philosopher Gilbert Simondon’s theory of transduction and metastable systems as a suitable framework for understanding various ontological and epistemological ramifications in our increasingly complex and intimate interactions with machine learning. Such developments are especially clear, I argue, in the poetic reimagining of language as a space of cybernetic hybridity. Keywords: Artificial Intelligence, Poetics, Neural Networks, Information Studies, Computation, Linguistics, Transduction, Philosoph

    Item response theory in AI: Analysing machine learning classifiers at the instance level

    Full text link
    [EN] AI systems are usually evaluated on a range of problem instances and compared to other AI systems that use different strategies. These instances are rarely independent. Machine learning, and supervised learning in particular, is a very good example of this. Given a machine learning model, its behaviour for a single instance cannot be understood in isolation but rather in relation to the rest of the data distribution or dataset. In a dual way, the results of one machine learning model for an instance can be analysed in comparison to other models. While this analysis is relative to a population or distribution of models, it can give much more insight than an isolated analysis. Item response theory (IRT) combines this duality between items and respondents to extract latent variables of the items (such as discrimination or difficulty) and the respondents (such as ability). IRT can be adapted to the analysis of machine learning experiments (and by extension to any other artificial intelligence experiments). In this paper, we see that IRT suits classification tasks perfectly, where instances correspond to items and classifiers correspond to respondents. We perform a series of experiments with a range of datasets and classification methods to fully understand what the IRT parameters such as discrimination, difficulty and guessing mean for classification instances (and their relation to instance hardness measures) and how the estimated classifier ability can be used to compare classifier performance in a different way through classifier characteristic curves.This work has been partially supported by the EU (FEDER) and the Ministerio de Economia y Competitividad (MINECO) in Spain grant TIN2015-69175-C4-1-R, the Air Force Office of Scientific Research under award number FA9550-17-1-0287, and the REFRAME project, granted by the European Coordinated Research on Long-term Challenges in Information and Communication Sciences Technologies ERA-Net (CHIST-ERA) and funded by Ministerio de Economia y Competitividad (MINECO) in Spain (PCIN-2013-037), and by Generalitat Valenciana PROMETEOII/2015/013. Fernando Martinez-Plumed was also supported by INCIBE (INCIBEI-2015-27345) "Ayudas para la excelencia de los equipos de investigacion avanzada en ciberseguridad", the European Commission (Joint Research Centre) HUMAINT project (Expert Contract CT-EX2018D335821-101), and Universitat Politecnica de Valencia (PAID-06-18 Ref. SP20180210). Ricardo Prudencio was financially supported by CNPq (Brazilian Agency). Jose Hernandez-Orallo was supported by a Salvador de Madariaga grant (PRX17/00467) from the Spanish MECD for a research stay at the Leverhulme Centre for the Future of Intelligence (CFI), Cambridge, a BEST grant (BEST/2017/045) from the Valencia GVA for another research stay also at the CFI, and an FLI grant RFP2.Martínez-Plumed, F.; Prudencio, R.; Martínez-Usó, A.; Hernández-Orallo, J. (2019). Item response theory in AI: Analysing machine learning classifiers at the instance level. Artificial Intelligence. 271:18-42. https://doi.org/10.1016/j.artint.2018.09.004S184227

    Generalization Error in Deep Learning

    Get PDF
    Deep learning models have lately shown great performance in various fields such as computer vision, speech recognition, speech translation, and natural language processing. However, alongside their state-of-the-art performance, it is still generally unclear what is the source of their generalization ability. Thus, an important question is what makes deep neural networks able to generalize well from the training set to new data. In this article, we provide an overview of the existing theory and bounds for the characterization of the generalization error of deep neural networks, combining both classical and more recent theoretical and empirical results

    A kernel-based framework for learning graded relations from data

    Get PDF
    Driven by a large number of potential applications in areas like bioinformatics, information retrieval and social network analysis, the problem setting of inferring relations between pairs of data objects has recently been investigated quite intensively in the machine learning community. To this end, current approaches typically consider datasets containing crisp relations, so that standard classification methods can be adopted. However, relations between objects like similarities and preferences are often expressed in a graded manner in real-world applications. A general kernel-based framework for learning relations from data is introduced here. It extends existing approaches because both crisp and graded relations are considered, and it unifies existing approaches because different types of graded relations can be modeled, including symmetric and reciprocal relations. This framework establishes important links between recent developments in fuzzy set theory and machine learning. Its usefulness is demonstrated through various experiments on synthetic and real-world data.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    von Neumann-Morgenstern and Savage Theorems for Causal Decision Making

    Full text link
    Causal thinking and decision making under uncertainty are fundamental aspects of intelligent reasoning. Decision making under uncertainty has been well studied when information is considered at the associative (probabilistic) level. The classical Theorems of von Neumann-Morgenstern and Savage provide a formal criterion for rational choice using purely associative information. Causal inference often yields uncertainty about the exact causal structure, so we consider what kinds of decisions are possible in those conditions. In this work, we consider decision problems in which available actions and consequences are causally connected. After recalling a previous causal decision making result, which relies on a known causal model, we consider the case in which the causal mechanism that controls some environment is unknown to a rational decision maker. In this setting we state and prove a causal version of Savage's Theorem, which we then use to develop a notion of causal games with its respective causal Nash equilibrium. These results highlight the importance of causal models in decision making and the variety of potential applications.Comment: Submitted to Journal of Causal Inferenc

    Discourse Structure in Machine Translation Evaluation

    Full text link
    In this article, we explore the potential of using sentence-level discourse structure for machine translation evaluation. We first design discourse-aware similarity measures, which use all-subtree kernels to compare discourse parse trees in accordance with the Rhetorical Structure Theory (RST). Then, we show that a simple linear combination with these measures can help improve various existing machine translation evaluation metrics regarding correlation with human judgments both at the segment- and at the system-level. This suggests that discourse information is complementary to the information used by many of the existing evaluation metrics, and thus it could be taken into account when developing richer evaluation metrics, such as the WMT-14 winning combined metric DiscoTKparty. We also provide a detailed analysis of the relevance of various discourse elements and relations from the RST parse trees for machine translation evaluation. In particular we show that: (i) all aspects of the RST tree are relevant, (ii) nuclearity is more useful than relation type, and (iii) the similarity of the translation RST tree to the reference tree is positively correlated with translation quality.Comment: machine translation, machine translation evaluation, discourse analysis. Computational Linguistics, 201
    • …
    corecore