523 research outputs found

    Ranking relations using analogies in biological and information networks

    Get PDF
    Analogical reasoning depends fundamentally on the ability to learn and generalize about relations between objects. We develop an approach to relational learning which, given a set of pairs of objects S={A(1):B(1),A(2):B(2),…,A(N):B(N)}\mathbf{S}=\{A^{(1)}:B^{(1)},A^{(2)}:B^{(2)},\ldots,A^{(N)}:B ^{(N)}\}, measures how well other pairs A:B fit in with the set S\mathbf{S}. Our work addresses the following question: is the relation between objects A and B analogous to those relations found in S\mathbf{S}? Such questions are particularly relevant in information retrieval, where an investigator might want to search for analogous pairs of objects that match the query set of interest. There are many ways in which objects can be related, making the task of measuring analogies very challenging. Our approach combines a similarity measure on function spaces with Bayesian analysis to produce a ranking. It requires data containing features of the objects of interest and a link matrix specifying which relationships exist; no further attributes of such relationships are necessary. We illustrate the potential of our method on text analysis and information networks. An application on discovering functional interactions between pairs of proteins is discussed in detail, where we show that our approach can work in practice even if a small set of protein pairs is provided.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS321 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Classifying and completing word analogies by machine learning

    Get PDF
    Analogical proportions are statements of the form ‘a is to b as c is to d’, formally denoted a:b::c:d. They are the basis of analogical reasoning which is often considered as an essential ingredient of human intelligence. For this reason, recognizing analogies in natural language has long been a research focus within the Natural Language Processing (NLP) community. With the emergence of word embedding models, a lot of progress has been made in NLP, essentially assuming that a word analogy like man:king::woman:queen is an instance of a parallelogram within the underlying vector space. In this paper, we depart from this assumption to adopt a machine learning approach, i.e., learning a substitute of the parallelogram model. To achieve our goal, we first review the formal modeling of analogical proportions, highlighting the properties which are useful from a machine learning perspective. For instance, the postulates supposed to govern such proportions entail that when a:b::c:d holds, then seven permutations of a,b,c,d still constitute valid analogies. From a machine learning perspective, this provides guidelines to build training sets of positive and negative examples. Taking into account these properties for augmenting the set of positive and negative examples, we first implement word analogy classifiers using various machine learning techniques, then we approximate by regression an analogy completion function, i.e., a way to compute the missing word when we have the three other ones. Using a GloVe embedding, classifiers show very high accuracy when recognizing analogies, improving state of the art on word analogy classification. Also, the regression processes usually lead to much more successful analogy completion than the ones derived from the parallelogram assumption. © 202

    RNNs Implicitly Implement Tensor Product Representations

    Full text link
    Recurrent neural networks (RNNs) can learn continuous vector representations of symbolic structures such as sequences and sentences; these representations often exhibit linear regularities (analogies). Such regularities motivate our hypothesis that RNNs that show such regularities implicitly compile symbolic structures into tensor product representations (TPRs; Smolensky, 1990), which additively combine tensor products of vectors representing roles (e.g., sequence positions) and vectors representing fillers (e.g., particular words). To test this hypothesis, we introduce Tensor Product Decomposition Networks (TPDNs), which use TPRs to approximate existing vector representations. We demonstrate using synthetic data that TPDNs can successfully approximate linear and tree-based RNN autoencoder representations, suggesting that these representations exhibit interpretable compositional structure; we explore the settings that lead RNNs to induce such structure-sensitive representations. By contrast, further TPDN experiments show that the representations of four models trained to encode naturally-occurring sentences can be largely approximated with a bag of words, with only marginal improvements from more sophisticated structures. We conclude that TPDNs provide a powerful method for interpreting vector representations, and that standard RNNs can induce compositional sequence representations that are remarkably well approximated by TPRs; at the same time, existing training tasks for sentence representation learning may not be sufficient for inducing robust structural representations.Comment: Accepted to ICLR 201

    Contributions to the use of analogical proportions for machine learning: theoretical properties and application to recommendation

    Get PDF
    Le raisonnement par analogie est reconnu comme une des principales caractéristiques de l'intelligence humaine. En tant que tel, il a pendant longtemps été étudié par les philosophes et les psychologues, mais de récents travaux s'intéressent aussi à sa modélisation d'un point de vue formel à l'aide de proportions analogiques, permettant l'implémentation de programmes informatiques. Nous nous intéressons ici à l'utilisation des proportions analogiques à des fins prédictives, dans un contexte d'apprentissage artificiel. Dans de récents travaux, les classifieurs analogiques ont montré qu'ils sont capables d'obtenir d'excellentes performances sur certains problèmes artificiels, là où d'autres techniques traditionnelles d'apprentissage se montrent beaucoup moins efficaces. Partant de cette observation empirique, cette thèse s'intéresse à deux axes principaux de recherche. Le premier sera de confronter le raisonnement par proportion analogique à des applications pratiques, afin d'étudier la viabilité de l'approche analogique sur des problèmes concrets. Le second axe de recherche sera d'étudier les classifieurs analogiques d'un point de vue théorique, car jusqu'à présent ceux-ci n'étaient connus que grâce à leurs définitions algorithmiques. Les propriétés théoriques qui découleront nous permettront de comprendre plus précisément leurs forces, ainsi que leurs faiblesses. Comme domaine d'application, nous avons choisi celui des systèmes de recommandation. On reproche souvent à ces derniers de manquer de nouveauté ou de surprise dans les recommandations qui sont adressées aux utilisateurs. Le raisonnement par analogie, capable de mettre en relation des objets en apparence différents, nous est apparu comme un outil potentiel pour répondre à ce problème. Nos expériences montreront que les systèmes analogiques ont tendance à produire des recommandations d'une qualité comparable à celle des méthodes existantes, mais que leur complexité algorithmique cubique les pénalise trop fortement pour prétendre à des applications pratiques où le temps de calcul est une des contraintes principales. Du côté théorique, une contribution majeure de cette thèse est de proposer une définition fonctionnelle des classifieurs analogiques, qui a la particularité d'unifier les approches préexistantes. Cette définition fonctionnelle nous permettra de clairement identifier les liens sous-jacents entre l'approche analogique et l'approche par k plus-proches-voisins, tant au plan algorithmique de haut niveau qu'au plan des propriétés théoriques (taux d'erreur notamment). De plus, nous avons pu identifier un critère qui rend l'application de notre principe d'inférence analogique parfaitement certaine (c'est-à-dire sans erreur), exhibant ainsi les propriétés linéaires du raisonnement par analogie.Analogical reasoning is recognized as a core component of human intelligence. It has been extensively studied from philosophical and psychological viewpoints, but recent works also address the modeling of analogical reasoning for computational purposes, particularly focused on analogical proportions. We are interested here in the use of analogical proportions for making predictions, in a machine learning context. In recent works, analogy-based classifiers have achieved noteworthy performances, in particular by performing well on some artificial problems where other traditional methods tend to fail. Starting from this empirical observation, the goal of this thesis is twofold. The first topic of research is to assess the relevance of analogical learners on real-world, practical application problems. The second topic is to exhibit meaningful theoretical properties of analogical classifiers, which were yet only empirically studied. The field of application that was chosen for assessing the suitability of analogical classifiers in real-world setting is the topic of recommender systems. A common reproach addressed towards recommender systems is that they often lack of novelty and diversity in their recommendations. As a way of establishing links between seemingly unrelated objects, analogy was thought as a way to overcome this issue. Experiments here show that while offering sometimes similar accuracy performances to those of basic classical approaches, analogical classifiers still suffer from their algorithmic complexity. On the theoretical side, a key contribution of this thesis is to provide a functional definition of analogical classifiers, that unifies the various pre-existing approaches. So far, only algorithmic definitions were known, making it difficult to lead a thorough theoretical study. From this functional definition, we clearly identified the links between our approach and that of the nearest neighbors classifiers, in terms of process and in terms of accuracy. We were also able to identify a criterion that ensures a safe application of our analogical inference principle, which allows us to characterize analogical reasoning as some sort of linear process

    Context-Based classification of objects in topographic data

    Get PDF
    Large-scale topographic databases model real world features as vector data objects. These can be point, line or area features. Each of these map objects is assigned to a descriptive class; for example, an area feature might be classed as a building, a garden or a road. Topographic data is subject to continual updates from cartographic surveys and ongoing quality improvement. One of the most important aspects of this is assignment and verification of class descriptions to each area feature. These attributes can be added manually, but, due to the vast volume of data involved, automated techniques are desirable to classify these polygons. Analogy is a key thought process that underpins learning and has been the subject of much research in the field of artificial intelligence (AI). An analogy identifies structural similarity between a well-known source domain and a less familiar target domain. In many cases, information present in the source can then be mapped to the target, yielding a better understanding of the latter. The solution of geometric analogy problems has been a fruitful area of AI research. We observe that there is a correlation between objects in geometric analogy problem domains and map features in topographic data. We describe two topographic area feature classification tools that use descriptions of neighbouring features to identify analogies between polygons: content vector matching (CVM) and context structure matching (CSM). CVM and CSM classify an area feature by matching its neighbourhood context against those of analogous polygons whose class is known. Both classifiers were implemented and then tested on high quality topographic polygon data supplied by Ordnance Survey (Great Britain). Area features were found to exhibit a high degree of variation in their neighbourhoods. CVM correctly classified 85.38% of the 79.03% of features it attempted to classify. The accuracy for CSM was 85.96% of the 62.96% of features it tried to identify. Thus, CVM can classify 25.53% more features than CSM, but is slightly less accurate. Both techniques excelled at identifying the feature classes that predominate in suburban data. Our structure-based classification approach may also benefit other types of spatial data, such as topographic line data, small-scale topographic data, raster data, architectural plans and circuit diagrams

    Constructivism, epistemology and information processing

    Get PDF
    The author analyzes the main models of artificial intelligence which deal with the transition from one stage to another, a central problem in development. He describes the contributions of rule-based systems and connectionist systems to an explanation of this transition. He considers that Artificial Intelligence models, in spite of their limitations, establish fruitful points of contact with the constructivist position.El autor analiza los principales modelos de inteligencia artificial que dan cuenta del paso de la transición de un estudio a otro, problema central del desarrollo. Describe y señala las aportaciones de los sistemas basados en reglas así como de los sistemas conexionistas para explicar dicha transición. Considera que los modelos de inteligencia artificial, a pesar de sus limitaciones, permiten establecer puntos de contacto muy fructiferos con la posición constructivista

    System alignment supports cross-domain learning and zero-shot generalisation

    Get PDF
    Recent findings suggest conceptual relationships hold across modalities. For instance, if two concepts occur in similar linguistic contexts, they also likely occur in similar visual contexts. These similarity structures may provide a valuable signal for alignment when learning to map between domains, such as when learning the names of objects. To assess this possibility, we conducted a paired-associate learning experiment in which participants mapped objects that varied on two visual features to locations that varied along two spatial dimensions. We manipulated whether the featural and spatial systems were aligned or misaligned. Although system alignment was not required to complete this supervised learning task, we found that participants learned more efficiently when systems aligned and that aligned systems facilitated zero-shot generalisation. We fit a variety of models to individuals' responses and found that models which included an offline unsupervised alignment mechanism best accounted for human performance. Our results provide empirical evidence that people align entire representation systems to accelerate learning, even when learning seemingly arbitrary associations between two domains

    SeGMA: Semi-Supervised Gaussian Mixture Auto-Encoder

    Full text link
    We propose a semi-supervised generative model, SeGMA, which learns a joint probability distribution of data and their classes and which is implemented in a typical Wasserstein auto-encoder framework. We choose a mixture of Gaussians as a target distribution in latent space, which provides a natural splitting of data into clusters. To connect Gaussian components with correct classes, we use a small amount of labeled data and a Gaussian classifier induced by the target distribution. SeGMA is optimized efficiently due to the use of Cramer-Wold distance as a maximum mean discrepancy penalty, which yields a closed-form expression for a mixture of spherical Gaussian components and thus obviates the need of sampling. While SeGMA preserves all properties of its semi-supervised predecessors and achieves at least as good generative performance on standard benchmark data sets, it presents additional features: (a) interpolation between any pair of points in the latent space produces realistically-looking samples; (b) combining the interpolation property with disentangled class and style variables, SeGMA is able to perform a continuous style transfer from one class to another; (c) it is possible to change the intensity of class characteristics in a data point by moving the latent representation of the data point away from specific Gaussian components
    • …
    corecore