311,030 research outputs found
CLIP: concept learning from inference patterns
AbstractA new concept-learning method called CLIP (concept learning from inference patterns) is proposed that learns new concepts from inference patterns, not from positive/negative examples that most conventional concept learning methods use. The learned concepts enable an efficient inference on a more abstract level. We use a colored digraph to represent inference patterns. The graph representation is expressive enough and enables the quantitative analysis of the inference pattern frequency. The learning process consists of the following two steps: (1) Convert the original inference patterns to a colored digraph, and (2) Extract a set of typical patterns which appears frequently in the digraph. The basic idea is that the smaller the digraph becomes, the smaller the amount of data to be handled becomes and, accordingly, the more efficient the inference process that uses these data. Also, we can reduce the size of the graph by replacing each frequently appearing graph pattern with a single node, and each reduced node represents a new concept. Experimentally, CLIP automatically generates multilevel representations from a given physical/single-level representation of a carry-chain circuit. These representations involve abstract descriptions of the circuit, such as mathematical and logical descriptions
M2GRL: A Multi-task Multi-view Graph Representation Learning Framework for Web-scale Recommender Systems
Combining graph representation learning with multi-view data (side
information) for recommendation is a trend in industry. Most existing methods
can be categorized as \emph{multi-view representation fusion}; they first build
one graph and then integrate multi-view data into a single compact
representation for each node in the graph. However, these methods are raising
concerns in both engineering and algorithm aspects: 1) multi-view data are
abundant and informative in industry and may exceed the capacity of one single
vector, and 2) inductive bias may be introduced as multi-view data are often
from different distributions. In this paper, we use a \emph{multi-view
representation alignment} approach to address this issue. Particularly, we
propose a multi-task multi-view graph representation learning framework (M2GRL)
to learn node representations from multi-view graphs for web-scale recommender
systems. M2GRL constructs one graph for each single-view data, learns multiple
separate representations from multiple graphs, and performs alignment to model
cross-view relations. M2GRL chooses a multi-task learning paradigm to learn
intra-view representations and cross-view relations jointly. Besides, M2GRL
applies homoscedastic uncertainty to adaptively tune the loss weights of tasks
during training. We deploy M2GRL at Taobao and train it on 57 billion examples.
According to offline metrics and online A/B tests, M2GRL significantly
outperforms other state-of-the-art algorithms. Further exploration on diversity
recommendation in Taobao shows the effectiveness of utilizing multiple
representations produced by \method{}, which we argue is a promising direction
for various industrial recommendation tasks of different focus.Comment: Accepted by KDD 2020 ads track as an oral paper. Code
address:https://github.com/99731/M2GR
Graphs in machine learning: an introduction
Graphs are commonly used to characterise interactions between objects of
interest. Because they are based on a straightforward formalism, they are used
in many scientific fields from computer science to historical sciences. In this
paper, we give an introduction to some methods relying on graphs for learning.
This includes both unsupervised and supervised methods. Unsupervised learning
algorithms usually aim at visualising graphs in latent spaces and/or clustering
the nodes. Both focus on extracting knowledge from graph topologies. While most
existing techniques are only applicable to static graphs, where edges do not
evolve through time, recent developments have shown that they could be extended
to deal with evolving networks. In a supervised context, one generally aims at
inferring labels or numerical values attached to nodes using both the graph
and, when they are available, node characteristics. Balancing the two sources
of information can be challenging, especially as they can disagree locally or
globally. In both contexts, supervised and un-supervised, data can be
relational (augmented with one or several global graphs) as described above, or
graph valued. In this latter case, each object of interest is given as a full
graph (possibly completed by other characteristics). In this context, natural
tasks include graph clustering (as in producing clusters of graphs rather than
clusters of nodes in a single graph), graph classification, etc. 1 Real
networks One of the first practical studies on graphs can be dated back to the
original work of Moreno [51] in the 30s. Since then, there has been a growing
interest in graph analysis associated with strong developments in the modelling
and the processing of these data. Graphs are now used in many scientific
fields. In Biology [54, 2, 7], for instance, metabolic networks can describe
pathways of biochemical reactions [41], while in social sciences networks are
used to represent relation ties between actors [66, 56, 36, 34]. Other examples
include powergrids [71] and the web [75]. Recently, networks have also been
considered in other areas such as geography [22] and history [59, 39]. In
machine learning, networks are seen as powerful tools to model problems in
order to extract information from data and for prediction purposes. This is the
object of this paper. For more complete surveys, we refer to [28, 62, 49, 45].
In this section, we introduce notations and highlight properties shared by most
real networks. In Section 2, we then consider methods aiming at extracting
information from a unique network. We will particularly focus on clustering
methods where the goal is to find clusters of vertices. Finally, in Section 3,
techniques that take a series of networks into account, where each network i
Recommended from our members
Network Structures, Concurrency, and Interpretability: Lessons from the Development of an AI Enabled Graph Database System
This thesis describes the development of the SmartGraph, an AI enabled graph database. The need for such a system has been independently recognized in the isolated fields of graph databases, graph computing, and computational graph deep learning systems, such as TensorFlow. Though prior works have investigated some relationships between these fields, we believe that the SmartGraph is the first system designed from conception to incorporate the most significant and useful characteristics of each. Examples include the ability to store graph structured data, run analytics natively on this data, and run gradient descent algorithms. It is the synergistic aspects of combining these fields that provide the most novel results presented in this dissertation. Key among them is how the notion of “graph querying” as used in graph databases can be used to solve a problem that has plagued deep learning systems since their inception; rather than attempting to embed graph structured datasets into restrictive vector spaces, we instead allow the deep learning functionality of the system to natively perform graph querying in memory during optimization as a way of interpreting (and learning) the graph. This results in a concept of natural and interpretable processing of graph structured data.
Graph computing systems have traditionally used distributed computing across multiple compute nodes (e.g. separate machines connected via Ethernet or internet) to deal with large-scale datasets whilst working sequentially on problems over entire datasets. In this dissertation, we outline a distributed graph computing methodology that facilitates all the above capabilities (even in an environment consisting of a single physical machine) while allowing for a workflow more typical of a graph database than a graph computing system; massive concurrent access allowing for arbitrarily asynchronous execution of queries and analytics across the entire system. Further, we demonstrate how this methodology is key to the artificial intelligence capabilities of the system
Machine learning algorithms for analysis of DNA data sets
The applications of machine learning algorithms to the analysis of data sets of DNA sequences are very important. The present chapter is devoted to the experimental investigation of applications of several machine learning algorithms for the analysis of a JLA data set consisting of DNA sequences derived from non-coding segments in the junction of the large single copy region and inverted repeat A of the chloroplast genome in Eucalyptus collected by Australian biologists. Data sets of this sort represent a new situation, where sophisticated alignment scores have to be used as a measure of similarity. The alignment scores do not satisfy properties of the Minkowski metric, and new machine learning approaches have to be investigated. The authors' experiments show that machine learning algorithms based on local alignment scores achieve very good agreement with known biological classes for this data set. A new machine learning algorithm based on graph partitioning performed best for clustering of the JLA data set. Our novel k-committees algorithm produced most accurate results for classification. Two new examples of synthetic data sets demonstrate that the authors' k-committees algorithm can outperform both the Nearest Neighbour and k-medoids algorithms simultaneously
Evolution and learning in artificial ecosystems
A generic model is presented for ecosystems inhabited by artificial animals, or animats, that develop over time. The individual animats develop continuously by means of generic mechanisms for learning, forgetting, and decisionmaking.At the same time, the animat populations develop in an evolutionary process based on fixed mechanisms for sexual and asexual reproduction, mutation, and death. The animats of the ecosystems move, eat, learn, make decisions, interact with other animats, reproduce, and die. Each animat has its individual sets of homeostatic variables, sensors, and motors.It also has its own memory graph that forms the basis of its decision-making. This memory graph has an architecture (i.e. graph topology) that changes over time via mechanisms for adding and removing nodes. Our approach combines genetic algorithms, reinforcement learning, homeostatic decision-making, and dynamic concept formation. To illustrate the generality of the model, five examples of ecosystems are given, ranging from a simpleworld inhabited by a single frog to a more complex world in which grass, sheep, and wolves interact
- …