17,922 research outputs found
Graphs in machine learning: an introduction
Graphs are commonly used to characterise interactions between objects of
interest. Because they are based on a straightforward formalism, they are used
in many scientific fields from computer science to historical sciences. In this
paper, we give an introduction to some methods relying on graphs for learning.
This includes both unsupervised and supervised methods. Unsupervised learning
algorithms usually aim at visualising graphs in latent spaces and/or clustering
the nodes. Both focus on extracting knowledge from graph topologies. While most
existing techniques are only applicable to static graphs, where edges do not
evolve through time, recent developments have shown that they could be extended
to deal with evolving networks. In a supervised context, one generally aims at
inferring labels or numerical values attached to nodes using both the graph
and, when they are available, node characteristics. Balancing the two sources
of information can be challenging, especially as they can disagree locally or
globally. In both contexts, supervised and un-supervised, data can be
relational (augmented with one or several global graphs) as described above, or
graph valued. In this latter case, each object of interest is given as a full
graph (possibly completed by other characteristics). In this context, natural
tasks include graph clustering (as in producing clusters of graphs rather than
clusters of nodes in a single graph), graph classification, etc. 1 Real
networks One of the first practical studies on graphs can be dated back to the
original work of Moreno [51] in the 30s. Since then, there has been a growing
interest in graph analysis associated with strong developments in the modelling
and the processing of these data. Graphs are now used in many scientific
fields. In Biology [54, 2, 7], for instance, metabolic networks can describe
pathways of biochemical reactions [41], while in social sciences networks are
used to represent relation ties between actors [66, 56, 36, 34]. Other examples
include powergrids [71] and the web [75]. Recently, networks have also been
considered in other areas such as geography [22] and history [59, 39]. In
machine learning, networks are seen as powerful tools to model problems in
order to extract information from data and for prediction purposes. This is the
object of this paper. For more complete surveys, we refer to [28, 62, 49, 45].
In this section, we introduce notations and highlight properties shared by most
real networks. In Section 2, we then consider methods aiming at extracting
information from a unique network. We will particularly focus on clustering
methods where the goal is to find clusters of vertices. Finally, in Section 3,
techniques that take a series of networks into account, where each network i
A Consistent Regularization Approach for Structured Prediction
We propose and analyze a regularization approach for structured prediction
problems. We characterize a large class of loss functions that allows to
naturally embed structured outputs in a linear space. We exploit this fact to
design learning algorithms using a surrogate loss approach and regularization
techniques. We prove universal consistency and finite sample bounds
characterizing the generalization properties of the proposed methods.
Experimental results are provided to demonstrate the practical usefulness of
the proposed approach.Comment: 39 pages, 2 Tables, 1 Figur
Fingerprinting-Based Positioning in Distributed Massive MIMO Systems
Location awareness in wireless networks may enable many applications such as
emergency services, autonomous driving and geographic routing. Although there
are many available positioning techniques, none of them is adapted to work with
massive multiple-in-multiple-out (MIMO) systems, which represent a leading 5G
technology candidate. In this paper, we discuss possible solutions for
positioning of mobile stations using a vector of signals at the base station,
equipped with many antennas distributed over deployment area. Our main proposal
is to use fingerprinting techniques based on a vector of received signal
strengths. This kind of methods are able to work in highly-cluttered multipath
environments, and require just one base station, in contrast to standard
range-based and angle-based techniques. We also provide a solution for
fingerprinting-based positioning based on Gaussian process regression, and
discuss main applications and challenges.Comment: Proc. of IEEE 82nd Vehicular Technology Conference (VTC2015-Fall
The pharmacophore kernel for virtual screening with support vector machines
We introduce a family of positive definite kernels specifically optimized for
the manipulation of 3D structures of molecules with kernel methods. The kernels
are based on the comparison of the three-points pharmacophores present in the
3D structures of molecul es, a set of molecular features known to be
particularly relevant for virtual screening applications. We present a
computationally demanding exact implementation of these kernels, as well as
fast approximations related to the classical fingerprint-based approa ches.
Experimental results suggest that this new approach outperforms
state-of-the-art algorithms based on the 2D structure of mol ecules for the
detection of inhibitors of several drug targets
- …