252 research outputs found
Batch kernel SOM and related Laplacian methods for social network analysis
Large graphs are natural mathematical models for describing the structure of
the data in a wide variety of fields, such as web mining, social networks,
information retrieval, biological networks, etc. For all these applications,
automatic tools are required to get a synthetic view of the graph and to reach
a good understanding of the underlying problem. In particular, discovering
groups of tightly connected vertices and understanding the relations between
those groups is very important in practice. This paper shows how a kernel
version of the batch Self Organizing Map can be used to achieve these goals via
kernels derived from the Laplacian matrix of the graph, especially when it is
used in conjunction with more classical methods based on the spectral analysis
of the graph. The proposed method is used to explore the structure of a
medieval social network modeled through a weighted graph that has been directly
built from a large corpus of agrarian contracts
Mining a medieval social network by kernel SOM and related methods
This paper briefly presents several ways to understand the organization of a
large social network (several hundreds of persons). We compare approaches
coming from data mining for clustering the vertices of a graph (spectral
clustering, self-organizing algorithms. . .) and provide methods for
representing the graph from these analysis. All these methods are illustrated
on a medieval social network and the way they can help to understand its
organization is underlined
Overlap Removal of Dimensionality Reduction Scatterplot Layouts
Dimensionality Reduction (DR) scatterplot layouts have become a ubiquitous
visualization tool for analyzing multidimensional data items with presence in
different areas. Despite its popularity, scatterplots suffer from occlusion,
especially when markers convey information, making it troublesome for users to
estimate items' groups' sizes and, more importantly, potentially obfuscating
critical items for the analysis under execution. Different strategies have been
devised to address this issue, either producing overlap-free layouts, lacking
the powerful capabilities of contemporary DR techniques in uncover interesting
data patterns, or eliminating overlaps as a post-processing strategy. Despite
the good results of post-processing techniques, the best methods typically
expand or distort the scatterplot area, thus reducing markers' size (sometimes)
to unreadable dimensions, defeating the purpose of removing overlaps. This
paper presents a novel post-processing strategy to remove DR layouts' overlaps
that faithfully preserves the original layout's characteristics and markers'
sizes. We show that the proposed strategy surpasses the state-of-the-art in
overlap removal through an extensive comparative evaluation considering
multiple different metrics while it is 2 or 3 orders of magnitude faster for
large datasets.Comment: 11 pages and 9 figure
Characteristics of networks generated by kernel growing neural gas
This research aims to develop kernel GNG, a kernelized version of the growing
neural gas (GNG) algorithm, and to investigate the features of the networks
generated by the kernel GNG. The GNG is an unsupervised artificial neural
network that can transform a dataset into an undirected graph, thereby
extracting the features of the dataset as a graph. The GNG is widely used in
vector quantization, clustering, and 3D graphics. Kernel methods are often used
to map a dataset to feature space, with support vector machines being the most
prominent application. This paper introduces the kernel GNG approach and
explores the characteristics of the networks generated by kernel GNG. Five
kernels, including Gaussian, Laplacian, Cauchy, inverse multiquadric, and log
kernels, are used in this study
A self-organizing world: special issue of the 13th edition of the workshop on self-organizing maps and learning vector quantization, clustering and data visualization, WSOMÂż+Âż2019
Postprint (author's final draft
Urban Land Cover Classification with Missing Data Modalities Using Deep Convolutional Neural Networks
Automatic urban land cover classification is a fundamental problem in remote
sensing, e.g. for environmental monitoring. The problem is highly challenging,
as classes generally have high inter-class and low intra-class variance.
Techniques to improve urban land cover classification performance in remote
sensing include fusion of data from different sensors with different data
modalities. However, such techniques require all modalities to be available to
the classifier in the decision-making process, i.e. at test time, as well as in
training. If a data modality is missing at test time, current state-of-the-art
approaches have in general no procedure available for exploiting information
from these modalities. This represents a waste of potentially useful
information. We propose as a remedy a convolutional neural network (CNN)
architecture for urban land cover classification which is able to embed all
available training modalities in a so-called hallucination network. The network
will in effect replace missing data modalities in the test phase, enabling
fusion capabilities even when data modalities are missing in testing. We
demonstrate the method using two datasets consisting of optical and digital
surface model (DSM) images. We simulate missing modalities by assuming that DSM
images are missing during testing. Our method outperforms both standard CNNs
trained only on optical images as well as an ensemble of two standard CNNs. We
further evaluate the potential of our method to handle situations where only
some DSM images are missing during testing. Overall, we show that we can
clearly exploit training time information of the missing modality during
testing
- …