1,416 research outputs found
Tree Edit Distance Learning via Adaptive Symbol Embeddings
Metric learning has the aim to improve classification accuracy by learning a
distance measure which brings data points from the same class closer together
and pushes data points from different classes further apart. Recent research
has demonstrated that metric learning approaches can also be applied to trees,
such as molecular structures, abstract syntax trees of computer programs, or
syntax trees of natural language, by learning the cost function of an edit
distance, i.e. the costs of replacing, deleting, or inserting nodes in a tree.
However, learning such costs directly may yield an edit distance which violates
metric axioms, is challenging to interpret, and may not generalize well. In
this contribution, we propose a novel metric learning approach for trees which
we call embedding edit distance learning (BEDL) and which learns an edit
distance indirectly by embedding the tree nodes as vectors, such that the
Euclidean distance between those vectors supports class discrimination. We
learn such embeddings by reducing the distance to prototypical trees from the
same class and increasing the distance to prototypical trees from different
classes. In our experiments, we show that BEDL improves upon the
state-of-the-art in metric learning for trees on six benchmark data sets,
ranging from computer science over biomedical data to a natural-language
processing data set containing over 300,000 nodes.Comment: Paper at the International Conference of Machine Learning (2018),
2018-07-10 to 2018-07-15 in Stockholm, Swede
Leave Graphs Alone: Addressing Over-Squashing without Rewiring
Recent works have investigated the role of graph bottlenecks in preventing
long-range information propagation in message-passing graph neural networks,
causing the so-called `over-squashing' phenomenon. As a remedy, graph rewiring
mechanisms have been proposed as preprocessing steps. Graph Echo State Networks
(GESNs) are a reservoir computing model for graphs, where node embeddings are
recursively computed by an untrained message-passing function. In this paper,
we show that GESNs can achieve a significantly better accuracy on six
heterophilic node classification tasks without altering the graph connectivity,
thus suggesting a different route for addressing the over-squashing problem.Comment: Extended Abstract. Presented at the First Learning on Graphs
Conference (LoG 2022), Virtual Event, December 9-12, 202
Fast and Deep Graph Neural Networks
We address the efficiency issue for the construction of a deep graph neural
network (GNN). The approach exploits the idea of representing each input graph
as a fixed point of a dynamical system (implemented through a recurrent neural
network), and leverages a deep architectural organization of the recurrent
units. Efficiency is gained by many aspects, including the use of small and
very sparse networks, where the weights of the recurrent units are left
untrained under the stability condition introduced in this work. This can be
viewed as a way to study the intrinsic power of the architecture of a deep GNN,
and also to provide insights for the set-up of more complex fully-trained
models. Through experimental results, we show that even without training of the
recurrent connections, the architecture of small deep GNN is surprisingly able
to achieve or improve the state-of-the-art performance on a significant set of
tasks in the field of graphs classification.Comment: Pre-print of 'Fast and Deep Graph Neural Networks', accepted for AAAI
2020. This document includes the Supplementary Materia
Tree Echo State Networks
In this paper we present the Tree Echo State Network (TreeESN) model, generalizing the paradigm of Reservoir Computing to tree structured data. TreeESNs exploit an untrained generalized recursive reservoir, exhibiting extreme efficiency for learning in structured domains. In addition, we highlight through the paper other characteristics of the approach: First, we discuss the Markovian characterization of reservoir dynamics, extended to the case of tree domains, that is implied by the contractive setting of the TreeESN state transition function. Second, we study two types of state mapping functions to map the tree structured state of TreeESN into a fixed-size feature representation for classification or regression tasks. The critical role of the relation between the choice of the state mapping function and the Markovian characterization of the task is analyzed and experimentally investigated on both artificial and real-world tasks. Finally, experimental results on benchmark and real-world tasks show that the TreeESN approach, in spite of its efficiency, can achieve comparable results with state-of-the-art, although more complex, neural and kernel based models for tree structured data
A Deep Generative Model for Fragment-Based Molecule Generation
Molecule generation is a challenging open problem in cheminformatics.
Currently, deep generative approaches addressing the challenge belong to two
broad categories, differing in how molecules are represented. One approach
encodes molecular graphs as strings of text, and learns their corresponding
character-based language model. Another, more expressive, approach operates
directly on the molecular graph. In this work, we address two limitations of
the former: generation of invalid and duplicate molecules. To improve validity
rates, we develop a language model for small molecular substructures called
fragments, loosely inspired by the well-known paradigm of Fragment-Based Drug
Design. In other words, we generate molecules fragment by fragment, instead of
atom by atom. To improve uniqueness rates, we present a frequency-based masking
strategy that helps generate molecules with infrequent fragments. We show
experimentally that our model largely outperforms other language model-based
competitors, reaching state-of-the-art performances typical of graph-based
approaches. Moreover, generated molecules display molecular properties similar
to those in the training sample, even in absence of explicit task-specific
supervision
Wild animals' biologging through machine learning models
In recent decades the biodiversity crisis has been characterised by a decline and extinction of many animal species worldwide. To aid in understanding the threats and causes of this demise, conservation scientists rely on remote assessments. Innovation in technology in the form of microelectromechanical systems (MEMs) has brought about great leaps forward in understanding of animal life. The MEMs are now readily available to ecologists for remotely monitoring the activities of wild animals. Since the advent of electronic tags, methods such as biologging are being increasingly applied to the study of animal ecology, providing information unattainable through other techniques. In this paper, we discuss a few relevant instances of biologging studies. We present an overview on biologging research area, describing the evolution of acquisition of behavioural information and the improvement provided by tags. In second part we will review some common data analysis techniques used to identify daily activity of animals
Deep Echo State Networks for Diagnosis of Parkinson's Disease
In this paper, we introduce a novel approach for diagnosis of Parkinson's
Disease (PD) based on deep Echo State Networks (ESNs). The identification of PD
is performed by analyzing the whole time-series collected from a tablet device
during the sketching of spiral tests, without the need for feature extraction
and data preprocessing. We evaluated the proposed approach on a public dataset
of spiral tests. The results of experimental analysis show that DeepESNs
perform significantly better than shallow ESN model. Overall, the proposed
approach obtains state-of-the-art results in the identification of PD on this
kind of temporal data.Comment: This is a pre-print of the paper submitted to the European Symposium
on Artificial Neural Networks, Computational Intelligence and Machine
Learning, ESANN 201
- …