5 research outputs found
Exploiting Vertices States in GraphESN by Weighted Nearest Neighbor
Abstract. Graph Echo State Networks (GraphESN) extend the Reservoir Computing approach to directly process graph structures. The reservoir is applied to every vertex of an input graph, realizing a contractive encoding process and resulting in a structured state isomorphic to the input. Whenever an unstructured output is required, a state mapping function maps the structured state into a fixed-size feature representation that feeds the linear readout. In this paper we propose an alternative approach, based on distance-weighted nearest neighbor, to realize a more flexible readout exploiting the state information computed for every vertex according to its individual relevance.
Modelli neurali costruttivi di tipo Reservoir Computing per domini strutturati
Il presente lavoro di tesi introduce e discute nuovi modelli di Reti Neurali Ricorsive per l'apprendimento supervisionato di trasduzioni su grafi. Due sono i maggiori contributi apportati: l'adozione di un approccio costruttivo, e l'introduzione di un meccanismo stabile di output-feedback, entrambi innovativi nell'ambito del Reservoir Computing a cui si rifanno i modelli considerati. La combinazione di una strategia costruttiva e dell'utilizzo di modelli di Reservoir Computing ha inoltre permesso la realizzazione di modelli molto efficienti dal punto di vista computazionale.
I modelli e le strategie individuate si configurano come uno strumento utile e flessibile nel trattamento di domini complessi attraverso tecniche di Machine Learning, e propongono soluzioni ad alcuni dei problemi aperti nell'ambito del Reservoir Computing.
L'analisi sperimentale svolta riguarda l'apprendimento di trasduzioni strutturali da dataset reali appartenenti all'ambito della Chemioinformatica
Syntax-based machine translation using dependency grammars and discriminative machine learning
Machine translation underwent huge improvements since the groundbreaking
introduction of statistical methods in the early 2000s, going from very
domain-specific systems that still performed relatively poorly despite the
painstakingly crafting of thousands of ad-hoc rules, to general-purpose
systems automatically trained on large collections of bilingual texts which
manage to deliver understandable translations that convey the general
meaning of the original input.
These approaches however still perform quite below the level of human
translators, typically failing to convey detailed meaning and register, and
producing translations that, while readable, are often ungrammatical and
unidiomatic.
This quality gap, which is considerably large compared to most other
natural language processing tasks, has been the focus of the research in
recent years, with the development of increasingly sophisticated models that
attempt to exploit the syntactical structure of human languages, leveraging
the technology of statistical parsers, as well as advanced machine learning
methods such as marging-based structured prediction algorithms and neural
networks.
The translation software itself became more complex in order to accommodate
for the sophistication of these advanced models: the main translation
engine (the decoder) is now often combined with a pre-processor which
reorders the words of the source sentences to a target language word order, or
with a post-processor that ranks and selects a translation according according
to fine model from a list of candidate translations generated by a coarse
model.
In this thesis we investigate the statistical machine translation problem
from various angles, focusing on translation from non-analytic languages
whose syntax is best described by fluid non-projective dependency grammars
rather than the relatively strict phrase-structure grammars or projectivedependency
grammars which are most commonly used in the literature.
We propose a framework for modeling word reordering phenomena
between language pairs as transitions on non-projective source dependency
parse graphs. We quantitatively characterize reordering phenomena for the
German-to-English language pair as captured by this framework, specifically
investigating the incidence and effects of the non-projectivity of source
syntax and the non-locality of word movement w.r.t. the graph structure.
We evaluated several variants of hand-coded pre-ordering rules in order to
assess the impact of these phenomena on translation quality.
We propose a class of dependency-based source pre-ordering approaches
that reorder sentences based on a flexible models trained by SVMs and and
several recurrent neural network architectures.
We also propose a class of translation reranking models, both syntax-free
and source dependency-based, which make use of a type of neural networks
known as graph echo state networks which is highly flexible and requires
extremely little training resources, overcoming one of the main limitations
of neural network models for natural language processing tasks
Reservoir Computing for Learning in Structured Domains
The study of learning models for direct processing complex data structures has gained an
increasing interest within the Machine Learning (ML) community during the last decades.
In this concern, efficiency, effectiveness and adaptivity of the ML models on large classes
of data structures represent challenging and open research issues.
The paradigm under consideration is Reservoir Computing (RC), a novel and extremely
efficient methodology for modeling Recurrent Neural Networks (RNN) for adaptive
sequence processing. RC comprises a number of different neural models, among which the
Echo State Network (ESN) probably represents the most popular, used and studied one.
Another research area of interest is represented by Recursive Neural Networks (RecNNs),
constituting a class of neural network models recently proposed for dealing with
hierarchical data structures directly.
In this thesis the RC paradigm is investigated and suitably generalized in order to
approach the problems arising from learning in structured domains. The research studies
described in this thesis cover classes of data structures characterized by increasing
complexity, from sequences, to trees and graphs structures. Accordingly, the research focus
goes progressively from the analysis of standard ESNs for sequence processing, to the
development of new models for trees and graphs structured domains. The analysis of ESNs
for sequence processing addresses the interesting problem of identifying and
characterizing the relevant factors which influence the reservoir dynamics and the ESN performance.
Promising applications of ESNs in the emerging field of Ambient Assisted Living are also
presented and discussed. Moving towards highly structured data representations, the
ESN model is extended to deal with complex structures directly, resulting in the proposed
TreeESN, which is suitable for domains comprising hierarchical structures, and Graph-ESN,
which generalizes the approach to a large class of cyclic/acyclic directed/undirected
labeled graphs. TreeESNs and GraphESNs represent both novel RC models for structured
data and extremely efficient approaches for modeling RecNNs, eventually contributing
to the definition of an RC framework for learning in structured domains. The problem
of adaptively exploiting the state space in GraphESNs is also investigated, with specific
regard to tasks in which input graphs are required to be mapped into flat vectorial outputs,
resulting in the GraphESN-wnn and GraphESN-NG models. As a further point, the
generalization performance of the proposed models is evaluated considering both artificial
and complex real-world tasks from different application domains, including Chemistry,
Toxicology and Document Processing