535 research outputs found
On the effectiveness of Randomized Signatures as Reservoir for Learning Rough Dynamics
Many finance, physics, and engineering phenomena are modeled by
continuous-time dynamical systems driven by highly irregular (stochastic)
inputs. A powerful tool to perform time series analysis in this context is
rooted in rough path theory and leverages the so-called Signature Transform.
This algorithm enjoys strong theoretical guarantees but is hard to scale to
high-dimensional data. In this paper, we study a recently derived random
projection variant called Randomized Signature, obtained using the
Johnson-Lindenstrauss Lemma. We provide an in-depth experimental evaluation of
the effectiveness of the Randomized Signature approach, in an attempt to
showcase the advantages of this reservoir to the community. Specifically, we
find that this method is preferable to the truncated Signature approach and
alternative deep learning techniques in terms of model complexity, training
time, accuracy, robustness, and data hungriness.Comment: Accepted for IEEE IJCNN 202
Distributional semantics and machine learning for statistical machine translation
[EU]Lan honetan semantika distribuzionalaren eta ikasketa automatikoaren erabilera aztertzen
dugu itzulpen automatiko estatistikoa hobetzeko. Bide horretan, erregresio logistikoan
oinarritutako ikasketa automatikoko eredu bat proposatzen dugu hitz-segiden itzulpen-
probabilitatea modu dinamikoan modelatzeko. Proposatutako eredua itzulpen automatiko
estatistikoko ohiko itzulpen-probabilitateen orokortze bat dela frogatzen dugu, eta testuinguruko nahiz semantika distribuzionaleko informazioa barneratzeko baliatu ezaugarri
lexiko, hitz-cluster eta hitzen errepresentazio bektorialen bidez. Horretaz gain, semantika
distribuzionaleko ezagutza itzulpen automatiko estatistikoan txertatzeko beste hurbilpen
bat lantzen dugu: hitzen errepresentazio bektorial elebidunak erabiltzea hitz-segiden
itzulpenen antzekotasuna modelatzeko. Gure esperimentuek proposatutako ereduen baliagarritasuna erakusten dute, emaitza itxaropentsuak eskuratuz oinarrizko sistema sendo
baten gainean. Era berean, gure lanak ekarpen garrantzitsuak egiten ditu errepresentazio
bektorialen mapaketa elebidunei eta hitzen errepresentazio bektorialetan oinarritutako
hitz-segiden antzekotasun neurriei dagokienean, itzulpen automatikoaz haratago balio
propio bat dutenak semantika distribuzionalaren arloan.[EN]In this work, we explore the use of distributional semantics and machine learning to
improve statistical machine translation. For that purpose, we propose the use of a logistic
regression based machine learning model for dynamic phrase translation probability mod-
eling. We prove that the proposed model can be seen as a generalization of the standard
translation probabilities used in statistical machine translation, and use it to incorporate
context and distributional semantic information through lexical, word cluster and word
embedding features. Apart from that, we explore the use of word embeddings for phrase
translation probability scoring as an alternative approach to incorporate distributional
semantic knowledge into statistical machine translation. Our experiments show the
effectiveness of the proposed models, achieving promising results over a strong baseline.
At the same time, our work makes important contributions in relation to bilingual word
embedding mappings and word embedding based phrase similarity measures, which go be-
yond machine translation and have an intrinsic value in the field of distributional semantics
- …