Search CORE

327 research outputs found

Making Good on LSTMs' Unfulfilled Promise

Author: Garcez A.
Philps D.
Weyde T.
Publication venue
Publication date: 01/12/2019
Field of study

LSTMs promise much to financial time-series analysis, temporal and cross-sectional inference, but we find that they do not deliver in a real-world financial management task. We examine an alternative called Continual Learning (CL), a memory-augmented approach, which can provide transparent explanations, i.e. which memory did what and when. This work has implications for many financial applications including credit, time-varying fairness in decision making and more. We make three important new observations. Firstly, as well as being more explainable, time-series CL approaches outperform LSTMs as well as a simple sliding window learner using feed-forward neural networks (FFNN). Secondly, we show that CL based on a sliding window learner (FFNN) is more effective than CL based on a sequential learner (LSTM). Thirdly, we examine how real-world, time-series noise impacts several similarity approaches used in CL memory addressing. We provide these insights using an approach called Continual Learning Augmentation (CLA) tested on a complex real-world problem, emerging market equities investment decision making. CLA provides a test-bed as it can be based on different types of time-series learners, allowing testing of LSTM and FFNN learners side by side. CLA is also used to test several distance approaches used in a memory recall-gate: Euclidean distance (ED), dynamic time warping (DTW), auto-encoders (AE) and a novel hybrid approach, warp-AE. We find that ED under-performs DTW and AE but warp-AE shows the best overall performance in a real-world financial task

arXiv.org e-Print Archive

City Research Online

Warwick Research Archives Portal Repository

Recommended from our members

Learning to Act with RVRL Agents

Author: Child C. H. T.
Garcez A.
Stathis K.
Publication venue
Publication date: 01/01/2007
Field of study

The use of reinforcement learning to guide action selection of cognitive agents has been shown to be a powerful technique for stochastic environments. Standard Reinforcement learning techniques used to provide decision theoretic policies rely, however, on explicit state-based computations of value for each state-action pair. This requires the computation of a number of values exponential to the number of state variables and actions in the system. This research extends existing work with an acquired probabilistic rule representation of an agent environment by developing an algorithm to apply reinforcement learning to values attached to the rules themselves. Structure captured by the rules is then used to learn a policy directly. The resulting value attached to each rule represents the utility of taking an action if the conditions of the rule are present in the agent’s current set of percepts. This has several advantages for planning purposes: generalization over many states and over unseen states; effective decisions can therefore be made with less training data than state based modelling systems (e.g. Dyna Q-Learning); and the problem of computation in an exponential state-action space is alleviated. The results of application of this algorithm to rules in a specific environment are presented, with comparison to standard reinforcement learning policies developed from related work

City Research Online

Recommended from our members

Learning Distributed Representations for Multiple-Viewpoint Melodic Prediction

Author: Cherla S.
Garcez A.
Pearce M.
Weyde T.
Publication venue
Publication date: 01/01/2013
Field of study

The analysis of sequences is important for extracting in- formation from music owing to its fundamentally temporal nature. In this paper, we present a distributed model based on the Restricted Boltzmann Machine (RBM) for learning melodic sequences. The model is similar to a previous suc- cessful neural network model for natural language [2]. It is first trained to predict the next pitch in a given pitch se- quence, and then extended to also make use of information in sequences of note-durations in monophonic melodies on the same task. In doing so, we also propose an efficient way of representing this additional information that takes advantage of the RBM’s structure. Results show that this RBM-based prediction model performs better than previ- ously evaluated n-gram models and also outperforms them in certain cases. It is able to make use of information present in longer sequences more effectively than n-gram models, while scaling linearly in the number of free pa- rameters required

City Research Online

Recommended from our members

The Recurrent Temporal Discriminative Restricted Boltzmann Machines

Author: Cherla S.
Garcez A.
Tran S. N.
Weyde T.
Publication venue
Publication date: 01/01/2017
Field of study

Classification of sequence data is the topic of interest for dynamic Bayesian models and Recurrent Neural Networks (RNNs). While the former can explicitly model the temporal dependencies between class variables, the latter have a capability of learning representations. Several attempts have been made to improve performance by combining these two approaches or increasing the processing capability of the hidden units in RNNs. This often results in complex models with a large number of learning parameters. In this paper, a compact model is proposed which offers both representation learning and temporal inference of class variables by rolling Restricted Boltzmann Machines (RBMs) and class variables over time. We address the key issue of intractability in this variant of RBMs by optimising a conditional distribution, instead of a joint distribution. Experiments reported in the paper on melody modelling and optical character recognition show that the proposed model can outperform the state-of-the-art. Also, the experimental results on optical character recognition, part-of-speech tagging and text chunking demonstrate that our model is comparable to recurrent neural networks with complex memory gates while requiring far fewer parameters

City Research Online

Recommended from our members

Sequence Classification Restricted Boltzmann Machines With Gated Units

Author: Garcez A.
Karunanithi M.
Tran S. N.
Weyde T.
Yin J.
Zhang Q.
Publication venue: Institute of Electrical and Electronics Engineers (IEEE)
Publication date: 01/11/2020
Field of study

For the classification of sequential data, dynamic Bayesian networks and recurrent neural networks (RNNs) are the preferred models. While the former can explicitly model the temporal dependences between the variables, and the latter have the capability of learning representations. The recurrent temporal restricted Boltzmann machine (RTRBM) is a model that combines these two features. However, learning and inference in RTRBMs can be difficult because of the exponential nature of its gradient computations when maximizing log likelihoods. In this article, first, we address this intractability by optimizing a conditional rather than a joint probability distribution when performing sequence classification. This results in the ``sequence classification restricted Boltzmann machine'' (SCRBM). Second, we introduce gated SCRBMs (gSCRBMs), which use an information processing gate, as an integration of SCRBMs with long short-term memory (LSTM) models. In the experiments reported in this article, we evaluate the proposed models on optical character recognition, chunking, and multiresident activity recognition in smart homes. The experimental results show that gSCRBMs achieve the performance comparable to that of the state of the art in all three tasks. gSCRBMs require far fewer parameters in comparison with other recurrent networks with memory gates, in particular, LSTMs and gated recurrent units (GRUs)

City Research Online

University of Queensland eSpace

Recommended from our members

Generalising the Discriminative Restricted Boltzmann Machine

Author: Cherla S.
Garcez A.
Tran S.N.
Weyde T.
Publication venue
Publication date: 06/04/2016
Field of study

We present a novel theoretical result that generalises the Discriminative Restricted Boltzmann Machine (DRBM). While originally the DRBM was defined assuming the {0, 1}-Bernoulli distribution in each of its hidden units, this result makes it possible to derive cost functions for variants of the DRBM that utilise other distributions, including some that are often encountered in the literature. This is illustrated with the Binomial and {-1, +1}-Bernoulli distributions here. We evaluate these two DRBM variants and compare them with the original one on three benchmark datasets, namely the MNIST and USPS digit classification datasets, and the 20 Newsgroups document classification dataset. Results show that each of the three compared models outperforms the remaining two in one of the three datasets, thus indicating that the proposed theoretical generalisation of the DRBM may be valuable in practice

City Research Online

University of Tasmania Open Access Repository

Recommended from our members

A Distributed Model For Multiple-Viewpoint Melodic Prediction.

Author: Cherla S.
Garcez A.
Pearce M.
Weyde T.
Publication venue: International Society for Music Information Retrieval
Publication date: 01/01/2013
Field of study

The analysis of sequences is important for extracting information from music owing to its fundamentally temporal nature. In this paper, we present a distributed model based on the Restricted Boltzmann Machine (RBM) for melodic sequences. The model is similar to a previous successful neural network model for natural language [2]. It is first trained to predict the next pitch in a given pitch sequence, and then extended to also make use of information in sequences of note-durations in monophonic melodies on the same task. In doing so, we also propose an efficient way of representing this additional information that takes advantage of the RBM’s structure. In our evaluation, this RBM-based prediction model performs slightly better than previously evaluated n-gram models in most cases. Results on a corpus of chorale and folk melodies showed that it is able to make use of information present in longer contexts more effectively than n-gram models, while scaling linearly in the number of free parameters required

City Research Online

Análise comparativa das edições portuguesa e brasileira da obra os livros que devoraram o meu pai, Afonso Cruz

Author: Barreiro A.
Batista F.
Garcez I.
Kuhn T.
Rebelo-Arnold I.
Publication venue: Editora LiberArs
Publication date: 01/01/2021
Field of study

Neste trabalho apresentamos uma análise comparativa das edições portuguesa (versão original) e brasileira (versão adaptada) da obra de literatura infantojuvenil Os Livros que devoraram o meu pai, do autor português Afonso Cruz. Esta análise tem como objetivo contribuir para a otimização dos processos editoriais necessariamente presentes na adaptação de textos, mas que se adequam a qualquer tipo de processo editorial. Para tal, partimos de alinhamentos ao nível da frase da obra completa para realizar alinhamentos ao nível da unidade lexical multipalavra ou da expressão usando a ferramenta CLUE-Aligner, que permite registar numa base de dados todos os pares de unidades parafrásticas resultantes da tarefa de alinhamento. Focamo-nos essencialmente na comparação de construções com função adjetival e esta análise comparativa pretende verificar que tipos de alterações foram realizadas no processo de adaptação. A partir do estudo dos resultados contrastivos baseados nos pares alinhados, que, na sua maioria, correspondem a unidades parafrásticas, discutimos as implicações das modificações de ordem linguística na constituição do novo texto, em termos semânticos, pontualmente analisados também do ponto de vista literário e/ou cultural. Como forma de manter a qualidade de recepção do texto de chegada, propomos uma tomada de consciência face aos limites impostos por um texto literário, pois é ténue a fronteira entre a adaptação indispensável e a intervenção excessiva. Este estudo apresenta uma base científica para trabalhos futuros na área da edição, revisão e conversão de texto literário de e para qualquer variedade do português.info:eu-repo/semantics/acceptedVersio

Repositório Institucional do ISCTE-IUL

Recommended from our members

An RNN-based Music Language Model for Improving Automatic Music Transcription

Author: Benetos E.
Cherla S.
Dixon S.
Garcez A.
Sigtia S.
Weyde T.
Publication venue: International Society for Music Information Retrieval
Publication date: 01/01/2014
Field of study

In this paper, we investigate the use of Music Language Models (MLMs) for improving Automatic Music Transcription performance. The MLMs are trained on sequences of symbolic polyphonic music from the Nottingham dataset. We train Recurrent Neural Network (RNN)-based models, as they are capable of capturing complex temporal structure present in symbolic music data. Similar to the function of language models in automatic speech recognition, we use the MLMs to generate a prior probability for the occurrence of a sequence. The acoustic AMT model is based on probabilistic latent component analysis, and prior information from the MLM is incorporated into the transcription framework using Dirichlet priors. We test our hybrid models on a dataset of multiple-instrument polyphonic music and report a significant 3% improvement in terms of F-measure, when compared to using an acoustic-only model

City Research Online

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Recommended from our members

Accuracy and interpretability trade-offs in machine learning applied to safer gambling

Author: Dragicevic S.
Garcez A.
Percy C.
Sarkar S.
Slabaugh G. G.
Weyde T.
Publication venue: CEUR Workshop Proceedings
Publication date: 26/12/2016
Field of study

Responsible gambling is an area of research and industry which seeks to understand the pathways to harm from gambling and implement programmes to reduce or prevent harm that gambling might cause. There is a growing body of research that has used gambling behavioural data to model and predict harmful gambling, and the industry is showing increasing interest in technologies that can help gambling operators to better predict harm and prevent it through appropriate interventions. However, industry surveys and feedback clearly indicate that in order to enable wider adoption of such data-driven methods, industry and policy makers require a greater understanding of how machine learning methods make these predictions. In this paper, we make use of the TREPAN algorithm for extracting decision trees from Neural Networks and Random Forests. We present the first comparative evaluation of predictive performance and tree properties for extracted trees, which is also the first comparative evaluation of knowledge extraction for safer gambling. Results indicate that TREPAN extracts better performing trees than direct learning of decision trees from the data. Overall, trees extracted with TREPAN from different models offer a good compromise between prediction accuracy and interpretability. TREPAN can produce decision trees with extended tests rules of different forms, so that interpretability depends on multiple factors. We present detailed results and a discussion of the trade-offs with regard to performance and interpretability and use in the gambling industry

City Research Online