Search CORE

1,088 research outputs found

Evolutionary estimation of a Coupled Markov Chain credit risk model

Author: A. Brabazon
A.J. McNeil
D. Duffie
H.M. Markowitz
J. Kennedy
J. Zhang
P.J. Schönbucher
R. Hochreiter
R. Hochreiter
R. Hochreiter
R. Merton
S. Hager
Y.M. Kaniovski
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

There exists a range of different models for estimating and simulating credit risk transitions to optimally manage credit risk portfolios and products. In this chapter we present a Coupled Markov Chain approach to model rating transitions and thereby default probabilities of companies. As the likelihood of the model turns out to be a non-convex function of the parameters to be estimated, we apply heuristics to find the ML estimators. To this extent, we outline the model and its likelihood function, and present both a Particle Swarm Optimization algorithm, as well as an Evolutionary Optimization algorithm to maximize the likelihood function. Numerical results are shown which suggest a further application of evolutionary optimization techniques for credit risk management

arXiv.org e-Print Archive

CiteSeerX

Crossref

How did the discussion go: Discourse act classification in social media conversations

Author: B O’Connor
J Bollen
K Scott
Mirko Lai
ML Larson
S Bhatia
S Hochreiter
S Hochreiter
Subhabrata Dutta
T Chakraborty
V Eisenlauer
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/08/2018
Field of study

We propose a novel attention based hierarchical LSTM model to classify discourse act sequences in social media conversations, aimed at mining data from online discussion using textual meanings beyond sentence level. The very uniqueness of the task is the complete categorization of possible pragmatic roles in informal textual discussions, contrary to extraction of question-answers, stance detection or sarcasm identification which are very much role specific tasks. Early attempt was made on a Reddit discussion dataset. We train our model on the same data, and present test results on two different datasets, one from Reddit and one from Facebook. Our proposed model outperformed the previous one in terms of domain independence; without using platform-dependent structural features, our hierarchical LSTM with word relevance attention mechanism achieved F1-scores of 71\% and 66\% respectively to predict discourse roles of comments in Reddit and Facebook discussions. Efficiency of recurrent and convolutional architectures in order to learn discursive representation on the same task has been presented and analyzed, with different word and comment embedding schemes. Our attention mechanism enables us to inquire into relevance ordering of text segments according to their roles in discourse. We present a human annotator experiment to unveil important observations about modeling and data annotation. Equipped with our text-based discourse identification model, we inquire into how heterogeneous non-textual features like location, time, leaning of information etc. play their roles in charaterizing online discussions on Facebook

arXiv.org e-Print Archive

Crossref

Learning Temporal Transformations From Time-Lapse Videos

Author: GE Hinton
J Yuen
KM Kitani
R Martin-Brualla
S Hochreiter
Y Shih
Publication venue
Publication date: 27/08/2016
Field of study

Based on life-long observations of physical, chemical, and biologic phenomena in the natural world, humans can often easily picture in their minds what an object will look like in the future. But, what about computers? In this paper, we learn computational models of object transformations from time-lapse videos. In particular, we explore the use of generative models to create depictions of objects at future times. These models explore several different prediction tasks: generating a future state given a single depiction of an object, generating a future state given two depictions of an object at different times, and generating future states recursively in a recurrent framework. We provide both qualitative and quantitative evaluations of the generated results, and also conduct a human evaluation to compare variations of our models.Comment: ECCV201

arXiv.org e-Print Archive

Crossref

Evolutionary multi-stage financial scenario tree generation

Author: A. Brabazon
A. Eichhorn
C. Blum
G.C. Pflug
H. Heitsch
H.M. Markowitz
J. Dang
J. Dupačová
K. Høyland
M. Koivu
M.C. Steinbach
P. Artzner
R. Hochreiter
R. Hochreiter
S.T. Rachev
T. Pennanen
T. Pennanen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Multi-stage financial decision optimization under uncertainty depends on a careful numerical approximation of the underlying stochastic process, which describes the future returns of the selected assets or asset categories. Various approaches towards an optimal generation of discrete-time, discrete-state approximations (represented as scenario trees) have been suggested in the literature. In this paper, a new evolutionary algorithm to create scenario trees for multi-stage financial optimization models will be presented. Numerical results and implementation details conclude the paper

arXiv.org e-Print Archive

Crossref

Identifying cross country skiing techniques using power meters in ski poles

Author: A Graves
F Marshland
J Jang
J Nilsson
O Rindal
S Hochreiter
T Stöggl
Y Sakurai
Y Sakurai
Publication venue
Publication date: 01/01/2019
Field of study

Power meters are becoming a widely used tool for measuring training and racing effort in cycling, and are now spreading also to other sports. This means that increasing volumes of data can be collected from athletes, with the aim of helping coaches and athletes analyse and understanding training load, racing efforts, technique etc. In this project, we have collaborated with Skisens AB, a company producing handles for cross country ski poles equipped with power meters. We have conducted a pilot study in the use of machine learning techniques on data from Skisens poles to identify which "gear" a skier is using (double poling or gears 2-4 in skating), based only on the sensor data from the ski poles. The dataset for this pilot study contained labelled time-series data from three individual skiers using four different gears recorded in varied locations and varied terrain. We systematically evaluated a number of machine learning techniques based on neural networks with best results obtained by a LSTM network (accuracy of 95% correctly classified strokes), when a subset of data from all three skiers was used for training. As expected, accuracy dropped to 78% when the model was trained on data from only two skiers and tested on the third. To achieve better generalisation to individuals not appearing in the training set more data is required, which is ongoing work.Comment: Presented at the Norwegian Artificial Intelligence Symposium 201

arXiv.org e-Print Archive

Crossref

Chalmers Research

Improving Search through A3C Reinforcement Learning based Conversational Agent

Author: EL Deci
G Shani
H Cuayhuitl
H Cuayáhuitl
J Wei
JS Bridle
RS Sutton
S Hochreiter
Publication venue
Publication date: 19/08/2018
Field of study

We develop a reinforcement learning based search assistant which can assist users through a set of actions and sequence of interactions to enable them realize their intent. Our approach caters to subjective search where the user is seeking digital assets such as images which is fundamentally different from the tasks which have objective and limited search modalities. Labeled conversational data is generally not available in such search tasks and training the agent through human interactions can be time consuming. We propose a stochastic virtual user which impersonates a real user and can be used to sample user behavior efficiently to train the agent which accelerates the bootstrapping of the agent. We develop A3C algorithm based context preserving architecture which enables the agent to provide contextual assistance to the user. We compare the A3C agent with Q-learning and evaluate its performance on average rewards and state values it obtains with the virtual user in validation episodes. Our experiments show that the agent learns to achieve higher rewards and better states.Comment: 17 pages, 7 figure

arXiv.org e-Print Archive

Crossref

Deep Tree Transductions - A Short Survey

Author: C Gallicchio
D Bacciu
D Bacciu
D Bacciu
J Clarke
M Diligenti
P Frasconi
S Hochreiter
T Cohn
Publication venue
Publication date: 01/01/2019
Field of study

The paper surveys recent extensions of the Long-Short Term Memory networks to handle tree structures from the perspective of learning non-trivial forms of isomorph structured transductions. It provides a discussion of modern TreeLSTM models, showing the effect of the bias induced by the direction of tree processing. An empirical analysis is performed on real-world benchmarks, highlighting how there is no single model adequate to effectively approach all transduction problems.Comment: To appear in the Proceedings of the 2019 INNS Big Data and Deep Learning (INNSBDDL 2019). arXiv admin note: text overlap with arXiv:1809.0909

arXiv.org e-Print Archive

Crossref

Archivio della Ricerca - Università di Pisa

Comparison of System Call Representations for Intrusion Detection

Author: A Sharma
AP Kosoresow
E Eskin
FA Gers
G Creech
H He
J McHugh
N Srivastava
S Hochreiter
SA Hofmeyr
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/05/2019
Field of study

Over the years, artificial neural networks have been applied successfully in many areas including IT security. Yet, neural networks can only process continuous input data. This is particularly challenging for security-related non-continuous data like system calls. This work focuses on four different options to preprocess sequences of system calls so that they can be processed by neural networks. These input options are based on one-hot encoding and learning word2vec or GloVe representations of system calls. As an additional option, we analyze if the mapping of system calls to their respective kernel modules is an adequate generalization step for (a) replacing system calls or (b) enhancing system call data with additional information regarding their context. However, when performing such preprocessing steps it is important to ensure that no relevant information is lost during the process. The overall objective of system call based intrusion detection is to categorize sequences of system calls as benign or malicious behavior. Therefore, this scenario is used to evaluate the different input options as a classification task. The results show, that each of the four different methods is a valid option when preprocessing input data, but the use of kernel modules only is not recommended because too much information is being lost during the mapping process.Comment: 12 pages, 1 figure, submitted to CISIS 201

arXiv.org e-Print Archive

Crossref

RePAD: Real-time Proactive Anomaly Detection for Time Series

Author: J Wu
J Xu
J-C Lin
M-C Lee
N Zou
RC Staudemeyer
S Hochreiter
S Hochreiter
WD Fisher
X Ma
Z Zhao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/10/2021
Field of study

During the past decade, many anomaly detection approaches have been introduced in different fields such as network monitoring, fraud detection, and intrusion detection. However, they require understanding of data pattern and often need a long off-line period to build a model or network for the target data. Providing real-time and proactive anomaly detection for streaming time series without human intervention and domain knowledge is highly valuable since it greatly reduces human effort and enables appropriate countermeasures to be undertaken before a disastrous damage, failure, or other harmful event occurs. However, this issue has not been well studied yet. To address it, this paper proposes RePAD, which is a Real-time Proactive Anomaly Detection algorithm for streaming time series based on Long Short-Term Memory (LSTM). RePAD utilizes short-term historic data points to predict and determine whether or not the upcoming data point is a sign that an anomaly is likely to happen in the near future. By dynamically adjusting the detection threshold over time, RePAD is able to tolerate minor pattern change in time series and detect anomalies either proactively or on time. Experiments based on two time series datasets collected from the Numenta Anomaly Benchmark demonstrate that RePAD is able to proactively detect anomalies and provide early warnings in real time without human intervention and domain knowledge.Comment: 12 pages, 8 figures, the 34th International Conference on Advanced Information Networking and Applications (AINA 2020

arXiv.org e-Print Archive

Crossref

Regularized Neural User Model for Goal-Oriented Spoken Dialogue Systems

Author: E Levin
J Schatzmann
O Pietquin
S Chandramohan
S Hochreiter
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

User simulation is widely used to generate artificial dialogues in order to train statistical spoken dialogue systems and perform evaluations. This paper presents a neural network approach for user modeling that exploits an encoder-decoder bidirectional architecture with a regularization layer for each dialogue act. In order to minimize the impact of data sparsity, the dialogue act space is compressed according to the user goal. Experiments on the Dialogue State Tracking Challenge 2 (DSTC2) dataset provide significant results at dialogue act and slot level predictions, outperforming previous neural user modeling approaches in terms of F1 score.Spanish Minister of Science under grants TIN2014-54288-C4-4-R and TIN2017-85854-C4-3-R and by the EU H2020 EMPATHIC project grant number 769872

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital para la Docencia y la Investigación