Search CORE

194,710 research outputs found

Recommended from our members

Variable grouping in multivariate time series via correlation

Author: Liu X
Swift S
Tucker A
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2001
Field of study

The decomposition of high-dimensional multivariate time series (MTS) into a number of low-dimensional MTS is a useful but challenging task because the number of possible dependencies between variables is likely to be huge. This paper is about a systematic study of the “variable groupings” problem in MTS. In particular, we investigate different methods of utilizing the information regarding correlations among MTS variables. This type of method does not appear to have been studied before. In all, 15 methods are suggested and applied to six datasets where there are identifiable mixed groupings of MTS variables. This paper describes the general methodology, reports extensive experimental results, and concludes with useful insights on the strength and weakness of this type of grouping metho

Brunel University Research Archive

How to Host a Data Competition: Statistical Advice for Design and Analysis of a Data Competition

Author: Anderson-Cook Christine M.
Fugate Michael L.
Lu Lu
Myers Kary L.
Pawley Norma
Quinlan Kevin R.
Publication venue: 'Wiley'
Publication date: 01/01/2019
Field of study

Data competitions rely on real-time leaderboards to rank competitor entries and stimulate algorithm improvement. While such competitions have become quite popular and prevalent, particularly in supervised learning formats, their implementations by the host are highly variable. Without careful planning, a supervised learning competition is vulnerable to overfitting, where the winning solutions are so closely tuned to the particular set of provided data that they cannot generalize to the underlying problem of interest to the host. This paper outlines some important considerations for strategically designing relevant and informative data sets to maximize the learning outcome from hosting a competition based on our experience. It also describes a post-competition analysis that enables robust and efficient assessment of the strengths and weaknesses of solutions from different competitors, as well as greater understanding of the regions of the input space that are well-solved. The post-competition analysis, which complements the leaderboard, uses exploratory data analysis and generalized linear models (GLMs). The GLMs not only expand the range of results we can explore, they also provide more detailed analysis of individual sub-questions including similarities and differences between algorithms across different types of scenarios, universally easy or hard regions of the input space, and different learning objectives. When coupled with a strategically planned data generation approach, the methods provide richer and more informative summaries to enhance the interpretation of results beyond just the rankings on the leaderboard. The methods are illustrated with a recently completed competition to evaluate algorithms capable of detecting, identifying, and locating radioactive materials in an urban environment.Comment: 36 page

arXiv.org e-Print Archive

USFSP Digital Archive

Scholar Commons - University of South Florida

Cross-modal Recurrent Models for Weight Objective Prediction from Multimodal Time-series Data

Author: Bellahsen Otmane
Bhattacharya Sourav
Chieh Angela
Karazija Laurynas
Lane Nicholas D.
Liberis Edgar
Liò Pietro
Vegreville Matthieu
Veličković Petar
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/11/2017
Field of study

We analyse multimodal time-series data corresponding to weight, sleep and steps measurements. We focus on predicting whether a user will successfully achieve his/her weight objective. For this, we design several deep long short-term memory (LSTM) architectures, including a novel cross-modal LSTM (X-LSTM), and demonstrate their superiority over baseline approaches. The X-LSTM improves parameter efficiency by processing each modality separately and allowing for information flow between them by way of recurrent cross-connections. We present a general hyperparameter optimisation technique for X-LSTMs, which allows us to significantly improve on the LSTM and a prior state-of-the-art cross-modal approach, using a comparable number of parameters. Finally, we visualise the model's predictions, revealing implications about latent variables in this task.Comment: To appear in NIPS ML4H 2017 and NIPS TSW 201

arXiv.org e-Print Archive

Crossref

Decision Making for Rapid Information Acquisition in the Reconnaissance of Random Fields

Author: Baillieul John
Baronov Dimitar
Publication venue
Publication date: 26/07/2011
Field of study

Research into several aspects of robot-enabled reconnaissance of random fields is reported. The work has two major components: the underlying theory of information acquisition in the exploration of unknown fields and the results of experiments on how humans use sensor-equipped robots to perform a simulated reconnaissance exercise. The theoretical framework reported herein extends work on robotic exploration that has been reported by ourselves and others. Several new figures of merit for evaluating exploration strategies are proposed and compared. Using concepts from differential topology and information theory, we develop the theoretical foundation of search strategies aimed at rapid discovery of topological features (locations of critical points and critical level sets) of a priori unknown differentiable random fields. The theory enables study of efficient reconnaissance strategies in which the tradeoff between speed and accuracy can be understood. The proposed approach to rapid discovery of topological features has led in a natural way to to the creation of parsimonious reconnaissance routines that do not rely on any prior knowledge of the environment. The design of topology-guided search protocols uses a mathematical framework that quantifies the relationship between what is discovered and what remains to be discovered. The quantification rests on an information theory inspired model whose properties allow us to treat search as a problem in optimal information acquisition. A central theme in this approach is that "conservative" and "aggressive" search strategies can be precisely defined, and search decisions regarding "exploration" vs. "exploitation" choices are informed by the rate at which the information metric is changing.Comment: 34 pages, 20 figure

arXiv.org e-Print Archive

Crossref

Boston University Institutional Repository (OpenBU)

SkILL - a Stochastic Inductive Logic Learner

Author: Côrte-Real Joana
Dutra Inês
Mantadelis Theofrastos
Rocha Ricardo
Publication venue
Publication date: 02/06/2015
Field of study

Probabilistic Inductive Logic Programming (PILP) is a rel- atively unexplored area of Statistical Relational Learning which extends classic Inductive Logic Programming (ILP). This work introduces SkILL, a Stochastic Inductive Logic Learner, which takes probabilistic annotated data and produces First Order Logic theories. Data in several domains such as medicine and bioinformatics have an inherent degree of uncer- tainty, that can be used to produce models closer to reality. SkILL can not only use this type of probabilistic data to extract non-trivial knowl- edge from databases, but it also addresses efficiency issues by introducing a novel, efficient and effective search strategy to guide the search in PILP environments. The capabilities of SkILL are demonstrated in three dif- ferent datasets: (i) a synthetic toy example used to validate the system, (ii) a probabilistic adaptation of a well-known biological metabolism ap- plication, and (iii) a real world medical dataset in the breast cancer domain. Results show that SkILL can perform as well as a deterministic ILP learner, while also being able to incorporate probabilistic knowledge that would otherwise not be considered

arXiv.org e-Print Archive

Crossref

Generative Adversarial Networks for Financial Trading Strategies Fine-Tuning and Combination

Author: Firoozye Nick
Koshiyama Adriano
Treleaven Philip
Publication venue
Publication date: 30/03/2019
Field of study

Systematic trading strategies are algorithmic procedures that allocate assets aiming to optimize a certain performance criterion. To obtain an edge in a highly competitive environment, the analyst needs to proper fine-tune its strategy, or discover how to combine weak signals in novel alpha creating manners. Both aspects, namely fine-tuning and combination, have been extensively researched using several methods, but emerging techniques such as Generative Adversarial Networks can have an impact into such aspects. Therefore, our work proposes the use of Conditional Generative Adversarial Networks (cGANs) for trading strategies calibration and aggregation. To this purpose, we provide a full methodology on: (i) the training and selection of a cGAN for time series data; (ii) how each sample is used for strategies calibration; and (iii) how all generated samples can be used for ensemble modelling. To provide evidence that our approach is well grounded, we have designed an experiment with multiple trading strategies, encompassing 579 assets. We compared cGAN with an ensemble scheme and model validation methods, both suited for time series. Our results suggest that cGANs are a suitable alternative for strategies calibration and combination, providing outperformance when the traditional techniques fail to generate any alpha

arXiv.org e-Print Archive

UCL Discovery