Search CORE

330 research outputs found

On the smoothness of nonlinear system identification

Author: Aguirre Luis A.
Ribeiro Antônio H.
Schön Thomas B.
Tiels Koen
Umenberger Jack
Publication venue: 'Elsevier BV'
Publication date: 07/08/2020
Field of study

We shed new light on the \textit{smoothness} of optimization problems arising in prediction error parameter estimation of linear and nonlinear systems. We show that for regions of the parameter space where the model is not contractive, the Lipschitz constant and

\beta

-smoothness of the objective function might blow up exponentially with the simulation length, making it hard to numerically find minima within those regions or, even, to escape from them. In addition to providing theoretical understanding of this problem, this paper also proposes the use of multiple shooting as a viable solution. The proposed method minimizes the error between a prediction model and the observed values. Rather than running the prediction model over the entire dataset, multiple shooting splits the data into smaller subsets and runs the prediction model over each subset, making the simulation length a design parameter and making it possible to solve problems that would be infeasible using a standard approach. The equivalence to the original problem is obtained by including constraints in the optimization. The new method is illustrated by estimating the parameters of nonlinear systems with chaotic or unstable behavior, as well as neural networks. We also present a comparative analysis of the proposed method with multi-step-ahead prediction error minimization

arXiv.org e-Print Archive

Pure OAI Repository

On Merging Feature Engineering and Deep Learning for Diagnosis, Risk-Prediction and Age Estimation Based on the 12-Lead ECG

Author: Behar Joachim A.
Read Jesse
Ribeiro Antonio Luiz P.
Ribeiro Antônio H.
Zvuloni Eran
Publication venue
Publication date: 16/07/2022
Field of study

Objective: Machine learning techniques have been used extensively for 12-lead electrocardiogram (ECG) analysis. For physiological time series, deep learning (DL) superiority to feature engineering (FE) approaches based on domain knowledge is still an open question. Moreover, it remains unclear whether combining DL with FE may improve performance. Methods: We considered three tasks intending to address these research gaps: cardiac arrhythmia diagnosis (multiclass-multilabel classification), atrial fibrillation risk prediction (binary classification), and age estimation (regression). We used an overall dataset of 2.3M 12-lead ECG recordings to train the following models for each task: i) a random forest taking the FE as input was trained as a classical machine learning approach; ii) an end-to-end DL model; and iii) a merged model of FE+DL. Results: FE yielded comparable results to DL while necessitating significantly less data for the two classification tasks and it was outperformed by DL for the regression task. For all tasks, merging FE with DL did not improve performance over DL alone. Conclusion: We found that for traditional 12-lead ECG based diagnosis tasks DL did not yield a meaningful improvement over FE, while it improved significantly the nontraditional regression task. We also found that combining FE with DL did not improve over DL alone which suggests that the FE were redundant with the features learned by DL. Significance: Our findings provides important recommendations on what machine learning strategy and data regime to chose with respect to the task at hand for the development of new machine learning models based on the 12-lead ECG

arXiv.org e-Print Archive

Beyond exploding and vanishing gradients:analysing RNN training using attractors and smoothness

Author: Aguirre Luis A.
Ribeiro Antônio H.
Schön Thomas B.
Tiels Koen
Publication venue: PMLR
Publication date: 26/08/2020
Field of study

The exploding and vanishing gradient problem has been the major conceptual principle behind most architecture and training improvements in recurrent neural networks (RNNs) during the last decade. In this paper, we argue that this principle, while powerful, might need some refinement to explain recent developments. We refine the concept of exploding gradients by reformulating the problem in terms of the cost function smoothness, which gives insight into higher-order derivatives and the existence of regions with many close local minima. We also clarify the distinction between vanishing gradients and the need for the RNN to learn attractors to fully use its expressive power. Through the lens of these refinements, we shed new light on recent developments in the RNN field, namely stable RNN and unitary (or orthogonal) RNNs

Pure OAI Repository

Beyond exploding and vanishing gradients: analysing RNN training using attractors and smoothness

Author: Aguirre Luis A.
Ribeiro Antônio H.
Schön Thomas B.
Tiels Koen
Publication venue
Publication date: 05/03/2020
Field of study

arXiv.org e-Print Archive

Pure OAI Repository