3,058 research outputs found

    Least third-order cumulant method with adaptive regularization parameter selection for neural networks

    Get PDF
    AbstractThis paper introduces an interesting property of the least third-order cumulant objective function. The property is that the solution is optimal when the gradients of Mean Squares error and third-order cumulant error are zero vectors. The optimal solutions are independent of the value of regularization parameter λ. Also, an adaptive regularization parameter selection method is derived to control the convergences of Mean Squares error and the cumulant error terms. The proposed selection method is able to tunnel through the sub-optimal solutions, of which the locations are controllable, via changing the value of the regularization parameter. Consequently, the least third-order cumulant method with the adaptive regularization parameter selection method is theoretically capable of estimating an optimal solution when it is applied to regression problems

    Rails Quality Data Modelling via Machine Learning-Based Paradigms

    Get PDF

    Application of Sparse Identification of Nonlinear Dynamics for Physics-Informed Learning

    Get PDF
    Advances in machine learning and deep neural networks has enabled complex engineering tasks like image recognition, anomaly detection, regression, and multi-objective optimization, to name but a few. The complexity of the algorithm architecture, e.g., the number of hidden layers in a deep neural network, typically grows with the complexity of the problems they are required to solve, leaving little room for interpreting (or explaining) the path that results in a specific solution. This drawback is particularly relevant for autonomous aerospace and aviation systems, where certifications require a complete understanding of the algorithm behavior in all possible scenarios. Including physics knowledge in such data-driven tools may improve the interpretability of the algorithms, thus enhancing model validation against events with low probability but relevant for system certification. Such events include, for example, spacecraft or aircraft sub-system failures, for which data may not be available in the training phase. This paper investigates a recent physics-informed learning algorithm for identification of system dynamics, and shows how the governing equations of a system can be extracted from data using sparse regression. The learned relationships can be utilized as a surrogate model which, unlike typical data-driven surrogate models, relies on the learned underlying dynamics of the system rather than large number of fitting parameters. The work shows that the algorithm can reconstruct the differential equations underlying the observed dynamics using a single trajectory when no uncertainty is involved. However, the training set size must increase when dealing with stochastic systems, e.g., nonlinear dynamics with random initial conditions

    Dynamic learning with neural networks and support vector machines

    Get PDF
    Neural network approach has proven to be a universal approximator for nonlinear continuous functions with an arbitrary accuracy. It has been found to be very successful for various learning and prediction tasks. However, supervised learning using neural networks has some limitations because of the black box nature of their solutions, experimental network parameter selection, danger of overfitting, and convergence to local minima instead of global minima. In certain applications, the fixed neural network structures do not address the effect on the performance of prediction as the number of available data increases. Three new approaches are proposed with respect to these limitations of supervised learning using neural networks in order to improve the prediction accuracy.;Dynamic learning model using evolutionary connectionist approach . In certain applications, the number of available data increases over time. The optimization process determines the number of the input neurons and the number of neurons in the hidden layer. The corresponding globally optimized neural network structure will be iteratively and dynamically reconfigured and updated as new data arrives to improve the prediction accuracy. Improving generalization capability using recurrent neural network and Bayesian regularization. Recurrent neural network has the inherent capability of developing an internal memory, which may naturally extend beyond the externally provided lag spaces. Moreover, by adding a penalty term of sum of connection weights, Bayesian regularization approach is applied to the network training scheme to improve the generalization performance and lower the susceptibility of overfitting. Adaptive prediction model using support vector machines . The learning process of support vector machines is focused on minimizing an upper bound of the generalization error that includes the sum of the empirical training error and a regularized confidence interval, which eventually results in better generalization performance. Further, this learning process is iteratively and dynamically updated after every occurrence of new data in order to capture the most current feature hidden inside the data sequence.;All the proposed approaches have been successfully applied and validated on applications related to software reliability prediction and electric power load forecasting. Quantitative results show that the proposed approaches achieve better prediction accuracy compared to existing approaches

    G-Mix: A Generalized Mixup Learning Framework Towards Flat Minima

    Full text link
    Deep neural networks (DNNs) have demonstrated promising results in various complex tasks. However, current DNNs encounter challenges with over-parameterization, especially when there is limited training data available. To enhance the generalization capability of DNNs, the Mixup technique has gained popularity. Nevertheless, it still produces suboptimal outcomes. Inspired by the successful Sharpness-Aware Minimization (SAM) approach, which establishes a connection between the sharpness of the training loss landscape and model generalization, we propose a new learning framework called Generalized-Mixup, which combines the strengths of Mixup and SAM for training DNN models. The theoretical analysis provided demonstrates how the developed G-Mix framework enhances generalization. Additionally, to further optimize DNN performance with the G-Mix framework, we introduce two novel algorithms: Binary G-Mix and Decomposed G-Mix. These algorithms partition the training data into two subsets based on the sharpness-sensitivity of each example to address the issue of "manifold intrusion" in Mixup. Both theoretical explanations and experimental results reveal that the proposed BG-Mix and DG-Mix algorithms further enhance model generalization across multiple datasets and models, achieving state-of-the-art performance.Comment: 19 pages, 23 figure

    AdaER: An Adaptive Experience Replay Approach for Continual Lifelong Learning

    Full text link
    Continual lifelong learning is an machine learning framework inspired by human learning, where learners are trained to continuously acquire new knowledge in a sequential manner. However, the non-stationary nature of streaming training data poses a significant challenge known as catastrophic forgetting, which refers to the rapid forgetting of previously learned knowledge when new tasks are introduced. While some approaches, such as experience replay (ER), have been proposed to mitigate this issue, their performance remains limited, particularly in the class-incremental scenario which is considered natural and highly challenging. In this paper, we present a novel algorithm, called adaptive-experience replay (AdaER), to address the challenge of continual lifelong learning. AdaER consists of two stages: memory replay and memory update. In the memory replay stage, AdaER introduces a contextually-cued memory recall (C-CMR) strategy, which selectively replays memories that are most conflicting with the current input data in terms of both data and task. Additionally, AdaER incorporates an entropy-balanced reservoir sampling (E-BRS) strategy to enhance the performance of the memory buffer by maximizing information entropy. To evaluate the effectiveness of AdaER, we conduct experiments on established supervised continual lifelong learning benchmarks, specifically focusing on class-incremental learning scenarios. The results demonstrate that AdaER outperforms existing continual lifelong learning baselines, highlighting its efficacy in mitigating catastrophic forgetting and improving learning performance.Comment: 18 pages, 26 figure

    A Magnetorheological Damper with Embedded Piezoelectric Force Sensor: Experiment and Modeling

    Get PDF
    This chapter describes configuration, fabrication, calibration and performance tests of the devised self-sensing MR damper firstly. Then, a black-box identification approach for modeling the forward and inverse dynamics of the self-sensing MR damper is presented, which is developed with the synthesis of NARX model and neural network within a Bayesian inference framework to have the ability of enhancing generalization.Department of Civil and Environmental Engineerin
    corecore