Search CORE

283 research outputs found

Adaptive Normalized Risk-Averting Training For Deep Neural Networks

Author: Lo James
Oates Tim
Wang Zhiguang
Publication venue
Publication date: 02/03/2016
Field of study

This paper proposes a set of new error criteria and learning approaches, Adaptive Normalized Risk-Averting Training (ANRAT), to attack the non-convex optimization problem in training deep neural networks (DNNs). Theoretically, we demonstrate its effectiveness on global and local convexity lower-bounded by the standard

L_p

-norm error. By analyzing the gradient on the convexity index

\lambda

, we explain the reason why to learn

\lambda

adaptively using gradient descent works. In practice, we show how this method improves training of deep neural networks to solve visual recognition tasks on the MNIST and CIFAR-10 datasets. Without using pretraining or other tricks, we obtain results comparable or superior to those reported in recent literature on the same tasks using standard ConvNets + MSE/cross entropy. Performance on deep/shallow multilayer perceptrons and Denoised Auto-encoders is also explored. ANRAT can be combined with other quasi-Newton training methods, innovative network variants, regularization techniques and other specific tricks in DNNs. Other than unsupervised pretraining, it provides a new perspective to address the non-convex optimization problem in DNNs.Comment: AAAI 2016, 0.39%~0.4% ER on MNIST with single 32-32-256-10 ConvNets, code available at https://github.com/cauchyturing/ANRA

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Adopting Robustness and Optimality in Fitting and Learning

Author: Lo James
Oates Tim
Wang Zhiguang
Publication venue
Publication date: 18/10/2023
Field of study

We generalized a modified exponentialized estimator by pushing the robust-optimal (RO) index

\lambda

-\infty

for achieving robustness to outliers by optimizing a quasi-Minimin function. The robustness is realized and controlled adaptively by the RO index without any predefined threshold. Optimality is guaranteed by expansion of the convexity region in the Hessian matrix to largely avoid local optima. Detailed quantitative analysis on both robustness and optimality are provided. The results of proposed experiments on fitting tasks for three noisy non-convex functions and the digits recognition task on the MNIST dataset consolidate the conclusions.Comment: arXiv admin note: text overlap with arXiv:1506.0269

arXiv.org e-Print Archive

inTformer: A Time-Embedded Attention-Based Transformer for Crash Likelihood Prediction at Intersections Using Connected Vehicle Data

Author: Abdel-Aty Mohamed
Anik B. M. Tazbiul Hassan
Islam Zubayer
Publication venue
Publication date: 07/07/2023
Field of study

The real-time crash likelihood prediction model is an essential component of the proactive traffic safety management system. Over the years, numerous studies have attempted to construct a crash likelihood prediction model in order to enhance traffic safety, but mostly on freeways. In the majority of the existing studies, researchers have primarily employed a deep learning-based framework to identify crash potential. Lately, Transformer has emerged as a potential deep neural network that fundamentally operates through attention-based mechanisms. Transformer has several functional benefits over extant deep learning models such as Long Short-Term Memory (LSTM), Convolution Neural Network (CNN), etc. Firstly, Transformer can readily handle long-term dependencies in a data sequence. Secondly, Transformer can parallelly process all elements in a data sequence during training. Finally, Transformer does not have the vanishing gradient issue. Realizing the immense possibility of Transformer, this paper proposes inTersection-Transformer (inTformer), a time-embedded attention-based Transformer model that can effectively predict intersection crash likelihood in real-time. The proposed model was evaluated using connected vehicle data extracted from INRIX's Signal Analytics Platform. The data was parallelly formatted and stacked at different timesteps to develop nine inTformer models. The best inTformer model achieved a sensitivity of 73%. This model was also compared to earlier studies on crash likelihood prediction at intersections and with several established deep learning models trained on the same connected vehicle dataset. In every scenario, this inTformer outperformed the benchmark models confirming the viability of the proposed inTformer architecture.Comment: 29 pages, 7 figures, 9 table

arXiv.org e-Print Archive

Peer-to-Peer Energy Trading in Smart Residential Environment with User Behavioral Modeling

Author: Timilsina Ashutosh
Publication venue: UKnowledge
Publication date: 01/01/2023
Field of study

Electric power systems are transforming from a centralized unidirectional market to a decentralized open market. With this shift, the end-users have the possibility to actively participate in local energy exchanges, with or without the involvement of the main grid. Rapidly reducing prices for Renewable Energy Technologies (RETs), supported by their ease of installation and operation, with the facilitation of Electric Vehicles (EV) and Smart Grid (SG) technologies to make bidirectional flow of energy possible, has contributed to this changing landscape in the distribution side of the traditional power grid. Trading energy among users in a decentralized fashion has been referred to as Peer- to-Peer (P2P) Energy Trading, which has attracted significant attention from the research and industry communities in recent times. However, previous research has mostly focused on engineering aspects of P2P energy trading systems, often neglecting the central role of users in such systems. P2P trading mechanisms require active participation from users to decide factors such as selling prices, storing versus trading energy, and selection of energy sources among others. The complexity of these tasks, paired with the limited cognitive and time capabilities of human users, can result sub-optimal decisions or even abandonment of such systems if performance is not satisfactory. Therefore, it is of paramount importance for P2P energy trading systems to incorporate user behavioral modeling that captures users’ individual trading behaviors, preferences, and perceived utility in a realistic and accurate manner. Often, such user behavioral models are not known a priori in real-world settings, and therefore need to be learned online as the P2P system is operating. In this thesis, we design novel algorithms for P2P energy trading. By exploiting a variety of statistical, algorithmic, machine learning, and behavioral economics tools, we propose solutions that are able to jointly optimize the system performance while taking into account and learning realistic model of user behavior. The results in this dissertation has been published in IEEE Transactions on Green Communications and Networking 2021, Proceedings of IEEE Global Communication Conference 2022, Proceedings of IEEE Conference on Pervasive Computing and Communications 2023 and ACM Transactions on Evolutionary Learning and Optimization 2023

University of Kentucky

Recommended from our members

EARLY-WARNING PREDICTION FOR MACHINE FAILURES IN AUTOMATED INDUSTRIES USING ADVANCED MACHINE LEARNING TECHNIQUES

Author: Singh Satnam
Publication venue: CSUSB ScholarWorks
Publication date: 01/12/2023
Field of study

This Culminating Experience Project explores the use of machine learning algorithms to detect machine failure. The research questions are: Q1) How does the quality of input data, including issues such as outliers, and noise, impact the accuracy and reliability of machine failure prediction models in industrial settings? Q2) How does the integration of SMOTE with feature engineering techniques influence the overall performance of machine learning models in detecting and preventing machine failures? Q3) What is the performance of different machine learning algorithms in predicting machine failures, and which algorithm is the most effective? The research findings are: Q1) Effective outlier handling is vital for predictive maintenance as the variables distribution initially showed a right-skewed pattern but after rectifying, it became more centralized, with correlations between specific sensors showing potential for further exploration. Q2) Data balancing through SMOTE and feature engineering is essential due to the rarity of actual failure instances. Substantial challenges are observed when predicting \u27Failure\u27 instances, with a lower true positive rate (73%), resulting in low precision (0.02) and recall (0.73) for \u27Failure\u27 predictions. This is further reflected in the low F1-Score (0.03) for \u27Failure,\u27 indicating a trade-off between precision and recall. Despite a commendable overall accuracy of 94%, the class imbalance within the dataset (92,200 \u27Running\u27 instances vs. 126 \u27Failure\u27 instances) remains a contributing factor to the model\u27s limitations. Q3) Machine learning algorithm performance varies, with Catboost excelling in accuracy and failure detection. The choice of algorithm and continuous model refinement are critical for enhanced predictive accuracy in industrial contexts. The main conclusions are: Q1) Addressing outliers in data preprocessing significantly enhances the accuracy of machine failure prediction models. Q2) focuses on addressing the issue of equipment failure parameter imbalance. It was found in the research findings that there was a significant imbalance in the failure data, with only 0.14% of the dataset representing actual failures and 99.86% of the dataset pertaining to non-failure data. This extreme class disparity can result in biased models that underperform on underrepresented classes, which is a common problem in machine learning. Q3) Catboost outperforms other algorithms in predicting machine failures with remarkable accuracy and failure detection rates of 92% accuracy and 99% times it is correct, and further exploration of diverse data and algorithms is needed for tailored industrial applications. Future research areas include advanced outlier handling, sensor relationships, and data balancing for improved model accuracy. Addressing rare failures, enhancing model performance, and exploring diverse machine learning algorithms are critical for advancing predictive maintenance

CSUSB ScholarWorks

On Tilted Losses in Machine Learning: Theory and Applications

Author: Beirami Ahmad
Li Tian
Sanjabi Maziar
Smith Virginia
Publication venue
Publication date: 01/06/2023
Field of study

Exponential tilting is a technique commonly used in fields such as statistics, probability, information theory, and optimization to create parametric distribution shifts. Despite its prevalence in related fields, tilting has not seen widespread use in machine learning. In this work, we aim to bridge this gap by exploring the use of tilting in risk minimization. We study a simple extension to ERM -- tilted empirical risk minimization (TERM) -- which uses exponential tilting to flexibly tune the impact of individual losses. The resulting framework has several useful properties: We show that TERM can increase or decrease the influence of outliers, respectively, to enable fairness or robustness; has variance-reduction properties that can benefit generalization; and can be viewed as a smooth approximation to the tail probability of losses. Our work makes rigorous connections between TERM and related objectives, such as Value-at-Risk, Conditional Value-at-Risk, and distributionally robust optimization (DRO). We develop batch and stochastic first-order optimization methods for solving TERM, provide convergence guarantees for the solvers, and show that the framework can be efficiently solved relative to common alternatives. Finally, we demonstrate that TERM can be used for a multitude of applications in machine learning, such as enforcing fairness between subgroups, mitigating the effect of outliers, and handling class imbalance. Despite the straightforward modification TERM makes to traditional ERM objectives, we find that the framework can consistently outperform ERM and deliver competitive performance with state-of-the-art, problem-specific approaches.Comment: arXiv admin note: substantial text overlap with arXiv:2007.0116

arXiv.org e-Print Archive

Application of Spectral Solution and Neural Network Techniques in Plasma Modeling for Electric Propulsion

Author: Whitman Joseph R.
Publication venue: AFIT Scholar
Publication date: 01/09/2018
Field of study

A solver for Poisson\u27s equation was developed using the Radix-2 FFT method first invented by Carl Friedrich Gauss. Its performance was characterized using simulated data and identical boundary conditions to those found in a Hall Effect Thruster. The characterization showed errors below machine-zero with noise-free data, and above 20% noise-to-signal strength, the error increased linearly with the noise. This solver can be implemented into AFRL\u27s plasma simulator, the Thermophysics Universal Research Framework (TURF) and used to quickly and accurately compute the electric field based on charge distributions. The validity of a machine learning approach and data-based complex system modeling approach was demonstrated. To this end, several multilayer perceptrons were created and validated against AFRL-provided Hall Thruster test data, with two networks showing mean error below 1% and standard deviations below 10%. These results, while not ready for implementation as a replacement for lookup tables, strongly suggest paths for future work and the development of networks that would be acceptable in such a role, saving both RAM space and time in plasma simulations

AFTI Scholar (Air Force Institute of Technology)

Increasing the robustness of deep neural networks against adversarial attacks and solving other prominent problems in the application of machine learning

Author: Alafandi Jalal
Publication venue
Publication date: 01/01/2023
Field of study

REAL-PhD