283 research outputs found
Adaptive Normalized Risk-Averting Training For Deep Neural Networks
This paper proposes a set of new error criteria and learning approaches,
Adaptive Normalized Risk-Averting Training (ANRAT), to attack the non-convex
optimization problem in training deep neural networks (DNNs). Theoretically, we
demonstrate its effectiveness on global and local convexity lower-bounded by
the standard -norm error. By analyzing the gradient on the convexity index
, we explain the reason why to learn adaptively using
gradient descent works. In practice, we show how this method improves training
of deep neural networks to solve visual recognition tasks on the MNIST and
CIFAR-10 datasets. Without using pretraining or other tricks, we obtain results
comparable or superior to those reported in recent literature on the same tasks
using standard ConvNets + MSE/cross entropy. Performance on deep/shallow
multilayer perceptrons and Denoised Auto-encoders is also explored. ANRAT can
be combined with other quasi-Newton training methods, innovative network
variants, regularization techniques and other specific tricks in DNNs. Other
than unsupervised pretraining, it provides a new perspective to address the
non-convex optimization problem in DNNs.Comment: AAAI 2016, 0.39%~0.4% ER on MNIST with single 32-32-256-10 ConvNets,
code available at https://github.com/cauchyturing/ANRA
Adopting Robustness and Optimality in Fitting and Learning
We generalized a modified exponentialized estimator by pushing the
robust-optimal (RO) index to for achieving robustness to
outliers by optimizing a quasi-Minimin function. The robustness is realized and
controlled adaptively by the RO index without any predefined threshold.
Optimality is guaranteed by expansion of the convexity region in the Hessian
matrix to largely avoid local optima. Detailed quantitative analysis on both
robustness and optimality are provided. The results of proposed experiments on
fitting tasks for three noisy non-convex functions and the digits recognition
task on the MNIST dataset consolidate the conclusions.Comment: arXiv admin note: text overlap with arXiv:1506.0269
inTformer: A Time-Embedded Attention-Based Transformer for Crash Likelihood Prediction at Intersections Using Connected Vehicle Data
The real-time crash likelihood prediction model is an essential component of
the proactive traffic safety management system. Over the years, numerous
studies have attempted to construct a crash likelihood prediction model in
order to enhance traffic safety, but mostly on freeways. In the majority of the
existing studies, researchers have primarily employed a deep learning-based
framework to identify crash potential. Lately, Transformer has emerged as a
potential deep neural network that fundamentally operates through
attention-based mechanisms. Transformer has several functional benefits over
extant deep learning models such as Long Short-Term Memory (LSTM), Convolution
Neural Network (CNN), etc. Firstly, Transformer can readily handle long-term
dependencies in a data sequence. Secondly, Transformer can parallelly process
all elements in a data sequence during training. Finally, Transformer does not
have the vanishing gradient issue. Realizing the immense possibility of
Transformer, this paper proposes inTersection-Transformer (inTformer), a
time-embedded attention-based Transformer model that can effectively predict
intersection crash likelihood in real-time. The proposed model was evaluated
using connected vehicle data extracted from INRIX's Signal Analytics Platform.
The data was parallelly formatted and stacked at different timesteps to develop
nine inTformer models. The best inTformer model achieved a sensitivity of 73%.
This model was also compared to earlier studies on crash likelihood prediction
at intersections and with several established deep learning models trained on
the same connected vehicle dataset. In every scenario, this inTformer
outperformed the benchmark models confirming the viability of the proposed
inTformer architecture.Comment: 29 pages, 7 figures, 9 table
Peer-to-Peer Energy Trading in Smart Residential Environment with User Behavioral Modeling
Electric power systems are transforming from a centralized unidirectional market to a decentralized open market. With this shift, the end-users have the possibility to actively participate in local energy exchanges, with or without the involvement of the main grid. Rapidly reducing prices for Renewable Energy Technologies (RETs), supported by their ease of installation and operation, with the facilitation of Electric Vehicles (EV) and Smart Grid (SG) technologies to make bidirectional flow of energy possible, has contributed to this changing landscape in the distribution side of the traditional power grid.
Trading energy among users in a decentralized fashion has been referred to as Peer- to-Peer (P2P) Energy Trading, which has attracted significant attention from the research and industry communities in recent times. However, previous research has mostly focused on engineering aspects of P2P energy trading systems, often neglecting the central role of users in such systems. P2P trading mechanisms require active participation from users to decide factors such as selling prices, storing versus trading energy, and selection of energy sources among others. The complexity of these tasks, paired with the limited cognitive and time capabilities of human users, can result sub-optimal decisions or even abandonment of such systems if performance is not satisfactory. Therefore, it is of paramount importance for P2P energy trading systems to incorporate user behavioral modeling that captures users’ individual trading behaviors, preferences, and perceived utility in a realistic and accurate manner. Often, such user behavioral models are not known a priori in real-world settings, and therefore need to be learned online as the P2P system is operating.
In this thesis, we design novel algorithms for P2P energy trading. By exploiting a variety of statistical, algorithmic, machine learning, and behavioral economics tools, we propose solutions that are able to jointly optimize the system performance while taking into account and learning realistic model of user behavior. The results in this dissertation has been published in IEEE Transactions on Green Communications and Networking 2021, Proceedings of IEEE Global Communication Conference 2022, Proceedings of IEEE Conference on Pervasive Computing and Communications 2023 and ACM Transactions on Evolutionary Learning and Optimization 2023
Recommended from our members
EARLY-WARNING PREDICTION FOR MACHINE FAILURES IN AUTOMATED INDUSTRIES USING ADVANCED MACHINE LEARNING TECHNIQUES
This Culminating Experience Project explores the use of machine learning algorithms to detect machine failure. The research questions are: Q1) How does the quality of input data, including issues such as outliers, and noise, impact the accuracy and reliability of machine failure prediction models in industrial settings? Q2) How does the integration of SMOTE with feature engineering techniques influence the overall performance of machine learning models in detecting and preventing machine failures? Q3) What is the performance of different machine learning algorithms in predicting machine failures, and which algorithm is the most effective? The research findings are: Q1) Effective outlier handling is vital for predictive maintenance as the variables distribution initially showed a right-skewed pattern but after rectifying, it became more centralized, with correlations between specific sensors showing potential for further exploration. Q2) Data balancing through SMOTE and feature engineering is essential due to the rarity of actual failure instances. Substantial challenges are observed when predicting \u27Failure\u27 instances, with a lower true positive rate (73%), resulting in low precision (0.02) and recall (0.73) for \u27Failure\u27 predictions. This is further reflected in the low F1-Score (0.03) for \u27Failure,\u27 indicating a trade-off between precision and recall. Despite a commendable overall accuracy of 94%, the class imbalance within the dataset (92,200 \u27Running\u27 instances vs. 126 \u27Failure\u27 instances) remains a contributing factor to the model\u27s limitations. Q3) Machine learning algorithm performance varies, with Catboost excelling in accuracy and failure detection. The choice of algorithm and continuous model refinement are critical for enhanced predictive accuracy in industrial contexts. The main conclusions are: Q1) Addressing outliers in data preprocessing significantly enhances the accuracy of machine failure prediction models. Q2) focuses on addressing the issue of equipment failure parameter imbalance. It was found in the research findings that there was a significant imbalance in the failure data, with only 0.14% of the dataset representing actual failures and 99.86% of the dataset pertaining to non-failure data. This extreme class disparity can result in biased models that underperform on underrepresented classes, which is a common problem in machine learning. Q3) Catboost outperforms other algorithms in predicting machine failures with remarkable accuracy and failure detection rates of 92% accuracy and 99% times it is correct, and further exploration of diverse data and algorithms is needed for tailored industrial applications. Future research areas include advanced outlier handling, sensor relationships, and data balancing for improved model accuracy. Addressing rare failures, enhancing model performance, and exploring diverse machine learning algorithms are critical for advancing predictive maintenance
On Tilted Losses in Machine Learning: Theory and Applications
Exponential tilting is a technique commonly used in fields such as
statistics, probability, information theory, and optimization to create
parametric distribution shifts. Despite its prevalence in related fields,
tilting has not seen widespread use in machine learning. In this work, we aim
to bridge this gap by exploring the use of tilting in risk minimization. We
study a simple extension to ERM -- tilted empirical risk minimization (TERM) --
which uses exponential tilting to flexibly tune the impact of individual
losses. The resulting framework has several useful properties: We show that
TERM can increase or decrease the influence of outliers, respectively, to
enable fairness or robustness; has variance-reduction properties that can
benefit generalization; and can be viewed as a smooth approximation to the tail
probability of losses. Our work makes rigorous connections between TERM and
related objectives, such as Value-at-Risk, Conditional Value-at-Risk, and
distributionally robust optimization (DRO). We develop batch and stochastic
first-order optimization methods for solving TERM, provide convergence
guarantees for the solvers, and show that the framework can be efficiently
solved relative to common alternatives. Finally, we demonstrate that TERM can
be used for a multitude of applications in machine learning, such as enforcing
fairness between subgroups, mitigating the effect of outliers, and handling
class imbalance. Despite the straightforward modification TERM makes to
traditional ERM objectives, we find that the framework can consistently
outperform ERM and deliver competitive performance with state-of-the-art,
problem-specific approaches.Comment: arXiv admin note: substantial text overlap with arXiv:2007.0116
Application of Spectral Solution and Neural Network Techniques in Plasma Modeling for Electric Propulsion
A solver for Poisson\u27s equation was developed using the Radix-2 FFT method first invented by Carl Friedrich Gauss. Its performance was characterized using simulated data and identical boundary conditions to those found in a Hall Effect Thruster. The characterization showed errors below machine-zero with noise-free data, and above 20% noise-to-signal strength, the error increased linearly with the noise. This solver can be implemented into AFRL\u27s plasma simulator, the Thermophysics Universal Research Framework (TURF) and used to quickly and accurately compute the electric field based on charge distributions. The validity of a machine learning approach and data-based complex system modeling approach was demonstrated. To this end, several multilayer perceptrons were created and validated against AFRL-provided Hall Thruster test data, with two networks showing mean error below 1% and standard deviations below 10%. These results, while not ready for implementation as a replacement for lookup tables, strongly suggest paths for future work and the development of networks that would be acceptable in such a role, saving both RAM space and time in plasma simulations
- …