65 research outputs found

    A new Potential-Based Reward Shaping for Reinforcement Learning Agent

    Full text link
    Potential-based reward shaping (PBRS) is a particular category of machine learning methods which aims to improve the learning speed of a reinforcement learning agent by extracting and utilizing extra knowledge while performing a task. There are two steps in the process of transfer learning: extracting knowledge from previously learned tasks and transferring that knowledge to use it in a target task. The latter step is well discussed in the literature with various methods being proposed for it, while the former has been explored less. With this in mind, the type of knowledge that is transmitted is very important and can lead to considerable improvement. Among the literature of both the transfer learning and the potential-based reward shaping, a subject that has never been addressed is the knowledge gathered during the learning process itself. In this paper, we presented a novel potential-based reward shaping method that attempted to extract knowledge from the learning process. The proposed method extracts knowledge from episodes' cumulative rewards. The proposed method has been evaluated in the Arcade learning environment and the results indicate an improvement in the learning process in both the single-task and the multi-task reinforcement learner agents

    Sampling-Based Nonlinear MPC of Neural Network Dynamics with Application to Autonomous Vehicle Motion Planning

    Full text link
    Control of machine learning models has emerged as an important paradigm for a broad range of robotics applications. In this paper, we present a sampling-based nonlinear model predictive control (NMPC) approach for control of neural network dynamics. We show its design in two parts: 1) formulating conventional optimization-based NMPC as a Bayesian state estimation problem, and 2) using particle filtering/smoothing to achieve the estimation. Through a principled sampling-based implementation, this approach can potentially make effective searches in the control action space for optimal control and also facilitate computation toward overcoming the challenges caused by neural network dynamics. We apply the proposed NMPC approach to motion planning for autonomous vehicles. The specific problem considers nonlinear unknown vehicle dynamics modeled as neural networks as well as dynamic on-road driving scenarios. The approach shows significant effectiveness in successful motion planning in case studies.Comment: To appear in 2022 American Control Conference (ACC

    Energy-Efficient Deadline-Aware Edge Computing: Bandit Learning with Partial Observations in Multi-Channel Systems

    Full text link
    In this paper, we consider a task offloading problem in a multi-access edge computing (MEC) network, in which edge users can either use their local processing unit to compute their tasks or offload their tasks to a nearby edge server through multiple communication channels each with different characteristics. The main objective is to maximize the energy efficiency of the edge users while meeting computing tasks deadlines. In the multi-user multi-channel offloading scenario, users are distributed with partial observations of the system states. We formulate this problem as a stochastic optimization problem and leverage \emph{contextual neural multi-armed bandit} models to develop an energy-efficient deadline-aware solution, dubbed E2DA. The proposed E2DA framework only relies on partial state information (i.e., computation task features) to make offloading decisions. Through extensive numerical analysis, we demonstrate that the E2DA algorithm can efficiently learn an offloading policy and achieve close-to-optimal performance in comparison with several baseline policies that optimize energy consumption and/or response time. Furthermore, we provide a comprehensive set of results on the MEC system performance for various applications such as augmented reality (AR) and virtual reality (VR).Comment: 2023 IEEE Global Communications Conferenc

    EEG-based multi-modal emotion recognition using bag of deep features: An optimal feature selection approach

    Get PDF
    Much attention has been paid to the recognition of human emotions with the help of electroencephalogram (EEG) signals based on machine learning technology. Recognizing emotions is a challenging task due to the non-linear property of the EEG signal. This paper presents an advanced signal processing method using the deep neural network (DNN) for emotion recognition based on EEG signals. The spectral and temporal components of the raw EEG signal are first retained in the 2D Spectrogram before the extraction of features. The pre-trained AlexNet model is used to extract the raw features from the 2D Spectrogram for each channel. To reduce the feature dimensionality, spatial, and temporal based, bag of deep features (BoDF) model is proposed. A series of vocabularies consisting of 10 cluster centers of each class is calculated using the k-means cluster algorithm. Lastly, the emotion of each subject is represented using the histogram of the vocabulary set collected from the raw-feature of a single channel. Features extracted from the proposed BoDF model have considerably smaller dimensions. The proposed model achieves better classification accuracy compared to the recently reported work when validated on SJTU SEED and DEAP data sets. For optimal classification performance, we use a support vector machine (SVM) and k-nearest neighbor (k-NN) to classify the extracted features for the different emotional states of the two data sets. The BoDF model achieves 93.8% accuracy in the SEED data set and 77.4% accuracy in the DEAP data set, which is more accurate compared to other state-of-the-art methods of human emotion recognition. - 2019 by the authors. Licensee MDPI, Basel, Switzerland.Funding: This research was funded by Higher Education Commission (HEC): Tdf/67/2017.Scopu

    Image Local Features Description through Polynomial Approximation

    Get PDF
    This work introduces a novel local patch descriptor that remains invariant under varying conditions of orientation, viewpoint, scale, and illumination. The proposed descriptor incorporate polynomials of various degrees to approximate the local patch within the image. Before feature detection and approximation, the image micro-texture is eliminated through a guided image filter with the potential to preserve the edges of the objects. The rotation invariance is achieved by aligning the local patch around the Harris corner through the dominant orientation shift algorithm. Weighted threshold histogram equalization (WTHE) is employed to make the descriptor in-sensitive to illumination changes. The correlation coefficient is used instead of Euclidean distance to improve the matching accuracy. The proposed descriptor has been extensively evaluated on the Oxford's affine covariant regions dataset, and absolute and transition tilt dataset. The experimental results show that our proposed descriptor can categorize the feature with more distinctiveness in comparison to state-of-the-art descriptors. - 2013 IEEE.This work was supported by the Qatar National Library.Scopu

    Malicious UAV detection using integrated audio and visual features for public safety applications

    Get PDF
    RÉSUMÉ: Unmanned aerial vehicles (UAVs) have become popular in surveillance, security, and remote monitoring. However, they also pose serious security threats to public privacy. The timely detection of a malicious drone is currently an open research issue for security provisioning companies. Recently, the problem has been addressed by a plethora of schemes. However, each plan has a limitation, such as extreme weather conditions and huge dataset requirements. In this paper, we propose a novel framework consisting of the hybrid handcrafted and deep feature to detect and localize malicious drones from their sound and image information. The respective datasets include sounds and occluded images of birds, airplanes, and thunderstorms, with variations in resolution and illumination. Various kernels of the support vector machine (SVM) are applied to classify the features. Experimental results validate the improved performance of the proposed scheme compared to other related methods

    A Phase Field Model for Rate-Dependent Ductile Fracture

    No full text
    In this study, a phase field viscoplastic model is proposed to model the influence of the loading rate on the ductile fracture, as one of the main causes of metallic alloys’ failure. To this aim, the effects of the phase field are incorporated in the Peric’s viscoplastic model; the model can efficiently be converted to a standard rate-independent model. The novel aspects of this work include: Describing a coupling between rate-dependent plasticity and phase field formulation by defining an energy function that contains the energy dissipation caused by plastic deformation as well as the fracture process and elastic energy. In addition, the equations required to develop the numerical solution are presented. The governing equations are determined by a minimization principle that results in balance laws for the coupled displacement-phase field problem. Furthermore, an implicit integration algorithm for a viscoplasticity model coupled with a phase field is presented for a three-dimensional stress state. The proposed algorithm can be utilized for different constitutive models of rate-dependent and rate-independent plasticity models coupled with fracture by changing the definition of the plastic multiplier. In addition, to control the influence of the plastic deformation and its work on the crack propagation, a threshold variable is defined in the model. Finally, using the proposed model, the influence of the loading rate on the responses of the different specimens in one-dimensional and multi-dimensional cases is investigated and the accuracy of the results was verified by comparing them with existing experimental and numerical results. The obtained result proves that the model can simulate the impact of the loading rate on the material response, and the gradual change of the fracture phase from ductile to brittle, caused by increasing the loading rate

    The Influence of Assembly Force on the Material Loss at the Metallic Head-Neck Junction of Hip Implants Subjected to Cyclic Fretting Wear

    No full text
    The impaction force required to assemble the head and stem components of hip implants is proven to play a major role in the mechanics of the taper junction. However, it is not clear if the assembly force could have an effect on fretting wear, which normally occurs at the junction. In this study, an adaptive finite element model was developed for a CoCr/CoCr head-neck junction with an angular mismatch of 0.01° in order to simulate the fretting wear process and predict the material loss under various assembly forces and over a high number of gait cycles. The junction was assembled with 2, 3, 4, and 5 kN and then subjected to 1,025,000 cycles of normal walking gait loading. The findings showed that material removal due to fretting wear increased when raising the assembly force. High assembly forces induced greater contact pressures over larger contact regions at the interface, which, in turn, resulted in more material loss and wear damage to the surface when compared to lower assembly forces. Although a high assembly force (greater than 4 kN) can further improve the initial strength and stability of the taper junction, it appears that it also increases the degree of fretting wear. Further studies are needed to investigate the assembly force in the other taper designs, angular mismatches, and material combinations

    Nonlinear modeling and dynamic analysis of bioengineering hyper-elastic tubes based on different material models

    No full text
    In this research, nonlinear vibrations of a hyper-elastic tube accounting for large deflection and moderate rotation have been examined. The hyper-elastic tube is assumed to be surrounded by a nonlinear hardening elastic medium. Different types of hyper-elastic material models are presented and discussed including neo-Hookean, Mooney-Rivlin, Ishihara and Yeoh models. The efficacy of these models in nonlinear vibration modeling and analysis of hyper-elastic tubes has been examined. Modified von-Karman strain is used to consider both large deflection and moderate rotation. The governing equations are obtained based on strain energy function of above-mentioned hyper-elastic material models. The nonlinear governing equation of the tube contains cubic and quantic terms which is solved via extended Hamiltonian method leading to a closed form of nonlinear vibration frequency. The effect of hyper-elastic models and their material parameters on nonlinear vibrational frequency of tubes has been studied. 2019, Springer-Verlag GmbH Germany, part of Springer Nature.Scopu
    • …
    corecore