Search CORE

59 research outputs found

Recommended from our members

Designing Weakly Coupled Mems Resonators with Machine Learning-Based Method

Author: Behrouzi Kamyar
Guo Ruiqi
Lin Liwei
Sui Fanping
Yue Wei
Publication venue: eScholarship, University of California
Publication date: 13/01/2022
Field of study

We demonstrate a design scheme for weakly coupled resonators (WCRs) by integrating the supervised learning (SL) with the genetic algorithm (GA). In this work, three distinctive achievements have been accomplished: 1) the precise prediction of coupling characteristics of WCRs with an accuracy of 98.7% via SL; 2) the stepwise evolutionary optimization of WCR geometries while maintaining their geometric connectivity via GA; and 3) the highly efficient generation of WCR designs with a mean coupling factor down to 0.0056, which outperforms 98% of random designs. The coupling behavior analysis and prediction are validated with experimental data of coupled microcantilevers from a published work. As such, this newly proposed scheme could shed light upon the structural optimization methods for high-performance MEMS devices with high degree of design freedom

eScholarship - University of California

Sample-efficient deep reinforcement learning from single agent to multiple agents

Author: Zheng Han
Publication venue
Publication date: 01/01/2021
Field of study

University of Technology Sydney. Faculty of Engineering and Information Technology.Deep reinforcement learning (DRL) has recently become a very popular topic in the academic field. However, it usually suffers the sample inefficiency problem due to the lack of effective exploration, instability, or temporal credit assignment issue. High sample complexity leads to a huge computation cost and adversely affects the employment of DRL techniques in practice. Despite many methods proposed to address this challenge, further improvements are still needed. This thesis contributes to developing sample-efficient DRL methods for continuous control from two perspectives: single agent and multiple agents. Specifically, the key contribution includes an uncertainty regularized policy learning method for single agent and two ensemble learning frameworks for multiple agents. Importantly, this thesis highlights that the multiple agents’ methods can be seen as bridging gaps among on-policy, off-policy RL, and evolutionary algorithms. Moreover, our approach achieves consistent improvements over the baseline methods and gives novel insight into effectively taking advantage of different methods to get the best of them

OPUS - University of Technology Sydney

Reinforcement learning strategies using Monte-Carlo to solve the blackjack problem

Author: Biju Vinai George
Channegowda Ravikumar Hodikehosahally
Jankatti Santosh Kumar
Jinachandra Niranjana Shravanabelagola
Srinivasaiah Raghavendra
Publication venue: Institute of Advanced Engineering and Science
Publication date: 01/02/2024
Field of study

Blackjack is a classic casino game in which the player attempts to outsmart the dealer by drawing a combination of cards with face values that add up to just under or equal to 21 but are more incredible than the hand of the dealer he manages to come up with. This study considers a simplified variation of blackjack, which has a dealer and plays no active role after the first two draws. A different game regime will be modeled for everyone to ten multiples of the conventional 52-card deck. Irrespective of the number of standard decks utilized, the game is played as a randomized discrete-time process. For determining the optimum course of action in terms of policy, we teach an agent-a decision maker-to optimize across the decision space of the game, considering the procedure as a finite Markov decision chain. To choose the most effective course of action, we mainly research Monte Carlo-based reinforcement learning approaches and compare them with q-learning, dynamic programming, and temporal difference. The performance of the distinct model-free policy iteration techniques is presented in this study, framing the game as a reinforcement learning problem

Institute of Advanced Engineering and Science

Learning to View: Decision Transformers for Active Object Detection

Author: Deshpande Mohit
Ding Wenhao
Madhivanan Rajasimman
Majcherczyk Nathalie
Qi Xuewei
Sen Arnie
Zhao Ding
Publication venue
Publication date: 23/01/2023
Field of study

Active perception describes a broad class of techniques that couple planning and perception systems to move the robot in a way to give the robot more information about the environment. In most robotic systems, perception is typically independent of motion planning. For example, traditional object detection is passive: it operates only on the images it receives. However, we have a chance to improve the results if we allow planning to consume detection signals and move the robot to collect views that maximize the quality of the results. In this paper, we use reinforcement learning (RL) methods to control the robot in order to obtain images that maximize the detection quality. Specifically, we propose using a Decision Transformer with online fine-tuning, which first optimizes the policy with a pre-collected expert dataset and then improves the learned policy by exploring better solutions in the environment. We evaluate the performance of proposed method on an interactive dataset collected from an indoor scenario simulator. Experimental results demonstrate that our method outperforms all baselines, including expert policy and pure offline RL methods. We also provide exhaustive analyses of the reward distribution and observation space.Comment: Accepted to ICRA 202

arXiv.org e-Print Archive

Offline Skill Graph (OSG): A Framework for Learning and Planning using Offline Reinforcement Learning Skills

Author: Aperstein Yehudit
Di Castro Dotan
Halevy Ben-ya
Publication venue
Publication date: 23/06/2023
Field of study

Reinforcement Learning has received wide interest due to its success in competitive games. Yet, its adoption in everyday applications is limited (e.g. industrial, home, healthcare, etc.). In this paper, we address this limitation by presenting a framework for planning over offline skills and solving complex tasks in real-world environments. Our framework is comprised of three modules that together enable the agent to learn from previously collected data and generalize over it to solve long-horizon tasks. We demonstrate our approach by testing it on a robotic arm that is required to solve complex tasks

arXiv.org e-Print Archive

VAPOR: Legged Robot Navigation in Outdoor Vegetation Using Offline Reinforcement Learning

Author: Elnoor Mohamed
Manocha Dinesh
Sathyamoorthy Adarsh Jagan
Weerakoon Kasun
Publication venue
Publication date: 19/09/2023
Field of study

We present VAPOR, a novel method for autonomous legged robot navigation in unstructured, densely vegetated outdoor environments using offline Reinforcement Learning (RL). Our method trains a novel RL policy using an actor-critic network and arbitrary data collected in real outdoor vegetation. Our policy uses height and intensity-based cost maps derived from 3D LiDAR point clouds, a goal cost map, and processed proprioception data as state inputs, and learns the physical and geometric properties of the surrounding obstacles such as height, density, and solidity/stiffness. The fully-trained policy's critic network is then used to evaluate the quality of dynamically feasible velocities generated from a novel context-aware planner. Our planner adapts the robot's velocity space based on the presence of entrapment inducing vegetation, and narrow passages in dense environments. We demonstrate our method's capabilities on a Spot robot in complex real-world outdoor scenes, including dense vegetation. We observe that VAPOR's actions improve success rates by up to 40%, decrease the average current consumption by up to 2.9%, and decrease the normalized trajectory length by up to 11.2% compared to existing end-to-end offline RL and other outdoor navigation methods

arXiv.org e-Print Archive

Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for Autonomous Real-World Reinforcement Learning

Author: Bohg Jeannette
Finn Chelsea
Mark Max Sobol
Sharma Archit
Vu Brandon
Yang Jingyun
Publication venue
Publication date: 23/10/2023
Field of study

The pre-train and fine-tune paradigm in machine learning has had dramatic success in a wide range of domains because the use of existing data or pre-trained models on the internet enables quick and easy learning of new tasks. We aim to enable this paradigm in robotic reinforcement learning, allowing a robot to learn a new task with little human effort by leveraging data and models from the Internet. However, reinforcement learning often requires significant human effort in the form of manual reward specification or environment resets, even if the policy is pre-trained. We introduce RoboFuME, a reset-free fine-tuning system that pre-trains a multi-task manipulation policy from diverse datasets of prior experiences and self-improves online to learn a target task with minimal human intervention. Our insights are to utilize calibrated offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy in the presence of distribution shifts and leverage pre-trained vision language models (VLMs) to build a robust reward classifier for autonomously providing reward signals during the online fine-tuning process. In a diverse set of five real robot manipulation tasks, we show that our method can incorporate data from an existing robot dataset collected at a different institution and improve on a target task within as little as 3 hours of autonomous real-world experience. We also demonstrate in simulation experiments that our method outperforms prior works that use different RL algorithms or different approaches for predicting rewards. Project website: https://robofume.github.i

arXiv.org e-Print Archive

ToP-ToM: Trust-aware Robot Policy with Theory of Mind

Author: Cangelosi Angelo
Serhan Baris
Yu Chuang
Publication venue
Publication date: 30/01/2025
Field of study

Theory of Mind (ToM) is a fundamental cognitive architecture that endows humans with the ability to attribute mental states to others. Humans infer the desires, beliefs, and intentions of others by observing their behavior and, in turn, adjust their actions to facilitate better interpersonal communication and team collaboration. In this paper, we investigated trust-aware robot policy with the theory of mind in a multiagent setting where a human collaborates with a robot against another human opponent. We show that by only focusing on team performance, the robot may resort to the reverse psychology trick, which poses a significant threat to trust maintenance. The human's trust in the robot will collapse when they discover deceptive behavior by the robot. To mitigate this problem, we adopt the robot theory of mind model to infer the human's trust beliefs, including true belief and false belief (an essential element of ToM). We designed a dynamic trust-aware reward function based on different trust beliefs to guide the robot policy learning, which aims to balance between avoiding human trust collapse due to robot reverse psychology. The experimental results demonstrate the importance of the ToM-based robot policy for human-robot trust and the effectiveness of our robot ToM-based robot policy in multiagent interaction settings

The University of Manchester - Institutional Repository

Guided Data Augmentation for Offline Reinforcement Learning and Imitation Learning

Author: Balis John U.
Corrado Nicholas E.
Hanna Josiah P.
Labiosa Adam
Qu Yuxiao
Publication venue
Publication date: 27/10/2023
Field of study

Learning from demonstration (LfD) is a popular technique that uses expert demonstrations to learn robot control policies. However, the difficulty in acquiring expert-quality demonstrations limits the applicability of LfD methods: real-world data collection is often costly, and the quality of the demonstrations depends greatly on the demonstrator's abilities and safety concerns. A number of works have leveraged data augmentation (DA) to inexpensively generate additional demonstration data, but most DA works generate augmented data in a random fashion and ultimately produce highly suboptimal data. In this work, we propose Guided Data Augmentation (GuDA), a human-guided DA framework that generates expert-quality augmented data. The key insight of GuDA is that while it may be difficult to demonstrate the sequence of actions required to produce expert data, a user can often easily identify when an augmented trajectory segment represents task progress. Thus, the user can impose a series of simple rules on the DA process to automatically generate augmented samples that approximate expert behavior. To extract a policy from GuDA, we use off-the-shelf offline reinforcement learning and behavior cloning algorithms. We evaluate GuDA on a physical robot soccer task as well as simulated D4RL navigation tasks, a simulated autonomous driving task, and a simulated soccer task. Empirically, we find that GuDA enables learning from a small set of potentially suboptimal demonstrations and substantially outperforms a DA strategy that samples augmented data randomly

arXiv.org e-Print Archive