Search CORE

327 research outputs found

Seeing is not Believing: Robust Reinforcement Learning against Spurious Correlation

Author: Chi Yuejie
Ding Wenhao
Shi Laixi
Zhao Ding
Publication venue
Publication date: 25/10/2023
Field of study

Robustness has been extensively studied in reinforcement learning (RL) to handle various forms of uncertainty such as random perturbations, rare events, and malicious attacks. In this work, we consider one critical type of robustness against spurious correlation, where different portions of the state do not have correlations induced by unobserved confounders. These spurious correlations are ubiquitous in real-world tasks, for instance, a self-driving car usually observes heavy traffic in the daytime and light traffic at night due to unobservable human activity. A model that learns such useless or even harmful correlation could catastrophically fail when the confounder in the test case deviates from the training one. Although motivated, enabling robustness against spurious correlation poses significant challenges since the uncertainty set, shaped by the unobserved confounder and causal structure, is difficult to characterize and identify. Existing robust algorithms that assume simple and unstructured uncertainty sets are therefore inadequate to address this challenge. To solve this issue, we propose Robust State-Confounded Markov Decision Processes (RSC-MDPs) and theoretically demonstrate its superiority in avoiding learning spurious correlations compared with other robust RL counterparts. We also design an empirical algorithm to learn the robust optimal policy for RSC-MDPs, which outperforms all baselines in eight realistic self-driving and manipulation tasks.Comment: Accepted to NeurIPS 202

arXiv.org e-Print Archive

Seasonal variability does not impact in vitro fertilization success

Author: Bai Haiyan
Gao Ming
Liu Xitong
Mol Ben W.
Shi Juanzi
Shi Wenhao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/11/2019
Field of study

Peer reviewedPublisher PD

Aberdeen University Research

Monash University Research Portal

Solving Math Word Problems with Reexamination

Author: Bin Yi
Ding Yujuan
Ng See-Kiong
Shi Wenhao
Yang Yang
Publication venue
Publication date: 19/11/2023
Field of study

Math word problem (MWP) solving aims to understand the descriptive math problem and calculate the result, for which previous efforts are mostly devoted to upgrade different technical modules. This paper brings a different perspective of \textit{reexamination process} during training by introducing a pseudo-dual task to enhance the MWP solving. We propose a pseudo-dual (PseDual) learning scheme to model such process, which is model-agnostic thus can be adapted to any existing MWP solvers. The pseudo-dual task is specifically defined as filling the numbers in the expression back into the original word problem with numbers masked. To facilitate the effective joint learning of the two tasks, we further design a scheduled fusion strategy for the number infilling task, which smoothly switches the input from the ground-truth math expressions to the predicted ones. Our pseudo-dual learning scheme has been tested and proven effective when being equipped in several representative MWP solvers through empirical studies. \textit{The codes and trained models are available at:} \url{https://github.com/steven640pixel/PsedualMWP}. \end{abstract}Comment: To be appeared at NeurIPS2023 Workshop on MATH-A

arXiv.org e-Print Archive

LLaSM: Large Language and Speech Model

Author: Chen Guangyao
Dong Siwei
Huang Wenhao
Shi Daochen
Shi Yemin
Shu Yu
Xiang Qiqi
Zhang Ruihua
Publication venue
Publication date: 16/09/2023
Field of study

Multi-modal large language models have garnered significant interest recently. Though, most of the works focus on vision-language multi-modal models providing strong capabilities in following vision-and-language instructions. However, we claim that speech is also an important modality through which humans interact with the world. Hence, it is crucial for a general-purpose assistant to be able to follow multi-modal speech-and-language instructions. In this work, we propose Large Language and Speech Model (LLaSM). LLaSM is an end-to-end trained large multi-modal speech-language model with cross-modal conversational abilities, capable of following speech-and-language instructions. Our early experiments show that LLaSM demonstrates a more convenient and natural way for humans to interact with artificial intelligence. Specifically, we also release a large Speech Instruction Following dataset LLaSM-Audio-Instructions. Code and demo are available at https://github.com/LinkSoul-AI/LLaSM and https://huggingface.co/spaces/LinkSoul/LLaSM. The LLaSM-Audio-Instructions dataset is available at https://huggingface.co/datasets/LinkSoul/LLaSM-Audio-Instructions

arXiv.org e-Print Archive

Non-Autoregressive Sentence Ordering

Author: Bin Yi
Ding Yujuan
Ji Bin
Shi Wenhao
Yang Yang
Zhang Jipeng
Publication venue
Publication date: 19/10/2023
Field of study

Existing sentence ordering approaches generally employ encoder-decoder frameworks with the pointer net to recover the coherence by recurrently predicting each sentence step-by-step. Such an autoregressive manner only leverages unilateral dependencies during decoding and cannot fully explore the semantic dependency between sentences for ordering. To overcome these limitations, in this paper, we propose a novel Non-Autoregressive Ordering Network, dubbed \textit{NAON}, which explores bilateral dependencies between sentences and predicts the sentence for each position in parallel. We claim that the non-autoregressive manner is not just applicable but also particularly suitable to the sentence ordering task because of two peculiar characteristics of the task: 1) each generation target is in deterministic length, and 2) the sentences and positions should match exclusively. Furthermore, to address the repetition issue of the naive non-autoregressive Transformer, we introduce an exclusive loss to constrain the exclusiveness between positions and sentences. To verify the effectiveness of the proposed model, we conduct extensive experiments on several common-used datasets and the experimental results show that our method outperforms all the autoregressive approaches and yields competitive performance compared with the state-of-the-arts. The codes are available at: \url{https://github.com/steven640pixel/nonautoregressive-sentence-ordering}.Comment: Accepted at Findings of EMNLP202

arXiv.org e-Print Archive

CAJun: Continuous Adaptive Jumping using a Learned Centroidal Controller

Author: Boots Byron
Meng Xiangyun
Shi Guanya
Tan Jie
Yang Yuxiang
Yu Wenhao
Zhang Tingnan
Publication venue
Publication date: 27/10/2023
Field of study

We present CAJun, a novel hierarchical learning and control framework that enables legged robots to jump continuously with adaptive jumping distances. CAJun consists of a high-level centroidal policy and a low-level leg controller. In particular, we use reinforcement learning (RL) to train the centroidal policy, which specifies the gait timing, base velocity, and swing foot position for the leg controller. The leg controller optimizes motor commands for the swing and stance legs according to the gait timing to track the swing foot target and base velocity commands using optimal control. Additionally, we reformulate the stance leg optimizer in the leg controller to speed up policy training by an order of magnitude. Our system combines the versatility of learning with the robustness of optimal control. By combining RL with optimal control methods, our system achieves the versatility of learning while enjoys the robustness from control methods, making it easily transferable to real robots. We show that after 20 minutes of training on a single GPU, CAJun can achieve continuous, long jumps with adaptive distances on a Go1 robot with small sim-to-real gaps. Moreover, the robot can jump across gaps with a maximum width of 70cm, which is over 40% wider than existing methods.Comment: Please visit https://yxyang.github.io/cajun/ for additional result

arXiv.org e-Print Archive

Deep Time-Stream Framework for Click-Through Rate Prediction by Tracking Interest Evolution

Author: Chen Qing-Guo
Hu Yao
Li Ming
Shi Shu-Ting
Tang Jun
Zheng Wenhao
Zhu Jianke
Publication venue
Publication date: 08/01/2020
Field of study

Click-through rate (CTR) prediction is an essential task in industrial applications such as video recommendation. Recently, deep learning models have been proposed to learn the representation of users' overall interests, while ignoring the fact that interests may dynamically change over time. We argue that it is necessary to consider the continuous-time information in CTR models to track user interest trend from rich historical behaviors. In this paper, we propose a novel Deep Time-Stream framework (DTS) which introduces the time information by an ordinary differential equations (ODE). DTS continuously models the evolution of interests using a neural network, and thus is able to tackle the challenge of dynamically representing users' interests based on their historical behaviors. In addition, our framework can be seamlessly applied to any existing deep CTR models by leveraging the additional Time-Stream Module, while no changes are made to the original CTR models. Experiments on public dataset as well as real industry dataset with billions of samples demonstrate the effectiveness of proposed approaches, which achieve superior performance compared with existing methods.Comment: 8 pages. arXiv admin note: text overlap with arXiv:1809.03672 by other author

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Non-Autoregressive Math Word Problem Solver with Unified Tree Structure

Author: Bin Yi
Han Mengqun
Ng See-Kiong
Shen Heng Tao
Shi Wenhao
Wang Lei
Yang Yang
Publication venue
Publication date: 28/10/2023
Field of study

Existing MWP solvers employ sequence or binary tree to present the solution expression and decode it from given problem description. However, such structures fail to handle the variants that can be derived via mathematical manipulation, e.g.,

(a_1+a_2) * a_3

and

a_1 * a_3+a_2 * a_3

can both be possible valid solutions for a same problem but formulated as different expression sequences or trees. The multiple solution variants depicting different possible solving procedures for the same input problem would raise two issues: 1) making it hard for the model to learn the mapping function between the input and output spaces effectively, and 2) wrongly indicating \textit{wrong} when evaluating a valid expression variant. To address these issues, we introduce a unified tree structure to present a solution expression, where the elements are permutable and identical for all the expression variants. We propose a novel non-autoregressive solver, named \textit{MWP-NAS}, to parse the problem and deduce the solution expression based on the unified tree. For evaluating the possible expression variants, we design a path-based metric to evaluate the partial accuracy of expressions of a unified tree. The results from extensive experiments conducted on Math23K and MAWPS demonstrate the effectiveness of our proposed MWP-NAS. The codes and checkpoints are available at: \url{https://github.com/mengqunhan/MWP-NAS}.Comment: Accepted at EMNLP202

arXiv.org e-Print Archive

Recommended from our members

The interplay between thermodynamics and kinetics in the solid-state synthesis of layered oxides.

Author: Bai Jianming
Bianchini Matteo
Ceder Gerbrand
Clément Raphaële J
Kim Haegyeom
Kitchaev Daniil
Ouyang Bin
Shi Tan
Sun Wenhao
Wang Feng
Wang Jingyang
Wang Yan
Xiao Penghao
Zhang Mingjian
Zhang Yaqian
Publication venue: eScholarship, University of California
Publication date: 01/10/2020
Field of study

In the synthesis of inorganic materials, reactions often yield non-equilibrium kinetic byproducts instead of the thermodynamic equilibrium phase. Understanding the competition between thermodynamics and kinetics is a fundamental step towards the rational synthesis of target materials. Here, we use in situ synchrotron X-ray diffraction to investigate the multistage crystallization pathways of the important two-layer (P2) sodium oxides Na0.67MO2 (M = Co, Mn). We observe a series of fast non-equilibrium phase transformations through metastable three-layer O3, O3' and P3 phases before formation of the equilibrium two-layer P2 polymorph. We present a theoretical framework to rationalize the observed phase progression, demonstrating that even though P2 is the equilibrium phase, compositionally unconstrained reactions between powder precursors favour the formation of non-equilibrium three-layered intermediates. These insights can guide the choice of precursors and parameters employed in the solid-state synthesis of ceramic materials, and constitutes a step forward in unravelling the complex interplay between thermodynamics and kinetics during materials synthesis

eScholarship - University of California