Search CORE

20,701 research outputs found

State-Augmentation Transformations for Risk-Sensitive Reinforcement Learning

Author: Ma Shuai
Yu Jia Yuan
Publication venue
Publication date: 29/11/2018
Field of study

In the framework of MDP, although the general reward function takes three arguments-current state, action, and successor state; it is often simplified to a function of two arguments-current state and action. The former is called a transition-based reward function, whereas the latter is called a state-based reward function. When the objective involves the expected cumulative reward only, this simplification works perfectly. However, when the objective is risk-sensitive, this simplification leads to an incorrect value. We present state-augmentation transformations (SATs), which preserve the reward sequences as well as the reward distributions and the optimal policy in risk-sensitive reinforcement learning. In risk-sensitive scenarios, firstly we prove that, for every MDP with a stochastic transition-based reward function, there exists an MDP with a deterministic state-based reward function, such that for any given (randomized) policy for the first MDP, there exists a corresponding policy for the second MDP, such that both Markov reward processes share the same reward sequence. Secondly we illustrate that two situations require the proposed SATs in an inventory control problem. One could be using Q-learning (or other learning methods) on MDPs with transition-based reward functions, and the other could be using methods, which are for the Markov processes with a deterministic state-based reward functions, on the Markov processes with general reward functions. We show the advantage of the SATs by considering Value-at-Risk as an example, which is a risk measure on the reward distribution instead of the measures (such as mean and variance) of the distribution. We illustrate the error in the reward distribution estimation from the direct use of Q-learning, and show how the SATs enable a variance formula to work on Markov processes with general reward functions

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Ccl5 establishes an autocrine high-grade glioma growth regulatory circuit critical for mesenchymal glioblastoma survival

Author: Gutmann David H
Hambardzumyan Dolores
Ma Yu
Pan Yuan
Smithson Laura J
Publication venue: Digital Commons@Becker
Publication date: 01/01/2017
Field of study

Crossref

Digital Commons@Becker

AEDNet: Adaptive Edge-Deleting Network For Subgraph Matching

Author: Lan Zixun
Ma Fei
Ma Ye
Yu Limin
Yuan LingLong
Publication venue: 'Elsevier BV'
Publication date: 08/11/2022
Field of study

Subgraph matching is to find all subgraphs in a data graph that are isomorphic to an existing query graph. Subgraph matching is an NP-hard problem, yet has found its applications in many areas. Many learning-based methods have been proposed for graph matching, whereas few have been designed for subgraph matching. The subgraph matching problem is generally more challenging, mainly due to the different sizes between the two graphs, resulting in considerable large space of solutions. Also the extra edges existing in the data graph connecting to the matched nodes may lead to two matched nodes of two graphs having different adjacency structures and often being identified as distinct objects. Due to the extra edges, the existing learning based methods often fail to generate sufficiently similar node-level embeddings for matched nodes. This study proposes a novel Adaptive Edge-Deleting Network (AEDNet) for subgraph matching. The proposed method is trained in an end-to-end fashion. In AEDNet, a novel sample-wise adaptive edge-deleting mechanism removes extra edges to ensure consistency of adjacency structure of matched nodes, while a unidirectional cross-propagation mechanism ensures consistency of features of matched nodes. We applied the proposed method on six datasets with graph sizes varying from 20 to 2300. Our evaluations on six open datasets demonstrate that the proposed AEDNet outperforms six state-of-the-arts and is much faster than the exact methods on large graphs

arXiv.org e-Print Archive

Axially deformed relativistic Hartree Bogoliubov with separable pairing force

Author: G. A. Lalazissis
I. Talmi
M. Abramowitz
P. Ring
P. Ring
Yuan Tian
Zhong-yu Ma
Publication venue: 'American Physical Society (APS)'
Publication date: 13/08/2009
Field of study

A separable form of pairing interaction in the

^{1}S_{0}

channel has been introduced and successfully applied in the description of both static and dynamic properties of superfluid nuclei. By adjusting the parameters to reproduce the pairing properties of the Gogny force in nuclear matter, this separable pairing force is successful in depicting the pairing properties of ground states and vibrational excitations of spherical nuclei on almost the same footing as the original Gogny force. In this article, we extend these investigations for Relativistic Hartree Bogoliubov theory in deformed nuclei with axial symmetry (RHBZ) using the same separable pairing interaction. In order to preserve translational invariance we construct one- and two-dimensional Talmi-Moshinsky brackets for the cylindrical harmonic oscillator basis. We show that the matrix elements of this force can then be expanded in a series of separable terms. The convergence of this expansion is investigated for various deformations. We observe a relatively fast convergence. This allows for a considerable reduction in computing time as compared to RHBZ-calculations with the full Gogny force in the pairing channel. As an example we solve the RHBZ equations with this separable pairing force for the ground states of the chain of Sm-isotopes. Good agreement with the experimental data as well as with other theoretical results is achieved.Comment: 8 pages, 5 figures. accepted by Phys. Rev.

arXiv.org e-Print Archive

Crossref