Search CORE

310 research outputs found

Subset Sampling and Its Extensions

Author: Huang Jinchao
Wang Sibo
Publication venue
Publication date: 21/07/2023
Field of study

This paper studies the \emph{subset sampling} problem. The input is a set

\mathcal{S}

n

records together with a function

\textbf{p}

that assigns each record

v\in\mathcal{S}

a probability

\textbf{p}(v)

. A query returns a random subset

X

\mathcal{S}

, where each record

v\in\mathcal{S}

is sampled into

X

independently with probability

\textbf{p}(v)

. The goal is to store

\mathcal{S}

in a data structure to answer queries efficiently. If

\mathcal{S}

fits in memory, the problem is interesting when

\mathcal{S}

is dynamic. We develop a dynamic data structure with

\mathcal{O}(1+\mu_{\mathcal{S}})

expected \emph{query} time,

\mathcal{O}(n)

space and

\mathcal{O}(1)

amortized expected \emph{update}, \emph{insert} and \emph{delete} time, where

\mu_{\mathcal{S}}=\sum_{v\in\mathcal{S}}\textbf{p}(v)

. The query time and space are optimal. If

\mathcal{S}

does not fit in memory, the problem is difficult even if

\mathcal{S}

is static. Under this scenario, we present an I/O-efficient algorithm that answers a \emph{query} in

\mathcal{O}\left((\log^*_B n)/B+(\mu_\mathcal{S}/B)\log_{M/B} (n/B)\right)

amortized expected I/Os using

\mathcal{O}(n/B)

space, where

M

is the memory size,

B

is the block size and

\log^*_B n

is the number of iterative

\log_2(.)

operations we need to perform on

n

before going below

B

. In addition, when each record is associated with a real-valued key, we extend the \emph{subset sampling} problem to the \emph{range subset sampling} problem, in which we require that the keys of the sampled records fall within a specified input range

[a,b]

. For this extension, we provide a solution under the dynamic setting, with

\mathcal{O}(\log n+\mu_{\mathcal{S}\cap[a,b]})

expected \emph{query} time,

\mathcal{O}(n)

space and

\mathcal{O}(\log n)

amortized expected \emph{update}, \emph{insert} and \emph{delete} time.Comment: 17 page

arXiv.org e-Print Archive

Nuclear Matter and Neutron Stars from Relativistic Brueckner-Hartree-Fock Theory

Author: Tong Hui
Wang Chencan
Wang Sibo
Publication venue: 'American Astronomical Society'
Publication date: 27/10/2022
Field of study

The momentum and isospin dependence of the single-particle potential for the in-medium nucleon are the key quantities in the Relativistic Brueckner-Hartree-Fock (RBHF) theory. It depends on how to extract the scalar and the vector components of the single-particle potential inside nuclear matter. In contrast to the RBHF calculations in the Dirac space with the positive-energy states (PESs) only, the single-particle potential can be determined in a unique way by the RBHF theory together with the negative-energy states (NESs), i.e., the RBHF theory in the full Dirac space. The saturation properties of symmetric and asymmetric nuclear matter in the full Dirac space are systematically investigated based on the realistic Bonn nucleon-nucleon potentials. In order to further specify the importance of the calculations in the full Dirac space, the neutron star properties are investigated. The direct URCA process in neutron star cooling will happen at density

\rho_{\rm{DURCA}}=0.43,~0.48,~0.52

^{-3}

with the proton fractions

Y_{p,\rm{DURCA}}=0.13

. The radii of a

1.4M_\odot

neutron star are predicated as

R_{1.4M_\odot}=11.97,~12.13,~12.27

km, and their tidal deformabilities are

\Lambda_{1.4M_\odot}=376,~405,~433

for potential Bonn A, B, C. Comparing with the results obtained in the Dirac space with PESs only, full-Dirac-space RBHF calculation predicts the softest symmetry energy which would be more favored by the gravitational waves (GW) detection from GW170817. Furthermore, the results from full-Dirac-space RBHF theory are consistent with the recent astronomical observations of massive neutron stars and simultaneous mass-radius measurement

arXiv.org e-Print Archive

TrafficPredict: Trajectory Prediction for Heterogeneous Traffic-Agents

Author: Ma Yuexin
Manocha Dinesh
Wang Wenping
Yang Ruigang
Zhang Sibo
Zhu Xinge
Publication venue
Publication date: 09/04/2019
Field of study

To safely and efficiently navigate in complex urban traffic, autonomous vehicles must make responsible predictions in relation to surrounding traffic-agents (vehicles, bicycles, pedestrians, etc.). A challenging and critical task is to explore the movement patterns of different traffic-agents and predict their future trajectories accurately to help the autonomous vehicle make reasonable navigation decision. To solve this problem, we propose a long short-term memory-based (LSTM-based) realtime traffic prediction algorithm, TrafficPredict. Our approach uses an instance layer to learn instances' movements and interactions and has a category layer to learn the similarities of instances belonging to the same type to refine the prediction. In order to evaluate its performance, we collected trajectory datasets in a large city consisting of varying conditions and traffic densities. The dataset includes many challenging scenarios where vehicles, bicycles, and pedestrians move among one another. We evaluate the performance of TrafficPredict on our new dataset and highlight its higher accuracy for trajectory prediction by comparing with prior prediction methods.Comment: Accepted by AAAI(Oral) 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Offline Experience Replay for Continual Offline Reinforcement Learning

Author: Gai Sibo
He Li
Wang Donglin
Publication venue
Publication date: 23/05/2023
Field of study

The capability of continuously learning new skills via a sequence of pre-collected offline datasets is desired for an agent. However, consecutively learning a sequence of offline tasks likely leads to the catastrophic forgetting issue under resource-limited scenarios. In this paper, we formulate a new setting, continual offline reinforcement learning (CORL), where an agent learns a sequence of offline reinforcement learning tasks and pursues good performance on all learned tasks with a small replay buffer without exploring any of the environments of all the sequential tasks. For consistently learning on all sequential tasks, an agent requires acquiring new knowledge and meanwhile preserving old knowledge in an offline manner. To this end, we introduced continual learning algorithms and experimentally found experience replay (ER) to be the most suitable algorithm for the CORL problem. However, we observe that introducing ER into CORL encounters a new distribution shift problem: the mismatch between the experiences in the replay buffer and trajectories from the learned policy. To address such an issue, we propose a new model-based experience selection (MBES) scheme to build the replay buffer, where a transition model is learned to approximate the state distribution. This model is used to bridge the distribution bias between the replay buffer and the learned model by filtering the data from offline data that most closely resembles the learned model for storage. Moreover, in order to enhance the ability on learning new tasks, we retrofit the experience replay method with a new dual behavior cloning (DBC) architecture to avoid the disturbance of behavior-cloning loss on the Q-learning process. In general, we call our algorithm offline experience replay (OER). Extensive experiments demonstrate that our OER method outperforms SOTA baselines in widely-used Mujoco environments.Comment: 9 pages, 4 figure

arXiv.org e-Print Archive

Properties of $^{208}$ Pb predicted from the relativistic equation of state in the full Dirac space

Author: Gao Jing
Tong Hui
Wang Chencan
Wang Sibo
Publication venue: 'American Physical Society (APS)'
Publication date: 29/12/2022
Field of study

Relativistic Brueckner-Hartree-Fock (RBHF) theory in the full Dirac space allows one to determine uniquely the momentum dependence of scalar and vector components of the single-particle potentials. In order to extend this new method from nuclear matter to finite nuclei, as a first step, properties of

^{208}

Pb are explored by using the microscopic equation of state for asymmetric nuclear matter and a liquid droplet model. The neutron and proton density distributions, the binding energies, the neutron and proton radii, and the neutron skin thickness in

^{208}

Pb are calculated. In order to further compare the charge densities predicted from the RBHF theory in the full Dirac space with the experimental charge densities, the differential cross sections and the electric charge form factors in the elastic electron-nucleus scattering are obtained by using the phase-shift analysis method. The results from the RBHF theory are in good agreement with the experimental data. In addition, the uncertainty arising from variations of the surface term parameter

f_0

in the liquid droplet model is also discussed

arXiv.org e-Print Archive

Rayleigh Quotient Graph Neural Networks for Graph-level Anomaly Detection

Author: Dong Xiangyu
Wang Sibo
Zhang Xingyi
Publication venue
Publication date: 27/03/2024
Field of study

Graph-level anomaly detection has gained significant attention as it finds applications in various domains, such as cancer diagnosis and enzyme prediction. However, existing methods fail to capture the spectral properties of graph anomalies, resulting in unexplainable framework design and unsatisfying performance. In this paper, we re-investigate the spectral differences between anomalous and normal graphs. Our main observation shows a significant disparity in the accumulated spectral energy between these two classes. Moreover, we prove that the accumulated spectral energy of the graph signal can be represented by its Rayleigh Quotient, indicating that the Rayleigh Quotient is a driving factor behind the anomalous properties of graphs. Motivated by this, we propose Rayleigh Quotient Graph Neural Network (RQGNN), the first spectral GNN that explores the inherent spectral features of anomalous graphs for graph-level anomaly detection. Specifically, we introduce a novel framework with two components: the Rayleigh Quotient learning component (RQL) and Chebyshev Wavelet GNN with RQ-pooling (CWGNN-RQ). RQL explicitly captures the Rayleigh Quotient of graphs and CWGNN-RQ implicitly explores the spectral space of graphs. Extensive experiments on 10 real-world datasets show that RQGNN outperforms the best rival by 6.74% in Macro-F1 score and 1.44% in AUC, demonstrating the effectiveness of our framework. Our code is available at https://github.com/xydong127/RQGNN

arXiv.org e-Print Archive