Search CORE

385 research outputs found

Investigations of the molecular interactions of two RNA-binding proteins, LARP4A and LARP4B

Author: Gu Yifei
Publication venue
Publication date: 01/05/2023
Field of study

Learning Stochastic Shortest Path with Linear Function Approximation

Author: Gu Quanquan
He Jiafan
Min Yifei
Wang Tianhao
Publication venue
Publication date: 05/07/2022
Field of study

We study the stochastic shortest path (SSP) problem in reinforcement learning with linear function approximation, where the transition kernel is represented as a linear mixture of unknown models. We call this class of SSP problems as linear mixture SSPs. We propose a novel algorithm with Hoeffding-type confidence sets for learning the linear mixture SSP, which can attain an

\tilde{\mathcal{O}}(d B_{\star}^{1.5}\sqrt{K/c_{\min}})

regret. Here

K

is the number of episodes,

d

is the dimension of the feature mapping in the mixture model,

B_{\star}

bounds the expected cumulative cost of the optimal policy, and

c_{\min}>0

is the lower bound of the cost function. Our algorithm also applies to the case when

c_{\min} = 0

, and an

\tilde{\mathcal{O}}(K^{2/3})

regret is guaranteed. To the best of our knowledge, this is the first algorithm with a sublinear regret guarantee for learning linear mixture SSP. Moreover, we design a refined Bernstein-type confidence set and propose an improved algorithm, which provably achieves an

\tilde{\mathcal{O}}(d B_{\star}\sqrt{K/c_{\min}})

regret. In complement to the regret upper bounds, we also prove a lower bound of

\Omega(dB_{\star} \sqrt{K})

. Hence, our improved algorithm matches the lower bound up to a

1/\sqrt{c_{\min}}

factor and poly-logarithmic factors, achieving a near-optimal regret guarantee.Comment: 46 pages, 1 figure. In ICML 202

arXiv.org e-Print Archive

Multi-Classifier Interactive Learning for Ambiguous Speech Emotion Recognition

Author: Gu Yu
Liang Xuefeng
Yao Longshan
Yin Yifei
Zhou Ying
Publication venue
Publication date: 12/12/2020
Field of study

In recent years, speech emotion recognition technology is of great significance in industrial applications such as call centers, social robots and health care. The combination of speech recognition and speech emotion recognition can improve the feedback efficiency and the quality of service. Thus, the speech emotion recognition has been attracted much attention in both industry and academic. Since emotions existing in an entire utterance may have varied probabilities, speech emotion is likely to be ambiguous, which poses great challenges to recognition tasks. However, previous studies commonly assigned a single-label or multi-label to each utterance in certain. Therefore, their algorithms result in low accuracies because of the inappropriate representation. Inspired by the optimally interacting theory, we address the ambiguous speech emotions by proposing a novel multi-classifier interactive learning (MCIL) method. In MCIL, multiple different classifiers first mimic several individuals, who have inconsistent cognitions of ambiguous emotions, and construct new ambiguous labels (the emotion probability distribution). Then, they are retrained with the new labels to interact with their cognitions. This procedure enables each classifier to learn better representations of ambiguous data from others, and further improves the recognition ability. The experiments on three benchmark corpora (MAS, IEMOCAP, and FAU-AIBO) demonstrate that MCIL does not only improve each classifier's performance, but also raises their recognition consistency from moderate to substantial.Comment: 10 pages, 4 figure

arXiv.org e-Print Archive

Delving into Out-of-Distribution Detection with Vision-Language Representations

Author: Cai Ziyang
Gu Jiuxiang
Li Wei
Li Yixuan
Ming Yifei
Sun Yiyou
Publication venue
Publication date: 24/11/2022
Field of study

Recognizing out-of-distribution (OOD) samples is critical for machine learning systems deployed in the open world. The vast majority of OOD detection methods are driven by a single modality (e.g., either vision or language), leaving the rich information in multi-modal representations untapped. Inspired by the recent success of vision-language pre-training, this paper enriches the landscape of OOD detection from a single-modal to a multi-modal regime. Particularly, we propose Maximum Concept Matching (MCM), a simple yet effective zero-shot OOD detection method based on aligning visual features with textual concepts. We contribute in-depth analysis and theoretical insights to understand the effectiveness of MCM. Extensive experiments demonstrate that MCM achieves superior performance on a wide variety of real-world tasks. MCM with vision-language features outperforms a common baseline with pure visual features on a hard OOD task with semantically similar classes by 13.1% (AUROC). Code is available at https://github.com/deeplearning-wisc/MCM.Comment: 36th Conference on Neural Information Processing Systems (NeurIPS 2022

arXiv.org e-Print Archive

Four-dimensional Cone Beam CT Reconstruction and Enhancement using a Temporal Non-Local Means Method

Author: Ahmad
Bergner
Buades
Combettes
Coupe
Dietrich
Feldkamp
Gilboa
Golub
Gu
Gu
Gu
Gu
Hale
Herk
Hestenes
Hissoiny
Jan-Jakob Sonke
Jia
Jia
Jia
Jia
Jia
Kriminski
Leng
Leng
Li
Li
Li
Lou
Lu
Lu
Men
Men
Samant
Segars
Sharp
Siddon
Sonke
Steve B. Jiang
Tian
Xu
Xun Jia
Yifei Lou
Zhang
Zhen Tian
Zijp
Publication venue: 'Wiley'
Publication date: 11/01/2012
Field of study

Four-dimensional Cone Beam Computed Tomography (4D-CBCT) has been developed to provide respiratory phase resolved volumetric imaging in image guided radiation therapy (IGRT). Inadequate number of projections in each phase bin results in low quality 4D-CBCT images with obvious streaking artifacts. In this work, we propose two novel 4D-CBCT algorithms: an iterative reconstruction algorithm and an enhancement algorithm, utilizing a temporal nonlocal means (TNLM) method. We define a TNLM energy term for a given set of 4D-CBCT images. Minimization of this term favors those 4D-CBCT images such that any anatomical features at one spatial point at one phase can be found in a nearby spatial point at neighboring phases. 4D-CBCT reconstruction is achieved by minimizing a total energy containing a data fidelity term and the TNLM energy term. As for the image enhancement, 4D-CBCT images generated by the FDK algorithm are enhanced by minimizing the TNLM function while keeping the enhanced images close to the FDK results. A forward-backward splitting algorithm and a Gauss-Jacobi iteration method are employed to solve the problems. The algorithms are implemented on GPU to achieve a high computational efficiency. The reconstruction algorithm and the enhancement algorithm generate visually similar 4D-CBCT images, both better than the FDK results. Quantitative evaluations indicate that, compared with the FDK results, our reconstruction method improves contrast-to-noise-ratio (CNR) by a factor of 2.56~3.13 and our enhancement method increases the CNR by 2.75~3.33 times. The enhancement method also removes over 80% of the streak artifacts from the FDK results. The total computation time is ~460 sec for the reconstruction algorithm and ~610 sec for the enhancement algorithm on an NVIDIA Tesla C1060 GPU card.Comment: 20 pages, 3 figures, 2 table

arXiv.org e-Print Archive

Crossref

GPU-based Fast Low-dose Cone Beam CT Reconstruction via Total Variation

Author: Gu Xuejun
Jia Xun
Jiang Steve B.
Lewis John
Li Ruijiang
Lou Yifei
Men Chunhua
Song William Y.
Publication venue
Publication date: 09/04/2010
Field of study

Cone-beam CT (CBCT) has been widely used in image guided radiation therapy (IGRT) to acquire updated volumetric anatomical information before treatment fractions for accurate patient alignment purpose. However, the excessive x-ray imaging dose from serial CBCT scans raises a clinical concern in most IGRT procedures. The excessive imaging dose can be effectively reduced by reducing the number of x-ray projections and/or lowering mAs levels in a CBCT scan. The goal of this work is to develop a fast GPU-based algorithm to reconstruct high quality CBCT images from undersampled and noisy projection data so as to lower the imaging dose. The CBCT is reconstructed by minimizing an energy functional consisting of a data fidelity term and a total variation regularization term. We developed a GPU-friendly version of the forward-backward splitting algorithm to solve this model. A multi-grid technique is also employed. We test our CBCT reconstruction algorithm on a digital NCAT phantom and a head-and-neck patient case. The performance under low mAs is also validated using a physical Catphan phantom and a head-and-neck Rando phantom. It is found that 40 x-ray projections are sufficient to reconstruct CBCT images with satisfactory quality for IGRT patient alignment purpose. Phantom experiments indicated that CBCT images can be successfully reconstructed with our algorithm under as low as 0.1 mAs/projection level. Comparing with currently widely used full-fan head-and-neck scanning protocol of about 360 projections with 0.4 mAs/projection, it is estimated that an overall 36 times dose reduction has been achieved with our algorithm. Moreover, the reconstruction time is about 130 sec on an NVIDIA Tesla C1060 GPU card, which is estimated ~100 times faster than similar iterative reconstruction approaches.Comment: 20 pages, 10 figures, Paper was revised and more testing cases were added

arXiv.org e-Print Archive

Crossref