Search CORE

258 research outputs found

The Kriston AI System for the VoxCeleb Speaker Recognition Challenge 2022

Author: Cai Qutang
Hong Guoqiang
Li Haizhou
Li Ximin
Ye Zhijian
Publication venue
Publication date: 23/09/2022
Field of study

This technical report describes our system for track 1, 2 and 4 of the VoxCeleb Speaker Recognition Challenge 2022 (VoxSRC-22). By combining several ResNet variants, our submission for track 1 attained a minDCF of 0:090 with EER 1:401%. By further incorporating three fine-tuned pre-trained models, our submission for track 2 achieved a minDCF of 0:072 with EER 1:119%. For track 4, our system consisted of voice activity detection (VAD), speaker embedding extraction, agglomerative hierarchical clustering (AHC) followed by a re-clustering step based on a Bayesian hidden Markov model and overlapped speech detection and handling. Our submission for track 4 achieved a diarisation error rate (DER) of 4.86%. The submissions all ranked the 2nd places for the corresponding tracks.Comment: System description of VoxSRC 2022: track 1, 2 and

arXiv.org e-Print Archive

CA-CentripetalNet: A novel anchor-free deep learning framework for hardhat wearing detection

Author: Cai Nian
Liu Zhijian
Ouyang Wensheng
Tian Nili
Wang Han
Zhang Chengbin
Publication venue
Publication date: 09/07/2023
Field of study

Automatic hardhat wearing detection can strengthen the safety management in construction sites, which is still challenging due to complicated video surveillance scenes. To deal with the poor generalization of previous deep learning based methods, a novel anchor-free deep learning framework called CA-CentripetalNet is proposed for hardhat wearing detection. Two novel schemes are proposed to improve the feature extraction and utilization ability of CA-CentripetalNet, which are vertical-horizontal corner pooling and bounding constrained center attention. The former is designed to realize the comprehensive utilization of marginal features and internal features. The latter is designed to enforce the backbone to pay attention to internal features, which is only used during the training rather than during the detection. Experimental results indicate that the CA-CentripetalNet achieves better performance with the 86.63% mAP (mean Average Precision) with less memory consumption at a reasonable speed than the existing deep learning based methods, especially in case of small-scale hardhats and non-worn-hardhats.Comment: It has been accepted for the journal of Signal, Image and Video Processing, which is a complete version. It is noted that it has been deleted for future publishin

arXiv.org e-Print Archive

Digital literacy and subjective happiness of low-income groups: Evidence from rural China

Author: Cai Zhijian
Liu Chang
Wang Jie
Publication venue
Publication date: 01/01/2022
Field of study

Improvements of the happiness of the rural population are an essential sign of the effectiveness of relative poverty governance. In the context of today’s digital economy, assessing the relationship between digital literacy and the subjective happiness of rural low-income groups is of great practicality. Based on data from China Family Panel Studies, the effect of digital literacy on the subjective well-being of rural low-income groups was empirically tested. A significant happiness effect of digital literacy on rural low-income groups was found. Digital literacy promotes the subjective happiness of rural low-income groups through income increase and consumption growth effects. The observed happiness effect is heterogeneous among different characteristic groups, and digital literacy significantly positively impacts the subjective happiness of rural low-income groups. Decomposition of subjective happiness into life satisfaction and job satisfaction shows that digital literacy significantly positively affects the job and life satisfaction of rural low-income groups. This paper demonstrates that digital literacy induces a practical happiness effect. To further strengthen the subjective welfare effect of digital literacy in the construction of digital villages, the government should focus on cultivating digital literacy among low-income groups from the demand side. The construction of digital infrastructure should be actively promoted from the supply side

PhilPapers

PubMed Central

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

Author: Cai Han
Han Song
Lin Ji
Liu Zhijian
Wang Kuan
Wang Tianzhe
Publication venue
Publication date: 15/06/2020
Field of study

We present APQ for efficient deep learning inference on resource-constrained hardware. Unlike previous methods that separately search the neural architecture, pruning policy, and quantization policy, we optimize them in a joint manner. To deal with the larger design space it brings, a promising approach is to train a quantization-aware accuracy predictor to quickly get the accuracy of the quantized model and feed it to the search engine to select the best fit. However, training this quantization-aware accuracy predictor requires collecting a large number of quantized pairs, which involves quantization-aware finetuning and thus is highly time-consuming. To tackle this challenge, we propose to transfer the knowledge from a full-precision (i.e., fp32) accuracy predictor to the quantization-aware (i.e., int8) accuracy predictor, which greatly improves the sample efficiency. Besides, collecting the dataset for the fp32 accuracy predictor only requires to evaluate neural networks without any training cost by sampling from a pretrained once-for-all network, which is highly efficient. Extensive experiments on ImageNet demonstrate the benefits of our joint optimization approach. With the same accuracy, APQ reduces the latency/energy by 2x/1.3x over MobileNetV2+HAQ. Compared to the separate optimization approach (ProxylessNAS+AMC+HAQ), APQ achieves 2.3% higher ImageNet accuracy while reducing orders of magnitude GPU hours and CO2 emission, pushing the frontier for green AI that is environmental-friendly. The code and video are publicly available.Comment: Accepted by CVPR 202

arXiv.org e-Print Archive

DSpace@MIT

Prompt Pool based Class-Incremental Continual Learning for Dialog State Tracking

Author: Cai Yucheng
Feng Junlan
Huang Yi
Liu Hong
Ou Zhijian
Zhou Yuan
Publication venue
Publication date: 16/11/2023
Field of study

Continual learning is crucial for dialog state tracking (DST) in dialog systems, since requirements from users for new functionalities are often encountered. However, most of existing continual learning methods for DST require task identities during testing, which is a severe limit in real-world applications. In this paper, we aim to address continual learning of DST in the class-incremental scenario (namely the task identity is unknown in testing). Inspired by the recently emerging prompt tuning method that performs well on dialog systems, we propose to use the prompt pool method, where we maintain a pool of key-value paired prompts and select prompts from the pool according to the distance between the dialog history and the prompt keys. The proposed method can automatically identify tasks and select appropriate prompts during testing. We conduct experiments on Schema-Guided Dialog dataset (SGD) and another dataset collected from a real-world dialog application. Experiment results show that the prompt pool method achieves much higher joint goal accuracy than the baseline. After combining with a rehearsal buffer, the model performance can be further improved

arXiv.org e-Print Archive

UniPCM: Universal Pre-trained Conversation Model with Task-aware Automatic Prompt

Author: Cai Yucheng
Li Yongbin
Ma Wentao
Ou Zhijian
Shao Yuan
Si Shuzheng
Wu Yuchuan
Publication venue
Publication date: 20/09/2023
Field of study

Recent research has shown that multi-task pre-training greatly improves the model's robustness and transfer ability, which is crucial for building a high-quality dialog system. However, most previous works on multi-task pre-training rely heavily on human-defined input format or prompt, which is not optimal in quality and quantity. In this work, we propose to use Task-based Automatic Prompt generation (TAP) to automatically generate high-quality prompts. Using the high-quality prompts generated, we scale the corpus of the pre-trained conversation model to 122 datasets from 15 dialog-related tasks, resulting in Universal Pre-trained Conversation Model (UniPCM), a powerful foundation model for various conversational tasks and different dialog systems. Extensive experiments have shown that UniPCM is robust to input prompts and capable of various dialog-related tasks. Moreover, UniPCM has strong transfer ability and excels at low resource scenarios, achieving SOTA results on 9 different datasets ranging from task-oriented dialog to open-domain conversation. Furthermore, we are amazed to find that TAP can generate prompts on par with those collected with crowdsourcing. The code is released with the paper

arXiv.org e-Print Archive

ROR-γ drives androgen receptor expression and represents a therapeutic target in castration-resistant prostate cancer.

Author: Borowsky Alexander D
Cai Demin
Chen Hong-Wu
Duan Zhijian
Evans Christopher P
Evans Ronald M
Gao Allen C
Kung Hsing-Jien
Lam Kit S
Louie Maggie C
Wang Junjian
Xiang Qiuping
Xu Jianzhen
Xu Yong
Xue Xiaoqian
Yang Joy C
Zhang Yan
Zou June X
Publication venue: eScholarship, University of California
Publication date: 01/05/2016
Field of study

The androgen receptor (AR) is overexpressed and hyperactivated in human castration-resistant prostate cancer (CRPC). However, the determinants of AR overexpression in CRPC are poorly defined. Here we show that retinoic acid receptor-related orphan receptor γ (ROR-γ) is overexpressed and amplified in metastatic CRPC tumors, and that ROR-γ drives AR expression in the tumors. ROR-γ recruits nuclear receptor coactivator 1 and 3 (NCOA1 and NCOA3, also known as SRC-1 and SRC-3) to an AR-ROR response element (RORE) to stimulate AR gene transcription. ROR-γ antagonists suppress the expression of both AR and its variant AR-V7 in prostate cancer (PCa) cell lines and tumors. ROR-γ antagonists also markedly diminish genome-wide AR binding, H3K27ac abundance and expression of the AR target gene network. Finally, ROR-γ antagonists suppressed tumor growth in multiple AR-expressing, but not AR-negative, xenograft PCa models, and they effectively sensitized CRPC tumors to enzalutamide, without overt toxicity, in mice. Taken together, these results establish ROR-γ as a key player in CRPC by acting upstream of AR and as a potential therapeutic target for advanced PCa

PubMed Central

eScholarship - University of California