641 research outputs found
An Ontology-Based Artificial Intelligence Model for Medicine Side-Effect Prediction: Taking Traditional Chinese Medicine as An Example
In this work, an ontology-based model for AI-assisted medicine side-effect
(SE) prediction is developed, where three main components, including the drug
model, the treatment model, and the AI-assisted prediction model, of proposed
model are presented. To validate the proposed model, an ANN structure is
established and trained by two hundred and forty-two TCM prescriptions. These
data are gathered and classified from the most famous ancient TCM book and more
than one thousand SE reports, in which two ontology-based attributions, hot and
cold, are introduced to evaluate whether the prescription will cause SE or not.
The results preliminarily reveal that it is a relationship between the
ontology-based attributions and the corresponding predicted indicator that can
be learnt by AI for predicting the SE, which suggests the proposed model has a
potential in AI-assisted SE prediction. However, it should be noted that, the
proposed model highly depends on the sufficient clinic data, and hereby, much
deeper exploration is important for enhancing the accuracy of the prediction
LM-VC: Zero-shot Voice Conversion via Speech Generation based on Language Models
Language model (LM) based audio generation frameworks, e.g., AudioLM, have
recently achieved new state-of-the-art performance in zero-shot audio
generation. In this paper, we explore the feasibility of LMs for zero-shot
voice conversion. An intuitive approach is to follow AudioLM - Tokenizing
speech into semantic and acoustic tokens respectively by HuBERT and
SoundStream, and converting source semantic tokens to target acoustic tokens
conditioned on acoustic tokens of the target speaker. However, such an approach
encounters several issues: 1) the linguistic content contained in semantic
tokens may get dispersed during multi-layer modeling while the lengthy speech
input in the voice conversion task makes contextual learning even harder; 2)
the semantic tokens still contain speaker-related information, which may be
leaked to the target speech, lowering the target speaker similarity; 3) the
generation diversity in the sampling of the LM can lead to unexpected outcomes
during inference, leading to unnatural pronunciation and speech quality
degradation. To mitigate these problems, we propose LM-VC, a two-stage language
modeling approach that generates coarse acoustic tokens for recovering the
source linguistic content and target speaker's timbre, and then reconstructs
the fine for acoustic details as converted speech. Specifically, to enhance
content preservation and facilitates better disentanglement, a masked prefix LM
with a mask prediction strategy is used for coarse acoustic modeling. This
model is encouraged to recover the masked content from the surrounding context
and generate target speech based on the target speaker's utterance and
corrupted semantic tokens. Besides, to further alleviate the sampling error in
the generation, an external LM, which employs window attention to capture the
local acoustic relations, is introduced to participate in the coarse acoustic
modeling
Delivering Speaking Style in Low-resource Voice Conversion with Multi-factor Constraints
Conveying the linguistic content and maintaining the source speech's speaking
style, such as intonation and emotion, is essential in voice conversion (VC).
However, in a low-resource situation, where only limited utterances from the
target speaker are accessible, existing VC methods are hard to meet this
requirement and capture the target speaker's timber. In this work, a novel VC
model, referred to as MFC-StyleVC, is proposed for the low-resource VC task.
Specifically, speaker timbre constraint generated by clustering method is newly
proposed to guide target speaker timbre learning in different stages.
Meanwhile, to prevent over-fitting to the target speaker's limited data,
perceptual regularization constraints explicitly maintain model performance on
specific aspects, including speaking style, linguistic content, and speech
quality. Besides, a simulation mode is introduced to simulate the inference
process to alleviate the mismatch between training and inference. Extensive
experiments performed on highly expressive speech demonstrate the superiority
of the proposed method in low-resource VC.Comment: Accepted by ICASSP 202
Beamforming Designs and Performance Evaluations for Intelligent Reflecting Surface Enhanced Wireless Communication System with Hardware Impairments
Intelligent reflecting surface (IRS) can effectively control the wavefront of
the impinging signals, and has emerged as a promising way to improve the energy
and spectrum efficiency of wireless communication systems. Most existing
studies were conducted with an assumption that the hardware operations are
perfect without any impairment. However, both physical transceiver and IRS
suffer from non-negligible hardware impairments in practice, which will bring
some major challenges, e.g., increasing the difficulty and complexity of the
beamforming designs, and degrading the system performance. In this paper, by
taking hardware impairments into consideration, we make the transmit and
reflect beamforming designs and evaluate the system performance. First, we
utilize the linear minimum mean square error estimator to make the channel
estimations, and analyze the factors that affect estimation accuracy. Then, we
derive the optimal transmit beamforming vector, and propose a gradient descent
method-based algorithm to obtain a sub-optimal reflect beamforming solution.
Next, we analyze the asymptotic channel capacities by considering two types of
asymptotics with respect to the transmit power and the numbers of antennas and
reflecting elements. Finally, we analyze the power scaling law and the energy
efficiency. By comparing the performance of our proposed algorithm with the
upper bound on the performance of global optimal reflect beamforming solution,
the simulation results demonstrate that our proposed algorithm can offer an
outstanding performance with low computational complexity. The simulation
results also show that there is no need to cost a lot on expensive antennas to
achieve both high spectral efficiency and energy efficiency when the
communication system is assisted by an IRS and suffer from hardware
impairments.Comment: arXiv admin note: text overlap with arXiv:2004.09804,
arXiv:2004.0976
汉语带标被动构式的构式化 = CONSTRUCTIONALIZATION OF MARKED PASSIVE CONSTRUCTION (MPC) IN CHINESE
Master'sMASTER OF ART
- …