Search CORE

119 research outputs found

Mixed-EVC: Mixed Emotion Synthesis and Control in Voice Conversion

Author: Busso Carlos
Li Haizhou
Ma Bin
Sisman Berrak
Zhou Kun
Publication venue
Publication date: 17/09/2023
Field of study

Emotional voice conversion (EVC) traditionally targets the transformation of spoken utterances from one emotional state to another, with previous research mainly focusing on discrete emotion categories. This paper departs from the norm by introducing a novel perspective: a nuanced rendering of mixed emotions and enhancing control over emotional expression. To achieve this, we propose a novel EVC framework, Mixed-EVC, which only leverages discrete emotion training labels. We construct an attribute vector that encodes the relationships among these discrete emotions, which is predicted using a ranking-based support vector machine and then integrated into a sequence-to-sequence (seq2seq) EVC framework. Mixed-EVC not only learns to characterize the input emotional style but also quantifies its relevance to other emotions during training. As a result, users have the ability to assign these attributes to achieve their desired rendering of mixed emotions. Objective and subjective evaluations confirm the effectiveness of our approach in terms of mixed emotion synthesis and control while surpassing traditional baselines in the conversion of discrete emotions from one to another

arXiv.org e-Print Archive

Progress in the seasonal variations of blood lipids: a mini-review.

Author: Ma Xiaochun
Wang Mansen
Yan Haichen
Zhang Haizhou
Zhang Qunye
Zhou Xiaoming
Publication venue: Providence St. Joseph Health Digital Commons
Publication date: 25/05/2020
Field of study

The seasonal variations of blood lipids have recently gained increasing interest in this field of lipid metabolism. Elucidating the seasonal patterns of blood lipids is particularly helpful for the prevention and treatment of cardiovascular and cerebrovascular diseases. However, the previous results remain controversial and the underlying mechanisms are still unclear. This mini-review is focused on summarizing the literature relevant to the seasonal variability of blood lipid parameters, as well as on discussing its significance in clinical diagnoses and management decisions

Providence St. Joseph Health Digital Commons

Long Short-term Memory with Two-Compartment Spiking Neuron

Author: Li Haizhou
Ma Chenxiang
Tan Kay Chen
Wu Jibin
Yang Qu
Zhang Shimin
Publication venue
Publication date: 14/07/2023
Field of study

The identification of sensory cues associated with potential opportunities and dangers is frequently complicated by unrelated events that separate useful cues by long delays. As a result, it remains a challenging task for state-of-the-art spiking neural networks (SNNs) to identify long-term temporal dependencies since bridging the temporal gap necessitates an extended memory capacity. To address this challenge, we propose a novel biologically inspired Long Short-Term Memory Leaky Integrate-and-Fire spiking neuron model, dubbed LSTM-LIF. Our model incorporates carefully designed somatic and dendritic compartments that are tailored to retain short- and long-term memories. The theoretical analysis further confirms its effectiveness in addressing the notorious vanishing gradient problem. Our experimental results, on a diverse range of temporal classification tasks, demonstrate superior temporal classification capability, rapid training convergence, strong network generalizability, and high energy efficiency of the proposed LSTM-LIF model. This work, therefore, opens up a myriad of opportunities for resolving challenging temporal processing tasks on emerging neuromorphic computing machines

arXiv.org e-Print Archive

Sparse Classifier Fusion for Speaker Verification

Author: Bin Ma
Filip Sedlak
Haizhou Li
Kong Aik Lee
Tomi Kinnunen
Ville Hautamaki
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

NIST 2007 Language Recognition Evaluation: From the Perspective of IIR

Author: Lee Kong-Aik
Li Haizhou
Ma Bin
Sim Khe-Chai
Sun Hanwu
Tong Rong
You Changhuai
Zhu Donglai
Publication venue: De La Salle University - Dasmarinas
Publication date: 01/01/2008
Field of study

PACLIC / The University of the Philippines Visayas Cebu College Cebu City, Philippines / November 20-22, 200

Waseda University Repository

Independent language modeling architecture for end-to-end ASR

Author: Chng Eng Siong
Khassanov Yerbolat
Li Haizhou
Ma Bin
Ni Chongjia
Pham Van Tung
Xu Haihua
Zeng Zhiping
Publication venue
Publication date: 01/01/2019
Field of study

The attention-based end-to-end (E2E) automatic speech recognition (ASR) architecture allows for joint optimization of acoustic and language models within a single network. However, in a vanilla E2E ASR architecture, the decoder sub-network (subnet), which incorporates the role of the language model (LM), is conditioned on the encoder output. This means that the acoustic encoder and the language model are entangled that doesn't allow language model to be trained separately from external text data. To address this problem, in this work, we propose a new architecture that separates the decoder subnet from the encoder output. In this way, the decoupled subnet becomes an independently trainable LM subnet, which can easily be updated using the external text data. We study two strategies for updating the new architecture. Experimental results show that, 1) the independent LM architecture benefits from external text data, achieving 9.3% and 22.8% relative character and word error rate reduction on Mandarin HKUST and English NSC datasets respectively; 2)the proposed architecture works well with external LM and can be generalized to different amount of labelled data

arXiv.org e-Print Archive

Crossref

Nazarbayev University Repository

Robust Speaker Verification Using Short-Time Frequency with Long-Time Window and Fusion of Multi-Resolutions

Author: Bin Ma
Brian Mak
Chien-Lin Huang
Chung-Hsien Wu
Haizhou Li
Publication venue
Publication date: 01/01/2008
Field of study

Abstract This study presents a novel approach of feature analysis to speaker verification. There are two main contributions in this paper. First, the feature analysis of short-time frequency with long-time window (SFLW) is a compact feature for the efficiency of speaker verification. The purpose of SFLW is to take account of short-time frequency characteristics and longtime resolution at the same time. Secondly, the fusion of multi-resolutions is used for the effectiveness of robust speaker verification. The speaker verification system can be further improved using multi-resolution features. The experimental results indicate that the proposed approaches not only speed up the processing time but also improve the performance of speaker verification

CiteSeerX

Diagenesis of the first member of Canglangpu Formation of the Cambrian Terreneuvian in northern part of the central Sichuan Basin and its influence on porosity

Author: Bing Zou
Bing Zou
Haizhou Qu
Haizhou Qu
Hongyi An
Lianjin Zhang
Qianwen Mo
Qinyang Huang
Qinyang Huang
Rongrong Zhao
Xingyu Zhang
Xingyu Zhang
Yu Pei
Yu Zhang
Zike Ma
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2023
Field of study

In this paper, taking the first Member of the Canglangpu Formation of the Cambrian Terreneuvian in the northern central Sichuan Basin as an example, the diagenesis and its influence on porosity are systemically studied based on the observations and identifications of cores, casts and cathodoluminescence thin sections. The results show that the rock types of the first member of Canglangpu Formation are various, including mixed rocks, carbonate rocks and clastic rocks. The specific lithology is dominated by sand-bearing oolitic dolomite, sandy oolitic dolomite, sparry oolotic dolomite and fine-grained detrital sandstone. At the same time, the Cang 1 Member has experienced five types of diagenetic environments, including seawater, meteoric water, evaporative seawater, shallow burial, and medium-deep burial diagenetic environments. Moreover, the main diagenetic processes under different diagenetic environments include cementation, dissolution, compaction, chemical compaction, dolomitization and structural fractures. According to the analysis, fabric-selective dissolution in meteoric water diagenetic environment, dolomitization in evaporative seawater environment, and non-fabric-selective dissolution, dolomitization and structural fractures in buried diagenetic environment are beneficial to the development of pores. However, cementation, compaction and chemical compaction in medium and deep burial environments, are unfavorable for the development of pores

Directory of Open Access Journals