Search CORE

17 research outputs found

An improved StarGAN for emotional voice conversion: enhancing voice quality and data augmentation

Author: Chen Junjie
He Xiangheng
Rizos Georgios
Schuller Björn W.
Publication venue: 'International Speech Communication Association'
Publication date: 18/07/2021
Field of study

Emotional Voice Conversion (EVC) aims to convert the emotional style of a source speech signal to a target style while preserving its content and speaker identity information. Previous emotional conversion studies do not disentangle emotional information from emotion-independent information that should be preserved, thus transforming it all in a monolithic manner and generating audio of low quality, with linguistic distortions. To address this distortion problem, we propose a novel StarGAN framework along with a two-stage training process that separates emotional features from those independent of emotion by using an autoencoder with two encoders as the generator of the Generative Adversarial Network (GAN). The proposed model achieves favourable results in both the objective evaluation and the subjective evaluation in terms of distortion, which reveals that the proposed model can effectively reduce distortion. Furthermore, in data augmentation experiments for end-to-end speech emotion recognition, the proposed StarGAN model achieves an increase of 2% in Micro-F1 and 5% in Macro-F1 compared to the baseline StarGAN model, which indicates that the proposed model is more valuable for data augmentation.Comment: Accepted by Interspeech 202

arXiv.org e-Print Archive

OPUS Augsburg

An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era

Author: André Elisabeth
Fu Ruibo
He Xiangheng
İymen Gökçe
Liu Shuo
Mertes Silvan
Schuller Björn W.
Sezgin Metin
Tao Jianhua
Triantafyllopoulos Andreas
Tzirakis Panagiotis
Yang Zijiang
Publication venue
Publication date: 06/10/2022
Field of study

Speech is the fundamental mode of human communication, and its synthesis has long been a core priority in human-computer interaction research. In recent years, machines have managed to master the art of generating speech that is understandable by humans. But the linguistic content of an utterance encompasses only a part of its meaning. Affect, or expressivity, has the capacity to turn speech into a medium capable of conveying intimate thoughts, feelings, and emotions -- aspects that are essential for engaging and naturalistic interpersonal communication. While the goal of imparting expressivity to synthesised utterances has so far remained elusive, following recent advances in text-to-speech synthesis, a paradigm shift is well under way in the fields of affective speech synthesis and conversion as well. Deep learning, as the technology which underlies most of the recent advances in artificial intelligence, is spearheading these efforts. In the present overview, we outline ongoing trends and summarise state-of-the-art approaches in an attempt to provide a comprehensive overview of this exciting field.Comment: Submitted to the Proceedings of IEE

arXiv.org e-Print Archive

OPUS Augsburg

Personalised depression forecasting using mobile sensor data and ecological momentary assessment

Author: Ebert David D.
Gerczuk Maurice
Grossmann Inga
Harrer Mathias
He Xiangheng
Heber Elena
Kathan Alexander
Küster Ludwig
Milling Manuel
Rajamani Srividya Tirunellai
Schuller Björn W.
Triantafyllopoulos Andreas
Yan Tianhao
Publication venue: 'Frontiers Media SA'
Publication date: 18/11/2022
Field of study

Introduction Digital health interventions are an effective way to treat depression, but it is still largely unclear how patients’ individual symptoms evolve dynamically during such treatments. Data-driven forecasts of depressive symptoms would allow to greatly improve the personalisation of treatments. In current forecasting approaches, models are often trained on an entire population, resulting in a general model that works overall, but does not translate well to each individual in clinically heterogeneous, real-world populations. Model fairness across patient subgroups is also frequently overlooked. Personalised models tailored to the individual patient may therefore be promising. Methods We investigate different personalisation strategies using transfer learning, subgroup models, as well as subject-dependent standardisation on a newly-collected, longitudinal dataset of depression patients undergoing treatment with a digital intervention (N=65 patients recruited). Both passive mobile sensor data as well as ecological momentary assessments were available for modelling. We evaluated the models’ ability to predict symptoms of depression (Patient Health Questionnaire-2; PHQ-2) at the end of each day, and to forecast symptoms of the next day. Results In our experiments, we achieve a best mean-absolute-error (MAE) of 0.801 (25% improvement) for predicting PHQ-2 values at the end of the day with subject-dependent standardisation compared to a non-personalised baseline (MAE=1.062). For one day ahead-forecasting, we can improve the baseline of 1.539 by 12% to a MAE of 1.349 using a transfer learning approach with shared common layers. In addition, personalisation leads to fairer models at group-level. Discussion Our results suggest that personalisation using subject-dependent standardisation and transfer learning can improve predictions and forecasts, respectively, of depressive symptoms in participants of a digital depression intervention. We discuss technical and clinical limitations of this approach, avenues for future investigations, and how personalised machine learning architectures may be implemented to improve existing digital interventions for depression

OPUS Augsburg

PubMed Central

FigShare

A robust welding seam identification method

Author: Ge Shuhi S
He Xiangheng
Khyam Mohammad O
Li Xinde
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Khyam, MO ORCiD: 0000-0002-1988-2328As an automatic welding process may experience some disturbances caused by, e.g., splashes and/or welding fumes, misalignments/poor positioning, thermally induced deformations, strong arc lights, diversified welding joints/grooves, etc., precisely identifying the welding seam has an great influence on the welding quality achieved. In this paper, a robust method for identifying this seam is proposed. Firstly, after a welding image obtained from a/the structured-light vision sensor is filtered, in a sufficiently small area, the extended Kalman filter (EKF) is used to search for the/its laser stripe in order to prevent possible disturbances. Secondly, to realize the extraction of the profile of welding seam, the least square method is used to fit a sequence of centroids determined by the scanning result of columns displayed on the tracking window. Thirdly, this profile is then qualitatively described and matched using a proposed character string method. Finally, the advantages of this method are compared with those of other approaches through repeated experiments

Crossref

aCQUIRe

ACQUIRE

A welding seam identification method based on cross-modal perception

Author: Ge Shuzhi S
He Xiangheng
Khyam Mohammad O
Li Pei
Li Xinde
Publication venue: 'Emerald'
Publication date: 01/01/2019
Field of study

Khyam, MO ORCiD: 0000-0002-1988-2328Purpose: As an automatic welding process may experience some disturbances caused by, for example, splashes and/or welding fumes, misalignments/poor positioning, thermally induced deformations, strong arc lights and diversified welding joints/grooves, precisely identifying the welding seam has a great influence on the welding quality. This paper aims to propose a robust method for identifying this seam based on cross-modal perception. Design/methodology/approach: First, after a welding image obtained from a structured-light vision sensor (here laser and vision are integrated into a cross-modal perception sensor) is filtered, in a sufficiently small area, the extended Kalman filter is used to prevent possible disturbances to search for its laser stripe. Second, to realize the extraction of the profile of welding seam, the least square method is used to fit a sequence of centroids determined by the scanning result of columns displayed on the tracking window. Third, this profile is then qualitatively described and matched using a proposed character string method. Findings: It is demonstrated that it maintains real time and is clearly superior in terms of accuracy and robustness, though its real-time performance is not the best. Originality/value: This paper proposes a robust method for automatically identifying and tracking a welding seam

aCQUIRe

ACQUIRE

A welding seam identification method based on cross-modal perception

Author: Mohammad Omar Khyam
Pei Li
Shuzhi Sam Ge
Xiangheng He
Xinde Li
Publication venue: 'Emerald'
Publication date
Field of study

Crossref

Catalytic Application and Mechanism Studies of Argentic Chloride Coupled Ag/Au Hollow Heterostructures: Considering the Interface Between Ag/Au Bimetals

Author: Changzhong Jiang
Jun Liu
Qingyong Tian
Quanguo He
Wei Wu
Xiangheng Xiao
Zhaohui Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Abstract For an economical use of solar energy, photocatalysts that are sufficiently efficient, stable, and capable of harvesting light are required. Composite heterostructures composed of noble metals and semiconductors exhibited the excellent in catalytic application. Here, 1D Ag/Au/AgCl hollow heterostructures are synthesized by galvanic replacement reaction (GRR) from Ag nanowires (NWs). The catalytic properties of these as-obtained Ag/Au/AgCl hollow heterostructures with different ratios are investigated by reducing 4-nitrophenol (Nip) into 4-aminophenol (Amp) in the presence of NaBH4, and the influence of AgCl semiconductor to the catalytic performances of Ag/Au bimetals is also investigated. These hollow heterostructures show the higher catalytic properties than pure Ag NWs, and the AgCl not only act as supporting materials, but the excess AgCl is also the obstacle for contact of Ag/Au bimetals with reactive species. Moreover, the photocatalytic performances of these hollow heterostructures are carried out by degradation of acid orange 7 (AO7) under UV and visible light. These Ag/Au/AgCl hollow heterostructures present the higher photocatalytic activities than pure Ag NWs and commercial TiO2 (P25), and the Ag/Au bimetals enhance the photocatalytic activity of AgCl semiconductor via the localized surface plasmon resonance (LSPR) and plasmon resonance energy transfer (PRET) mechanisms. The as-synthesized 1D Ag/Au/AgCl hollow heterostructures with multifunction could apply in practical environmental remedy by catalytic manners. Graphical abstrac

Directory of Open Access Journals

Co2P Nanoparticles Wrapped in Amorphous Porous Carbon as an Efficient and Stable Catalyst for Water Oxidation

Author: Changzhong Jiang
Chongyang Tang
Dong He
Haojie Wang
Jiangchao Liu
Lanli He
Xiangheng Xiao
Xianyin Song
Zunjian Ke
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

Frontiers - Publisher Connector

Recommended from our members

Cu Promoted the Dynamic Evolution of Ni-Based Catalysts for Polyethylene Terephthalate Plastic Upcycling.

Author: Dao Benjamin
Elliott Gregory
Gu Jing
He Dong
Huang Jier
Kang Hongxing
Nyakuchena James
Pan Xiaoqing
Streater Daniel
Williams Nicholas
Xiao Xiangheng
Yan Xingxu
Publication venue: eScholarship, University of California
Publication date: 05/04/2024
Field of study

Upcycling plastic wastes into value-added chemicals is a promising approach to put end-of-life plastic wastes back into their ecocycle. As one of the polyesters that is used daily, polyethylene terephthalate (PET) plastic waste is employed here as the model substrate. Herein, a nickel (Ni)-based catalyst was prepared via electrochemically depositing copper (Cu) species on Ni foam (NiCu/NF). The NiCu/NF formed Cu/CuO and Ni/NiO/Ni(OH)2 core-shell structures before electrolysis and reconstructed into NiOOH and CuOOH/Cu(OH)2 active species during the ethylene glycol (EG) oxidation. After oxidation, the Cu and Ni species evolved into more reduced species. An indirect mechanism was identified as the main EG oxidation (EGOR) mechanism. In EGOR, NiCu60s/NF catalyst exhibited an optimal Faradaic efficiency (FE, 95.8%) and yield rate (0.70 mmol cm-2 h-1) for formate production. Also, over 80% FE of formate was achieved when a commercial PET plastic powder hydrolysate was applied. Furthermore, commercial PET plastic water bottle waste was employed as a substrate for electrocatalytic upcycling, and pure terephthalic acid (TPA) was recovered only after 1 h electrolysis. Lastly, density functional theory (DFT) calculation revealed that the key role of Cu was significantly reducing the Gibbs free-energy barrier (ΔG) of EGORs rate-determining step (RDS), promoting catalysts dynamic evolution, and facilitating the C-C bond cleavage

eScholarship - University of California

Zinc Single Atom Confinement Effects on Catalysis in 1T-Phase Molybdenum Disulfide

Author: Demetrashvili Nino
Gu Jing
He Dong
Hu Wenhui
Huang Jier
Li Zhida
Pan Xiaoqing
Trulson Gabriella
Washington Audrey
Xiao Xiangheng
Yan XingXu
Younan Sabrina M
Publication venue: eScholarship, University of California
Publication date: 11/01/2023
Field of study

Active sites are atomic sites within catalysts that drive reactions and are essential for catalysis. Spatially confining guest metals within active site microenvironments has been predicted to improve catalytic activity by altering the electronic states of active sites. Using the hydrogen evolution reaction (HER) as a model reaction, we show that intercalating zinc single atoms between layers of 1T-MoS2 (Zn SAs/1T-MoS2) enhances HER performance by decreasing the overpotential, charge transfer resistance, and kinetic barrier. The confined Zn atoms tetrahedrally coordinate to basal sulfur (S) atoms and expand the interlayer spacing of 1T-MoS2 by ∼3.4%. Under confinement, the Zn SAs donate electrons to coordinated S atoms, which lowers the free energy barrier of H* adsorption-desorption and enhances HER kinetics. In this work, which is applicable to all types of catalytic reactions and layered materials, HER performance is enhanced by controlling the coordination geometry and electronic states of transition metals confined within active-site microenvironments

PubMed Central

eScholarship - University of California