Search CORE

221 research outputs found

DualTalker: A Cross-Modal Dual Learning Approach for Speech-Driven 3D Facial Animation

Author: Li Zhifeng
Su Guinan
Yang Yanwu
Publication venue
Publication date: 12/11/2023
Field of study

In recent years, audio-driven 3D facial animation has gained significant attention, particularly in applications such as virtual reality, gaming, and video conferencing. However, accurately modeling the intricate and subtle dynamics of facial expressions remains a challenge. Most existing studies approach the facial animation task as a single regression problem, which often fail to capture the intrinsic inter-modal relationship between speech signals and 3D facial animation and overlook their inherent consistency. Moreover, due to the limited availability of 3D-audio-visual datasets, approaches learning with small-size samples have poor generalizability that decreases the performance. To address these issues, in this study, we propose a cross-modal dual-learning framework, termed DualTalker, aiming at improving data usage efficiency as well as relating cross-modal dependencies. The framework is trained jointly with the primary task (audio-driven facial animation) and its dual task (lip reading) and shares common audio/motion encoder components. Our joint training framework facilitates more efficient data usage by leveraging information from both tasks and explicitly capitalizing on the complementary relationship between facial motion and audio to improve performance. Furthermore, we introduce an auxiliary cross-modal consistency loss to mitigate the potential over-smoothing underlying the cross-modal complementary representations, enhancing the mapping of subtle facial expression dynamics. Through extensive experiments and a perceptual user study conducted on the VOCA and BIWI datasets, we demonstrate that our approach outperforms current state-of-the-art methods both qualitatively and quantitatively. We have made our code and video demonstrations available at https://github.com/sabrina-su/iadf.git

arXiv.org e-Print Archive

Adversarial Data Augmentation Using VAE-GAN for Disordered Speech Recognition

Author: Deng Jiajun
Geng Mengzhe
Hu Shujie
Jin Zengrui
Li Guinan
Liu Xunying
Wang Tianzi
Xie Xurong
Publication venue
Publication date: 03/11/2022
Field of study

Automatic recognition of disordered speech remains a highly challenging task to date. The underlying neuro-motor conditions, often compounded with co-occurring physical disabilities, lead to the difficulty in collecting large quantities of impaired speech required for ASR system development. This paper presents novel variational auto-encoder generative adversarial network (VAE-GAN) based personalized disordered speech augmentation approaches that simultaneously learn to encode, generate and discriminate synthesized impaired speech. Separate latent features are derived to learn dysarthric speech characteristics and phoneme context representations. Self-supervised pre-trained Wav2vec 2.0 embedding features are also incorporated. Experiments conducted on the UASpeech corpus suggest the proposed adversarial data augmentation approach consistently outperformed the baseline speed perturbation and non-VAE GAN augmentation methods with trained hybrid TDNN and End-to-end Conformer systems. After LHUC speaker adaptation, the best system using VAE-GAN based augmentation produced an overall WER of 27.78% on the UASpeech test set of 16 dysarthric speakers, and the lowest published WER of 57.31% on the subset of speakers with "Very Low" intelligibility.Comment: Submitted to ICASSP 202

arXiv.org e-Print Archive

Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems

Author: Cui Mingyu
Deng Jiajun
Hu Shujie
Jin Zengrui
Li Guinan
Liu Xunying
Wang Tianzi
Xie Xurong
Xue Boyang
Publication venue
Publication date: 15/02/2023
Field of study

Speaker adaptation techniques provide a powerful solution to customise automatic speech recognition (ASR) systems for individual users. Practical application of unsupervised model-based speaker adaptation techniques to data intensive end-to-end ASR systems is hindered by the scarcity of speaker-level data and performance sensitivity to transcription errors. To address these issues, a set of compact and data efficient speaker-dependent (SD) parameter representations are used to facilitate both speaker adaptive training and test-time unsupervised speaker adaptation of state-of-the-art Conformer ASR systems. The sensitivity to supervision quality is reduced using a confidence score-based selection of the less erroneous subset of speaker-level adaptation data. Two lightweight confidence score estimation modules are proposed to produce more reliable confidence scores. The data sparsity issue, which is exacerbated by data selection, is addressed by modelling the SD parameter uncertainty using Bayesian learning. Experiments on the benchmark 300-hour Switchboard and the 233-hour AMI datasets suggest that the proposed confidence score-based adaptation schemes consistently outperformed the baseline speaker-independent (SI) Conformer model and conventional non-Bayesian, point estimate-based adaptation using no speaker data selection. Similar consistent performance improvements were retained after external Transformer and LSTM language model rescoring. In particular, on the 300-hour Switchboard corpus, statistically significant WER reductions of 1.0%, 1.3%, and 1.4% absolute (9.5%, 10.9%, and 11.3% relative) were obtained over the baseline SI Conformer on the NIST Hub5'00, RT02, and RT03 evaluation sets respectively. Similar WER reductions of 2.7% and 3.3% absolute (8.9% and 10.2% relative) were also obtained on the AMI development and evaluation sets.Comment: IEEE/ACM Transactions on Audio, Speech, and Language Processin

arXiv.org e-Print Archive

Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition

Author: Cui Mingyu
Deng Jiajun
Geng Mengzhe
Hu Shujie
Jin Zengrui
Li Guinan
Liu Xunying
Meng Helen
Wang Tianzi
Publication venue
Publication date: 06/07/2023
Field of study

Accurate recognition of cocktail party speech containing overlapping speakers, noise and reverberation remains a highly challenging task to date. Motivated by the invariance of visual modality to acoustic signal corruption, an audio-visual multi-channel speech separation, dereverberation and recognition approach featuring a full incorporation of visual information into all system components is proposed in this paper. The efficacy of the video input is consistently demonstrated in mask-based MVDR speech separation, DNN-WPE or spectral mapping (SpecM) based speech dereverberation front-end and Conformer ASR back-end. Audio-visual integrated front-end architectures performing speech separation and dereverberation in a pipelined or joint fashion via mask-based WPD are investigated. The error cost mismatch between the speech enhancement front-end and ASR back-end components is minimized by end-to-end jointly fine-tuning using either the ASR cost function alone, or its interpolation with the speech enhancement loss. Experiments were conducted on the mixture overlapped and reverberant speech data constructed using simulation or replay of the Oxford LRS2 dataset. The proposed audio-visual multi-channel speech separation, dereverberation and recognition systems consistently outperformed the comparable audio-only baseline by 9.1% and 6.2% absolute (41.7% and 36.0% relative) word error rate (WER) reductions. Consistent speech enhancement improvements were also obtained on PESQ, STOI and SRMR scores.Comment: IEEE/ACM Transactions on Audio, Speech, and Language Processin

arXiv.org e-Print Archive

Anti-tumour therapeutic efficacy of OX40L in murine tumour model

Author: Akiba
Ali
Ali
Bretscher
Carreno
Claire Entwisle
Cornelia S. McLean
Croft
Esther Choolun
Geng Li
Gramaglia
Gramaglia
Gruss
Guinan
Hellstrom
Imura
June Lynam
Kashii
Kjaergaard
Kjaergaard
Maxwell
Morris
Murrium Ahmad
Murtaza
Oh
Pan
Peter Loudon
Petty
Robert C. Rees
Rogers
Selman A. Ali
Shahid Mian
Smith
Stephanie E.B. McArdle
Stuber
Stuber
Todryk
Vetto
Wang
Weinberg
Weinberg
Weinberg
Weinberg
Whittle
Publication venue: 'Elsevier BV'
Publication date: 22/04/2004
Field of study

OX40 ligand (OX40L), a member of TNF superfamily, is a co-stimulatory molecule involved in T cell activation. Systemic administration of mOX40L fusion protein significantly inhibited the growth of experimental lung metastasis and subcutaneous (s.c.) established colon (CT26) and breast (4T1) carcinomas. Vaccination with OX40L was significantly enhanced by combination treatment with intra-tumour injection of a disabled infectious single cycle-herpes simplex virus (DISC-HSV) vector encoding murine granulocyte macrophage-colony stimulating factor (mGM-CSF). Tumour rejection in response to OX40L therapy required functional CD4+ and CD8+ T cells and correlated with splenocyte cytotoxic T lymphocytes (CTLs) activity against the AH-1 gp70 peptide of the tumour associated antigen expressed by CT26 cells. These results demonstrate the potential role of the OX40L in cancer immunotherapy

Crossref

Nottingham Trent Institutional Repository (IRep)

Intense exercise for survival among men with metastatic castrate-resistant prostate cancer (INTERVAL-GAP4): A multicentre, randomized, controlled phase III study protocol

Author: Buzza Mark
Casey Orla
Catto James
Chan June M
Courneya Kerry S
Finn Stephen P
Galvao Daniel A
Gledhill Sam
Greenwood Rosemary
Guinan Emer M
Hart Nicolas H
Hughes Daniel C
Kenfield Stacey A
Mucci Lorelei
Newton Robert U
Plaet Stephan F.E
Plymate Stephen R
Ryan Charles J
Saad Fred
Van Blarigan Erin L
Zhang Li
Publication venue: ResearchOnline@ND
Publication date: 01/01/2018
Field of study

Introduction: Preliminary evidence supports the beneficial role of physical activity on prostate cancer outcomes. This phase III randomised controlled trial (RCT) is designed to determine if supervised high-intensity aerobic and resistance exercise increases overall survival (OS) in patients with metastatic castrate-resistant prostate cancer (mCRPC). Methods and analysis: Participants (n=866) must have histologically documented metastatic prostate cancer with evidence of progressive disease on androgen deprivation therapy (defined as mCRPC). Patients can be treatmentnaive for mCRPC or on first-line androgen receptor-targeted therapy for mCRPC (ie, abiraterone or enzalutamide) without evidence of progression at enrolment, and with no prior chemotherapy for mCRPC. Patients will receive psychosocial support and will be randomly assigned (1:1) to either supervised exercise (high-intensity aerobic and resistance training) or self-directed exercise (provision of guidelines), stratified by treatment status and site. Exercise prescriptions will be tailored to each participant’s fitness and morbidities. The primary endpoint is OS. Secondary endpoints include time to disease progression, occurrence of a skeletal-related event or progression of pain, and degree of pain, opiate use, physical and emotional quality of life, and changes in metabolic biomarkers. An assessment of whether immune function, inflammation, dysregulation of insulin and energy metabolism, and androgen biomarkers are associated with OS will be performed, and whether they mediate the primary association between exercise and OS will also be investigated. This study will also establish a biobank for future biomarker discovery or validation. Ethics and dissemination: Validation of exercise as medicine and its mechanisms of action will create evidence to change clinical practice. Accordingly, outcomes of this RCT will be published in international, peer-reviewed journals, and presented at national and international conferences. Ethics approval was first obtained at Edith Cowan University (ID: 13236 NEWTON), with a further 10 investigator sites since receiving ethics approval, prior to activation. Trial registration number: NCT02730338

Crossref

Harvard University - DASH

ResearchOnline@ND

University of Canberra Research Repository

Queensland University of Technology ePrints Archive

eScholarship - University of California

Research Online @ ECU

University of Queensland eSpace

Stellar Coronal and Wind Models: Impact on Exoplanets

Author: A Maggio
A Morgenthaler
A Morgenthaler
A Reiners
A Reiners
A Reiners
A Strugarek
A Telleschi
A. Skumanich
AA Pevtsov
AA Vidotto
AA Vidotto
AA Vidotto
AA Vidotto
AA Vidotto
AA Vidotto
AA Vidotto
AA Vidotto
AA Vidotto
AA Vidotto
AA Vidotto
AA Vidotto
AA Vidotto
AA Vidotto
AA Vidotto
AA Vidotto
AA West
AF Lanza
B Holst van der
B Holst van der
B Zieger
BA Nicholson
BE Wood
BE Wood
BE Wood
BE Wood
BE Wood
BJ Wargelin
BO Demory
C Catala
C Garraffo
C Johnstone
C. P. Johnstone
CM Johns-Krull
CM Johns-Krull
CP Folsom
CP Johnstone
CP Johnstone
D Arzoumanian
D Bisikalo
D Buzasi
D Falceta-Gonçalves
D Lai
E Shkolnik
EF Guinan
EJ Gaidos
F Bagenal
G Anglada-Escudé
G Scandariato
GAJ Hussain
GAJ Hussain
GAJ Hussain
GHJ Oord van den
GW Pneuman
H Lammer
H Lammer
H Washimi
I Baraffe
I Pillitteri
I Ribas
I. Ribas
IA Waite
IA Waite
IA Waite
Ignasi Ribas
IV Sokolov
J Donati
J Donati
J Donati
J Donati
J Llama
J Morin
J Morin
J Morin
J Morin
J Sanz-Forcada
J Zendejas
Jackie Villadsen
JD Alvarado-Gómez
Jeremy Lim
JF Donati
JF Donati
JF Donati
JF Donati
JF Donati
JF Donati
JF Donati
JF Donati
JF Donati
JF Donati
JH Debes
JI Zuluaga
JJG Lima
JM Grießmeier
JM Grießmeier
JV Hollweg
K Tsinganos
KG Kislyakova
L Hartmann
L. Mestel
M Guedel
M Guedel
M Güdel
M Jardine
M Jardine
MG Sterenborg
ML Khodachenko
ML Khodachenko
N Pizzolato
NJ Wright
O Cohen
O Cohen
O. Cohen
O. Cohen
P Lang
P Petit
P Petit
P Testa
R Fares
R Fares
R Fares
R Fares
R Keppens
R Pallavicini
RD Jeffries
Robert P. Kraft
Rui F. Pinto
RVE Lovelace
S Boro Saikia
S Boro Saikia
S Matt
S. V. Jeffers
SC Marsden
SC Marsden
SG Parsons
SH Yang
SK Solanki
SL Li
SP Matt
SR Cranmer
SR Cranmer
T Matsakos
T Matsumoto
Takuma Matsumoto
TE Holzer
Theresa Lüftinger
TK Suzuki
V Jatenco-Pereira
V Réville
V See
V. Bourrier
V. Bourrier
WH Ip
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/09/2017
Field of study

Surface magnetism is believed to be the main driver of coronal heating and stellar wind acceleration. Coronae are believed to be formed by plasma confined in closed magnetic coronal loops of the stars, with winds mainly originating in open magnetic field line regions. In this Chapter, we review some basic properties of stellar coronae and winds and present some existing models. In the last part of this Chapter, we discuss the effects of coronal winds on exoplanets.Comment: Chapter published in the "Handbook of Exoplanets", Editors in Chief: Juan Antonio Belmonte and Hans Deeg, Section Editor: Nuccio Lanza. Springer Reference Work

arXiv.org e-Print Archive

Crossref

Methods for conducting international Delphi surveys to optimise global participation in core outcome set development: a case study in gastric cancer informed by a comprehensive literature review

Author: Adeyeye Ademola
Alkhaffaf Bilal
Allum William
Baiocchi Gian Luca
Baki Bahadır Emre
Beuscart Jean-Baptiste
Blazeby Jane M.
Bodur Muhammed Selim
Bruce Iain A.
Cabañas Gabriel Salcedo
Campos Cristina Marin
Candas Bahar
Cekic Arif Burak
Chaudry M. Asif
Costa Paulo M.
de Manzoni Giovanni
del Val Ismael Diez
Gisbertz Suzanne S.
Glenny Anne-Marie
Gonzalez Maria Posada
Griffiths Ewen
Guinan Emer
Guner Ali
Hagens Eliza R. C.
He Yu-long
Horbach Sophie
Lages Patrícia
Law Simon
Lee Hyuk-Joon
Li Guoxin
Li Shuangxi
Li Ziyu
Liang Han
Mecoli Christopher
Metryka Aleksandra
Nakada Koji
Neumann Philipp
Nuñez Rafael Mauricio Restrepo
Onofre Susana
O’Neill Linda
Reim Daniel
Reynolds John V.
Smith Toby O.
van Berge Henegouwen Mark I.
Vorwald Peter
Williamson Paula R.
Xu Zekuan
Xue Yingwei
Yildirim Reyyan
Zanotti Daniela
Zhao Enhao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Copyright © 2021, The Author(s) Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.Background: Core outcome sets (COS) should be relevant to key stakeholders and widely applicable and usable. Ideally, they are developed for international use to allow optimal data synthesis from trials. Electronic Delphi surveys are commonly used to facilitate global participation; however, this has limitations. It is common for these surveys to be conducted in a single language potentially excluding those not fluent in that tongue. The aim of this study is to summarise current approaches for optimising international participation in Delphi studies and make recommendations for future practice. Methods: A comprehensive literature review of current approaches to translating Delphi surveys for COS development was undertaken. A standardised methodology adapted from international guidance derived from 12 major sets of translation guidelines in the field of outcome reporting was developed. As a case study, this was applied to a COS project for surgical trials in gastric cancer to translate a Delphi survey into 7 target languages from regions active in gastric cancer research. Results: Three hundred thirty-two abstracts were screened and four studies addressing COS development in rheumatoid and osteoarthritis, vascular malformations and polypharmacy were eligible for inclusion. There was wide variation in methodological approaches to translation, including the number of forward translations, the inclusion of back translation, the employment of cognitive debriefing and how discrepancies and disagreements were handled. Important considerations were identified during the development of the gastric cancer survey including establishing translation groups, timelines, understanding financial implications, strategies to maximise recruitment and regulatory approvals. The methodological approach to translating the Delphi surveys was easily reproducible by local collaborators and resulted in an additional 637 participants to the 315 recruited to complete the source language survey. Ninety-nine per cent of patients and 97% of healthcare professionals from non-English-speaking regions used translated surveys. Conclusion: Consideration of the issues described will improve planning by other COS developers and can be used to widen international participation from both patients and healthcare professionals.This study is funded by the National Institute for Health Research (NIHR) Doctoral Research Fellowship Grant (DRF-2015-08-023). JMB is partially funded by the NIHR Bristol Biomedical Research Centre and the MRC ConDUCT-II Hub for Trials Methodology Research. PRW was funded by the MRC North West Hub for Trials Methodology Research (Grant ref: MR/K025635/01).info:eu-repo/semantics/publishedVersio

Universidade de Lisboa: Repositório.UL

The University of Manchester - Institutional Repository