Search CORE

120 research outputs found

How to Promote Learning Behavior? Exploring the Role of Gamification in Mobile Learning Apps from the Perspective of Affordance

Author: Yin Juan
Zhou Ziyue
Publication venue: AIS Electronic Library (AISeL)
Publication date: 26/07/2022
Field of study

With the rapid popularization of mobile network, various gamification elements have been widely used in many mobile learning apps. However, there is no complete model to explain the actual effect of these gamification elements on promoting user learning behavior. This paper analyzes the effects of two types of gamification affordance elements on user learning behavior in mobile learning apps based on affordance theory and combines with an elaboration likelihood model. The datas are collected and analyzed from 23641 Shanbay Word users. The result shows that achievement visualization affordance and competition affordance affect users learning behavior negatively, user learning intention plays a mediating role between achievement visualization affordance, competition affordance and user learning behavior, team learning climate plays a moderating role between user learning intention, achievement visualization affordance and competition affordance

AIS Electronic Library (AISeL)

When Online Auction Meets Virtual Reality: An Empirical Investigation

Author: Guo Xingyao
Jiang Ziyue
Yan Zhenbin
Zhang Yi
Zhou Zhongyun
Publication venue: AIS Electronic Library (AISeL)
Publication date: 25/01/2024
Field of study

The online auction is becoming increasingly popular in e-commerce, which allows to sell a product to the buyer with the highest bid. However, the lack of authentic product details for a thorough evaluation still poses challenges to its success. Recently, virtual reality (VR) is introduced to online auctions. We employ a unique dataset to investigate the effects of VR on auction outcomes and bidding activities. Results show that VR enhances buyers’ bidding competition, which in turn increases auction success and price, resulting in a competitive effect. Additionally, we find VR boosts buyers’ strategic responses to the bidding war, leading to a late-bidding effect. Findings contribute to both the theory and practice of VR and online auctions in selling houses

AIS Electronic Library (AISeL)

Semi-supervised Domain Adaptation in Graph Transfer Learning

Author: Dong Hao
Luo Xiao
Qiao Ziyue
Xiao Meng
Xiong Hui
Zhou Yuanchun
Publication venue
Publication date: 19/09/2023
Field of study

As a specific case of graph transfer learning, unsupervised domain adaptation on graphs aims for knowledge transfer from label-rich source graphs to unlabeled target graphs. However, graphs with topology and attributes usually have considerable cross-domain disparity and there are numerous real-world scenarios where merely a subset of nodes are labeled in the source graph. This imposes critical challenges on graph transfer learning due to serious domain shifts and label scarcity. To address these challenges, we propose a method named Semi-supervised Graph Domain Adaptation (SGDA). To deal with the domain shift, we add adaptive shift parameters to each of the source nodes, which are trained in an adversarial manner to align the cross-domain distributions of node embedding, thus the node classifier trained on labeled source nodes can be transferred to the target nodes. Moreover, to address the label scarcity, we propose pseudo-labeling on unlabeled nodes, which improves classification on the target graph via measuring the posterior influence of nodes based on their relative position to the class centroids. Finally, extensive experiments on a range of publicly accessible datasets validate the effectiveness of our proposed SGDA in different experimental settings

arXiv.org e-Print Archive

A Systematic Evaluation of Federated Learning on Biomedical Natural Language Processing

Author: chen jiandong
Peng Le
Sun Ju
Xu Ziyue
Zhang Rui
zhou sicheng
Publication venue
Publication date: 20/07/2023
Field of study

Language models (LMs) like BERT and GPT have revolutionized natural language processing (NLP). However, privacy-sensitive domains, particularly the medical field, face challenges to train LMs due to limited data access and privacy constraints imposed by regulations like the Health Insurance Portability and Accountability Act (HIPPA) and the General Data Protection Regulation (GDPR). Federated learning (FL) offers a decentralized solution that enables collaborative learning while ensuring the preservation of data privacy. In this study, we systematically evaluate FL in medicine across

2

biomedical NLP tasks using

6

LMs encompassing

8

corpora. Our results showed that: 1) FL models consistently outperform LMs trained on individual client's data and sometimes match the model trained with polled data; 2) With the fixed number of total data, LMs trained using FL with more clients exhibit inferior performance, but pre-trained transformer-based models exhibited greater resilience. 3) LMs trained using FL perform nearly on par with the model trained with pooled data when clients' data are IID distributed while exhibiting visible gaps with non-IID data. Our code is available at: https://github.com/PL97/FedNLPComment: Accepted by KDD 2023 Workshop FL4Data-Minin

arXiv.org e-Print Archive

FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models

Author: Huang Rongjie
Jiang Ziyue
Ren Yi
Yang Qian
Ye Zhenhui
Zhao Zhou
Zuo Jialong
Publication venue
Publication date: 22/05/2023
Field of study

Stutter removal is an essential scenario in the field of speech editing. However, when the speech recording contains stutters, the existing text-based speech editing approaches still suffer from: 1) the over-smoothing problem in the edited speech; 2) lack of robustness due to the noise introduced by stutter; 3) to remove the stutters, users are required to determine the edited region manually. To tackle the challenges in stutter removal, we propose FluentSpeech, a stutter-oriented automatic speech editing model. Specifically, 1) we propose a context-aware diffusion model that iteratively refines the modified mel-spectrogram with the guidance of context features; 2) we introduce a stutter predictor module to inject the stutter information into the hidden sequence; 3) we also propose a stutter-oriented automatic speech editing (SASE) dataset that contains spontaneous speech recordings with time-aligned stutter labels to train the automatic stutter localization model. Experimental results on VCTK and LibriTTS datasets demonstrate that our model achieves state-of-the-art performance on speech editing. Further experiments on our SASE dataset show that FluentSpeech can effectively improve the fluency of stuttering speech in terms of objective and subjective metrics. Code and audio samples can be found at https://github.com/Zain-Jiang/Speech-Editing-Toolkit.Comment: Accepted by ACL 2023 (Findings

arXiv.org e-Print Archive

Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech

Author: Jiang Ziyue
Liu Jinglin
Ren Yi
Yang Qian
Ye Zhenhui
Zhao Zhou
Zhe Su
Publication venue
Publication date: 09/10/2022
Field of study

Polyphone disambiguation aims to capture accurate pronunciation knowledge from natural text sequences for reliable Text-to-speech (TTS) systems. However, previous approaches require substantial annotated training data and additional efforts from language experts, making it difficult to extend high-quality neural TTS systems to out-of-domain daily conversations and countless languages worldwide. This paper tackles the polyphone disambiguation problem from a concise and novel perspective: we propose Dict-TTS, a semantic-aware generative text-to-speech model with an online website dictionary (the existing prior information in the natural language). Specifically, we design a semantics-to-pronunciation attention (S2PA) module to match the semantic patterns between the input text sequence and the prior semantics in the dictionary and obtain the corresponding pronunciations; The S2PA module can be easily trained with the end-to-end TTS model without any annotated phoneme labels. Experimental results in three languages show that our model outperforms several strong baseline models in terms of pronunciation accuracy and improves the prosody modeling of TTS systems. Further extensive analyses demonstrate that each design in Dict-TTS is effective. The code is available at \url{https://github.com/Zain-Jiang/Dict-TTS}.Comment: Accepted by NeurIPS 202

arXiv.org e-Print Archive

Interdisciplinary Fairness in Imbalanced Research Proposal Topic Inference: A Hierarchical Transformer-based Method with Selective Interpolation

Author: Du Yi
Fu Yanjie
Ning Zhiyuan
Qiao Ziyue
Wu Min
Xiao Meng
Zhou Yuanchun
Publication venue
Publication date: 04/09/2023
Field of study

The objective of topic inference in research proposals aims to obtain the most suitable disciplinary division from the discipline system defined by a funding agency. The agency will subsequently find appropriate peer review experts from their database based on this division. Automated topic inference can reduce human errors caused by manual topic filling, bridge the knowledge gap between funding agencies and project applicants, and improve system efficiency. Existing methods focus on modeling this as a hierarchical multi-label classification problem, using generative models to iteratively infer the most appropriate topic information. However, these methods overlook the gap in scale between interdisciplinary research proposals and non-interdisciplinary ones, leading to an unjust phenomenon where the automated inference system categorizes interdisciplinary proposals as non-interdisciplinary, causing unfairness during the expert assignment. How can we address this data imbalance issue under a complex discipline system and hence resolve this unfairness? In this paper, we implement a topic label inference system based on a Transformer encoder-decoder architecture. Furthermore, we utilize interpolation techniques to create a series of pseudo-interdisciplinary proposals from non-interdisciplinary ones during training based on non-parametric indicators such as cross-topic probabilities and topic occurrence probabilities. This approach aims to reduce the bias of the system during model training. Finally, we conduct extensive experiments on a real-world dataset to verify the effectiveness of the proposed method. The experimental results demonstrate that our training strategy can significantly mitigate the unfairness generated in the topic inference task.Comment: 19 pages, Under review. arXiv admin note: text overlap with arXiv:2209.1391

arXiv.org e-Print Archive

Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis

Author: Jiang Ziyue
Liu Jinglin
Ma Zejun
Ren Yi
Ye Zhenhui
Yin Xiang
Zhang Chen
Zhao Zhou
Publication venue
Publication date: 06/06/2023
Field of study

We are interested in a novel task, namely low-resource text-to-talking avatar. Given only a few-minute-long talking person video with the audio track as the training data and arbitrary texts as the driving input, we aim to synthesize high-quality talking portrait videos corresponding to the input text. This task has broad application prospects in the digital human industry but has not been technically achieved yet due to two challenges: (1) It is challenging to mimic the timbre from out-of-domain audio for a traditional multi-speaker Text-to-Speech system. (2) It is hard to render high-fidelity and lip-synchronized talking avatars with limited training data. In this paper, we introduce Adaptive Text-to-Talking Avatar (Ada-TTA), which (1) designs a generic zero-shot multi-speaker TTS model that well disentangles the text content, timbre, and prosody; and (2) embraces recent advances in neural rendering to achieve realistic audio-driven talking face video generation. With these designs, our method overcomes the aforementioned two challenges and achieves to generate identity-preserving speech and realistic talking person video. Experiments demonstrate that our method could synthesize realistic, identity-preserving, and audio-visual synchronized talking avatar videos.Comment: 6 pages, 3 figure

arXiv.org e-Print Archive