Search CORE

187 research outputs found

Skydiver: A Spiking Neural Network Accelerator Exploiting Spatio-Temporal Workload Balance

Author: Chen Qinyu
Fang Xinyuan
Gao Chang
Luan Haitao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2022
Field of study

Skydiver: A Spiking Neural Network Accelerator Exploiting Spatio-Temporal Workload Balance

Author: Chen Qinyu
Fang Xinyuan
Gao Chang
Luan Haitao
Publication venue
Publication date: 14/03/2022
Field of study

Spiking Neural Networks (SNNs) are developed as a promising alternative to Artificial Neural networks (ANNs) due to their more realistic brain-inspired computing models. SNNs have sparse neuron firing over time, i.e., spatio-temporal sparsity; thus, they are useful to enable energy-efficient hardware inference. However, exploiting spatio-temporal sparsity of SNNs in hardware leads to unpredictable and unbalanced workloads, degrading the energy efficiency. In this work, we propose an FPGA-based convolutional SNN accelerator called Skydiver that exploits spatio-temporal workload balance. We propose the Approximate Proportional Relation Construction (APRC) method that can predict the relative workload channel-wisely and a Channel-Balanced Workload Schedule (CBWS) method to increase the hardware workload balance ratio to over 90%. Skydiver was implemented on a Xilinx XC7Z045 FPGA and verified on image segmentation and MNIST classification tasks. Results show improved throughput by 1.4X and 1.2X for the two tasks. Skydiver achieved 22.6 KFPS throughput, and 42.4 uJ/Image prediction energy on the classification task with 98.5% accuracy.Comment: Accepted to be published in the IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 202

arXiv.org e-Print Archive

DiffS2UT: A Semantic Preserving Diffusion Model for Textless Direct Speech-to-Speech Translation

Author: Gao Zhujin
Xu Linli
Ye Zhongyi
Zhou Xinyuan
Zhu Yongxin
Publication venue
Publication date: 26/10/2023
Field of study

While Diffusion Generative Models have achieved great success on image generation tasks, how to efficiently and effectively incorporate them into speech generation especially translation tasks remains a non-trivial problem. Specifically, due to the low information density of speech data, the transformed discrete speech unit sequence is much longer than the corresponding text transcription, posing significant challenges to existing auto-regressive models. Furthermore, it is not optimal to brutally apply discrete diffusion on the speech unit sequence while disregarding the continuous space structure, which will degrade the generation performance significantly. In this paper, we propose a novel diffusion model by applying the diffusion forward process in the \textit{continuous} speech representation space, while employing the diffusion backward process in the \textit{discrete} speech unit space. In this way, we preserve the semantic structure of the continuous speech representation space in the diffusion process and integrate the continuous and discrete diffusion models. We conduct extensive experiments on the textless direct speech-to-speech translation task, where the proposed method achieves comparable results to the computationally intensive auto-regressive baselines (500 steps on average) with significantly fewer decoding steps (50 steps).Comment: Accepted in EMNLP2023 main conferenc

arXiv.org e-Print Archive

FrameFire: Enabling Efficient Spiking Neural Network Inference for Video Segmentation

Author: Chen Qinyu
Fang Xinyuan
Gao Chang
Luan Haitao
Sun Congyi
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 13/06/2023
Field of study

Fast video recognition is essential for real-time scenarios, e.g., autonomous driving. However, applying existing Deep Neural Networks (DNNs) to individual high-resolution images is expensive due to large model sizes. Spiking Neural Networks (SNNs) are developed as a promising alternative to DNNs due to their more realistic brain-inspired computing models. SNNs have sparse neuron firing over time, i.e., spatio-temporal sparsity; thus they are useful to enable energy-efficient computation. However, exploiting the spatio-temporal sparsity of SNNs in hardware leads to unpredictable and unbalanced workloads, degrading energy efficiency. In this work, we, therefore, propose an SNN accelerator called FrameFire for efficient video processing. We introduce a Keyframe-dominated Workload Balance Schedule (KWBS) method. It accelerates the image recognition network with sparse keyframes, then records and analyzes the current workload distribution on hardware to facilitate scheduling workloads in subsequent frames. FrameFire is implemented on a Xilinx XC7Z035 FPGA and verified by video segmentation tasks. The results show that the throughput is improved by 1.7× with the KWBS method. FrameFire achieved 1.04 KFPS throughput and 1.15 mJ/frame recognition energy

ZORA

CEAT: Continual Expansion and Absorption Transformer for Non-Exemplar Class-Incremental Learning

Author: Dong Songlin
Gao Xinyuan
Gong Yihong
He Yuhang
Wei Xing
Publication venue
Publication date: 11/03/2024
Field of study

In real-world applications, dynamic scenarios require the models to possess the capability to learn new tasks continuously without forgetting the old knowledge. Experience-Replay methods store a subset of the old images for joint training. In the scenario of more strict privacy protection, storing the old images becomes infeasible, which leads to a more severe plasticity-stability dilemma and classifier bias. To meet the above challenges, we propose a new architecture, named continual expansion and absorption transformer~(CEAT). The model can learn the novel knowledge by extending the expanded-fusion layers in parallel with the frozen previous parameters. After the task ends, we losslessly absorb the extended parameters into the backbone to ensure that the number of parameters remains constant. To improve the learning ability of the model, we designed a novel prototype contrastive loss to reduce the overlap between old and new classes in the feature space. Besides, to address the classifier bias towards the new classes, we propose a novel approach to generate the pseudo-features to correct the classifier. We experiment with our methods on three standard Non-Exemplar Class-Incremental Learning~(NECIL) benchmarks. Extensive experiments demonstrate that our model gets a significant improvement compared with the previous works and achieves 5.38%, 5.20%, and 4.92% improvement on CIFAR-100, TinyImageNet, and ImageNet-Subset

arXiv.org e-Print Archive

Genetic code expansion in \u3ci\u3ePseudomonas putida\u3c/i\u3e KT2440

Author: Chen Yan
Gao Tian
Guo Jiantao
He Xinyuan
Liu Kun
Niu Wei
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 18/11/2022
Field of study

Pseudomonas putida KT2440 is an emerging microbial chassis for bio-based chemical production from renewable feedstocks and environmental bioremediation. However, tools for studying, engineering, and modulating protein complexes and biosynthetic enzymes in this organism are largely underdeveloped. Genetic code expansion for the incorporation of unnatural amino acids (unAAs) into proteins can advance such efforts and, furthermore, enable additional controls of biological processes of the strain. In this work, we established the orthogonality of two widely used archaeal tRNA synthetase and tRNA pairs in KT2440. Following the optimization of decoding systems, four unAAs were incorporated into proteins in response to a UAG stop codon at 34.6-78% efficiency. In addition, we demonstrated the utility of genetic code expansion through the incorporation of a photocrosslinking amino acid, p-benzoyl-L-phenylalanine (pBpa), into glutathione S-transferase (GstA) and a chemosensory response regulator (CheY) for protein-protein interaction studies in KT2440. This work reported the successful genetic code expansion in KT2440 for the first time. Given the diverse structure and functions of unAAs that have been added to protein syntheses using the archaeal systems, our research lays down a solid foundation for future work to study and enhance the biological functions of KT2440

DigitalCommons@University of Nebraska

Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training

Author: Das Rohan Kumar
Gao Xiaoxue
Li Haizhou
Qian Xinyuan
Tao Ruijie
Wang Jiadong
Publication venue
Publication date: 31/03/2024
Field of study

Audio-visual active speaker detection (AV-ASD) aims to identify which visible face is speaking in a scene with one or more persons. Most existing AV-ASD methods prioritize capturing speech-lip correspondence. However, there is a noticeable gap in addressing the challenges from real-world AV-ASD scenarios. Due to the presence of low-quality noisy videos in such cases, AV-ASD systems without a selective listening ability are short of effectively filtering out disruptive voice components from mixed audio inputs. In this paper, we propose a Multi-modal Speaker Extraction-to-Detection framework named `MuSED', which is pre-trained with audio-visual target speaker extraction to learn the denoising ability, then it is fine-tuned with the AV-ASD task. Meanwhile, to better capture the multi-modal information and deal with real-world problems such as missing modality, MuSED is modelled on the time domain directly and integrates the multi-modal plus-and-minus augmentation strategy. Our experiments demonstrate that MuSED substantially outperforms the state-of-the-art AV-ASD methods and achieves 95.6% mAP on the AVA-ActiveSpeaker dataset, 98.3% AP on the ASW dataset, and 97.9% F1 on the Columbia AV-ASD dataset, respectively. We will publicly release the code in due course.Comment: 10 page

arXiv.org e-Print Archive

Survey of Natural Language Processing for Education: Taxonomy, Systematic Review, and Future Trends

Author: Du Hanyue
Gao Ming
Lan Yunshi
Li Xinyuan
Lu Xuesong
Qian Weining
Zhou Aoying
Publication venue
Publication date: 15/03/2024
Field of study

Natural Language Processing (NLP) aims to analyze text or speech via techniques in the computer science field. It serves the applications in domains of healthcare, commerce, education and so on. Particularly, NLP has been widely applied to the education domain and its applications have enormous potential to help teaching and learning. In this survey, we review recent advances in NLP with the focus on solving problems relevant to the education domain. In detail, we begin with introducing the related background and the real-world scenarios in education where NLP techniques could contribute. Then, we present a taxonomy of NLP in the education domain and highlight typical NLP applications including question answering, question construction, automated assessment, and error correction. Next, we illustrate the task definition, challenges, and corresponding cutting-edge techniques based on the above taxonomy. In particular, LLM-involved methods are included for discussion due to the wide usage of LLMs in diverse NLP applications. After that, we showcase some off-the-shelf demonstrations in this domain. At last, we conclude with six promising directions for future research, including more datasets in education domain, controllable usage of LLMs, intervention of difficulty-level control, interpretable educational NLP, methods with adaptive learning, and integrated systems for education. We organize all relevant datasets and papers in the open-available Github Link for better review~\url{https://github.com/LiXinyuan1015/NLP-for-Education}

arXiv.org e-Print Archive

Research on the Design Methods for Green Renovation of Existing Buildings in Lingnan Region

Author: Gao Xinyuan
Liang Haixiu
Wang Ding
Zhao Yihang
Publication venue: EDP Sciences
Publication date: 01/01/2024
Field of study

China’s urbanization has entered a new stage with the promotion of “Carbon Peaking and Carbon Neutrality Goals” and “Urban Renewal Strategy”. Problems such as poor comfort, high energy consumption and unreasonable functions of existing buildings have attracted extensive attention from society. The climate-adapted human environment created by traditional buildings in the Lingnan region offers insights for the green transformation of buildings in this area. This paper summarizes the wisdom from the climate-adaptive construction of traditional buildings in Lingnan region, and proposes a green transformation design scheme that meets the requirements of energy efficiency and comfort, which provides a reference for the green renovation design of existing buildings

Directory of Open Access Journals