Search CORE

96 research outputs found

Transcriptome and Comparative Gene Expression Analysis of Sogatella furcifera (Horváth) in Response to Southern Rice Black-Streaked Dwarf Virus

Author: Wu Jianxiang
Xu Yi
Zhou Wenwu
Zhou Xueping
Zhou Yijun
Publication venue: Public Library of Science
Publication date: 27/04/2012
Field of study

BACKGROUND: The white backed planthopper (WBPH), Sogatella furcifera (Horváth), causes great damage to many crops by direct feeding or transmitting plant viruses. Southern rice black-streaked dwarf virus (SRBSDV), transmitted by WBPH, has become a great threat to rice production in East Asia. METHODOLOGY/PRINCIPAL FINDINGS: By de novo transcriptome assembling and massive parallel pyrosequencing, we constructed two transcriptomes of WBPH and profiled the alternation of gene expression in response to SRBSDV infection in transcriptional level. Over 25 million reads of high-quality DNA sequences and 81388 different unigenes were generated using Illumina technology from both viruliferous and non-viruliferous WBPH. WBPH has a very similar gene ontological distribution to other two closely related rice planthoppers, Nilaparvata lugens and Laodelphax striatellus. 7291 microsatellite loci were also predicted which could be useful for further evolutionary analysis. Furthermore, comparative analysis of the two transcriptomes generated from viruliferous and non-viruliferous WBPH provided a list of candidate transcripts that potentially were elicited as a response to viral infection. Pathway analyses of a subset of these transcripts indicated that SRBSDV infection may perturb primary metabolism and the ubiquitin-proteasome pathways. In addition, 5.5% (181 out of 3315) of the genes in cell cytoskeleton organization pathway showed obvious changes. Our data also demonstrated that SRBSDV infection activated the immunity regulatory systems of WBPH, such as RNA interference, autophagy and antimicrobial peptide production. CONCLUSIONS/SIGNIFICANCE: We employed massively parallel pyrosequencing to collect ESTs from viruliferous and non-viruliferous samples of WBPH. 81388 different unigenes have been obtained. We for the first time described the direct effects of a Reoviridae family plant virus on global gene expression profiles of its insect vector using high-throughput sequencing. Our study will provide a road map for future investigations of the fascinating interactions between Reoviridae viruses and their insect vectors, and provide new strategies for crop protection

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

FigShare

A simplified multi-model statistical approach for predicting the effects of forest management on land surface temperature in Fennoscandia

Author: Cherubini Francesco
Hu Xiangping
Huang Bo
Li Yan
Liu Yi
Zhao Wenwu
Publication venue
Publication date: 01/01/2023
Field of study

Forests interact with the local climate through a variety of biophysical mechanisms. Observational and modelling studies have investigated the effects of forested vs. non-forested areas, but the influence of forest management on surface temperature has received far less attention owing to the inherent challenges to adapt climate models to cope with forest dynamics. Further, climate models are complex and highly parameterized, and the time and resource intensity of their use limit applications. The availability of simple yet reliable statistical models based on high resolution maps of forest attributes representative of different development stages can link individual forest management practices to local temperature changes, and ultimately support the design of improved strategies. In this study, we investigate how forest management influences local surface temperature (LSTs) in Fennoscandia through a set of machine learning algorithms. We find that more developed forests are typically associated with higher LST than young or undeveloped forests. The mean multi-model estimates from our statistical system can accurately reproduce the observed LST. Relative to the present state of Fennoscandian forests, fully develop forests are found to induce an annual mean warming of 0.26 °C (0.03/0.69 °C as 5th/95th percentile), and an average cooling effect in the summer daytime from -0.85 to -0.23 °C (depending on the model). On the contrary, a scenario with undeveloped forests induces an annual average cooling of -0.29 °C (-0.61/-0.01 °C), but daytime warming in the summer that can be higher than 1 °C. A weak annual mean cooling of -0.01 °C is attributed to forest harvest from 2015 to 2018, with an increased daytime temperature in summer of about 0.04 °C. Overall, this approach is a flexible option to study effects of forest management on LST that can be applied at various scales and for alternative management scenarios, thereby helping to improve local management strategies with consideration of effects on local climate

Institutional Repository of the Freie Universität Berlin

Leveraging Pre-trained AudioLDM for Text to Sound Generation: A Benchmark Study

Author: Liang Jinhua
Liu Haohe
Liu Xubo
Plumbley Mark D.
Wang Wenwu
Yuan Yi
Publication venue
Publication date: 11/03/2023
Field of study

Deep neural networks have recently achieved breakthroughs in sound generation with text prompts. Despite their promising performance, current text-to-sound generation models face issues on small-scale datasets (e.g., overfitting), significantly limiting their performance. In this paper, we investigate the use of pre-trained AudioLDM, the state-of-the-art model for text-to-audio generation, as the backbone for sound generation. Our study demonstrates the advantages of using pre-trained models for text-to-sound generation, especially in data-scarcity scenarios. In addition, experiments show that different training strategies (e.g., training conditions) may affect the performance of AudioLDM on datasets of different scales. To facilitate future studies, we also evaluate various text-to-sound generation systems on several frequently used datasets under the same evaluation protocols, which allow fair comparisons and benchmarking of these methods on the common ground.Comment: EUSIPCO 202

arXiv.org e-Print Archive

Text-Driven Foley Sound Generation With Latent Diffusion Model

Author: Kang Xiyuan
Liu Haohe
Liu Xubo
Plumbley Mark D.
Wang Wenwu
Wu Peipei
Yuan Yi
Publication venue
Publication date: 17/06/2023
Field of study

Foley sound generation aims to synthesise the background sound for multimedia content. Previous models usually employ a large development set with labels as input (e.g., single numbers or one-hot vector). In this work, we propose a diffusion model based system for Foley sound generation with text conditions. To alleviate the data scarcity issue, our model is initially pre-trained with large-scale datasets and fine-tuned to this task via transfer learning using the contrastive language-audio pertaining (CLAP) technique. We have observed that the feature embedding extracted by the text encoder can significantly affect the performance of the generation model. Hence, we introduce a trainable layer after the encoder to improve the text embedding produced by the encoder. In addition, we further refine the generated waveform by generating multiple candidate audio clips simultaneously and selecting the best one, which is determined in terms of the similarity score between the embedding of the candidate clips and the embedding of the target text label. Using the proposed method, our system ranks

{1}^{st}

among the systems submitted to DCASE Challenge 2023 Task 7. The results of the ablation studies illustrate that the proposed techniques significantly improve sound generation performance. The codes for implementing the proposed system are available online.Comment: Submit to DCASE-workshop 2023. arXiv admin note: text overlap with arXiv:2305.1590

arXiv.org e-Print Archive

AudioLDM: Text-to-Audio Generation with Latent Diffusion Models

Author: Chen Zehua
Liu Haohe
Liu Xubo
Mandic Danilo
Mei Xinhao
Plumbley Mark D.
Wang Wenwu
Yuan Yi
Publication venue
Publication date: 29/01/2023
Field of study

Text-to-audio (TTA) system has recently gained attention for its ability to synthesize general audio based on text descriptions. However, previous studies in TTA have limited generation quality with high computational costs. In this study, we propose AudioLDM, a TTA system that is built on a latent space to learn the continuous audio representations from contrastive language-audio pretraining (CLAP) latents. The pretrained CLAP models enable us to train LDMs with audio embedding while providing text embedding as a condition during sampling. By learning the latent representations of audio signals and their compositions without modeling the cross-modal relationship, AudioLDM is advantageous in both generation quality and computational efficiency. Trained on AudioCaps with a single GPU, AudioLDM achieves state-of-the-art TTA performance measured by both objective and subjective metrics (e.g., frechet distance). Moreover, AudioLDM is the first TTA system that enables various text-guided audio manipulations (e.g., style transfer) in a zero-shot fashion. Our implementation and demos are available at https://audioldm.github.io.Comment: Accepted by ICML 2023. Demo and implementation at https://audioldm.github.io. Evaluation toolbox at https://github.com/haoheliu/audioldm_eva

arXiv.org e-Print Archive

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Sparks of Large Audio Models: A Survey and Outlook

Author: Cuayáhuitl Heriberto
Latif Siddique
Ren Yi
Schuller Björn W.
Shamshad Fahad
Shoukat Moazzam
Togneri Roberto
Usama Muhammad
Wang Wenwu
Zhang Xulong
Publication venue
Publication date: 03/09/2023
Field of study

This survey paper provides a comprehensive overview of the recent advancements and challenges in applying large language models to the field of audio signal processing. Audio processing, with its diverse signal representations and a wide range of sources--from human voices to musical instruments and environmental sounds--poses challenges distinct from those found in traditional Natural Language Processing scenarios. Nevertheless, \textit{Large Audio Models}, epitomized by transformer-based architectures, have shown marked efficacy in this sphere. By leveraging massive amount of data, these models have demonstrated prowess in a variety of audio tasks, spanning from Automatic Speech Recognition and Text-To-Speech to Music Generation, among others. Notably, recently these Foundational Audio Models, like SeamlessM4T, have started showing abilities to act as universal translators, supporting multiple speech tasks for up to 100 languages without any reliance on separate task-specific systems. This paper presents an in-depth analysis of state-of-the-art methodologies regarding \textit{Foundational Large Audio Models}, their performance benchmarks, and their applicability to real-world scenarios. We also highlight current limitations and provide insights into potential future research directions in the realm of \textit{Large Audio Models} with the intent to spark further discussion, thereby fostering innovation in the next generation of audio-processing systems. Furthermore, to cope with the rapid development in this area, we will consistently update the relevant repository with relevant recent articles and their open-source implementations at https://github.com/EmulationAI/awesome-large-audio-models.Comment: work in progress, Repo URL: https://github.com/EmulationAI/awesome-large-audio-model

arXiv.org e-Print Archive

AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining

Author: Kong Qiuqiang
Liu Haohe
Liu Xubo
Mei Xinhao
Plumbley Mark D.
Tian Qiao
Wang Wenwu
Wang Yuping
Wang Yuxuan
Yuan Yi
Publication venue
Publication date: 09/09/2023
Field of study

Although audio generation shares commonalities across different types of audio, such as speech, music, and sound effects, designing models for each type requires careful consideration of specific objectives and biases that can significantly differ from those of other types. To bring us closer to a unified perspective of audio generation, this paper proposes a framework that utilizes the same learning method for speech, music, and sound effect generation. Our framework introduces a general representation of audio, called "language of audio" (LOA). Any audio can be translated into LOA based on AudioMAE, a self-supervised pre-trained representation learning model. In the generation process, we translate any modalities into LOA by using a GPT-2 model, and we perform self-supervised audio generation learning with a latent diffusion model conditioned on LOA. The proposed framework naturally brings advantages such as in-context learning abilities and reusable self-supervised pretrained AudioMAE and latent diffusion models. Experiments on the major benchmarks of text-to-audio, text-to-music, and text-to-speech demonstrate state-of-the-art or competitive performance against previous approaches. Our code, pretrained model, and demo are available at https://audioldm.github.io/audioldm2.Comment: AudioLDM 2 project page is https://audioldm.github.io/audioldm

arXiv.org e-Print Archive

WavJourney: Compositional Audio Creation with Large Language Models

Author: Cao Yin
Cui Meng
Huang Qiushi
Kong Qiuqiang
Liang Jinhua
Liu Haohe
Liu Xubo
Plumbley Mark D.
Wang Wenwu
Yuan Yi
Zhu Zhongkai
Publication venue
Publication date: 26/07/2023
Field of study

Large Language Models (LLMs) have shown great promise in integrating diverse expert models to tackle intricate language and vision tasks. Despite their significance in advancing the field of Artificial Intelligence Generated Content (AIGC), their potential in intelligent audio content creation remains unexplored. In this work, we tackle the problem of creating audio content with storylines encompassing speech, music, and sound effects, guided by text instructions. We present WavJourney, a system that leverages LLMs to connect various audio models for audio content generation. Given a text description of an auditory scene, WavJourney first prompts LLMs to generate a structured script dedicated to audio storytelling. The audio script incorporates diverse audio elements, organized based on their spatio-temporal relationships. As a conceptual representation of audio, the audio script provides an interactive and interpretable rationale for human engagement. Afterward, the audio script is fed into a script compiler, converting it into a computer program. Each line of the program calls a task-specific audio generation model or computational operation function (e.g., concatenate, mix). The computer program is then executed to obtain an explainable solution for audio generation. We demonstrate the practicality of WavJourney across diverse real-world scenarios, including science fiction, education, and radio play. The explainable and interactive design of WavJourney fosters human-machine co-creation in multi-round dialogues, enhancing creative control and adaptability in audio production. WavJourney audiolizes the human imagination, opening up new avenues for creativity in multimedia content creation.Comment: Project Page: https://audio-agi.github.io/WavJourney_demopage

arXiv.org e-Print Archive