56 research outputs found

    Existence and multiplicity of positive bound states for Schrƶdinger equations

    Full text link

    Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech

    Get PDF
    Modelling prosody variation is critical for synthesizing natural and expressive speech in end-to-end text-to-speech (TTS) systems. In this paper, a cross-utterance conditional VAE (CUC-VAE) is proposed to estimate a posterior probability distribution of the latent prosody features for each phoneme by conditioning on acoustic features, speaker information, and text features obtained from both past and future sentences. At inference time, instead of the standard Gaussian distribution used by VAE, CUC-VAE allows sampling from an utterance-specific prior distribution conditioned on cross-utterance information, which allows the prosody features generated by the TTS system to be related to the context and is more similar to how humans naturally produce prosody. The performance of CUC-VAE is evaluated via a qualitative listening test for naturalness, intelligibility and quantitative measurements, including word error rates and the standard deviation of prosody attributes. Experimental results on LJ-Speech and LibriTTS data show that the proposed CUC-VAE TTS system improves naturalness and prosody diversity with clear margins

    Language and Sketching: An LLM-driven Interactive Multimodal Multitask Robot Navigation Framework

    Full text link
    The socially-aware navigation system has evolved to adeptly avoid various obstacles while performing multiple tasks, such as point-to-point navigation, human-following, and -guiding. However, a prominent gap persists: in Human-Robot Interaction (HRI), the procedure of communicating commands to robots demands intricate mathematical formulations. Furthermore, the transition between tasks does not quite possess the intuitive control and user-centric interactivity that one would desire. In this work, we propose an LLM-driven interactive multimodal multitask robot navigation framework, termed LIM2N, to solve the above new challenge in the navigation field. We achieve this by first introducing a multimodal interaction framework where language and hand-drawn inputs can serve as navigation constraints and control objectives. Next, a reinforcement learning agent is built to handle multiple tasks with the received information. Crucially, LIM2N creates smooth cooperation among the reasoning of multimodal input, multitask planning, and adaptation and processing of the intelligent sensing modules in the complicated system. Extensive experiments are conducted in both simulation and the real world demonstrating that LIM2N has superior user needs understanding, alongside an enhanced interactive experience

    Cross-Utterance Conditioned VAE for Speech Generation

    Full text link
    Speech synthesis systems powered by neural networks hold promise for multimedia production, but frequently face issues with producing expressive speech and seamless editing. In response, we present the Cross-Utterance Conditioned Variational Autoencoder speech synthesis (CUC-VAE S2) framework to enhance prosody and ensure natural speech generation. This framework leverages the powerful representational capabilities of pre-trained language models and the re-expression abilities of variational autoencoders (VAEs). The core component of the CUC-VAE S2 framework is the cross-utterance CVAE, which extracts acoustic, speaker, and textual features from surrounding sentences to generate context-sensitive prosodic features, more accurately emulating human prosody generation. We further propose two practical algorithms tailored for distinct speech synthesis applications: CUC-VAE TTS for text-to-speech and CUC-VAE SE for speech editing. The CUC-VAE TTS is a direct application of the framework, designed to generate audio with contextual prosody derived from surrounding texts. On the other hand, the CUC-VAE SE algorithm leverages real mel spectrogram sampling conditioned on contextual information, producing audio that closely mirrors real sound and thereby facilitating flexible speech editing based on text such as deletion, insertion, and replacement. Experimental results on the LibriTTS datasets demonstrate that our proposed models significantly enhance speech synthesis and editing, producing more natural and expressive speech.Comment: 13 pages

    Structural bias in T4 RNA ligase-mediated 3ā€²-adapter ligation

    Get PDF
    T4 RNA ligases are commonly used to attach adapters to RNAs, but large differences in ligation efficiency make detection and quantitation problematic. We developed a ligation selection strategy using random RNAs in combination with high-throughput sequencing to gain insight into the differences in efficiency of ligating pre-adenylated DNA adapters to RNA 3ā€²-ends. After analyzing biases in RNA sequence, secondary structure and RNA-adapter cofold structure, we conclude that T4 RNA ligases do not show significant primary sequence preference in RNA substrates, but are biased against structural features within RNAs and adapters. Specifically, RNAs with less than three unstructured nucleotides at the 3ā€²-end and RNAs that are predicted to cofold with an adapter in unfavorable structures are likely to be poorly ligated. The effect of RNA-adapter cofold structures on ligation is supported by experiments where the ligation efficiency of specific miRNAs was changed by designing adapters to alter cofold structure. In addition, we show that using adapters with randomized regions results in higher ligation efficiency and reduced ligation bias. We propose that using randomized adapters may improve RNA representation in experiments that include a 3ā€²-adapter ligation step

    NtGNL1 Plays an Essential Role in Pollen Tube Tip Growth and Orientation Likely via Regulation of Post-Golgi Trafficking

    Get PDF
    Background: Tobacco GNOM LIKE 1 (NtGNL1), a new member of the Big/GBF family, is characterized by a sec 7 domain. Thus, we proposed that NtGNL1 may function in regulating pollen tube growth for vesicle trafficking. Methodology/Principal Findings: To test this hypothesis, we used an RNAi technique to down-regulate NtGNL1 expression and found that pollen tube growth and orientation were clearly inhibited. Cytological observations revealed that both timing and behavior of endocytosis was disrupted, and endosome trafficking to prevacuolar compartments (PVC) or multivesicular bodies (MVB) was altered in pollen tube tips. Moreover, NtGNL1 seemed to partially overlap with Golgi bodies, but clearly colocalized with putative late endosome compartments. We also observed that in such pollen tubes, the Golgi apparatus disassembled and fused with the endoplasmic reticulum, indicating abnormal post-Golgi trafficking. During this process, actin organization was also remodeled. Conclusions/Significance: Thus, we revealed that NtGNL1 is essential for pollen tube growth and orientation and it likel

    6G Network AI Architecture for Everyone-Centric Customized Services

    Full text link
    Mobile communication standards were developed for enhancing transmission and network performance by using more radio resources and improving spectrum and energy efficiency. How to effectively address diverse user requirements and guarantee everyone's Quality of Experience (QoE) remains an open problem. The Sixth Generation (6G) mobile systems will solve this problem by utilizing heterogenous network resources and pervasive intelligence to support everyone-centric customized services anywhere and anytime. In this article, we first coin the concept of Service Requirement Zone (SRZ) on the user side to characterize and visualize the integrated service requirements and preferences of specific tasks of individual users. On the system side, we further introduce the concept of User Satisfaction Ratio (USR) to evaluate the system's overall service ability of satisfying a variety of tasks with different SRZs. Then, we propose a network Artificial Intelligence (AI) architecture with integrated network resources and pervasive AI capabilities for supporting customized services with guaranteed QoEs. Finally, extensive simulations show that the proposed network AI architecture can consistently offer a higher USR performance than the cloud AI and edge AI architectures with respect to different task scheduling algorithms, random service requirements, and dynamic network conditions

    Whole-genome sequencing of the snub-nosed monkey provides insights into folivory and evolutionary history

    Get PDF
    Colobines are a unique group of Old World monkeys that principally eat leaves and seeds rather than fruits and insects. We report the sequencing at 146Ɨ coverage, de novo assembly and analyses of the genome of a male golden snub-nosed monkey (Rhinopithecus roxellana) and resequencing at 30Ɨ coverage of three related species (Rhinopithecus bieti, Rhinopithecus brelichi and Rhinopithecus strykeri). Comparative analyses showed that Asian colobines have an enhanced ability to derive energy from fatty acids and to degrade xenobiotics. We found evidence for functional evolution in the colobine RNASE1 gene, encoding a key secretory RNase that digests the high concentrations of bacterial RNA derived from symbiotic microflora. Demographic reconstructions indicated that the profile of ancient effective population sizes for R. roxellana more closely resembles that of giant panda rather than its congeners. These findings offer new insights into the dietary adaptations and evolutionary history of colobine primates

    Resource allocation in broadband wireless networks

    No full text
    published_or_final_versionElectrical and Electronic EngineeringDoctoralDoctor of Philosoph
    • ā€¦
    corecore