76 research outputs found

    Research on the Application of Data Mining Technology in Educational Management System

    Get PDF
    Data mining technology has changed the way people understand data and analyze data to a large extent. The technology belongs to a new kind of information processing technology, which can provide information support to the decision makers and make for promotion of integrated development of educational reform. At present, the educational management system in colleges and universities mainly centers on the handling of daily routines, and the mass data merely realized the storage instead of exerting its function, besides, the significance hidden in the mass data often has not been discovered. Therefore, starting from the definition of data mining, the paper analyzed the necessity of application of data mining technology in the educational management system, and then studied the application of data mining technology in the educational management system. Keywords: Data Mining; Colleges and Universities; Educational Management System; Application DOI: 10.7176/JEP/11-8-14 Publication date:March 31st 202

    Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis

    Full text link
    We are interested in a novel task, namely low-resource text-to-talking avatar. Given only a few-minute-long talking person video with the audio track as the training data and arbitrary texts as the driving input, we aim to synthesize high-quality talking portrait videos corresponding to the input text. This task has broad application prospects in the digital human industry but has not been technically achieved yet due to two challenges: (1) It is challenging to mimic the timbre from out-of-domain audio for a traditional multi-speaker Text-to-Speech system. (2) It is hard to render high-fidelity and lip-synchronized talking avatars with limited training data. In this paper, we introduce Adaptive Text-to-Talking Avatar (Ada-TTA), which (1) designs a generic zero-shot multi-speaker TTS model that well disentangles the text content, timbre, and prosody; and (2) embraces recent advances in neural rendering to achieve realistic audio-driven talking face video generation. With these designs, our method overcomes the aforementioned two challenges and achieves to generate identity-preserving speech and realistic talking person video. Experiments demonstrate that our method could synthesize realistic, identity-preserving, and audio-visual synchronized talking avatar videos.Comment: 6 pages, 3 figure

    Multisystem optimization for an integrated production scheduling with resource saving problem in textile printing and dyeing

    Get PDF
    Resource saving has become an integral aspect of manufacturing in industry 4.0. This paper proposes a multisystem optimization (MSO) algorithm, inspired by implicit parallelism of heuristic methods, to solve an integrated production scheduling with resource saving problem in textile printing and dyeing. First, a real-world integrated production scheduling with resource saving is formulated as a multisystem optimization problem. Then, the MSO algorithm is proposed to solve multisystem optimization problems that consist of several coupled subsystems, and each of the subsystems may contain multiple objectives and multiple constraints. The proposed MSO algorithm is composed of within-subsystem evolution and cross-subsystem migration operators, and the former is to optimize each subsystem by excellent evolution operators and the later is to complete information sharing between multiple subsystems, to accelerate the global optimization of the whole system. Performance is tested on a set of multisystem benchmark functions and compared with improved NSGA-II and multiobjective multifactorial evolutionary algorithm (MO-MFEA). Simulation results show that the MSO algorithm is better than compared algorithms for the benchmark functions studied in this paper. Finally, the MSO algorithm is successfully applied to the proposed integrated production scheduling with resource saving problem, and the results show that MSO is a promising algorithm for the studied problem. © 2020 Haiping Ma et al

    Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation

    Full text link
    Large diffusion models have been successful in text-to-audio (T2A) synthesis tasks, but they often suffer from common issues such as semantic misalignment and poor temporal consistency due to limited natural language understanding and data scarcity. Additionally, 2D spatial structures widely used in T2A works lead to unsatisfactory audio quality when generating variable-length audio samples since they do not adequately prioritize temporal information. To address these challenges, we propose Make-an-Audio 2, a latent diffusion-based T2A method that builds on the success of Make-an-Audio. Our approach includes several techniques to improve semantic alignment and temporal consistency: Firstly, we use pre-trained large language models (LLMs) to parse the text into structured pairs for better temporal information capture. We also introduce another structured-text encoder to aid in learning semantic alignment during the diffusion denoising process. To improve the performance of variable length generation and enhance the temporal information extraction, we design a feed-forward Transformer-based diffusion denoiser. Finally, we use LLMs to augment and transform a large amount of audio-label data into audio-text datasets to alleviate the problem of scarcity of temporal data. Extensive experiments show that our method outperforms baseline models in both objective and subjective metrics, and achieves significant gains in temporal information understanding, semantic consistency, and sound quality

    Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts

    Full text link
    Zero-shot text-to-speech aims at synthesizing voices with unseen speech prompts. Previous large-scale multispeaker TTS models have successfully achieved this goal with an enrolled recording within 10 seconds. However, most of them are designed to utilize only short speech prompts. The limited information in short speech prompts significantly hinders the performance of fine-grained identity imitation. In this paper, we introduce Mega-TTS 2, a generic zero-shot multispeaker TTS model that is capable of synthesizing speech for unseen speakers with arbitrary-length prompts. Specifically, we 1) design a multi-reference timbre encoder to extract timbre information from multiple reference speeches; 2) and train a prosody language model with arbitrary-length speech prompts; With these designs, our model is suitable for prompts of different lengths, which extends the upper bound of speech quality for zero-shot text-to-speech. Besides arbitrary-length prompts, we introduce arbitrary-source prompts, which leverages the probabilities derived from multiple P-LLM outputs to produce expressive and controlled prosody. Furthermore, we propose a phoneme-level auto-regressive duration model to introduce in-context learning capabilities to duration modeling. Experiments demonstrate that our method could not only synthesize identity-preserving speech with a short prompt of an unseen speaker but also achieve improved performance with longer speech prompts. Audio samples can be found in https://mega-tts.github.io/mega2_demo/

    Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias

    Full text link
    Scaling text-to-speech to a large and wild dataset has been proven to be highly effective in achieving timbre and speech style generalization, particularly in zero-shot TTS. However, previous works usually encode speech into latent using audio codec and use autoregressive language models or diffusion models to generate it, which ignores the intrinsic nature of speech and may lead to inferior or uncontrollable results. We argue that speech can be decomposed into several attributes (e.g., content, timbre, prosody, and phase) and each of them should be modeled using a module with appropriate inductive biases. From this perspective, we carefully design a novel and large zero-shot TTS system called Mega-TTS, which is trained with large-scale wild data and models different attributes in different ways: 1) Instead of using latent encoded by audio codec as the intermediate feature, we still choose spectrogram as it separates the phase and other attributes very well. Phase can be appropriately constructed by the GAN-based vocoder and does not need to be modeled by the language model. 2) We model the timbre using global vectors since timbre is a global attribute that changes slowly over time. 3) We further use a VQGAN-based acoustic model to generate the spectrogram and a latent code language model to fit the distribution of prosody, since prosody changes quickly over time in a sentence, and language models can capture both local and long-range dependencies. We scale Mega-TTS to multi-domain datasets with 20K hours of speech and evaluate its performance on unseen speakers. Experimental results demonstrate that Mega-TTS surpasses state-of-the-art TTS systems on zero-shot TTS, speech editing, and cross-lingual TTS tasks, with superior naturalness, robustness, and speaker similarity due to the proper inductive bias of each module. Audio samples are available at https://mega-tts.github.io/demo-page

    Anchoring Cu 1 species over nanodiamond-graphene for semi-hydrogenation of acetylene

    Get PDF
    The design of cheap, non-toxic, and earth-abundant transition metal catalysts for selective hydrogenation of alkynes remains a challenge in both industry and academia. Here, we report a new atomically dispersed copper (Cu) catalyst supported on a defective nanodiamondgraphene (ND@G), which exhibits excellent catalytic performance for the selective conversion of acetylene to ethylene, i.e., with high conversion (95%), high selectivity (98%), and good stability (for more than 60 h). The unique structural feature of the Cu atoms anchored over graphene through Cu-C bonds ensures the effective activation of acetylene and easy desorption of ethylene, which is the key for the outstanding activity and selectivity of the catalyst

    Weighted gene co-expression network analysis identifies genes related to HG Type 0 resistance and verification of hub gene GmHg1

    Get PDF
    IntroductionThe soybean cyst nematode (SCN) is a major disease in soybean production thatseriously affects soybean yield. At present, there are no studies on weighted geneco-expression network analysis (WGCNA) related to SCN resistance.MethodsHere, transcriptome data from 36 soybean roots under SCN HG Type 0 (race 3) stresswere used in WGCNA to identify significant modules.Results and DiscussionA total of 10,000 differentially expressed genes and 21 modules were identified, of which the module most related to SCN was turquoise. In addition, the hub gene GmHg1 with high connectivity was selected, and its function was verified. GmHg1 encodes serine/threonine protein kinase (PK), and the expression of GmHg1 in SCN-resistant cultivars (‘Dongnong L-204’) and SCN-susceptible cultivars (‘Heinong 37’) increased significantly after HG Type 0 stress. Soybean plants transformed with GmHg1-OX had significantly increased SCN resistance. In contrast, the GmHg1-RNAi transgenic soybean plants significantly reduced SCN resistance. In transgenic materials, the expression patterns of 11 genes with the same expression trend as the GmHg1 gene in the ‘turquoise module’ were analyzed. Analysis showed that 11genes were co-expressed with GmHg1, which may be involved in the process of soybean resistance to SCN. Our work provides a new direction for studying the Molecular mechanism of soybean resistance to SCN

    Tin Assisted Fully Exposed Platinum Clusters Stabilized on Defect-Rich Graphene for Dehydrogenation Reaction

    Get PDF
    Tin assisted fully exposed Pt clusters are fabricated on the core-shell nanodiamond@graphene (ND@G) hybrid support (a-PtSn/ND@G). The obtained atomically dispersed Pt clusters, with an average Pt atom number of 3, were anchored over the ND@Gsupport by the assistance of Sn atoms as a partition agent and through the Pt-C bond between Pt clusters and defect-rich graphene nanoshell. The atomically dispersed Pt clusters guaranteed a full metal availability to the reactants, a high thermal stability, and an optimized adsorption/desorption behavior. It inhibits the side reactions and enhances catalytic performance in direct dehydrogenation of n-butane at a low temperature of 450 °C, leading to \u3e98% selectivity toward olefin products, and the turnover frequency (TOF) of a-PtSn/ND@G is approximately 3.9 times higher than that of the traditional Pt3Sn alloy catalyst supported on Al2O3 (Pt3Sn/Al2O3)
    • …
    corecore