51 research outputs found

    A3^3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing

    Full text link
    Recently, speech representation learning has improved many speech-related tasks such as speech recognition, speech classification, and speech-to-text translation. However, all the above tasks are in the direction of speech understanding, but for the inverse direction, speech synthesis, the potential of representation learning is yet to be realized, due to the challenging nature of generating high-quality speech. To address this problem, we propose our framework, Alignment-Aware Acoustic-Text Pretraining (A3^3T), which reconstructs masked acoustic signals with text input and acoustic-text alignment during training. In this way, the pretrained model can generate high quality of reconstructed spectrogram, which can be applied to the speech editing and unseen speaker TTS directly. Experiments show A3^3T outperforms SOTA models on speech editing, and improves multi-speaker speech synthesis without the external speaker verification model.Comment: under review, 12 pages, 10 figure

    Effect of asymmetric feathering angle on the aerodynamic performance of a flyable bionic flapping-wing rotor

    Get PDF
    The current study involves an experimental as well as numerical study on the aerodynamic behavior of a flapping-wing rotor (FWR) with different feathering amplitudes (−20°–50°, −50°–20°, and −35°–35°). In order to fulfil the experimental test, an FWR which weighs 18.7 g is designed in this manuscript. According to the experimental and numerical results, it was observed that, compared with the cases under a zero average stroke angle, the cases under a positive average stroke angle or negative average stroke angle share a higher rotary speed given the same input voltage. Despite the fact that the negative average stroke angle would facilitate the generation of a higher rotary speed, the negative average stroke angle cases tend to generate the smallest lift-to-power ratio. On the other hand, the cases with a positive average stroke angle tend to share the largest lift-to-power ratio (about 1.25 times those of zero average stroke angle cases and about 1.6 times those of negative average stroke angle cases). The above study indicates that the application of a positive average stroke angle can provide an effective solution to further increase the aerodynamic performance of a bio-inspired FWR

    Convolutional Neural Networks Facilitate River Barrier Detection and Evidence Severe Habitat Fragmentation in the Mekong River Biodiversity Hotspot

    Get PDF
    Construction of river infrastructure, such as dams and weirs, is a global issue for ecosystem protection due to the fragmentation of river habitat and hydrological alteration it causes. Accurate river barrier databases, increasingly used to determine river fragmentation for ecologically sensitive management, are challenging to generate. This is especially so in large, poorly mapped basins where only large dams tend to be recorded. The Mekong is one of the world's most biodiverse river basins but, like many large rivers, impacts on habitat fragmentation from river infrastructure are poorly documented. To demonstrate a solution to this, and enable more sensitive basin management, we generated a whole‐basin barrier database for the Mekong, by training Convolutional Neural Network (CNN)–based object detection models, the best of which was used to identify 10,561 previously unrecorded barriers. Combining manual revision and merged with the existing barrier database, our new barrier database for the Mekong Basin contains 13,054 barriers. Existing databases for the Lower Mekong documented under ∼3% of the barriers recorded by CNN combined with manual checking. The Nam Chi/Nam Mun region, eastern Thailand, is the most fragmented area within the basin, with a median [95% CI] barrier density of 15.53 [0.00–49.30] per 100 km, and Catchment Area‐based Fragmentation Index value, calculated in an upstream direction, of 1,178.67 [0.00–6,418.46], due to the construction of dams and sluice gates. The CNN‐based object detection framework is effective and potentially can transform our ability to identify river barriers across many large river basins and facilitate ecologically‐sensitive management

    Physics-data-driven intelligent optimization for large-scale meta-devices

    Full text link
    Meta-devices have gained significant attention and have been widely utilized in optical systems for focusing and imaging, owing to their lightweight, high-integration, and exceptional-flexibility capabilities. However, based on the assumption of local phase approximation, traditional design method neglect the local lattice coupling effect between adjacent meta-atoms, thus harming the practical performance of meta-devices. Using physics-driven or data-driven optimization algorithms can effectively solve the aforementioned problems. Nevertheless, both of the methods either involve considerable time costs or require a substantial amount of data sets. Here, we propose a physics-data-driven approach based "intelligent optimizer" that enables us to adaptively modify the sizes of the studied meta-atom according to the sizes of its surrounding ones. Such a scheme allows to mitigate the undesired local lattice coupling effect, and the proposed network model works well on thousands of datasets with a validation loss of 3*10-3. Experimental results show that the 1-mm-diameter metalens designed with the "intelligent optimizer" possesses a relative focusing efficiency of 93.4% (as compared to ideal focusing) and a Strehl ratio of 0.94. In contrast to the previous inverse design method, our method significantly boosts designing efficiency with five orders of magnitude reduction in time. Our design approach may sets a new paradigm for devising large-scale meta-devices.Comment: manuscripts:19 pages, 4 figures; Supplementary Information: 11 pages, 12 figure

    Aerodynamic performance of a flyable flapping wing rotor with passive pitching angle variation

    Get PDF
    The present work was based on an experimental study on the aerodynamic performance of a flapping wing rotor (FWR) and enhancement by passive pitching angle variation (PPAV) associated with powered flapping motion. The PPAV (in this study 10o~50o) was realized by a specially designed sleeve-pin unit as part of a U-shape flapping mechanism. Through experiment and analysis, it was found that the average lift produced by an FWR of PPAV was >100% higher than the baseline model, the same FWR of a constant pitching angle 30o under the same input power. It was also noted that the lift-voltage relationship for the FWR of PPAV was almost linear and the aerodynamic efficiency was also over 100% higher than the baseline FWR when the input voltage was under 6V. The aerodynamic lift or efficiency of the FWR of PPAV can be also increased significantly by reducing the weight of the wings. An FWR model was fabricated and achieved vertical take-off and free flight powered by 9V input voltage. The mechanism of PPAV function provides a feasible solution for aerodynamic improvement of a bio-inspired FWR and potential application to micro-air-vehicles (MAVs)
    corecore