133 research outputs found

    Deep Nonparametric Estimation of Intrinsic Data Structures by Chart Autoencoders: Generalization Error and Robustness

    Full text link
    Autoencoders have demonstrated remarkable success in learning low-dimensional latent features of high-dimensional data across various applications. Assuming that data are sampled near a low-dimensional manifold, we employ chart autoencoders, which encode data into low-dimensional latent features on a collection of charts, preserving the topology and geometry of the data manifold. Our paper establishes statistical guarantees on the generalization error of chart autoencoders, and we demonstrate their denoising capabilities by considering nn noisy training samples, along with their noise-free counterparts, on a dd-dimensional manifold. By training autoencoders, we show that chart autoencoders can effectively denoise the input data with normal noise. We prove that, under proper network architectures, chart autoencoders achieve a squared generalization error in the order of n2d+2log4n\displaystyle n^{-\frac{2}{d+2}}\log^4 n, which depends on the intrinsic dimension of the manifold and only weakly depends on the ambient dimension and noise level. We further extend our theory on data with noise containing both normal and tangential components, where chart autoencoders still exhibit a denoising effect for the normal component. As a special case, our theory also applies to classical autoencoders, as long as the data manifold has a global parametrization. Our results provide a solid theoretical foundation for the effectiveness of autoencoders, which is further validated through several numerical experiments

    VarietySound: Timbre-Controllable Video to Sound Generation via Unsupervised Information Disentanglement

    Full text link
    Video to sound generation aims to generate realistic and natural sound given a video input. However, previous video-to-sound generation methods can only generate a random or average timbre without any controls or specializations of the generated sound timbre, leading to the problem that people cannot obtain the desired timbre under these methods sometimes. In this paper, we pose the task of generating sound with a specific timbre given a video input and a reference audio sample. To solve this task, we disentangle each target sound audio into three components: temporal information, acoustic information, and background information. We first use three encoders to encode these components respectively: 1) a temporal encoder to encode temporal information, which is fed with video frames since the input video shares the same temporal information as the original audio; 2) an acoustic encoder to encode timbre information, which takes the original audio as input and discards its temporal information by a temporal-corrupting operation; and 3) a background encoder to encode the residual or background sound, which uses the background part of the original audio as input. To make the generated result achieve better quality and temporal alignment, we also adopt a mel discriminator and a temporal discriminator for the adversarial training. Our experimental results on the VAS dataset demonstrate that our method can generate high-quality audio samples with good synchronization with events in video and high timbre similarity with the reference audio

    Waveform-Controlled Terahertz Radiation from the Air Filament Produced by Few-Cycle Laser Pulses

    Full text link
    Waveform-controlled Terahertz (THz) radiation is of great importance due to its potential application in THz sensing and coherent control of quantum systems. We demonstrated a novel scheme to generate waveform-controlled THz radiation from air plasma produced when carrier-envelope-phase (CEP) stabilized few-cycle laser pulses undergo filamentation in ambient air. We launched CEP-stabilized 10 fs-long (~ 1.7 optical cycles) laser pulses at 1.8 {\mu}m into air and found that the generated THz waveform can be controlled by varying the filament length and the CEP of driving laser pulses. Calculations using the photocurrent model and including the propagation effects well reproduce the experimental results, and the origins of various phase shifts in the filament are elucidated.Comment: 5pages, 5 figure

    Responses of Soil Organic Carbon to Long-Term Understory Removal in Subtropical Cinnamomum camphora

    Get PDF
    We conducted a study on a 48-year-old Cinnamomum camphora plantation in the subtropics of China, by removing understory gradually and then comparing this treatment with a control (undisturbed). This study analyzed the content and storage soil organic carbon (SOC) in a soil depth of 0–60 cm. The results showed that SOC content was lower in understory removal (UR) treatment, with a decrease range from 5% to 34%, and a decline of 10.16 g·kg−1 and 8.58 g·kg−1 was noticed in 0–10 cm and 10–20 cm layers, respectively, with significant differences (P<0.05). Carbon storage was reduced in UR, ranging from 2% to 43%, with a particular drastic decline of 15.39 t·hm−2 and 11.58 t·hm−2 in 0–10 cm (P<0.01) and 10–20 cm (P<0.01) layers, respectively. Content of SOC had an extremely significant (P<0.01) correlation with soil nutrients in the two stands, and the correlation coefficients of CK were higher than those of UR. Our data showed that the presence of understory favored the accumulation of soil organic carbon to a large extent. Therefore, long-term practice of understory removal weakens the function of forest ecosystem as a carbon sink
    corecore