Search CORE

2,237 research outputs found

AdaDiff: Adaptive Step Selection for Fast Diffusion

Author: Jiang Yu-Gang
Shao Jie
Wu Zuxuan
Xing Zhen
Zhang Hui
Publication venue
Publication date: 24/11/2023
Field of study

Diffusion models, as a type of generative models, have achieved impressive results in generating images and videos conditioned on textual conditions. However, the generation process of diffusion models involves denoising for dozens of steps to produce photorealistic images/videos, which is computationally expensive. Unlike previous methods that design ``one-size-fits-all'' approaches for speed up, we argue denoising steps should be sample-specific conditioned on the richness of input texts. To this end, we introduce AdaDiff, a lightweight framework designed to learn instance-specific step usage policies, which are then used by the diffusion model for generation. AdaDiff is optimized using a policy gradient method to maximize a carefully designed reward function, balancing inference time and generation quality. We conduct experiments on three image generation and two video generation benchmarks and demonstrate that our approach achieves similar results in terms of visual quality compared to the baseline using a fixed 50 denoising steps while reducing inference time by at least 33%, going as high as 40%. Furthermore, our qualitative analysis shows that our method allocates more steps to more informative text conditions and fewer steps to simpler text conditions.Comment: 10 pages, 5 figure

arXiv.org e-Print Archive

Is Floppy Eyelid Syndrome More Prevalent in Obstructive Sleep Apnea Syndrome Patients?

Author: Chang-Jiang Liu
Dao-Jiang Yu
Gang Feng
Hui Li
Ping Wang
Tian-Lan Zhao
Zhen-Hai Long
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2016
Field of study

Controversial findings are reported about the relationship between floppy eyelid syndrome (FES) and obstructive sleep apnea syndrome (OSAS). The main goal of this study was to evaluate whether FES is more prevalent in OSAS patients by performing a meta-analysis. A comprehensive literature search of Pubmed, Embase, and Cochrane databases was performed. Only studies related to the prevalence of FES in OSAS were included in the meta-analysis. We estimated a pooled odds ratio (OR) for the prevalence of FES in OSAS. In total, 6 studies with 767 participants met the inclusion criteria. Using a fixed-effects model, the pooled OR was 4.12. The test for the overall effect revealed that FES was statistically prevalent in OSAS patients when compared with that in non-OSAS subjects (Z=4.98, p<0.00001). In the subgroup analysis by OSAS severity, the incidence of FES in OSAS increased with severity of OSAS as indicated with increased OR values (OR = 2.56, 4.62, and 7.64 for mild, moderate, and severe OSAS). In conclusion, the results indicate that FES is more prevalent in OSAS patients. However, this result was based only on unadjusted estimates. Prospective cohort studies are needed to determine whether OSAS is an independent risk factor for FES

Crossref

Directory of Open Access Journals

PubMed Central

APCodec: A Neural Audio Codec with Parallel Amplitude and Phase Spectrum Encoding and Decoding

Author: Ai Yang
Du Hui-Peng
Jiang Xiao-Hang
Ling Zhen-Hua
Lu Ye-Xin
Publication venue
Publication date: 16/02/2024
Field of study

This paper introduces a novel neural audio codec targeting high waveform sampling rates and low bitrates named APCodec, which seamlessly integrates the strengths of parametric codecs and waveform codecs. The APCodec revolutionizes the process of audio encoding and decoding by concurrently handling the amplitude and phase spectra as audio parametric characteristics like parametric codecs. It is composed of an encoder and a decoder with the modified ConvNeXt v2 network as the backbone, connected by a quantizer based on the residual vector quantization (RVQ) mechanism. The encoder compresses the audio amplitude and phase spectra in parallel, amalgamating them into a continuous latent code at a reduced temporal resolution. This code is subsequently quantized by the quantizer. Ultimately, the decoder reconstructs the audio amplitude and phase spectra in parallel, and the decoded waveform is obtained by inverse short-time Fourier transform. To ensure the fidelity of decoded audio like waveform codecs, spectral-level loss, quantization loss, and generative adversarial network (GAN) based loss are collectively employed for training the APCodec. To support low-latency streamable inference, we employ feed-forward layers and causal convolutional layers in APCodec, incorporating a knowledge distillation training strategy to enhance the quality of decoded audio. Experimental results confirm that our proposed APCodec can encode 48 kHz audio at bitrate of just 6 kbps, with no significant degradation in the quality of the decoded audio. At the same bitrate, our proposed APCodec also demonstrates superior decoded audio quality and faster generation speed compared to well-known codecs, such as SoundStream, Encodec, HiFi-Codec and AudioDec.Comment: Submitted to IEEE/ACM Transactions on Audio, Speech, and Language Processin

arXiv.org e-Print Archive

VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models

Author: Dai Qi
Hu Han
Jiang Yu-Gang
Wu Zuxuan
Xing Zhen
Zhang Hui
Zhang Zihao
Publication venue
Publication date: 30/11/2023
Field of study

Diffusion models have achieved significant success in image and video generation. This motivates a growing interest in video editing tasks, where videos are edited according to provided text descriptions. However, most existing approaches only focus on video editing for short clips and rely on time-consuming tuning or inference. We are the first to propose Video Instruction Diffusion (VIDiff), a unified foundation model designed for a wide range of video tasks. These tasks encompass both understanding tasks (such as language-guided video object segmentation) and generative tasks (video editing and enhancement). Our model can edit and translate the desired results within seconds based on user instructions. Moreover, we design an iterative auto-regressive method to ensure consistency in editing and enhancing long videos. We provide convincing generative results for diverse input videos and written instructions, both qualitatively and quantitatively. More examples can be found at our website https://ChenHsing.github.io/VIDiff

arXiv.org e-Print Archive

Methylation status of DDIT3 gene in Chronic Myeloid Leukemia

Author: Li Jian-yong
Lin Jiang
Qian Jun
Qian Zhen
Wang Ya-li
Yao Dong-ming
Zhu Zhao-hui
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background DNA-damage-inducible transcript 3 (<it>DDIT3</it>), a candidate tumor suppressor gene (TSG), has been found involved in the regulation of cellular growth and differentiation. The epigenetic changes of TSGs are recently recognized as an abnormal mechanism contributing to the development of chronic myeloid leukemia (CML). The aim of this study was to investigate the methylation status of <it>DDIT3 </it>gene in CML patients. Methods The methylation status of <it>DDIT3 </it>promoter was detected in the bone marrow mononuclear cells from 53 patients with CML using methylation-specific PCR (MSP). The expression levels of <it>DDIT3 </it>and <it>bcr/abl </it>transcript were determined by real-time quantitative PCR (RQ-PCR). Clinical data of these patients were collected and analyzed. Results The aberrant methylation of <it>DDIT3 </it>gene promoter was found in 35 of 53 (66%) CML cases. Correlation was not found between <it>DDIT3 </it>promoter hypermethylation and the age, sex, hemoglobin concentration, platelet counts, chromosomal abnormalities, <it>bcr/abl </it>transcript, and staging of CML patients (<it>P </it>> 0.05), but found between <it>DDIT3 </it>promoter hypermethylation and WBC counts of CML cases (R = 0.781, <it>P </it>< 0.001). The level of <it>DDIT3 </it>transcript in CML patients was significantly lower than that in controls (median 3.28 vs 19.69, <it>P </it>< 0.001), however, there was no difference in the level of <it>DDIT3 </it>transcript between methylation-positive CML cases (0.05-65.32, median 2.13) and methylation- negative CML cases (0.12-126.04, median 3.92) (<it>P </it>> 0.05). Conclusion Our results demonstrate that aberrant methylation of <it>DDIT3 </it>occurs in CML frequently.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central