177 research outputs found

    Z∗Z^*: Zero-shot Style Transfer via Attention Rearrangement

    Full text link
    Despite the remarkable progress in image style transfer, formulating style in the context of art is inherently subjective and challenging. In contrast to existing learning/tuning methods, this study shows that vanilla diffusion models can directly extract style information and seamlessly integrate the generative prior into the content image without retraining. Specifically, we adopt dual denoising paths to represent content/style references in latent space and then guide the content image denoising process with style latent codes. We further reveal that the cross-attention mechanism in latent diffusion models tends to blend the content and style images, resulting in stylized outputs that deviate from the original content image. To overcome this limitation, we introduce a cross-attention rearrangement strategy. Through theoretical analysis and experiments, we demonstrate the effectiveness and superiority of the diffusion-based Z‾\underline{Z}ero-shot S‾\underline{S}tyle T‾\underline{T}ransfer via A‾\underline{A}ttention R‾\underline{R}earrangement, Z-STAR

    Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models

    Full text link
    Pretrained vision-language models (VLMs) such as CLIP have shown impressive generalization capability in downstream vision tasks with appropriate text prompts. Instead of designing prompts manually, Context Optimization (CoOp) has been recently proposed to learn continuous prompts using taskspecific training data. Despite the performance improvements on downstream tasks, several studies have reported that CoOp suffers from the overfitting issue in two aspects: (i) the test accuracy on base classes first improves and then worsens during training;(ii) the test accuracy on novel classes keeps decreasing. However, none of the existing studies can understand and mitigate such overfitting problems. In this study, we first explore the cause of overfitting by analyzing the gradient flow. Comparative experiments reveal that CoOp favors generalizable and spurious features in the early and later training stages, respectively, leading to the non-overfitting and overfitting phenomena. Given those observations, we propose Subspace Prompt Tuning (SubPT) to project the gradients in back-propagation onto the low-rank subspace spanned by the early-stage gradient flow eigenvectors during the entire training process and successfully eliminate the overfitting problem. In addition, we equip CoOp with a Novel Feature Learner (NFL) to enhance the generalization ability of the learned prompts onto novel categories beyond the training set, needless of image training data. Extensive experiments on 11 classification datasets demonstrate that SubPT+NFL consistently boost the performance of CoOp and outperform the state-of-the-art CoCoOp approach. Experiments on more challenging vision downstream tasks, including open-vocabulary object detection and zero-shot semantic segmentation, also verify the effectiveness of the proposed method. Codes can be found at https://tinyurl.com/mpe64f89

    Arbitrary Video Style Transfer via Multi-Channel Correlation

    Full text link
    Video style transfer is getting more attention in AI community for its numerous applications such as augmented reality and animation productions. Compared with traditional image style transfer, performing this task on video presents new challenges: how to effectively generate satisfactory stylized results for any specified style, and maintain temporal coherence across frames at the same time. Towards this end, we propose Multi-Channel Correction network (MCCNet), which can be trained to fuse the exemplar style features and input content features for efficient style transfer while naturally maintaining the coherence of input videos. Specifically, MCCNet works directly on the feature space of style and content domain where it learns to rearrange and fuse style features based on their similarity with content features. The outputs generated by MCC are features containing the desired style patterns which can further be decoded into images with vivid style textures. Moreover, MCCNet is also designed to explicitly align the features to input which ensures the output maintains the content structures as well as the temporal continuity. To further improve the performance of MCCNet under complex light conditions, we also introduce the illumination loss during training. Qualitative and quantitative evaluations demonstrate that MCCNet performs well in both arbitrary video and image style transfer tasks

    Application of diffusion tensor imaging in the diagnosis of post-stroke aphasia: a meta-analysis and systematic review

    Get PDF
    IntroductionDiffusion Tensor Imaging (DTI) indicators of different white matter (WM) fibers and brain region lesions for post-stroke aphasia (PSA) are inconsistent in existing studies. Our study examines the consistency and differences between PSA tests performed with DTI. In addition, obtaining consistent and independent conclusions between studies was made possible by utilizing DTI in PSA assessment.MethodsIn order to gather relevant studies using DTI for diagnosing PSA, we searched the Web of Science, PubMed, Embase, and CNKI databases. Based on the screening and evaluation of the included studies, the meta-analysis was used to conduct a quantitative analysis. Narrative descriptions were provided for studies that met the inclusion criteria but lacked data.ResultsFirst, we reported on the left hemisphere. The meta-analysis showed that fractional anisotropy (FA) of the arcuate fasciculus (AF) and superior longitudinal fasciculus (SLF), inferior frontal-occipital fasciculus (IFOF), inferior longitudinal fasciculus (ILF), and uncinate fasciculus (UF) were decreased in the PSA group in comparison with the healthy controls (p < 0.00001). However, in the comparison of axial diffusivity (AD), there was no statistically significant difference in white matter fiber tracts in the dual-stream language model of the PSA group. Elevated radial diffusivity (RD) was seen only in the IFOF and ILF (PIFOF = 0.01; PILF = 0.05). In the classic Broca’s area, the FA of the PSA group was decreased (p < 0.00001) while the apparent diffusion coefficient was elevated (p = 0.03). Secondly, we evaluated the white matter fiber tracts in the dual-stream language model of the right hemisphere. The FA of the PSA group was decreased only in the IFOF (p = 0.001). AD was elevated in the AF and UF (PAF < 0.00001; PUF = 0.009). RD was elevated in the AF and UF (PAF = 0.01; PUF = 0.003). The other fiber tracts did not undergo similar alterations.ConclusionIn conclusion, DTI is vital for diagnosing PSA because it detects WM changes effectively, but it still has some limitations. Due to a lack of relevant language scales and clinical manifestations, diagnosing and differentiating PSA independently remain challenging.Systematic review registrationhttps://www.crd.york.ac.uk/PROSPERO/display_record.php?RecordID=365897

    Acupuncture for insomnia symptoms in hypertensive patients: a systematic review and meta-analysis

    Get PDF
    PurposeIn the realm of pain management, traditional Chinese medicine, specifically acupuncture, has garnered increasing attention. This meta-analysis pioneers the evaluation of acupuncture’s effectiveness in treating insomnia among hypertensive patients.MethodsWe conducted a comprehensive search across several databases—PubMed, Web of Science, Cochrane Library, WANFANG, China National Knowledge Infrastructure (CNKI), Sinomed, and the Chinese Journal of Science and Technology (VIP). Additionally, forward and backward articles of studies published from the inception of these databases until 10 September 2023, were reviewed. This systematic review and meta-analysis included all randomized controlled trials (RCTs) focusing on acupuncture for insomnia in hypertensive patients, without imposing language or date restrictions. We rigorously assessed all outcome measures reported in these trials. The evidence was synthesized by calculating the difference between mean differences (MD) in symptom change. The quality of the evidence was determined using the Cochrane Risk of Bias tool. This study is registered with PROSPERO under number CRD42023461760.ResultsOur analysis included 16 RCTs, comprising 1,309 patients. The findings revealed that acupuncture was significantly more effective than the control group in reducing insomnia symptoms, as indicated by a greater decrease in the PSQI score (MD = −3.1, 95% CI [−3.77 to −2.62], p < 0.00001). Additionally, improvements in both systolic and diastolic blood pressure were more pronounced in the acupuncture group compared to the control group (SBP: MD = −10.31, 95% CI [−16.98 to −3.64], p = 0.002; DBP: MD = −5.71, 95% CI [−8.19 to −3.23], p < 0.00001). These results suggest that acupuncture not only improves sleep quality but also lowers blood pressure in patients suffering from hypertension and insomnia. Further research is warranted to elucidate optimal acupuncture points and the duration of treatment for maximized therapeutic effect.Systematic review registration:https://www.crd.york.ac.uk/prospero, CRD42023461760

    Carbon in Chinese grasslands : meta-analysis and theory of grazing effects

    Get PDF
    Unidad de excelencia María de Maeztu CEX2019-000940-MGlobally, livestock grazing is an important management factor influencing soil degradation, soil health and carbon (C) stocks of grassland ecosystems. However, the effects of grassland types, grazing intensity and grazing duration on C stocks are unclear across large geographic scales. To provide a more comprehensive assessment of how grazing drives ecosystem C stocks in grasslands, we compiled and analyzed data from 306 studies featuring four grassland types across China: desert steppes, typical steppes, meadow steppes and alpine steppes. Light grazing was the best management practice for desert steppes (< 2 sheep ha−1) and typical steppes (3 to 4 sheep ha−1), whereas medium grazing pressure was optimal for meadow steppes (5 to 6 sheep ha−1) and alpine steppes (7 to 8 sheep ha−1) leading to the highest ecosystem C stocks under grazing. Plant biomass (desert steppes) and soil C stocks (meadow steppes) increased under light or medium grazing, confirming the 'intermediate disturbance hypothesis'. Heavy grazing decreased all C stocks regardless of grassland ecosystem types, approximately 1.4 Mg ha−1 per year for the whole ecosystem. The regrowth and regeneration of grasslands in response to grazing intensity (i.e., grazing optimization) depended on grassland types and grazing duration. In conclusion, grassland grazing is a double-edged sword. On the one hand, proper management (light or medium grazing) can maintain and even increase C stocks above- and belowground, and increase the harvested livestock products from grasslands. On the other hand, human-induced overgrazing can lead to rapid degradation of vegetation and soils, resulting in significant carbon loss and requiring long-term recovery. Grazing regimes (i.e., intensity and duration applied) must consider specific grassland characteristics to ensure stable productivity rates and optimal impacts on ecosystem C stocks
    • …
    corecore