177 research outputs found
: Zero-shot Style Transfer via Attention Rearrangement
Despite the remarkable progress in image style transfer, formulating style in
the context of art is inherently subjective and challenging. In contrast to
existing learning/tuning methods, this study shows that vanilla diffusion
models can directly extract style information and seamlessly integrate the
generative prior into the content image without retraining. Specifically, we
adopt dual denoising paths to represent content/style references in latent
space and then guide the content image denoising process with style latent
codes. We further reveal that the cross-attention mechanism in latent diffusion
models tends to blend the content and style images, resulting in stylized
outputs that deviate from the original content image. To overcome this
limitation, we introduce a cross-attention rearrangement strategy. Through
theoretical analysis and experiments, we demonstrate the effectiveness and
superiority of the diffusion-based ero-shot tyle
ransfer via ttention earrangement,
Z-STAR
Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models
Pretrained vision-language models (VLMs) such as CLIP have shown impressive
generalization capability in downstream vision tasks with appropriate text
prompts. Instead of designing prompts manually, Context Optimization (CoOp) has
been recently proposed to learn continuous prompts using taskspecific training
data. Despite the performance improvements on downstream tasks, several studies
have reported that CoOp suffers from the overfitting issue in two aspects: (i)
the test accuracy on base classes first improves and then worsens during
training;(ii) the test accuracy on novel classes keeps decreasing. However,
none of the existing studies can understand and mitigate such overfitting
problems. In this study, we first explore the cause of overfitting by analyzing
the gradient flow. Comparative experiments reveal that CoOp favors
generalizable and spurious features in the early and later training stages,
respectively, leading to the non-overfitting and overfitting phenomena. Given
those observations, we propose Subspace Prompt Tuning (SubPT) to project the
gradients in back-propagation onto the low-rank subspace spanned by the
early-stage gradient flow eigenvectors during the entire training process and
successfully eliminate the overfitting problem. In addition, we equip CoOp with
a Novel Feature Learner (NFL) to enhance the generalization ability of the
learned prompts onto novel categories beyond the training set, needless of
image training data. Extensive experiments on 11 classification datasets
demonstrate that SubPT+NFL consistently boost the performance of CoOp and
outperform the state-of-the-art CoCoOp approach. Experiments on more
challenging vision downstream tasks, including open-vocabulary object detection
and zero-shot semantic segmentation, also verify the effectiveness of the
proposed method. Codes can be found at https://tinyurl.com/mpe64f89
Arbitrary Video Style Transfer via Multi-Channel Correlation
Video style transfer is getting more attention in AI community for its
numerous applications such as augmented reality and animation productions.
Compared with traditional image style transfer, performing this task on video
presents new challenges: how to effectively generate satisfactory stylized
results for any specified style, and maintain temporal coherence across frames
at the same time. Towards this end, we propose Multi-Channel Correction network
(MCCNet), which can be trained to fuse the exemplar style features and input
content features for efficient style transfer while naturally maintaining the
coherence of input videos. Specifically, MCCNet works directly on the feature
space of style and content domain where it learns to rearrange and fuse style
features based on their similarity with content features. The outputs generated
by MCC are features containing the desired style patterns which can further be
decoded into images with vivid style textures. Moreover, MCCNet is also
designed to explicitly align the features to input which ensures the output
maintains the content structures as well as the temporal continuity. To further
improve the performance of MCCNet under complex light conditions, we also
introduce the illumination loss during training. Qualitative and quantitative
evaluations demonstrate that MCCNet performs well in both arbitrary video and
image style transfer tasks
Application of diffusion tensor imaging in the diagnosis of post-stroke aphasia: a meta-analysis and systematic review
IntroductionDiffusion Tensor Imaging (DTI) indicators of different white matter (WM) fibers and brain region lesions for post-stroke aphasia (PSA) are inconsistent in existing studies. Our study examines the consistency and differences between PSA tests performed with DTI. In addition, obtaining consistent and independent conclusions between studies was made possible by utilizing DTI in PSA assessment.MethodsIn order to gather relevant studies using DTI for diagnosing PSA, we searched the Web of Science, PubMed, Embase, and CNKI databases. Based on the screening and evaluation of the included studies, the meta-analysis was used to conduct a quantitative analysis. Narrative descriptions were provided for studies that met the inclusion criteria but lacked data.ResultsFirst, we reported on the left hemisphere. The meta-analysis showed that fractional anisotropy (FA) of the arcuate fasciculus (AF) and superior longitudinal fasciculus (SLF), inferior frontal-occipital fasciculus (IFOF), inferior longitudinal fasciculus (ILF), and uncinate fasciculus (UF) were decreased in the PSA group in comparison with the healthy controls (p < 0.00001). However, in the comparison of axial diffusivity (AD), there was no statistically significant difference in white matter fiber tracts in the dual-stream language model of the PSA group. Elevated radial diffusivity (RD) was seen only in the IFOF and ILF (PIFOF = 0.01; PILF = 0.05). In the classic Broca’s area, the FA of the PSA group was decreased (p < 0.00001) while the apparent diffusion coefficient was elevated (p = 0.03). Secondly, we evaluated the white matter fiber tracts in the dual-stream language model of the right hemisphere. The FA of the PSA group was decreased only in the IFOF (p = 0.001). AD was elevated in the AF and UF (PAF < 0.00001; PUF = 0.009). RD was elevated in the AF and UF (PAF = 0.01; PUF = 0.003). The other fiber tracts did not undergo similar alterations.ConclusionIn conclusion, DTI is vital for diagnosing PSA because it detects WM changes effectively, but it still has some limitations. Due to a lack of relevant language scales and clinical manifestations, diagnosing and differentiating PSA independently remain challenging.Systematic review registrationhttps://www.crd.york.ac.uk/PROSPERO/display_record.php?RecordID=365897
Acupuncture for insomnia symptoms in hypertensive patients: a systematic review and meta-analysis
PurposeIn the realm of pain management, traditional Chinese medicine, specifically acupuncture, has garnered increasing attention. This meta-analysis pioneers the evaluation of acupuncture’s effectiveness in treating insomnia among hypertensive patients.MethodsWe conducted a comprehensive search across several databases—PubMed, Web of Science, Cochrane Library, WANFANG, China National Knowledge Infrastructure (CNKI), Sinomed, and the Chinese Journal of Science and Technology (VIP). Additionally, forward and backward articles of studies published from the inception of these databases until 10 September 2023, were reviewed. This systematic review and meta-analysis included all randomized controlled trials (RCTs) focusing on acupuncture for insomnia in hypertensive patients, without imposing language or date restrictions. We rigorously assessed all outcome measures reported in these trials. The evidence was synthesized by calculating the difference between mean differences (MD) in symptom change. The quality of the evidence was determined using the Cochrane Risk of Bias tool. This study is registered with PROSPERO under number CRD42023461760.ResultsOur analysis included 16 RCTs, comprising 1,309 patients. The findings revealed that acupuncture was significantly more effective than the control group in reducing insomnia symptoms, as indicated by a greater decrease in the PSQI score (MD = −3.1, 95% CI [−3.77 to −2.62], p < 0.00001). Additionally, improvements in both systolic and diastolic blood pressure were more pronounced in the acupuncture group compared to the control group (SBP: MD = −10.31, 95% CI [−16.98 to −3.64], p = 0.002; DBP: MD = −5.71, 95% CI [−8.19 to −3.23], p < 0.00001). These results suggest that acupuncture not only improves sleep quality but also lowers blood pressure in patients suffering from hypertension and insomnia. Further research is warranted to elucidate optimal acupuncture points and the duration of treatment for maximized therapeutic effect.Systematic review registration:https://www.crd.york.ac.uk/prospero, CRD42023461760
Carbon in Chinese grasslands : meta-analysis and theory of grazing effects
Unidad de excelencia MarÃa de Maeztu CEX2019-000940-MGlobally, livestock grazing is an important management factor influencing soil degradation, soil health and carbon (C) stocks of grassland ecosystems. However, the effects of grassland types, grazing intensity and grazing duration on C stocks are unclear across large geographic scales. To provide a more comprehensive assessment of how grazing drives ecosystem C stocks in grasslands, we compiled and analyzed data from 306 studies featuring four grassland types across China: desert steppes, typical steppes, meadow steppes and alpine steppes. Light grazing was the best management practice for desert steppes (< 2 sheep ha−1) and typical steppes (3 to 4 sheep ha−1), whereas medium grazing pressure was optimal for meadow steppes (5 to 6 sheep ha−1) and alpine steppes (7 to 8 sheep ha−1) leading to the highest ecosystem C stocks under grazing. Plant biomass (desert steppes) and soil C stocks (meadow steppes) increased under light or medium grazing, confirming the 'intermediate disturbance hypothesis'. Heavy grazing decreased all C stocks regardless of grassland ecosystem types, approximately 1.4 Mg ha−1 per year for the whole ecosystem. The regrowth and regeneration of grasslands in response to grazing intensity (i.e., grazing optimization) depended on grassland types and grazing duration. In conclusion, grassland grazing is a double-edged sword. On the one hand, proper management (light or medium grazing) can maintain and even increase C stocks above- and belowground, and increase the harvested livestock products from grasslands. On the other hand, human-induced overgrazing can lead to rapid degradation of vegetation and soils, resulting in significant carbon loss and requiring long-term recovery. Grazing regimes (i.e., intensity and duration applied) must consider specific grassland characteristics to ensure stable productivity rates and optimal impacts on ecosystem C stocks
- …