11 research outputs found
Text-Only Image Captioning with Multi-Context Data Generation
Text-only Image Captioning (TIC) is an approach that aims to construct a
model solely based on text that can accurately describe images. Recently,
diffusion models have demonstrated remarkable capabilities in generating
high-quality images that are semantically coherent with given texts. This
presents an opportunity to generate synthetic training images for TIC. However,
we have identified a challenge that the images generated from simple
descriptions typically exhibit a single perspective with one or limited
contexts, which is not aligned with the complexity of real-world scenes in the
image domain. In this paper, we propose a novel framework that addresses this
issue by introducing multi-context data generation. Starting with an initial
text corpus, our framework employs a large language model to select multiple
sentences that describe the same scene from various perspectives. These
sentences are then summarized into a single sentence with multiple contexts. We
generate simple images using the straightforward sentences and complex images
using the summarized sentences through diffusion models. Finally, we train the
model exclusively using the synthetic image-text pairs obtained from this
process. Experimental results demonstrate that our proposed framework
effectively tackles the central challenge we have identified, achieving the
state-of-the-art performance on popular datasets such as MSCOCO, Flickr30k, and
SS1M
CA-SSL: Class-Agnostic Semi-Supervised Learning for Detection and Segmentation
To improve instance-level detection/segmentation performance, existing
self-supervised and semi-supervised methods extract either task-unrelated or
task-specific training signals from unlabeled data. We show that these two
approaches, at the two extreme ends of the task-specificity spectrum, are
suboptimal for the task performance. Utilizing too little task-specific
training signals causes underfitting to the ground-truth labels of downstream
tasks, while the opposite causes overfitting to the ground-truth labels. To
this end, we propose a novel Class-Agnostic Semi-Supervised Learning (CA-SSL)
framework to achieve a more favorable task-specificity balance in extracting
training signals from unlabeled data. CA-SSL has three training stages that act
on either ground-truth labels (labeled data) or pseudo labels (unlabeled data).
This decoupling strategy avoids the complicated scheme in traditional SSL
methods that balances the contributions from both data types. Especially, we
introduce a warmup training stage to achieve a more optimal balance in task
specificity by ignoring class information in the pseudo labels, while
preserving localization training signals. As a result, our warmup model can
better avoid underfitting/overfitting when fine-tuned on the ground-truth
labels in detection and segmentation tasks. Using 3.6M unlabeled data, we
achieve a significant performance gain of 4.7% over ImageNet-pretrained
baseline on FCOS object detection. In addition, our warmup model demonstrates
excellent transferability to other detection and segmentation frameworks.Comment: Appeared in ECCV202
Spatial-Semantic Collaborative Cropping for User Generated Content
A large amount of User Generated Content (UGC) is uploaded to the Internet daily and displayed to people world-widely through the client side (mobile and PC). This requires the cropping algorithms to produce the aesthetic thumbnail within a specific aspect ratio on different devices. However, existing image cropping works mainly focus on landmark or landscape images, which fail to model the relations among the multi-objects with the complex background in UGC. Besides, previous methods merely consider the aesthetics of the cropped images while ignoring the content integrity, which is crucial for UGC cropping. In this paper, we propose a Spatial-Semantic Collaborative cropping network (S2CNet) for arbitrary user generated content accompanied by a new cropping benchmark. Specifically, we first mine the visual genes of the potential objects. Then, the suggested adaptive attention graph recasts this task as a procedure of information association over visual nodes. The underlying spatial and semantic relations are ultimately centralized to the crop candidate through differentiable message passing, which helps our network efficiently to preserve both the aesthetics and the content integrity. Extensive experiments on the proposed UGCrop5K and other public datasets demonstrate the superiority of our approach over state-of-the-art counterparts
Recommended from our members
Chronic E-Cigarette Use Impairs Endothelial Function on the Physiological and Cellular Levels.
BACKGROUND: The harmful vascular effects of smoking are well established, but the effects of chronic use of electronic cigarettes (e-cigarettes) on endothelial function are less understood. We hypothesized that e-cigarette use causes changes in blood milieu that impair endothelial function. METHODS: Endothelial function was measured in chronic e-cigarette users, chronic cigarette smokers, and nonusers. We measured effects of participants sera, or e-cigarette aerosol condensate, on NO and H2O2 release and cell permeability in cultured endothelial cells (ECs). RESULTS: E-cigarette users and smokers had lower flow-mediated dilation (FMD) than nonusers. Sera from e-cigarette users and smokers reduced VEGF (vascular endothelial growth factor)-induced NO secretion by ECs relative to nonuser sera, without significant reduction in endothelial NO synthase mRNA or protein levels. E-cigarette user sera caused increased endothelial release of H2O2, and more permeability than nonuser sera. E-cigarette users and smokers exhibited changes in circulating biomarkers of inflammation, thrombosis, and cell adhesion relative to nonusers, but with distinct profiles. E-cigarette user sera had higher concentrations of the receptor for advanced glycation end products (RAGE) ligands S100A8 and HMGB1 (high mobility group box 1) than smoker and nonuser sera, and receptor for advanced glycation end product inhibition reduced permeability induced by e-cigarette user sera but did not affect NO production. CONCLUSIONS: Chronic vaping and smoking both impair FMD and cause changes in the blood that inhibit endothelial NO release. Vaping, but not smoking, causes changes in the blood that increase microvascular endothelial permeability and may have a vaping-specific effect on intracellular oxidative state. Our results suggest a role for RAGE in e-cigarette-induced changes in endothelial function
Recommended from our members
Chronic E-Cigarette Use Impairs Endothelial Function on the Physiological and Cellular Levels
BackgroundThe harmful vascular effects of smoking are well established, but the effects of chronic use of electronic cigarettes (e-cigarettes) on endothelial function are less understood. We hypothesized that e-cigarette use causes changes in blood milieu that impair endothelial function.MethodsEndothelial function was measured in chronic e-cigarette users, chronic cigarette smokers, and nonusers. We measured effects of participants' sera, or e-cigarette aerosol condensate, on NO and H2O2 release and cell permeability in cultured endothelial cells (ECs).ResultsE-cigarette users and smokers had lower flow-mediated dilation (FMD) than nonusers. Sera from e-cigarette users and smokers reduced VEGF (vascular endothelial growth factor)-induced NO secretion by ECs relative to nonuser sera, without significant reduction in endothelial NO synthase mRNA or protein levels. E-cigarette user sera caused increased endothelial release of H2O2, and more permeability than nonuser sera. E-cigarette users and smokers exhibited changes in circulating biomarkers of inflammation, thrombosis, and cell adhesion relative to nonusers, but with distinct profiles. E-cigarette user sera had higher concentrations of the receptor for advanced glycation end products (RAGE) ligands S100A8 and HMGB1 (high mobility group box 1) than smoker and nonuser sera, and receptor for advanced glycation end product inhibition reduced permeability induced by e-cigarette user sera but did not affect NO production.ConclusionsChronic vaping and smoking both impair FMD and cause changes in the blood that inhibit endothelial NO release. Vaping, but not smoking, causes changes in the blood that increase microvascular endothelial permeability and may have a vaping-specific effect on intracellular oxidative state. Our results suggest a role for RAGE in e-cigarette-induced changes in endothelial function