68 research outputs found
ViCo: Engaging Video Comment Generation with Human Preference Rewards
Engaging video comments play an important role in video social media, as they
are the carrier of feelings, thoughts, or humor of the audience. Preliminary
works have made initial exploration for video comment generation by adopting
caption-style encoder-decoder models. However, comment generation presents some
unique challenges distinct from caption generation, which makes these methods
somewhat less effective at generating engaging comments. In contrast to the
objective and descriptive nature of captions, comments tend to be inherently
subjective, making it hard to quantify and evaluate the engagement of comments.
Furthermore, the scarcity of truly engaging comments brings difficulty to
collecting enough high-quality training examples. In this paper, we propose
ViCo with three novel designs to tackle the above challenges for generating
engaging Video Comments. Firstly, to quantify the engagement of comments, we
utilize the number of "likes" each comment receives as a proxy of human
preference after an appropriate debiasing procedure. Secondly, to automatically
evaluate the engagement of comments, we train a reward model to align its
judgment to the above proxy. Our user studies indicate that this reward model
effectively aligns with human judgments. Lastly, to alleviate the scarcity of
high-quality comments, an initial generator is trained on readily available but
noisy data to generate comments. Then the reward model is employed to offer
feedback on the generated comments, thus optimizing the initial generator. To
facilitate the research of video commenting, we collect a large video
comment-dataset (ViCo-20k) with rich metadata from a popular video website.
Experiments on ViCo-20k show that the comments generated by our ViCo model
exhibit the best performance in terms of both quantitative and qualitative
results, particularly when engagement is considered
Visualizing Formation and Dynamics of a Three-Dimensional Sponge-like Network of a Coacervate in Real Time
Coacervates, which are formed by liquid–liquid phase separation, have been extensively explored as models for synthetic cells and membraneless organelles, so their in-depth structural analysis is crucial. However, both the inner structure dynamics and formation mechanism of coacervates remain elusive. Herein, we demonstrate real-time confocal observation of a three-dimensional sponge-like network in a dipeptide-based coacervate. In situ generation of the dipeptide allowed us to capture the emergence of the sponge-like network via unprecedented membrane folding of vesicle-shaped intermediates. We also visualized dynamic fluctuation of the network, including reversible engagement/disengagement of cross-links and a stochastic network kissing event. Photoinduced transient formation of a multiphase coacervate was achieved with a thermally responsive phase transition. Our findings expand the fundamental understanding of synthetic coacervates and provide opportunities to manipulate their physicochemical properties by engineering the inner network for potential applications in development of artificial cells and life-like material fabrication
CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment
The pre-trained image-text models, like CLIP, have demonstrated the strong
power of vision-language representation learned from a large scale of
web-collected image-text data. In light of the well-learned visual features,
some existing works transfer image representation to video domain and achieve
good results. However, how to utilize image-language pre-trained model (e.g.,
CLIP) for video-language pre-training (post-pretraining) is still under
explored. In this paper, we investigate two questions: 1) what are the factors
hindering post-pretraining CLIP to further improve the performance on
video-language tasks? and 2) how to mitigate the impact of these factors?
Through a series of comparative experiments and analyses, we find that the data
scale and domain gap between language sources have great impacts. Motivated by
these, we propose a Omnisource Cross-modal Learning method equipped with a
Video Proxy mechanism on the basis of CLIP, namely CLIP-ViP. Extensive results
show that our approach improves the performance of CLIP on video-text retrieval
by a large margin. Our model also achieves SOTA results on a variety of
datasets, including MSR-VTT, DiDeMo, LSMDC, and ActivityNet. We will release
our code and pre-trained CLIP-ViP models at
https://github.com/microsoft/XPretrain/tree/main/CLIP-ViP
Target Guided Emotion Aware Chat Machine
The consistency of a response to a given post at semantic-level and
emotional-level is essential for a dialogue system to deliver human-like
interactions. However, this challenge is not well addressed in the literature,
since most of the approaches neglect the emotional information conveyed by a
post while generating responses. This article addresses this problem by
proposing a unifed end-to-end neural architecture, which is capable of
simultaneously encoding the semantics and the emotions in a post and leverage
target information for generating more intelligent responses with appropriately
expressed emotions. Extensive experiments on real-world data demonstrate that
the proposed method outperforms the state-of-the-art methods in terms of both
content coherence and emotion appropriateness.Comment: To appear on TOIS 202
Imidacloprid Alters Foraging and Decreases Bee Avoidance of Predators
Abstract Concern is growing over the effects of neonicotinoid pesticides, which can impair honey bee cognition. We provide the first demonstration that sublethal concentrations of imidacloprid can harm honey bee decision-making about danger by significantly increasing the probability of a bee visiting a dangerous food source. Apis cerana is a native bee that is an important pollinator of agricultural crops and native plants in Asia. When foraging on nectar containing 40 mg/L (34 ppb) imidacloprid, honey bees (Apis cerana) showed no aversion to a feeder with a hornet predator, and 1.8 fold more bees chose the dangerous feeder as compared to control bees. Control bees exhibited significant predator avoidance. We also give the first evidence that foraging by A. cerana workers can be inhibited by sublethal concentrations of the pesticide, imidacloprid, which is widely used in Asia. Compared to bees collecting uncontaminated nectar, 23% fewer foragers returned to collect the nectar with 40 mg/L imidacloprid. Bees that did return respectively collected 46% and 63% less nectar containing 20 mg/ L and 40 mg/L imidacloprid. These results suggest that the effects of neonicotinoids on honey bee decision-making and other advanced cognitive functions should be explored. Moreover, research should extend beyond the classic model, the European honey bee (A. mellifera), to other important bee species
Associations between reproduction and work in workers of the Asian hive bee Apis cerana
a b s t r a c t If a honey bee (Apis spp.) colony becomes queenless, about 1/3 of young workers activate their ovaries and produce haploid male-producing eggs. In doing so queenless workers maximize their inclusive fitness because the normal option of vicarious production of relatives via their queen's eggs is no longer available. But if many workers are engaged in reproduction, how does a queenless colony continue to feed its brood and forage? Here we show that in the Asian hive bee Apis cerana hypopharyngeal gland (HPG) size is larger in queenless workers than in queenright workers and that bees undertaking brood-rearing tasks have larger HPG than same-aged bees that are foraging. In queenless colonies, workers with a smaller number of ovarioles are more likely to have activated ovaries. This reinforces the puzzling observation that a large number of ovarioles reduces reproductive success in queenless A. cerana. It further suggests that reproductive workers either avoid foraging or transition to foraging later in life than nonreproductive workers. Finally, our study also showed that ovary activation and larger-than-average numbers of ovarioles had no statistically detectable influence on foraging specialization for pollen or nectar
- …