68 research outputs found

    ViCo: Engaging Video Comment Generation with Human Preference Rewards

    Full text link
    Engaging video comments play an important role in video social media, as they are the carrier of feelings, thoughts, or humor of the audience. Preliminary works have made initial exploration for video comment generation by adopting caption-style encoder-decoder models. However, comment generation presents some unique challenges distinct from caption generation, which makes these methods somewhat less effective at generating engaging comments. In contrast to the objective and descriptive nature of captions, comments tend to be inherently subjective, making it hard to quantify and evaluate the engagement of comments. Furthermore, the scarcity of truly engaging comments brings difficulty to collecting enough high-quality training examples. In this paper, we propose ViCo with three novel designs to tackle the above challenges for generating engaging Video Comments. Firstly, to quantify the engagement of comments, we utilize the number of "likes" each comment receives as a proxy of human preference after an appropriate debiasing procedure. Secondly, to automatically evaluate the engagement of comments, we train a reward model to align its judgment to the above proxy. Our user studies indicate that this reward model effectively aligns with human judgments. Lastly, to alleviate the scarcity of high-quality comments, an initial generator is trained on readily available but noisy data to generate comments. Then the reward model is employed to offer feedback on the generated comments, thus optimizing the initial generator. To facilitate the research of video commenting, we collect a large video comment-dataset (ViCo-20k) with rich metadata from a popular video website. Experiments on ViCo-20k show that the comments generated by our ViCo model exhibit the best performance in terms of both quantitative and qualitative results, particularly when engagement is considered

    Visualizing Formation and Dynamics of a Three-Dimensional Sponge-like Network of a Coacervate in Real Time

    Get PDF
    Coacervates, which are formed by liquid–liquid phase separation, have been extensively explored as models for synthetic cells and membraneless organelles, so their in-depth structural analysis is crucial. However, both the inner structure dynamics and formation mechanism of coacervates remain elusive. Herein, we demonstrate real-time confocal observation of a three-dimensional sponge-like network in a dipeptide-based coacervate. In situ generation of the dipeptide allowed us to capture the emergence of the sponge-like network via unprecedented membrane folding of vesicle-shaped intermediates. We also visualized dynamic fluctuation of the network, including reversible engagement/disengagement of cross-links and a stochastic network kissing event. Photoinduced transient formation of a multiphase coacervate was achieved with a thermally responsive phase transition. Our findings expand the fundamental understanding of synthetic coacervates and provide opportunities to manipulate their physicochemical properties by engineering the inner network for potential applications in development of artificial cells and life-like material fabrication

    CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment

    Full text link
    The pre-trained image-text models, like CLIP, have demonstrated the strong power of vision-language representation learned from a large scale of web-collected image-text data. In light of the well-learned visual features, some existing works transfer image representation to video domain and achieve good results. However, how to utilize image-language pre-trained model (e.g., CLIP) for video-language pre-training (post-pretraining) is still under explored. In this paper, we investigate two questions: 1) what are the factors hindering post-pretraining CLIP to further improve the performance on video-language tasks? and 2) how to mitigate the impact of these factors? Through a series of comparative experiments and analyses, we find that the data scale and domain gap between language sources have great impacts. Motivated by these, we propose a Omnisource Cross-modal Learning method equipped with a Video Proxy mechanism on the basis of CLIP, namely CLIP-ViP. Extensive results show that our approach improves the performance of CLIP on video-text retrieval by a large margin. Our model also achieves SOTA results on a variety of datasets, including MSR-VTT, DiDeMo, LSMDC, and ActivityNet. We will release our code and pre-trained CLIP-ViP models at https://github.com/microsoft/XPretrain/tree/main/CLIP-ViP

    Target Guided Emotion Aware Chat Machine

    Full text link
    The consistency of a response to a given post at semantic-level and emotional-level is essential for a dialogue system to deliver human-like interactions. However, this challenge is not well addressed in the literature, since most of the approaches neglect the emotional information conveyed by a post while generating responses. This article addresses this problem by proposing a unifed end-to-end neural architecture, which is capable of simultaneously encoding the semantics and the emotions in a post and leverage target information for generating more intelligent responses with appropriately expressed emotions. Extensive experiments on real-world data demonstrate that the proposed method outperforms the state-of-the-art methods in terms of both content coherence and emotion appropriateness.Comment: To appear on TOIS 202

    Imidacloprid Alters Foraging and Decreases Bee Avoidance of Predators

    Get PDF
    Abstract Concern is growing over the effects of neonicotinoid pesticides, which can impair honey bee cognition. We provide the first demonstration that sublethal concentrations of imidacloprid can harm honey bee decision-making about danger by significantly increasing the probability of a bee visiting a dangerous food source. Apis cerana is a native bee that is an important pollinator of agricultural crops and native plants in Asia. When foraging on nectar containing 40 mg/L (34 ppb) imidacloprid, honey bees (Apis cerana) showed no aversion to a feeder with a hornet predator, and 1.8 fold more bees chose the dangerous feeder as compared to control bees. Control bees exhibited significant predator avoidance. We also give the first evidence that foraging by A. cerana workers can be inhibited by sublethal concentrations of the pesticide, imidacloprid, which is widely used in Asia. Compared to bees collecting uncontaminated nectar, 23% fewer foragers returned to collect the nectar with 40 mg/L imidacloprid. Bees that did return respectively collected 46% and 63% less nectar containing 20 mg/ L and 40 mg/L imidacloprid. These results suggest that the effects of neonicotinoids on honey bee decision-making and other advanced cognitive functions should be explored. Moreover, research should extend beyond the classic model, the European honey bee (A. mellifera), to other important bee species

    Associations between reproduction and work in workers of the Asian hive bee Apis cerana

    Get PDF
    a b s t r a c t If a honey bee (Apis spp.) colony becomes queenless, about 1/3 of young workers activate their ovaries and produce haploid male-producing eggs. In doing so queenless workers maximize their inclusive fitness because the normal option of vicarious production of relatives via their queen's eggs is no longer available. But if many workers are engaged in reproduction, how does a queenless colony continue to feed its brood and forage? Here we show that in the Asian hive bee Apis cerana hypopharyngeal gland (HPG) size is larger in queenless workers than in queenright workers and that bees undertaking brood-rearing tasks have larger HPG than same-aged bees that are foraging. In queenless colonies, workers with a smaller number of ovarioles are more likely to have activated ovaries. This reinforces the puzzling observation that a large number of ovarioles reduces reproductive success in queenless A. cerana. It further suggests that reproductive workers either avoid foraging or transition to foraging later in life than nonreproductive workers. Finally, our study also showed that ovary activation and larger-than-average numbers of ovarioles had no statistically detectable influence on foraging specialization for pollen or nectar