31 research outputs found
ViCo: Engaging Video Comment Generation with Human Preference Rewards
Engaging video comments play an important role in video social media, as they
are the carrier of feelings, thoughts, or humor of the audience. Preliminary
works have made initial exploration for video comment generation by adopting
caption-style encoder-decoder models. However, comment generation presents some
unique challenges distinct from caption generation, which makes these methods
somewhat less effective at generating engaging comments. In contrast to the
objective and descriptive nature of captions, comments tend to be inherently
subjective, making it hard to quantify and evaluate the engagement of comments.
Furthermore, the scarcity of truly engaging comments brings difficulty to
collecting enough high-quality training examples. In this paper, we propose
ViCo with three novel designs to tackle the above challenges for generating
engaging Video Comments. Firstly, to quantify the engagement of comments, we
utilize the number of "likes" each comment receives as a proxy of human
preference after an appropriate debiasing procedure. Secondly, to automatically
evaluate the engagement of comments, we train a reward model to align its
judgment to the above proxy. Our user studies indicate that this reward model
effectively aligns with human judgments. Lastly, to alleviate the scarcity of
high-quality comments, an initial generator is trained on readily available but
noisy data to generate comments. Then the reward model is employed to offer
feedback on the generated comments, thus optimizing the initial generator. To
facilitate the research of video commenting, we collect a large video
comment-dataset (ViCo-20k) with rich metadata from a popular video website.
Experiments on ViCo-20k show that the comments generated by our ViCo model
exhibit the best performance in terms of both quantitative and qualitative
results, particularly when engagement is considered
Quotients of Special Classes of Positroids
In this paper, we give a complete characterization of rank positroids
that are quotients of the uniform matroid , completing a partial
result by Bendetti-Chavez-Jim\'enez. Furthermore, we show that any pair of
concordant positroids with adjacent ranks are related by a cyclic shift on
their decorated permutations. We also use the concept of conecklaces to give a
full characterization of concordant lattice path matroids (LPMs).Comment: 32 pages, 10 figures; This research was carried out as part of the
PACE program in the summer of 2023 at Peking University, Beijing; Comments
very welcom
CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment
The pre-trained image-text models, like CLIP, have demonstrated the strong
power of vision-language representation learned from a large scale of
web-collected image-text data. In light of the well-learned visual features,
some existing works transfer image representation to video domain and achieve
good results. However, how to utilize image-language pre-trained model (e.g.,
CLIP) for video-language pre-training (post-pretraining) is still under
explored. In this paper, we investigate two questions: 1) what are the factors
hindering post-pretraining CLIP to further improve the performance on
video-language tasks? and 2) how to mitigate the impact of these factors?
Through a series of comparative experiments and analyses, we find that the data
scale and domain gap between language sources have great impacts. Motivated by
these, we propose a Omnisource Cross-modal Learning method equipped with a
Video Proxy mechanism on the basis of CLIP, namely CLIP-ViP. Extensive results
show that our approach improves the performance of CLIP on video-text retrieval
by a large margin. Our model also achieves SOTA results on a variety of
datasets, including MSR-VTT, DiDeMo, LSMDC, and ActivityNet. We will release
our code and pre-trained CLIP-ViP models at
https://github.com/microsoft/XPretrain/tree/main/CLIP-ViP
TeViS:Translating Text Synopses to Video Storyboards
A video storyboard is a roadmap for video creation which consists of
shot-by-shot images to visualize key plots in a text synopsis. Creating video
storyboards, however, remains challenging which not only requires cross-modal
association between high-level texts and images but also demands long-term
reasoning to make transitions smooth across shots. In this paper, we propose a
new task called Text synopsis to Video Storyboard (TeViS) which aims to
retrieve an ordered sequence of images as the video storyboard to visualize the
text synopsis. We construct a MovieNet-TeViS dataset based on the public
MovieNet dataset. It contains 10K text synopses each paired with keyframes
manually selected from corresponding movies by considering both relevance and
cinematic coherence. To benchmark the task, we present strong CLIP-based
baselines and a novel VQ-Trans. VQ-Trans first encodes text synopsis and images
into a joint embedding space and uses vector quantization (VQ) to improve the
visual representation. Then, it auto-regressively generates a sequence of
visual features for retrieval and ordering. Experimental results demonstrate
that VQ-Trans significantly outperforms prior methods and the CLIP-based
baselines. Nevertheless, there is still a large gap compared to human
performance suggesting room for promising future work. The code and data are
available at: \url{https://ruc-aimind.github.io/projects/TeViS/}Comment: Accepted to ACM Multimedia 202
Characterization of a Novel Lytic Bacteriophage φEC14 that Infects Enterobacter cloacae Clinical Isolates
Enterobacter cloacae, an important agent associated with nosocomial infection, usually involves expanded-spectrum cephalosporin resistance. The therapeutic potential of bacteriophage is a possibly alternative chemotherapy against bacterial infection. In this study, we have characterized one newly isolated bacteriophage φEC14, which is lytic to E. cloacae specifically. Transmission electron microscopy revealed that phage φEC14 had an icosahedral head and long contractile tail, morphologically similar to phages belonging to family Siphoviridae. Pulsed-Field Gel Electrophoresis (PFGE) showed that the size of φEC14 virion DNA was in rang of 23.0-48.5 kb. Restriction analysis showed that lytic phage φEC14 was a double-stranded DNA virus, which might be cut by some restriction endonucleases. SDS-PAGE of phage proteins exhibited one major band and six minor bands with molecular weight ranging from 6.5 to 66.4 kilo-Dalton. In one-step experiment, phage φC14 had a short latent period of 10 minutes and a burst size of 50 PFU/cell. The best understanding of the biological features of lytic bacteriophage φEC14 would facilitate the development of an alternative agent to control the spread of multidrug-resistant E. cloacae
Comparison of Proangiogenic Effects of Adipose-Derived Stem Cells and Foreskin Fibroblast Exosomes on Artificial Dermis Prefabricated Flaps
Large prefabricated flaps often suffer from necrosis or poor healing due to a lack of new blood vessels and related factors that promote angiogenesis. The innovative use of adipose-derived stem cell exosomes (ADSC-Exo) resolves the problem of vascularization of prefabricated flaps. We analyzed the differential microRNA (miRNA) expression in ADSC-Exo using next-generation sequencing (NGS) technology to explore their potential mechanisms in promoting vascularization. We observed that ADSC-Exo could significantly promote the vascularization of artificial dermis prefabricated flaps compared with human foreskin fibroblast exosomes. NGS indicated that there were some differentially expressed miRNAs in both exosomes. Bioinformatics analysis suggested that significantly upregulated hsa-miR-760 and significantly downregulated hsa-miR-423-3p in ADSC-Exo could regulate the expression of the ITGA5 and HDAC5 genes, respectively, to promote the vascularization of skin flaps. In summary, ADSC-Exo can promote skin-flap vascularization, and thereby resolve the problem of insufficient neovascularization of artificial dermis prefabricated flaps, thus expanding the application of prefabricated skin-flap transplantation
Reconstruction of perianal skin defect using modified keystone flap after perianal tumor resection
Purpose: The large resection area of perianal tumor makes the skin defect hard to reconstruct. The keystone flap has demonstrated a growing application in skin defects. Herein, we aimed to explore the efficacy of keystone flap in the repair of skin defect after perianal tumor resection. Methods: This study is a retrospective review of patients diagnosed with perianal tumor from January 2010 to November 2021. A standardized data collection template was used to collect variables. The detailed process of the reconstructive surgery is carefully described in this article. After surgery, the healing process was closely observed. Results: Twenty patients underwent keystone flap repair. The average wound size before closure measured 3.5 × 4.9 cm2. Primary wound healing was achieved, and the flap survived during the follow up period, which ranged from 6 to 24 months. No severe complications occurred; slight edema was noticed in one patient. Conclusion: The application of keystone flap is a promising way to repair skin defect after tumor removal, and the complications rate was low after surgery. It can be concluded that this method is an effective and reliable way to repair perianal skin defect