Search CORE

7 research outputs found

On the Difference of BERT-style and CLIP-style Text Encoders

Author: Chen Guiming Hardy
Chen Zhihong
Diao Shizhe
Wan Xiang
Wang Benyou
Publication venue
Publication date: 06/06/2023
Field of study

Masked language modeling (MLM) has been one of the most popular pretraining recipes in natural language processing, e.g., BERT, one of the representative models. Recently, contrastive language-image pretraining (CLIP) has also attracted attention, especially its vision models that achieve excellent performance on a broad range of vision tasks. However, few studies are dedicated to studying the text encoders learned by CLIP. In this paper, we analyze the difference between BERT-style and CLIP-style text encoders from three experiments: (i) general text understanding, (ii) vision-centric text understanding, and (iii) text-to-image generation. Experimental analyses show that although CLIP-style text encoders underperform BERT-style ones for general text understanding tasks, they are equipped with a unique ability, i.e., synesthesia, for the cross-modal association, which is more similar to the senses of humans.Comment: Natural Language Processing. 10 pages, 1 figure. Findings of ACL-202

arXiv.org e-Print Archive

Plum: Prompt Learning using Metaheuristic

Author: Diao Shizhe
Liu Xiang
Pan Rui
Shum Kashun
Xing Shuo
Zhang Jipeng
Zhang Tong
Publication venue
Publication date: 14/11/2023
Field of study

Since the emergence of large language models, prompt learning has become a popular method for optimizing and customizing these models. Special prompts, such as Chain-of-Thought, have even revealed previously unknown reasoning capabilities within these models. However, the progress of discovering effective prompts has been slow, driving a desire for general prompt optimization methods. Unfortunately, few existing prompt learning methods satisfy the criteria of being truly "general", i.e., automatic, discrete, black-box, gradient-free, and interpretable all at once. In this paper, we introduce metaheuristics, a branch of discrete non-convex optimization methods with over 100 options, as a promising approach to prompt learning. Within our paradigm, we test six typical methods: hill climbing, simulated annealing, genetic algorithms with/without crossover, tabu search, and harmony search, demonstrating their effectiveness in black-box prompt learning and Chain-of-Thought prompt tuning. Furthermore, we show that these methods can be used to discover more human-understandable prompts that were previously unknown, opening the door to a cornucopia of possibilities in prompt optimization. We release all the codes in \url{https://github.com/research4pan/Plum}

arXiv.org e-Print Archive

TeViS:Translating Text Synopses to Video Storyboards

Author: Cao Xiang
Chen Shizhe
Gu Xu
Li Boyuan
Ni Feiyue
Song Ruihua
Sun Yuchong
Wang Xihua
Publication venue
Publication date: 14/08/2023
Field of study

A video storyboard is a roadmap for video creation which consists of shot-by-shot images to visualize key plots in a text synopsis. Creating video storyboards, however, remains challenging which not only requires cross-modal association between high-level texts and images but also demands long-term reasoning to make transitions smooth across shots. In this paper, we propose a new task called Text synopsis to Video Storyboard (TeViS) which aims to retrieve an ordered sequence of images as the video storyboard to visualize the text synopsis. We construct a MovieNet-TeViS dataset based on the public MovieNet dataset. It contains 10K text synopses each paired with keyframes manually selected from corresponding movies by considering both relevance and cinematic coherence. To benchmark the task, we present strong CLIP-based baselines and a novel VQ-Trans. VQ-Trans first encodes text synopsis and images into a joint embedding space and uses vector quantization (VQ) to improve the visual representation. Then, it auto-regressively generates a sequence of visual features for retrieval and ordering. Experimental results demonstrate that VQ-Trans significantly outperforms prior methods and the CLIP-based baselines. Nevertheless, there is still a large gap compared to human performance suggesting room for promising future work. The code and data are available at: \url{https://ruc-aimind.github.io/projects/TeViS/}Comment: Accepted to ACM Multimedia 202

arXiv.org e-Print Archive

Resilience-Oriented Planning of Urban Distribution System Source–Network–Load–Storage in the Context of High-Penetrated Building-Integrated Resources

Author: Ping Wang
Sheng Zhu
Shilin Shen
Shizhe Xiang
Shu Yang
Tongtong Liu
Wei Lou
Xiaodong Yang
Publication venue: MDPI AG
Publication date: 01/04/2024
Field of study

Building-integrated flexible resources can offer economical availability to accommodate high-penetrated renewable energy sources (RESs), which can be potentially coordinated to achieve cost-effective supply. This paper proposes a resilience-oriented planning model of urban distribution system source–network–load–storage in the context of high-penetrated building-integrated resources. In this model, source–network–load–storage resources are cost-optimally planned, including the lines, soft open point (SOP), building-integrated photovoltaics (BIPVs), building-integrated wind turbine (BIWT), building-integrated energy storage system (ESS), etc. To enhance fault recovery capability during extreme faults, fault scenarios are incorporated into the distribution system operation via coupled multiple recovery stages. The resilience-oriented planning is a thorny problem due to its source–network–load–storage couplings, normal-fault couplings, etc. The original resilience-oriented planning is reformulated as a mixed-integer linear programming (MILP) problem, which can then be solved with a two-stage method and evaluated via a multi-dimensional evaluation metrics. The proposed planning methodology is benchmarked over a Portugal 54-node urban distribution system to verify the superiority and effectiveness on the system economy and resilience levels. Case studies show that the proposed methodology can exploit the optimal synergies of different source–network–load–storage components and enhance system dispatchability

Directory of Open Access Journals

In vivo detection of 13C isotopomer turnover in the human brain by sequential infusion of 13C labeled substrates

Author: Badar-Goffer
Bluml
Boumezbeur
Boumezbeur
Cerdan
Chhina
Christopher Johnson
de Graaf
de Graaf
Deelchand
Ernst
Fan
Gruetter
Gruetter
Jun Shen
Lebon
Li
Li
Maria Ferraris Araneta
Mason
Robert B. Innis
Rothman
Shen
Shen
Shizhe Li
Xiang
Xu
Yan Zhang
Yun Xiang
Zhang
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Motor-based microprobe powered by bio-assembled catalase for motion detection of DNA

Author: Balasubramanian
Bunea
Campuzano
Campuzano
Drummond
Feng
Gao
Gao
Gao
Gao
Guix
Hay Burgess
Huangxian Ju
Jianping Lei
Jie Wu
Kagan
Kagan
Li
Ling
Lou
Mei
Mei
Nguyen
Orozco
Orozco
Patra
Paxton
Paxton
Peyer
Risch
Sanchez
Sanchez
Shen
Shizhe Fu
Solovev
Wang
Wang
Wang
Wu
Xiang
Yu
Yuzhe Xie
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Pitch derived graphene oxides: Characterization and effect on pyrolysis and carbonization of coal tar pitch

Author: Blanco
Chen
Cheng
Cheng
Dang
Dong
Dzukarnain
Ebrahim
Fanjul
Ferrari
Figueiras
García
Granda
Guan
Guerrero-Contreras
Hao Chen
He
He
He
He
Hummers
Jia
Jilai Xue
Jimenez-Cervantes Amieva
Kim
Kim
Kumar
Lin
Lin
Liou
Liu
Lu
Lucena
Machnikowski
Mallakpour
Maravi
Marcano
Martínez-Alonso
Mohan
Mohan
Pei
Petrova
Pérez
Robertson
Sakthivel
Schlepp
Shao
Shizhe Liu
Sima
Son
Stankovich
Taylor
Titelman
Wang
Xiang Li
Xuan Liu
Yang
Yu
Yudin
Zaaba
Zhang
Zhang
Zhang
Zhang
Zhao
Zhou
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref