Search CORE

218 research outputs found

Video Captioning via Hierarchical Reinforcement Learning

Author: Chen Wenhu
Wang William Yang
Wang Xin
Wang Yuan-Fang
Wu Jiawei
Publication venue
Publication date: 29/03/2018
Field of study

Video captioning is the task of automatically generating a textual description of the actions in a video. Although previous work (e.g. sequence-to-sequence model) has shown promising results in abstracting a coarse description of a short video, it is still very challenging to caption a video containing multiple fine-grained actions with a detailed description. This paper aims to address the challenge by proposing a novel hierarchical reinforcement learning framework for video captioning, where a high-level Manager module learns to design sub-goals and a low-level Worker module recognizes the primitive actions to fulfill the sub-goal. With this compositional framework to reinforce video captioning at different levels, our approach significantly outperforms all the baseline methods on a newly introduced large-scale dataset for fine-grained video captioning. Furthermore, our non-ensemble model has already achieved the state-of-the-art results on the widely-used MSR-VTT dataset.Comment: CVPR 2018, with supplementary materia

arXiv.org e-Print Archive

Crossref

XL-NBT: A Cross-lingual Neural Belief Tracking Framework

Author: Chen Jianshu
Chen Wenhu
Su Yu
Wang William Yang
Wang Xin
Yan Xifeng
Yu Dong
Publication venue
Publication date: 01/01/2018
Field of study

Task-oriented dialog systems are becoming pervasive, and many companies heavily rely on them to complement human agents for customer service in call centers. With globalization, the need for providing cross-lingual customer support becomes more urgent than ever. However, cross-lingual support poses great challenges---it requires a large amount of additional annotated data from native speakers. In order to bypass the expensive human annotation and achieve the first step towards the ultimate goal of building a universal dialog system, we set out to build a cross-lingual state tracking framework. Specifically, we assume that there exists a source language with dialog belief tracking annotations while the target languages have no annotated dialog data of any form. Then, we pre-train a state tracker for the source language as a teacher, which is able to exploit easy-to-access parallel data. We then distill and transfer its own knowledge to the student state tracker in target languages. We specifically discuss two types of common parallel resources: bilingual corpus and bilingual dictionary, and design different transfer learning strategies accordingly. Experimentally, we successfully use English state tracker as the teacher to transfer its knowledge to both Italian and German trackers and achieve promising results.Comment: 13 pages, 5 figures, 3 tables, accepted to EMNLP 2018 conferenc

arXiv.org e-Print Archive

Crossref

Augmenting Black-box LLMs with Medical Textbooks for Clinical Question Answering

Author: Chen Wenhu
Ma Xueguang
Wang Yubo
Publication venue
Publication date: 05/09/2023
Field of study

Large-scale language models (LLMs), such as ChatGPT, are capable of generating human-like responses for various downstream tasks, such as task-oriented dialogues and question answering. However, applying LLMs to medical domains remains challenging due to their inability to leverage domain-specific knowledge. In this study, we present the Large-scale Language Models Augmented with Medical Textbooks (LLM-AMT), which integrates authoritative medical textbooks as the cornerstone of its design, enhancing its proficiency in the specialized domain through plug-and-play modules, comprised of a Hybrid Textbook Retriever, supplemented by the Query Augmenter and the LLM Reader. Experimental evaluation on three open-domain medical question-answering tasks reveals a substantial enhancement in both the professionalism and accuracy of the LLM responses when utilizing LLM-AMT, exhibiting an improvement ranging from 11.4% to 13.2%. Despite being 100 times smaller, we found that medical textbooks as the retrieval corpus serves as a more valuable external knowledge source than Wikipedia in the medical domain. Our experiments show that textbook augmentation results in a performance improvement ranging from 9.7% to 12.2% over Wikipedia augmentation

arXiv.org e-Print Archive

Tandem Phosphorothioate Modifications for DNA Adsorption Strength and Polarity Control on Gold Nanoparticles

Author: Ding Jinsong
Liu Juewen
Wang Feng
Zhou Wenhu
Publication venue: 'American Chemical Society (ACS)'
Publication date: 10/09/2014
Field of study

This document is the Accepted Manuscript version of a Published Work that appeared in final form in Applied Materials & Interfaces, copyright © American Chemical Society after peer review and technical editing by publisher. To access the final edited and published work see Zhou, W., Wang, F., Ding, J., & Liu, J. (2014). Tandem Phosphorothioate Modifications for DNA Adsorption Strength and Polarity Control on Gold Nanoparticles. ACS Applied Materials & Interfaces, 6(17), 14795–14800. https://doi.org/10.1021/am504791bUnmodified DNA was recently used to functionalize gold nanoparticles via DNA base adsorption. Compared to thiolated DNA, however, the application of unmodified DNA is limited by the lack of sequence generality, adsorption polarity control and poor adsorption stability. We report that these problems can be solved using phosphorothioate (PS) DNA. PS DNA binds to gold mainly via the sulfur atom and is thus less sequence dependent. The adsorption affinity is ranked to be thiol > PS > adenine > thymine. Tandem PS improves adsorption strength, allows tunable DNA density, and the resulting conjugates are functional at a low cost.University of Waterloo || Natural Sciences and Engineering Research Council || Foundation for Shenghua Scholar of Central South University || National Natural Science Foundation of China || Grant No. 2130119

University of Waterloo's Institutional Repository