Search CORE

286 research outputs found

Online Metric-Weighted Linear Representations for Robust Visual Tracking

Author: Dick Anthony
Li Xi
Shen Chunhua
Zhang Zhongfei
Zhuang Yueting
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/07/2015
Field of study

In this paper, we propose a visual tracker based on a metric-weighted linear representation of appearance. In order to capture the interdependence of different feature dimensions, we develop two online distance metric learning methods using proximity comparison information and structured output learning. The learned metric is then incorporated into a linear representation of appearance. We show that online distance metric learning significantly improves the robustness of the tracker, especially on those sequences exhibiting drastic appearance changes. In order to bound growth in the number of training samples, we design a time-weighted reservoir sampling method. Moreover, we enable our tracker to automatically perform object identification during the process of object tracking, by introducing a collection of static template samples belonging to several object classes of interest. Object identification results for an entire video sequence are achieved by systematically combining the tracking information and visual recognition at each frame. Experimental results on challenging video sequences demonstrate the effectiveness of the method for both inter-frame tracking and object identification.Comment: 51 pages. Appearing in IEEE Transactions on Pattern Analysis and Machine Intelligenc

arXiv.org e-Print Archive

Adelaide Research & Scholarship

Zero-Shot Recognition using Dual Visual-Semantic Mapping Paths

Author: Hu Huanhang
Li Yanan
Lin Yuetan
Wang Donghui
Zhuang Yueting
Publication venue
Publication date: 19/03/2017
Field of study

Zero-shot recognition aims to accurately recognize objects of unseen classes by using a shared visual-semantic mapping between the image feature space and the semantic embedding space. This mapping is learned on training data of seen classes and is expected to have transfer ability to unseen classes. In this paper, we tackle this problem by exploiting the intrinsic relationship between the semantic space manifold and the transfer ability of visual-semantic mapping. We formalize their connection and cast zero-shot recognition as a joint optimization problem. Motivated by this, we propose a novel framework for zero-shot recognition, which contains dual visual-semantic mapping paths. Our analysis shows this framework can not only apply prior semantic knowledge to infer underlying semantic manifold in the image feature space, but also generate optimized semantic embedding space, which can enhance the transfer ability of the visual-semantic mapping to unseen classes. The proposed method is evaluated for zero-shot recognition on four benchmark datasets, achieving outstanding results.Comment: Accepted as a full paper in IEEE Computer Vision and Pattern Recognition (CVPR) 201

arXiv.org e-Print Archive

Crossref

Video Question Answering via Attribute-Augmented Attention Network Learning

Author: Chen Long
Li Yimeng
Xiao Jun
Ye Yunan
Zhao Zhou
Zhuang Yueting
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 19/07/2017
Field of study

Video Question Answering is a challenging problem in visual information retrieval, which provides the answer to the referenced video content according to the question. However, the existing visual question answering approaches mainly tackle the problem of static image question, which may be ineffectively for video question answering due to the insufficiency of modeling the temporal dynamics of video contents. In this paper, we study the problem of video question answering by modeling its temporal dynamics with frame-level attention mechanism. We propose the attribute-augmented attention network learning framework that enables the joint frame-level attribute detection and unified video representation learning for video question answering. We then incorporate the multi-step reasoning process for our proposed attention network to further improve the performance. We construct a large-scale video question answering dataset. We conduct the experiments on both multiple-choice and open-ended video question answering tasks to show the effectiveness of the proposed method.Comment: Accepted for SIGIR 201

arXiv.org e-Print Archive

Crossref

Intrusion of polyethylene glycol into solid-state nanopores

Author: Li Yibing
Sun Yueting
Xu Chengliang
Publication venue
Publication date: 01/01/2018
Field of study

The intrusion of PEG aqueous solution into solid-state-nanopores upon mechanical pressure is experimentally investigated. By using hydrophobic nanoporous silica with a broad range of pore sizes, the characteristic size of PEG chains in water while penetrating nanopores is measured and analyzed, which increases with molecular weight and decreases with concentration of PEG. Its sensitivity to molecular weight is relatively limited due to nano-confinement. The inclusion of PEG as an intruding liquid imposes a rate effect on the intrusion pressure, and inhibits the extrusion from the nanopores

Crossref

University of Birmingham Research Portal

PubMed Central