Search CORE

229 research outputs found

Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition

Author: Kuang Zhanghui
Ouyang Wanli
Sheng Lu
Sun Shuyang
Zhang Wei
Publication venue
Publication date: 07/07/2018
Field of study

Motion representation plays a vital role in human action recognition in videos. In this study, we introduce a novel compact motion representation for video action recognition, named Optical Flow guided Feature (OFF), which enables the network to distill temporal information through a fast and robust approach. The OFF is derived from the definition of optical flow and is orthogonal to the optical flow. The derivation also provides theoretical support for using the difference between two frames. By directly calculating pixel-wise spatiotemporal gradients of the deep feature maps, the OFF could be embedded in any existing CNN based video action recognition framework with only a slight additional cost. It enables the CNN to extract spatiotemporal information, especially the temporal information between frames simultaneously. This simple but powerful idea is validated by experimental results. The network with OFF fed only by RGB inputs achieves a competitive accuracy of 93.3% on UCF-101, which is comparable with the result obtained by two streams (RGB and optical flow), but is 15 times faster in speed. Experimental results also show that OFF is complementary to other motion modalities such as optical flow. When the proposed method is plugged into the state-of-the-art video action recognition framework, it has 96:0% and 74:2% accuracy on UCF-101 and HMDB-51 respectively. The code for this project is available at https://github.com/kevin-ssy/Optical-Flow-Guided-Feature.Comment: CVPR 2018. code available at https://github.com/kevin-ssy/Optical-Flow-Guided-Featur

arXiv.org e-Print Archive

Crossref

Revisiting Classifier: Transferring Vision-Language Models for Video Recognition

Author: Ouyang Wanli
Sun Zhun
Wu Wenhao
Publication venue
Publication date: 26/03/2023
Field of study

Transferring knowledge from task-agnostic pre-trained deep models for downstream tasks is an important topic in computer vision research. Along with the growth of computational capacity, we now have open-source vision-language pre-trained models in large scales of the model architecture and amount of data. In this study, we focus on transferring knowledge for video classification tasks. Conventional methods randomly initialize the linear classifier head for vision classification, but they leave the usage of the text encoder for downstream visual recognition tasks undiscovered. In this paper, we revise the role of the linear classifier and replace the classifier with the different knowledge from pre-trained model. We utilize the well-pretrained language model to generate good semantic target for efficient transferring learning. The empirical study shows that our method improves both the performance and the training speed of video classification, with a negligible change in the model. Our simple yet effective tuning paradigm achieves state-of-the-art performance and efficient training on various video recognition scenarios, i.e., zero-shot, few-shot, general recognition. In particular, our paradigm achieves the state-of-the-art accuracy of 87.8% on Kinetics-400, and also surpasses previous methods by 20~50% absolute top-1 accuracy under zero-shot, few-shot settings on five popular video datasets. Code and models can be found at https://github.com/whwu95/Text4Vis .Comment: Accepted by AAAI-2023. Camera Ready Versio

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Do Managers In Chinese Family Firms Learn From The Market? Evidence From Chinese Private Placement

Author: Gao Weiwei
Li Wanli
Sun Wei
Publication venue: 'Clute Institute'
Publication date: 03/03/2015
Field of study

Recent empirical papers report managers’ learning in merger and acquisition (M&A) decisions and family control is central in many countries. Does learning exist in family firms’ financing decisions? Based on the announced private placements from Chinese family firms, we investigate the relation between managers’ final decisions in family firms and the market reaction to the announcement. Our analysis suggests that a non-linear relation exists between managers’ learning and family control. Managers generally learn from the market when making final decisions but family involvement can reduce this probability. Supplementary testing indicates that managers in family firms with low ownership are less likely to learn from the market than those in family firms with high ownership. Further analysis suggests that corporate governance can influence managers’ learning. Family member’ participation in purchasing the placed shares and serve as the top managers can make manager’ learning less likely when the ownership is low. Independent directors in family firms don’t play their due role in supervising the behavior of managers and large shareholders

Clute Institute: Journals

Market Feedback And Managers’ Decisions In Private Placement – Evidence From Chinese Family Firms

Author: Gao Weiwei
Li Wanli
Sun Wei
Publication venue: 'Clute Institute'
Publication date: 30/06/2016
Field of study

What effect does market feedback have on managers’ decisions on private placement in family firms? Based on information asymmetry, agency theory, and corporate governance theory, we investigate the relationship between managers’ final decisions and market feedback to the announcement. We find that managers in family firms accept market feedback in decision-making and their attitude can be affected by many external factors. Managers tend to listen to the market when family firms are non-high-tech, when family members participate in purchasing the placed shares, when family members serve as managers, and when separation of control rights from ownership is small

Clute Institute: Journals

Energy-Based Models For Speech Synthesis

Author: Ragni Anton
Sun Wanli
Tu Zehai
Publication venue
Publication date: 19/10/2023
Field of study

Recently there has been a lot of interest in non-autoregressive (non-AR) models for speech synthesis, such as FastSpeech 2 and diffusion models. Unlike AR models, these models do not have autoregressive dependencies among outputs which makes inference efficient. This paper expands the range of available non-AR models with another member called energy-based models (EBMs). The paper describes how noise contrastive estimation, which relies on the comparison between positive and negative samples, can be used to train EBMs. It proposes a number of strategies for generating effective negative samples, including using high-performing AR models. It also describes how sampling from EBMs can be performed using Langevin Markov Chain Monte-Carlo (MCMC). The use of Langevin MCMC enables to draw connections between EBMs and currently popular diffusion models. Experiments on LJSpeech dataset show that the proposed approach offers improvements over Tacotron 2

arXiv.org e-Print Archive

TransGrasp: Grasp Pose Estimation of a Category of Objects by Transferring Grasps from Only One Labeled Instance

Author: Peng Wanli
Sun Yi
Wen Hongtao
Yan Jianhang
Publication venue
Publication date: 25/07/2022
Field of study

Grasp pose estimation is an important issue for robots to interact with the real world. However, most of existing methods require exact 3D object models available beforehand or a large amount of grasp annotations for training. To avoid these problems, we propose TransGrasp, a category-level grasp pose estimation method that predicts grasp poses of a category of objects by labeling only one object instance. Specifically, we perform grasp pose transfer across a category of objects based on their shape correspondences and propose a grasp pose refinement module to further fine-tune grasp pose of grippers so as to ensure successful grasps. Experiments demonstrate the effectiveness of our method on achieving high-quality grasps with the transferred grasp poses. Our code is available at https://github.com/yanjh97/TransGrasp.Comment: Accepted to European Conference on Computer Vision (ECCV) 202

arXiv.org e-Print Archive

Reform of Training Ways of Engineering Practice and Innovation Ability for Petroleum Engineering Students

Author: BAI Mingxing
SONG Kaoping
SUN Jiangpeng
ZOU Wanli
Publication venue: Canadian Research & Development Center of Sciences and Cultures
Publication date: 26/11/2015
Field of study

To improve the students’ ability of engineering practice and innovation, Petroleum Engineering Experiment Teaching Center reformed the training ways, including construction of two-way practice teaching system, reform of the experimental teaching organization and management mode, experimental teaching mode, experimental teaching contents, experimental teaching assessment methods, form and content of college students’ extracurricular scientific activities, etc.. These help to form the good situation of independent learning and independent experiments, training students’ engineering capability and innovative capability, which have achieved fruitful results in practice

CSCanada.net: E-Journals (Canadian Academy of Oriental and Occidental Culture, Canadian Research & Development Center of Sciences and Cultures)

[μ-2,2′-Dimethyl-2,2′-(p-phenylene)dipropyl]bis[chloridobis(2-methyl-2-phenylpropyl)tin(IV)]

Author: Kang Wanli
Liang Chong
Sun Chunliu
Zhu Dongsheng
Publication venue: International Union of Crystallography
Publication date: 01/05/2010
Field of study

The molecular structure of the title compound, [Sn2(C10H13)4(C14H20)Cl2], is a binuclear centrosymmetric complex, in which the Sn atoms are four-coordinated by three C atoms and one Cl atom in a distorted tetrahedral geometry

Directory of Open Access Journals

PubMed Central