Search CORE

314 research outputs found

Theoretic Analysis and Extremely Easy Algorithms for Domain Adaptive Feature Learning

Author: Chung Fu-lai
Deng Cheng
Huang Heng
Jiang Wenhao
Liu Wei
Nie Feiping
Publication venue
Publication date: 10/08/2017
Field of study

Domain adaptation problems arise in a variety of applications, where a training dataset from the \textit{source} domain and a test dataset from the \textit{target} domain typically follow different distributions. The primary difficulty in designing effective learning models to solve such problems lies in how to bridge the gap between the source and target distributions. In this paper, we provide comprehensive analysis of feature learning algorithms used in conjunction with linear classifiers for domain adaptation. Our analysis shows that in order to achieve good adaptation performance, the second moments of the source domain distribution and target domain distribution should be similar. Based on our new analysis, a novel extremely easy feature learning algorithm for domain adaptation is proposed. Furthermore, our algorithm is extended by leveraging multiple layers, leading to a deep linear model. We evaluate the effectiveness of the proposed algorithms in terms of domain adaptation tasks on the Amazon review dataset and the spam dataset from the ECML/PKDD 2006 discovery challenge.Comment: ijca

arXiv.org e-Print Archive

Crossref

Automatización del diseño de las superficies limitadoras de obstáculos

Author: Cheng Wenhao
Publication venue: Universitat Politècnica de Catalunya
Publication date: 30/06/2015
Field of study

Air safety is one of the most sensitive issues in the fascinating world of aeronautics. Even though the rate of victims to accident in case of any aircraft catastrophe is relatively low in comparison with other means of transport such as a car, the echo of the news and its economic impact is far away high with respect to the others. Because of that, the international and local organizations of aviation have created a set of regulations in order to reduce and prevent any possible causes that could lead to any kind of disaster. Aeronautical servitude with an objective of protecting the airspace of airport facilities and its vicinity, applying a series of restrictions that ban construction or presence of any object that infringe aircrafts operation safety are the principal terms of the cited regulations. This document is development of an application from which airports servitude is designed in accordance with ICAO's and BOE's regulations and standards that leads to analyse and understand these regulations. The application is designed in the programming language C# using Microsoft Visual Studio 2012 as an interface for the user interaction. Finally, it is necessary to assimilate and implement the commands and .dll libraries in applications's code in order to be able to represent servitudes in the AUTOCAD environment

UPCommons. Portal del coneixement obert de la UPC

Visualizing topological edge states of single and double bilayer Bi supported on multibilayer Bi(111) films

Author: Fu Ying-Shuang
Peng Lang
Rubio Angel
Tang Peizhe
Xian Jing-Jing
Zhang Shou-Cheng
Zhang Wenhao
Publication venue: 'American Physical Society (APS)'
Publication date: 06/12/2018
Field of study

Freestanding single-bilayer Bi(111) is a two-dimensional topological insulator with edge states propagating along its perimeter. Given the interlayer coupling experimentally, the topological nature of Bi(111) thin films and the impact of the supporting substrate on the topmost Bi bilayer are still under debate. Here, combined with scanning tunneling microscopy and first-principles calculations, we systematically study the electronic properties of Bi(111) thin films grown on a NbSe2 substrate. Two types of non-magnetic edge structures, i.e., a conventional zigzag edge and a 2x1 reconstructed edge, coexist alternately at the boundaries of single bilayer islands, the topological edge states of which exhibit remarkably different energy and spatial distributions. Prominent edge states are persistently visualized at the edges of both single and double bilayer Bi islands, regardless of the underlying thickness of Bi(111) thin films. We provide an explanation for the topological origin of the observed edge states that is verified with first-principles calculations. Our paper clarifies the long-standing controversy regarding the topology of Bi(111) thin films and reveals the tunability of topological edge states via edge modifications.Comment: 36 pages, 10 figure

arXiv.org e-Print Archive

MPG.PuRe

MixPoet: Diverse Poetry Generation via Learning Controllable Mixed Latent Space

Author: Li Ruoyu
Li Wenhao
Sun Maosong
Yang Cheng
Yi Xiaoyuan
Publication venue
Publication date: 12/03/2020
Field of study

As an essential step towards computer creativity, automatic poetry generation has gained increasing attention these years. Though recent neural models make prominent progress in some criteria of poetry quality, generated poems still suffer from the problem of poor diversity. Related literature researches show that different factors, such as life experience, historical background, etc., would influence composition styles of poets, which considerably contributes to the high diversity of human-authored poetry. Inspired by this, we propose MixPoet, a novel model that absorbs multiple factors to create various styles and promote diversity. Based on a semi-supervised variational autoencoder, our model disentangles the latent space into some subspaces, with each conditioned on one influence factor by adversarial training. In this way, the model learns a controllable latent variable to capture and mix generalized factor-related properties. Different factor mixtures lead to diverse styles and hence further differentiate generated poems from each other. Experiment results on Chinese poetry demonstrate that MixPoet improves both diversity and quality against three state-of-the-art models.Comment: 8 pages, 5 figures, published in AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model

Author: Cheng Kanzhi
Ma Zheng
Song Wenpo
Zhang Jianbing
Zhu Wenhao
Zhu Zixuan
Publication venue
Publication date: 02/08/2023
Field of study

Current captioning approaches tend to generate correct but "generic" descriptions that lack real-world knowledge, e.g., named entities and contextual information. Considering that Vision-Language Pre-Training (VLP) models master massive such knowledge from large-scale web-harvested data, it is promising to utilize the generalizability of VLP models to incorporate knowledge into image descriptions. However, using VLP models faces challenges: zero-shot inference suffers from knowledge hallucination that leads to low-quality descriptions, but the generic bias in downstream task fine-tuning hinders the VLP model from expressing knowledge. To address these concerns, we propose a simple yet effective method called Knowledge-guided Replay (K-Replay), which enables the retention of pre-training knowledge during fine-tuning. Our approach consists of two parts: (1) a knowledge prediction task on automatically collected replay exemplars to continuously awaken the VLP model's memory about knowledge, thus preventing the model from collapsing into the generic pattern; (2) a knowledge distillation constraint to improve the faithfulness of generated descriptions hence alleviating the knowledge hallucination. To evaluate knowledge-enhanced descriptions, we construct a novel captioning benchmark KnowCap, containing knowledge of landmarks, famous brands, special foods and movie characters. Experimental results show that our approach effectively incorporates knowledge into descriptions, outperforming strong VLP baseline by 20.9 points (78.7->99.6) in CIDEr score and 20.5 percentage points (34.0%->54.5%) in knowledge recognition accuracy. Our code and data is available at https://github.com/njucckevin/KnowCap.Comment: Accepted at ACM Multimedia (ACMMM) 202

arXiv.org e-Print Archive

The power of question translation training in multilingual reasoning:Broadened scope and deepened insights

Author: Birch Alexandra
Chen Cheng
Chen Jiajun
Huang Shujian
Yuan Fei
Zhu Wenhao
Publication venue
Publication date: 02/05/2024
Field of study

Bridging the significant gap between large language model's English and non-English performance presents a great challenge. While some previous studies attempt to mitigate this gap with translated training data, the recently proposed question alignment approach leverages the model's English expertise to improve multilingual performance with minimum usage of expensive, error-prone translation. In this paper, we explore how broadly this method can be applied by examining its effects in reasoning with executable code and reasoning with common sense. We also explore how to apply this approach efficiently to extremely large language models using proxy-tuning. Experiment results on multilingual reasoning benchmarks mGSM, mSVAMP and xCSQA demonstrate that the question alignment approach can be used to boost multilingual performance across diverse reasoning scenarios, model families, and sizes. For instance, when applied to the LLaMA2 models, our method brings an average accuracy improvements of 12.2% on mGSM even with the 70B model. To understand the mechanism of its success, we analyze representation space, chain-of-thought and translation data scales, which reveals how question translation training strengthens language alignment within LLMs and shapes their working patterns

Edinburgh Research Explorer

A fundamental diagram based interpretable framework for traffic flow estimation and prediction by combining a Markovian model with deep learning

Author: Chen Yanyan
Cheng Qixiu
Guo Jifu
Li Wenhao
Liu Yanyue
Pan Yuyan
Publication venue
Publication date: 15/03/2024
Field of study

Explore Bristol Research

UniHPE: Towards Unified Human Pose Estimation via Contrastive Learning

Author: Chai Wenhao
Hwang Jenq-Neng
Jiang Zhongyu
Li Lei
Yang Cheng-Yen
Zhou Zhuoran
Publication venue
Publication date: 24/11/2023
Field of study

In recent times, there has been a growing interest in developing effective perception techniques for combining information from multiple modalities. This involves aligning features obtained from diverse sources to enable more efficient training with larger datasets and constraints, as well as leveraging the wealth of information contained in each modality. 2D and 3D Human Pose Estimation (HPE) are two critical perceptual tasks in computer vision, which have numerous downstream applications, such as Action Recognition, Human-Computer Interaction, Object tracking, etc. Yet, there are limited instances where the correlation between Image and 2D/3D human pose has been clearly researched using a contrastive paradigm. In this paper, we propose UniHPE, a unified Human Pose Estimation pipeline, which aligns features from all three modalities, i.e., 2D human pose estimation, lifting-based and image-based 3D human pose estimation, in the same pipeline. To align more than two modalities at the same time, we propose a novel singular value based contrastive learning loss, which better aligns different modalities and further boosts the performance. In our evaluation, UniHPE achieves remarkable performance metrics: MPJPE

50.5

mm on the Human3.6M dataset and PAMPJPE

51.6

mm on the 3DPW dataset. Our proposed method holds immense potential to advance the field of computer vision and contribute to various applications

arXiv.org e-Print Archive

Back to Optimization: Diffusion-based Zero-Shot 3D Human Pose Estimation

Author: Chai Wenhao
Hwang Jenq-Neng
Jiang Zhongyu
Li Lei
Yang Cheng-Yen
Zhou Zhuoran
Publication venue
Publication date: 07/07/2023
Field of study

Learning-based methods have dominated the 3D human pose estimation (HPE) tasks with significantly better performance in most benchmarks than traditional optimization-based methods. Nonetheless, 3D HPE in the wild is still the biggest challenge of learning-based models, whether with 2D-3D lifting, image-to-3D, or diffusion-based methods, since the trained networks implicitly learn camera intrinsic parameters and domain-based 3D human pose distributions and estimate poses by statistical average. On the other hand, the optimization-based methods estimate results case-by-case, which can predict more diverse and sophisticated human poses in the wild. By combining the advantages of optimization-based and learning-based methods, we propose the Zero-shot Diffusion-based Optimization (ZeDO) pipeline for 3D HPE to solve the problem of cross-domain and in-the-wild 3D HPE. Our multi-hypothesis ZeDO achieves state-of-the-art (SOTA) performance on Human3.6M as minMPJPE

51.4

mm without training with any 2D-3D or image-3D pairs. Moreover, our single-hypothesis ZeDO achieves SOTA performance on 3DPW dataset with PA-MPJPE

42.6

mm on cross-dataset evaluation, which even outperforms learning-based methods trained on 3DPW

arXiv.org e-Print Archive