Search CORE

799 research outputs found

The Application of Two-level Attention Models in Deep Convolutional Neural Network for Fine-grained Image Classification

Author: Peng Yuxin
Xiao Tianjun
Xu Yichong
Yang Kuiyuan
Zhang Jiaxing
Zhang Zheng
Publication venue
Publication date: 24/11/2014
Field of study

Fine-grained classification is challenging because categories can only be discriminated by subtle and local differences. Variances in the pose, scale or rotation usually make the problem more difficult. Most fine-grained classification systems follow the pipeline of finding foreground object or object parts (where) to extract discriminative features (what). In this paper, we propose to apply visual attention to fine-grained classification task using deep neural network. Our pipeline integrates three types of attention: the bottom-up attention that propose candidate patches, the object-level top-down attention that selects relevant patches to a certain object, and the part-level top-down attention that localizes discriminative parts. We combine these attentions to train domain-specific deep nets, then use it to improve both the what and where aspects. Importantly, we avoid using expensive annotations like bounding box or part information from end-to-end. The weak supervision constraint makes our work easier to generalize. We have verified the effectiveness of the method on the subsets of ILSVRC2012 dataset and CUB200_2011 dataset. Our pipeline delivered significant improvements and achieved the best accuracy under the weakest supervision condition. The performance is competitive against other methods that rely on additional annotations

arXiv.org e-Print Archive

Crossref

Attentional Neural Network: Feature Selection Using Cognitive Feedback

Author: Song Sen
Wang Qian
Zhang Jiaxing
Zhang Zheng
Publication venue
Publication date: 19/11/2014
Field of study

Attentional Neural Network is a new framework that integrates top-down cognitive bias and bottom-up feature extraction in one coherent architecture. The top-down influence is especially effective when dealing with high noise or difficult segmentation problems. Our system is modular and extensible. It is also easy to train and cheap to run, and yet can accommodate complex behaviors. We obtain classification accuracy better than or competitive with state of art results on the MNIST variation dataset, and successfully disentangle overlaid digits with high success rates. We view such a general purpose framework as an essential foundation for a larger system emulating the cognitive abilities of the whole brain.Comment: Poster in Neural Information Processing Systems (NIPS) 201

arXiv.org e-Print Archive

CiteSeerX

Spectral Unsupervised Domain Adaptation for Visual Recognition

Author: Huang Jiaxing
Lu Shijian
Zhang Jingyi
Publication venue
Publication date: 10/06/2021
Field of study

Unsupervised domain adaptation (UDA) aims to learn a well-performed model in an unlabeled target domain by leveraging labeled data from one or multiple related source domains. It remains a great challenge due to 1) the lack of annotations in the target domain and 2) the rich discrepancy between the distributions of source and target data. We propose Spectral UDA (SUDA), an efficient yet effective UDA technique that works in the spectral space and is generic across different visual recognition tasks in detection, classification and segmentation. SUDA addresses UDA challenges from two perspectives. First, it mitigates inter-domain discrepancies by a spectrum transformer (ST) that maps source and target images into spectral space and learns to enhance domain-invariant spectra while suppressing domain-variant spectra simultaneously. To this end, we design novel adversarial multi-head spectrum attention that leverages contextual information to identify domain-variant and domain-invariant spectra effectively. Second, it mitigates the lack of annotations in target domain by introducing multi-view spectral learning which aims to learn comprehensive yet confident target representations by maximizing the mutual information among multiple ST augmentations capturing different spectral views of each target sample. Extensive experiments over different visual tasks (e.g., detection, classification and segmentation) show that SUDA achieves superior accuracy and it is also complementary with state-of-the-art UDA methods with consistent performance boosts but little extra computation

arXiv.org e-Print Archive

Facile Preparation of Bimetallic MOF-derived Supported Tungstophosphoric Acid Composites for Biodiesel Production

Author: Jin Jiaxing
Luo Linmin
Wu Yaping
Zhang Qiuyun
Zhang Yutao
Publication venue: 'Periodica Polytechnica Budapest University of Technology and Economics'
Publication date: 28/08/2023
Field of study

In this work, the novel TPA@C-NiZr-MOF catalyst is synthesized by the impregnation of tungstophosphoric acid (TPA) on the NiZr-based metal-organic framework (NiZr-MOF) followed by calcination up to 300 °C. The as-prepared catalyst materials were structurally, morphologically, and texturally characterized by XRD, FTIR, temperature programmed desorption of NH3 ( TPD-NH3 ), N2 physisorption, SEM, TEM, and XPS. The prepared catalyst can be used as an efficient heterogeneous catalyst for biodiesel production from oleic acid (OA) with methanol. The results indicated that, in comparison to TPA@NiZr-MOF, the TPA@C-NiZr-MOF catalyst calcined at 300 °C exhibits excellent catalytic performance probably owing to the synergistic effect between TPA and metal oxide skeletons, high acidity, as well as larger surface area and pore size. Additionally, the TPA@C-NiZr-MOF catalyst can be reused in up to six cycles with an acceptable conversion. This study showed that the bimetallic MOF-derived composite materials can be used as an alternative potential heterogeneous catalyst toward biorefinery applications

Periodica Polytechnica (Budapest University of Technology and Economics)

Vision-Language Models for Vision Tasks: A Survey

Author: Huang Jiaxing
Jin Sheng
Lu Shijian
Zhang Jingyi
Publication venue
Publication date: 16/02/2024
Field of study

Most visual recognition studies rely heavily on crowd-labelled data in deep neural networks (DNNs) training, and they usually train a DNN for each single visual recognition task, leading to a laborious and time-consuming visual recognition paradigm. To address the two challenges, Vision-Language Models (VLMs) have been intensively investigated recently, which learns rich vision-language correlation from web-scale image-text pairs that are almost infinitely available on the Internet and enables zero-shot predictions on various visual recognition tasks with a single VLM. This paper provides a systematic review of visual language models for various visual recognition tasks, including: (1) the background that introduces the development of visual recognition paradigms; (2) the foundations of VLM that summarize the widely-adopted network architectures, pre-training objectives, and downstream tasks; (3) the widely-adopted datasets in VLM pre-training and evaluations; (4) the review and categorization of existing VLM pre-training methods, VLM transfer learning methods, and VLM knowledge distillation methods; (5) the benchmarking, analysis and discussion of the reviewed methods; (6) several research challenges and potential research directions that could be pursued in the future VLM studies for visual recognition. A project associated with this survey has been created at https://github.com/jingyi0000/VLM_survey

arXiv.org e-Print Archive

The Validity of CET-6 among Chinese Students Studying Overseas

Author: Bo Zhang
Cunyi Liu
Huilin Qi
Jiaxing Xiong
Publication venue: Scitech Research Organisation
Publication date: 18/12/2019
Field of study

This paper focuses on the validity of College English Test Band 6 (CET-6) in oversea life among Chinese students to find out whether the scores of CET-6 can truly reflect students’ English language ability and whether it is possible to use the scores of CET-6 as a proof for English language proficiency. To do the survey, we conducted the survey by quantitative research methods with 50 samples in Universiti Putra Malaysia(UPM). After the collection and analysis of data, some current issues about the assessment standards of CET-6 are found, and suggestions are also given to improve the validity of CET-6

Scitech Research Journals