Search CORE

11 research outputs found

SCA-PVNet: Self-and-Cross Attention Based Aggregation of Point Cloud and Multi-View for 3D Object Retrieval

Author: Cheng Yi
Guo Aiyuan
Li Yiqun
Lin Dongyun
Mao Shangbo
Publication venue
Publication date: 20/07/2023
Field of study

To address 3D object retrieval, substantial efforts have been made to generate highly discriminative descriptors of 3D objects represented by a single modality, e.g., voxels, point clouds or multi-view images. It is promising to leverage the complementary information from multi-modality representations of 3D objects to further improve retrieval performance. However, multi-modality 3D object retrieval is rarely developed and analyzed on large-scale datasets. In this paper, we propose self-and-cross attention based aggregation of point cloud and multi-view images (SCA-PVNet) for 3D object retrieval. With deep features extracted from point clouds and multi-view images, we design two types of feature aggregation modules, namely the In-Modality Aggregation Module (IMAM) and the Cross-Modality Aggregation Module (CMAM), for effective feature fusion. IMAM leverages a self-attention mechanism to aggregate multi-view features while CMAM exploits a cross-attention mechanism to interact point cloud features with multi-view features. The final descriptor of a 3D object for object retrieval can be obtained via concatenating the aggregated features from both modules. Extensive experiments and analysis are conducted on three datasets, ranging from small to large scale, to show the superiority of the proposed SCA-PVNet over the state-of-the-art methods

arXiv.org e-Print Archive

Attire detection and retrieval based on region proposals with convolutional neural network

Author: Mao Shangbo
Publication venue
Publication date: 01/01/2017
Field of study

Region Proposals with Convolutional Neural Network Features (RCNN), an object detection algorithm, has a good performance on Visual Object Classes Challenge 2012 [1]. There are two main approaches to improve the performance of it. The first one is to apply high-capacity Convolutional Neutral Network (CNN) with region proposals to localize and segment the object. The other one is to perform supervised pre-training when the labelled data is insufficient. The goal of this project is to build an attire detection system using Region Proposals with Convolutional Neural Network Features. In order to study RCNN, we introduce some concepts related to it. We explain the definitions of object detection, Neural Network (NN) and Convolutional Neural Network (CNN) in detail. The description of RCNN contains two parts. The first part is the method of region proposal, and the second part is the CNN architecture. Then we describe the attire detection system and the process of dataset construction in detail. Finally, we summarize and discuss the testing results. The testing results show RCNN have a good performance on attire object detection. The mean average precision (mAP) based on all categories is 57.26%. Based on the testing results, we find that the quality and amount of training data have a great effect on the performance of attire detection system.Master of Science (Signal Processing

DR-NTU (Digital Repository of NTU)

Missing Data Estimation for Traffic Volume by Searching an Optimum Closed Cut in Urban Networks

Author: Guoqiang Mao
Shangbo Wang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Deep residual pooling network for texture recognition

Author: Chia Liang Tien
Mao Shangbo
Rajan Deepu
Publication venue
Publication date: 01/01/2021
Field of study

Current deep learning-based texture recognition methods extract spatial orderless features from pre-trained deep learning models that are trained on large-scale image datasets. These methods either produce high dimensional features or have multiple steps like dictionary learning, feature encoding and dimension reduction. In this paper, we propose a novel end-to-end learning framework that not only overcomes these limitations, but also demonstrates faster learning. The proposed framework incorporates a residual pooling layer consisting of a residual encoding module and an aggregation module. The residual encoder preserves the spatial information for improved feature learning and the aggregation module generates orderless feature for classification through a simple averaging. The feature has the lowest dimension among previous deep texture recognition approaches, yet it achieves state-of-the-art performance on benchmark texture recognition datasets such as FMD, DTD, 4D Light and one industry dataset used for metal surface anomaly detection. Additionally, the proposed method obtains comparable results on the MIT-Indoor scene recognition dataset. Our codes are available at https://github.com/maoshangbo/DRP-Texture-Recognition.This work was conducted within the Rolls-Royce@NTU Corporate Lab under the project DACS 2.1: Artificial Intelligence (AI) for Smart Image Understanding with support from the Industry Alignment Fund (IAF) Singapore under the Corp Lab@University Scheme

DR-NTU (Digital Repository of NTU)

Unsupervised feature learning with sparse Bayesian auto-encoding based extreme learning machine

Author: Cui Dongshun
Huang Guang-Bin
Mao Shangbo
Zhang Guanghao
Publication venue
Publication date: 01/01/2020
Field of study

Extreme learning machine (ELM) is a popular method in machine learning with extremely few parameters, fast learning speed and model efficiency. Unsupervised feature learning based ELM receives rising research focus. Recently the ELM auto-encoder (ELM-AE) was proposed for this task, which develops the ELM based compact feature learning without sacrificing elegant solution. Compared with ELM-AE and following ℓ1-regularized ELM-AE, we introduce a sparse Bayesian learning scheme into ELM-AE for better generalization capability. A parallel training strategy is also integrated to improve time-efficiency of multi-output sparse Bayesian learning. Furthermore, pruning hidden nodes for better performance and efficiency according to estimated variances of prior distribution of output weights is achieved. Experiments on several datasets verify the effectiveness and efficiency of our proposed ELM-AE for unsupervised feature learning, compared with PCA, NMF, ELM-AE and ℓ1-regularized ELM-AE

DR-NTU (Digital Repository of NTU)

Towards Enhanced Recovery and System Stability: Analytical Solutions for Dynamic Incident Effects in Road Networks

Author: Changle Li
Guoqiang Mao
Shangbo Wang
Wenwei Yue
Zhigang Xu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Crossref

R-ELMNet: regularized extreme learning machine network

Author: Cui Dongshun
Huang Guang-Bin
Li Yue
Mao Shangbo
Zhang Guanghao
Publication venue
Publication date: 01/01/2020
Field of study

Principal component analysis network (PCANet), as an unsupervised shallow network, demonstrates noticeable effectiveness on datasets of various volumes. It carries a two-layer convolution with PCA as filter learning method, followed by a block-wise histogram post-processing stage. Following the structure of PCANet, extreme learning machine auto-encoder (ELM-AE) variants are employed to replace the PCA's role, which come from extreme learning machine network (ELMNet) and hierarchical ELMNet. ELMNet emphasizes the importance of orthogonal projection while overlooking non-linearity. The latter introduces complex pre-processing to overcome drawback of non-linear ELM-AE. In this paper, we analyze intrinsic characteristics of ELM-AE variants and accordingly propose a regularized ELM-AE, which combines non-linearity learning capability and approximately orthogonal projection. Experiments on image classification show the effectiveness compared to supervised convolutional neural networks and related shallow networks on unsupervised feature learning

DR-NTU (Digital Repository of NTU)

R-ELMNet: Regularized extreme learning machine network

Author: Bai
Bellman
Chan
Cohen
Cui
Dongshun Cui
Dufourq
Frankl
Guang-Bin Huang
Guanghao Zhang
Huang
Huang
Huang
Huang
Jameson
Jarrett
Johnson
Kasun
Kasun
Krizhevsky
Larsen
Liu
Lyons
Nene
Nene
Rumelhart
Samaria
Shangbo Mao
Shi
Simonyan
Suykens
Xiao
Xu
Yue Li
Zhu
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref