Search CORE

178 research outputs found

CoupleNet: Coupling Global Structure with Local Parts for Object Detection

Author: Lu Hanqing
Wang Jinqiao
Wu Yi
Zhao Chaoyang
Zhao Xu
Zhu Yousong
Publication venue
Publication date: 09/08/2017
Field of study

The region-based Convolutional Neural Network (CNN) detectors such as Faster R-CNN or R-FCN have already shown promising results for object detection by combining the region proposal subnetwork and the classification subnetwork together. Although R-FCN has achieved higher detection speed while keeping the detection performance, the global structure information is ignored by the position-sensitive score maps. To fully explore the local and global properties, in this paper, we propose a novel fully convolutional network, named as CoupleNet, to couple the global structure with local parts for object detection. Specifically, the object proposals obtained by the Region Proposal Network (RPN) are fed into the the coupling module which consists of two branches. One branch adopts the position-sensitive RoI (PSRoI) pooling to capture the local part information of the object, while the other employs the RoI pooling to encode the global and context information. Next, we design different coupling strategies and normalization ways to make full use of the complementary advantages between the global and local branches. Extensive experiments demonstrate the effectiveness of our approach. We achieve state-of-the-art results on all three challenging datasets, i.e. a mAP of 82.7% on VOC07, 80.4% on VOC12, and 34.4% on COCO. Codes will be made publicly available.Comment: Accepted by ICCV 201

arXiv.org e-Print Archive

IUPUIScholarWorks

The design, education and evolution of a robotic baby

Author: Zhu Hanqing
Publication venue: Georgia Institute of Technology
Publication date: 10/01/2023
Field of study

Inspired by Alan Turing’s idea of a child machine, I introduce the formal definition of a robotic baby, an integrated system with minimal world knowledge at birth, capable of learning incrementally and interactively, and adapting to the world. Within the definition, fundamental capabilities and system characteristics of the robotic baby are identified and presented as the system-level requirements. As a minimal viable prototype, the Baby architecture is proposed with a systems engineering design approach to satisfy the system-level requirements, which has been verified and validated with simulations and experiments on a robotic system. The capabilities of the robotic baby are demonstrated in natural language acquisition and semantic parsing in English and Chinese, as well as in natural language grounding, natural language reinforcement learning, natural language programming and system introspection for explainability. Furthermore, the education and evolution of the robotic baby are illustrated with real-world robotic demonstrations. Inspired by the genetic inheritance in human beings, knowledge inheritance in robotic babies and its benefits regarding evolution are discussed.Ph.D

Scholarly Materials And Research @ Georgia Tech

Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and Efficient Pre-LN Transformers

Author: Gu Jiaqi
Jiang Zixuan
Pan David Z.
Zhu Hanqing
Publication venue
Publication date: 24/05/2023
Field of study

Transformers have achieved great success in machine learning applications. Normalization techniques, such as Layer Normalization (LayerNorm, LN) and Root Mean Square Normalization (RMSNorm), play a critical role in accelerating and stabilizing the training of Transformers. While LayerNorm recenters and rescales input vectors, RMSNorm only rescales the vectors by their RMS value. Despite being more computationally efficient, RMSNorm may compromise the representation ability of Transformers. There is currently no consensus regarding the preferred normalization technique, as some models employ LayerNorm while others utilize RMSNorm, especially in recent large language models. It is challenging to convert Transformers with one normalization to the other type. While there is an ongoing disagreement between the two normalization types, we propose a solution to unify two mainstream Transformer architectures, Pre-LN and Pre-RMSNorm Transformers. By removing the inherent redundant mean information in the main branch of Pre-LN Transformers, we can reduce LayerNorm to RMSNorm, achieving higher efficiency. We further propose the Compressed RMSNorm (CRMSNorm) and Pre-CRMSNorm Transformer based on a lossless compression of the zero-mean vectors. We formally establish the equivalence of Pre-LN, Pre-RMSNorm, and Pre-CRMSNorm Transformer variants in both training and inference. It implies that Pre-LN Transformers can be substituted with Pre-(C)RMSNorm counterparts at almost no cost, offering the same arithmetic functionality along with free efficiency improvement. Experiments demonstrate that we can reduce the training and inference time of Pre-LN Transformers by up to 10%.Comment: 15 pages, 5 tables, code available at https://github.com/ZixuanJiang/pre-rmsnorm-transforme

arXiv.org e-Print Archive

Feature Distilled Tracking

Author: Lu Hanqing
Wang Jinqiao
Wang Peisong
Wu Yi
Zhu Guibo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2017
Field of study

Feature extraction and representation is one of the most important components for fast, accurate, and robust visual tracking. Very deep convolutional neural networks (CNNs) provide effective tools for feature extraction with good generalization ability. However, extracting features using very deep CNN models needs high performance hardware due to its large computation complexity, which prohibits its extensions in real-time applications. To alleviate this problem, we aim at obtaining small and fast-to-execute shallow models based on model compression for visual tracking. Specifically, we propose a small feature distilled network (FDN) for tracking by imitating the intermediate representations of a much deeper network. The FDN extracts rich visual features with higher speed than the original deeper network. To further speed-up, we introduce a shift-and-stitch method to reduce the arithmetic operations, while preserving the spatial resolution of the distilled feature maps unchanged. Finally, a scale adaptive discriminative correlation filter is learned on the distilled feature for visual tracking to handle scale variation of the target. Comprehensive experimental results on object tracking benchmark datasets show that the proposed approach achieves 5x speed-up with competitive performance to the state-of-the-art deep trackers

IUPUIScholarWorks

Improving the indoor thermal environment with ceiling radiant terminals

Author: Hu Mingle
Huang Linsheng
Liu Zehua
Wang Hanqing
Zhu Hui
Publication venue: SINTEF Academic Press
Publication date: 01/01/2021
Field of study

A CFD (computational Fluid Dynamics) simulation model of the porous ceiling radiant air-conditioning system was established to study the influence of the ceiling temperature and envelope temperature (including the temperature of the walls and the floor of a room) on the thermal environment in the room equipped with such a system. The results showed that, for the summer condition, higher ceiling temperatures would result in higher indoor air temperature and higher Predicted Percentage Dissatisfied (PPD), which meant potential discomfort of occupants in the room. For the winter condition, however, a higher ceiling temperature within 28°C would result in a lower PPD, thus improved the thermal comfort. Considering the energy-conservation, the thermal comfort could be assured if the ceiling temperature was not more than 28°C. As for the effect of envelope temperature, the result showed that the increase in the envelope temperature during summer could result in a higher indoor air temperature, but the thermal comfort of occupants could still be ensured under such condition. Considering both the thermal comfort and the energyconservation, a ceiling temperature of 18°C (underside surface temperature of the ceiling) and an envelope temperature between 26°C and 32°C were proved appropriate for the summer. Similarly, based on the simulation results, a ceiling temperature of 26°C, and an envelope temperature between 8°C and 11°C were found appropriate for the winter. The results indicated that for the porous ceiling radiant air-conditioning system, ceiling temperature should be controlled to increase the ratio of radiant heat transfer in the summer, and the envelope temperature should be lowered to improve the energy-conservation of the system. In the winter, the heat transfer by radiation of the porous ceiling would account for a larger ratio, therefore the system showed good heating capacity and energyconservation performance in winter.publishedVersio

SINTEF Open

DREAMPlaceFPGA-MP: An Open-Source GPU-Accelerated Macro Placer for Modern FPGAs with Cascade Shapes and Region Constraints

Author: Jiang Zhixing
Pan David Z.
Rajarathnam Rachel Selina
Xiong Zhili
Zhu Hanqing
Publication venue
Publication date: 14/11/2023
Field of study

FPGA macro placement plays a pivotal role in routability and timing closer to the modern FPGA physical design flow. In modern FPGAs, macros could be subject to complex cascade shape constraints requiring instances to be placed in consecutive sites. In addition, in real-world FPGA macro placement scenarios, designs could have various region constraints that specify boundaries within which certain design instances and macros should be placed. In this work, we present DREAMPlaceFPGA-MP, an open-source GPU-accelerated FPGA macro-placer that efficiently generates legal placements for macros while honoring cascade shape requirements and region constraints. Treating multiple macros in a cascade shape as a large single instance and restricting instances to their respective regions, DREAMPlaceFPGA-MP obtains roughly legal placements. The macros are legalized in multiple steps to efficiently handle cascade shapes and region constraints. Our experimental results demonstrate that DREAMPlaceFPGA-MP is among the top contestants of the MLCAD 2023 FPGA Macro-Placement Contest

arXiv.org e-Print Archive