Search CORE

98 research outputs found

AutoSTL: Automated Spatio-Temporal Multi-Task Learning

Author: Miao Hao
Zhang Chunxu
Zhang Junbo
Zhang Zijian
Zhao Xiangyu
Publication venue
Publication date: 16/04/2023
Field of study

Spatio-Temporal prediction plays a critical role in smart city construction. Jointly modeling multiple spatio-temporal tasks can further promote an intelligent city life by integrating their inseparable relationship. However, existing studies fail to address this joint learning problem well, which generally solve tasks individually or a fixed task combination. The challenges lie in the tangled relation between different properties, the demand for supporting flexible combinations of tasks and the complex spatio-temporal dependency. To cope with the problems above, we propose an Automated Spatio-Temporal multi-task Learning (AutoSTL) method to handle multiple spatio-temporal tasks jointly. Firstly, we propose a scalable architecture consisting of advanced spatio-temporal operations to exploit the complicated dependency. Shared modules and feature fusion mechanism are incorporated to further capture the intrinsic relationship between tasks. Furthermore, our model automatically allocates the operations and fusion weight. Extensive experiments on benchmark datasets verified that our model achieves state-of-the-art performance. As we can know, AutoSTL is the first automated spatio-temporal multi-task learning method.<br/

VBN

Distributed Dynamic Map Fusion via Federated Learning for Intelligent Networked Vehicles

Author: Hao Qi
Hong Yuncong
Wang Shuai
Zhang Zijian
Zhou Liangkai
Publication venue
Publication date: 05/03/2021
Field of study

The technology of dynamic map fusion among networked vehicles has been developed to enlarge sensing ranges and improve sensing accuracies for individual vehicles. This paper proposes a federated learning (FL) based dynamic map fusion framework to achieve high map quality despite unknown numbers of objects in fields of view (FoVs), various sensing and model uncertainties, and missing data labels for online learning. The novelty of this work is threefold: (1) developing a three-stage fusion scheme to predict the number of objects effectively and to fuse multiple local maps with fidelity scores; (2) developing an FL algorithm which fine-tunes feature models (i.e., representation learning networks for feature extraction) distributively by aggregating model parameters; (3) developing a knowledge distillation method to generate FL training labels when data labels are unavailable. The proposed framework is implemented in the Car Learning to Act (CARLA) simulation platform. Extensive experimental results are provided to verify the superior performance and robustness of the developed map fusion and FL schemes.Comment: 12 pages, 5 figures, to appear in 2021 IEEE International Conference on Robotics and Automation (ICRA

arXiv.org e-Print Archive

Monad: Towards Cost-effective Specialization for Chiplet-based Spatial Accelerators

Author: Ding Zijian
Hao Xiaochen
Liang Yun
Wang Yuan
Yin Jieming
Publication venue
Publication date: 15/08/2023
Field of study

Advanced packaging offers a new design paradigm in the post-Moore era, where many small chiplets can be assembled into a large system. Based on heterogeneous integration, a chiplet-based accelerator can be highly specialized for a specific workload, demonstrating extreme efficiency and cost reduction. To fully leverage this potential, it is critical to explore both the architectural design space for individual chiplets and different integration options to assemble these chiplets, which have yet to be fully exploited by existing proposals. This paper proposes Monad, a cost-aware specialization approach for chiplet-based spatial accelerators that explores the tradeoffs between PPA and fabrication costs. To evaluate a specialized system, we introduce a modeling framework considering the non-uniformity in dataflow, pipelining, and communications when executing multiple tensor workloads on different chiplets. We propose to combine the architecture and integration design space by uniformly encoding the design aspects for both spaces and exploring them with a systematic ML-based approach. The experiments demonstrate that Monad can achieve an average of 16% and 30% EDP reduction compared with the state-of-the-art chiplet-based accelerators, Simba and NN-Baton, respectively.Comment: To be published in ICCAD 202

arXiv.org e-Print Archive

TAG : Type Auxiliary Guiding for Code Comment Generation

Author: Cai Ruichu
Chen Yao
Hao Yuexing
Li Zijian
Liang Zhihao
Xu Boyan
Publication venue
Publication date: 01/01/2020
Field of study

Existing leading code comment generation approaches with the structure-to-sequence framework ignores the type information of the interpretation of the code, e.g., operator, string, etc. However, introducing the type information into the existing framework is non-trivial due to the hierarchical dependence among the type information. In order to address the issues above, we propose a Type Auxiliary Guiding encoder-decoder framework for the code comment generation task which considers the source code as an N-ary tree with type information associated with each node. Specifically, our framework is featured with a Type-associated Encoder and a Type-restricted Decoder which enables adaptive summarization of the source code. We further propose a hierarchical reinforcement learning method to resolve the training difficulties of our proposed framework. Extensive evaluations demonstrate the state-of-the-art performance of our framework with both the auto-evaluated metrics and case studies.Comment: ACL 2020, Accepte

arXiv.org e-Print Archive

Crossref

Learning Disentangled Semantic Representation for Domain Adaptation

Author: Cai Ruichu
Hao Zhifeng
Li Zijian
Qiao Jie
Wei Pengfei
Zhang Kun
Publication venue: 'International Joint Conferences on Artificial Intelligence'
Publication date: 21/12/2020
Field of study

Domain adaptation is an important but challenging task. Most of the existing domain adaptation methods struggle to extract the domain-invariant representation on the feature space with entangling domain information and semantic information. Different from previous efforts on the entangled feature space, we aim to extract the domain invariant semantic information in the latent disentangled semantic representation (DSR) of the data. In DSR, we assume the data generation process is controlled by two independent sets of variables, i.e., the semantic latent variables and the domain latent variables. Under the above assumption, we employ a variational auto-encoder to reconstruct the semantic latent variables and domain latent variables behind the data. We further devise a dual adversarial network to disentangle these two sets of reconstructed latent variables. The disentangled semantic latent variables are finally adapted across the domains. Experimental studies testify that our model yields state-of-the-art performance on several domain adaptation benchmark datasets

arXiv.org e-Print Archive

Crossref

AutoSTL: Automated Spatio-Temporal Multi-Task Learning

Author: Miao Hao
Zhang Chunxu
Zhang Junbo
Zhang Zijian
Zhao Hongwei
Zhao Xiangyu
Publication venue
Publication date: 16/04/2023
Field of study

arXiv.org e-Print Archive

Frequency-astigmatism asymmetric nonlinear conversion of structured light lasers

Author: Fu Xing
Liu Qiang
Pan Jing
Shen Yijie
Shi Zijian
Wang Hao
Publication venue
Publication date: 01/05/2023
Field of study

Nonlinear optics of structured light has recently delivered intriguing fundamental physical phenomena in light-matter interactions and advanced applications from classical imaging to quantum informatics. The mutual interaction between spin, orbital angular momentum (OAM) and wavelength is extensively studied in such cases. In this work, we go beyond only considering OAM and wavelength by taking the nonlinear frequency conversion and transverse mode astigmatism conversion as two building blocks and investigating how single modes and complicated multiplexed modes evolve after them. In particular, We found a generalized law of nonlinear conversion structured light from experiments and theories, that the converted modes are highly related to the sequence of these two blocks, obeying an inherent (non)commutative rule in which. This effect not only creates extended structured laser modes but serve as new rules in nonlinear structured light manipulation

arXiv.org e-Print Archive

UniDoc: A Universal Large Multimodal Model for Simultaneous Text Detection, Recognition, Spotting and Understanding

Author: Feng Hao
Huang Can
Li Houqiang
Lu Jinghui
Tang Jingqun
Wang Zijian
Zhou Wengang
Publication venue
Publication date: 19/08/2023
Field of study

In the era of Large Language Models (LLMs), tremendous strides have been made in the field of multimodal understanding. However, existing advanced algorithms are limited to effectively utilizing the immense representation capabilities and rich world knowledge inherent to these large pre-trained models, and the beneficial connections among tasks within the context of text-rich scenarios have not been sufficiently explored. In this work, we introduce UniDoc, a novel multimodal model equipped with text detection and recognition capabilities, which are deficient in existing approaches. Moreover, UniDoc capitalizes on the beneficial interactions among tasks to enhance the performance of each individual task. To implement UniDoc, we perform unified multimodal instruct tuning on the contributed large-scale instruction following datasets. Quantitative and qualitative experimental results show that UniDoc sets state-of-the-art scores across multiple challenging benchmarks. To the best of our knowledge, this is the first large multimodal model capable of simultaneous text detection, recognition, spotting, and understanding

arXiv.org e-Print Archive