98 research outputs found
AutoSTL: Automated Spatio-Temporal Multi-Task Learning
Spatio-Temporal prediction plays a critical role in smart city construction. Jointly modeling multiple spatio-temporal tasks can further promote an intelligent city life by integrating their inseparable relationship. However, existing studies fail to address this joint learning problem well, which generally solve tasks individually or a fixed task combination. The challenges lie in the tangled relation between different properties, the demand for supporting flexible combinations of tasks and the complex spatio-temporal dependency. To cope with the problems above, we propose an Automated Spatio-Temporal multi-task Learning (AutoSTL) method to handle multiple spatio-temporal tasks jointly. Firstly, we propose a scalable architecture consisting of advanced spatio-temporal operations to exploit the complicated dependency. Shared modules and feature fusion mechanism are incorporated to further capture the intrinsic relationship between tasks. Furthermore, our model automatically allocates the operations and fusion weight. Extensive experiments on benchmark datasets verified that our model achieves state-of-the-art performance. As we can know, AutoSTL is the first automated spatio-temporal multi-task learning method.<br/
Distributed Dynamic Map Fusion via Federated Learning for Intelligent Networked Vehicles
The technology of dynamic map fusion among networked vehicles has been
developed to enlarge sensing ranges and improve sensing accuracies for
individual vehicles. This paper proposes a federated learning (FL) based
dynamic map fusion framework to achieve high map quality despite unknown
numbers of objects in fields of view (FoVs), various sensing and model
uncertainties, and missing data labels for online learning. The novelty of this
work is threefold: (1) developing a three-stage fusion scheme to predict the
number of objects effectively and to fuse multiple local maps with fidelity
scores; (2) developing an FL algorithm which fine-tunes feature models (i.e.,
representation learning networks for feature extraction) distributively by
aggregating model parameters; (3) developing a knowledge distillation method to
generate FL training labels when data labels are unavailable. The proposed
framework is implemented in the Car Learning to Act (CARLA) simulation
platform. Extensive experimental results are provided to verify the superior
performance and robustness of the developed map fusion and FL schemes.Comment: 12 pages, 5 figures, to appear in 2021 IEEE International Conference
on Robotics and Automation (ICRA
Monad: Towards Cost-effective Specialization for Chiplet-based Spatial Accelerators
Advanced packaging offers a new design paradigm in the post-Moore era, where
many small chiplets can be assembled into a large system. Based on
heterogeneous integration, a chiplet-based accelerator can be highly
specialized for a specific workload, demonstrating extreme efficiency and cost
reduction. To fully leverage this potential, it is critical to explore both the
architectural design space for individual chiplets and different integration
options to assemble these chiplets, which have yet to be fully exploited by
existing proposals. This paper proposes Monad, a cost-aware specialization
approach for chiplet-based spatial accelerators that explores the tradeoffs
between PPA and fabrication costs. To evaluate a specialized system, we
introduce a modeling framework considering the non-uniformity in dataflow,
pipelining, and communications when executing multiple tensor workloads on
different chiplets. We propose to combine the architecture and integration
design space by uniformly encoding the design aspects for both spaces and
exploring them with a systematic ML-based approach. The experiments demonstrate
that Monad can achieve an average of 16% and 30% EDP reduction compared with
the state-of-the-art chiplet-based accelerators, Simba and NN-Baton,
respectively.Comment: To be published in ICCAD 202
TAG : Type Auxiliary Guiding for Code Comment Generation
Existing leading code comment generation approaches with the
structure-to-sequence framework ignores the type information of the
interpretation of the code, e.g., operator, string, etc. However, introducing
the type information into the existing framework is non-trivial due to the
hierarchical dependence among the type information. In order to address the
issues above, we propose a Type Auxiliary Guiding encoder-decoder framework for
the code comment generation task which considers the source code as an N-ary
tree with type information associated with each node. Specifically, our
framework is featured with a Type-associated Encoder and a Type-restricted
Decoder which enables adaptive summarization of the source code. We further
propose a hierarchical reinforcement learning method to resolve the training
difficulties of our proposed framework. Extensive evaluations demonstrate the
state-of-the-art performance of our framework with both the auto-evaluated
metrics and case studies.Comment: ACL 2020, Accepte
Learning Disentangled Semantic Representation for Domain Adaptation
Domain adaptation is an important but challenging task. Most of the existing
domain adaptation methods struggle to extract the domain-invariant
representation on the feature space with entangling domain information and
semantic information. Different from previous efforts on the entangled feature
space, we aim to extract the domain invariant semantic information in the
latent disentangled semantic representation (DSR) of the data. In DSR, we
assume the data generation process is controlled by two independent sets of
variables, i.e., the semantic latent variables and the domain latent variables.
Under the above assumption, we employ a variational auto-encoder to reconstruct
the semantic latent variables and domain latent variables behind the data. We
further devise a dual adversarial network to disentangle these two sets of
reconstructed latent variables. The disentangled semantic latent variables are
finally adapted across the domains. Experimental studies testify that our model
yields state-of-the-art performance on several domain adaptation benchmark
datasets
AutoSTL: Automated Spatio-Temporal Multi-Task Learning
Spatio-Temporal prediction plays a critical role in smart city construction.
Jointly modeling multiple spatio-temporal tasks can further promote an
intelligent city life by integrating their inseparable relationship. However,
existing studies fail to address this joint learning problem well, which
generally solve tasks individually or a fixed task combination. The challenges
lie in the tangled relation between different properties, the demand for
supporting flexible combinations of tasks and the complex spatio-temporal
dependency. To cope with the problems above, we propose an Automated
Spatio-Temporal multi-task Learning (AutoSTL) method to handle multiple
spatio-temporal tasks jointly. Firstly, we propose a scalable architecture
consisting of advanced spatio-temporal operations to exploit the complicated
dependency. Shared modules and feature fusion mechanism are incorporated to
further capture the intrinsic relationship between tasks. Furthermore, our
model automatically allocates the operations and fusion weight. Extensive
experiments on benchmark datasets verified that our model achieves
state-of-the-art performance. As we can know, AutoSTL is the first automated
spatio-temporal multi-task learning method
Frequency-astigmatism asymmetric nonlinear conversion of structured light lasers
Nonlinear optics of structured light has recently delivered intriguing
fundamental physical phenomena in light-matter interactions and advanced
applications from classical imaging to quantum informatics. The mutual
interaction between spin, orbital angular momentum (OAM) and wavelength is
extensively studied in such cases. In this work, we go beyond only considering
OAM and wavelength by taking the nonlinear frequency conversion and transverse
mode astigmatism conversion as two building blocks and investigating how single
modes and complicated multiplexed modes evolve after them. In particular, We
found a generalized law of nonlinear conversion structured light from
experiments and theories, that the converted modes are highly related to the
sequence of these two blocks, obeying an inherent (non)commutative rule in
which. This effect not only creates extended structured laser modes but serve
as new rules in nonlinear structured light manipulation
UniDoc: A Universal Large Multimodal Model for Simultaneous Text Detection, Recognition, Spotting and Understanding
In the era of Large Language Models (LLMs), tremendous strides have been made
in the field of multimodal understanding. However, existing advanced algorithms
are limited to effectively utilizing the immense representation capabilities
and rich world knowledge inherent to these large pre-trained models, and the
beneficial connections among tasks within the context of text-rich scenarios
have not been sufficiently explored. In this work, we introduce UniDoc, a novel
multimodal model equipped with text detection and recognition capabilities,
which are deficient in existing approaches. Moreover, UniDoc capitalizes on the
beneficial interactions among tasks to enhance the performance of each
individual task. To implement UniDoc, we perform unified multimodal instruct
tuning on the contributed large-scale instruction following datasets.
Quantitative and qualitative experimental results show that UniDoc sets
state-of-the-art scores across multiple challenging benchmarks. To the best of
our knowledge, this is the first large multimodal model capable of simultaneous
text detection, recognition, spotting, and understanding
- …