171 research outputs found
Chern Number Tunable Quantum Anomalous Hall Effect in Monolayer Transitional Metal Oxides via Manipulating Magnetization Orientation
Although much effort has been made to explore quantum anomalous Hall effect
(QAHE) in both theory and experiment, the QAHE systems with tunable Chern
numbers are yet limited. Here, we theoretically propose that NiAsO and
PdSbO, monolayer transitional metal oxides, can realize QAHE with tunable
Chern numbers via manipulating their magnetization orientations. When the
magnetization lies in the \textit{x-y} plane and all mirror symmetries are
broken, the low-Chern-number (i.e., ) phase emerges. When the
magnetization exhibits non-zero \textit{z}-direction component, the system
enters the high-Chern-number (i.e., ) phase, even in the
presence of canted magnetization. The global band gap can approach the
room-temperature energy scale in monolayer PdSbO (23.4 meV), when the
magnetization is aligned to \textit{z}-direction. By using Wannier-based
tight-binding model, we establish the phase diagram of magnetization induced
topological phase transition. Our work provides a high-temperature QAHE system
with tunable Chern number for the practical electronic application
A Survey on Datasets for Decision-making of Autonomous Vehicle
Autonomous vehicles (AV) are expected to reshape future transportation
systems, and decision-making is one of the critical modules toward high-level
automated driving. To overcome those complicated scenarios that rule-based
methods could not cope with well, data-driven decision-making approaches have
aroused more and more focus. The datasets to be used in developing data-driven
methods dramatically influences the performance of decision-making, hence it is
necessary to have a comprehensive insight into the existing datasets. From the
aspects of collection sources, driving data can be divided into vehicle,
environment, and driver related data. This study compares the state-of-the-art
datasets of these three categories and summarizes their features including
sensors used, annotation, and driving scenarios. Based on the characteristics
of the datasets, this survey also concludes the potential applications of
datasets on various aspects of AV decision-making, assisting researchers to
find appropriate ones to support their own research. The future trends of AV
dataset development are summarized
MCM: Multi-condition Motion Synthesis Framework for Multi-scenario
The objective of the multi-condition human motion synthesis task is to
incorporate diverse conditional inputs, encompassing various forms like text,
music, speech, and more. This endows the task with the capability to adapt
across multiple scenarios, ranging from text-to-motion and music-to-dance,
among others. While existing research has primarily focused on single
conditions, the multi-condition human motion generation remains underexplored.
In this paper, we address these challenges by introducing MCM, a novel paradigm
for motion synthesis that spans multiple scenarios under diverse conditions.
The MCM framework is able to integrate with any DDPM-like diffusion model to
accommodate multi-conditional information input while preserving its generative
capabilities. Specifically, MCM employs two-branch architecture consisting of a
main branch and a control branch. The control branch shares the same structure
as the main branch and is initialized with the parameters of the main branch,
effectively maintaining the generation ability of the main branch and
supporting multi-condition input. We also introduce a Transformer-based
diffusion model MWNet (DDPM-like) as our main branch that can capture the
spatial complexity and inter-joint correlations in motion sequences through a
channel-dimension self-attention module. Quantitative comparisons demonstrate
that our approach achieves SoTA results in both text-to-motion and competitive
results in music-to-dance tasks, comparable to task-specific methods.
Furthermore, the qualitative evaluation shows that MCM not only streamlines the
adaptation of methodologies originally designed for text-to-motion tasks to
domains like music-to-dance and speech-to-gesture, eliminating the need for
extensive network re-configurations but also enables effective multi-condition
modal control, realizing "once trained is motion need"
ReDas: Supporting Fine-Grained Reshaping and Multiple Dataflows on Systolic Array
Current systolic arrays still suffer from low performance and PE utilization
on many real workloads due to the mismatch between the fixed array topology and
diverse DNN kernels. We present ReDas, a flexible and lightweight systolic
array that can adapt to various DNN models by supporting dynamic fine-grained
reshaping and multiple dataflows. The key idea is to construct reconfigurable
roundabout data paths using only the short connections between neighbor PEs.
The array with 128128 size supports 129 different logical shapes and 3
dataflows (IS/OS/WS). Experiments on DNN models of MLPerf demonstrate that
ReDas can achieve 3.09x speedup on average compared to state-of-the-art work.Comment: 7 pages, 11 figures, conferenc
Latency-aware Unified Dynamic Networks for Efficient Image Recognition
Dynamic computation has emerged as a promising avenue to enhance the
inference efficiency of deep networks. It allows selective activation of
computational units, leading to a reduction in unnecessary computations for
each input sample. However, the actual efficiency of these dynamic models can
deviate from theoretical predictions. This mismatch arises from: 1) the lack of
a unified approach due to fragmented research; 2) the focus on algorithm design
over critical scheduling strategies, especially in CUDA-enabled GPU contexts;
and 3) challenges in measuring practical latency, given that most libraries
cater to static operations. Addressing these issues, we unveil the
Latency-Aware Unified Dynamic Networks (LAUDNet), a framework that integrates
three primary dynamic paradigms-spatially adaptive computation, dynamic layer
skipping, and dynamic channel skipping. To bridge the theoretical and practical
efficiency gap, LAUDNet merges algorithmic design with scheduling optimization,
guided by a latency predictor that accurately gauges dynamic operator latency.
We've tested LAUDNet across multiple vision tasks, demonstrating its capacity
to notably reduce the latency of models like ResNet-101 by over 50% on
platforms such as V100, RTX3090, and TX2 GPUs. Notably, LAUDNet stands out in
balancing accuracy and efficiency. Code is available at:
https://www.github.com/LeapLabTHU/LAUDNet
Dynamic Perceiver for Efficient Visual Recognition
Early exiting has become a promising approach to improving the inference
efficiency of deep networks. By structuring models with multiple classifiers
(exits), predictions for ``easy'' samples can be generated at earlier exits,
negating the need for executing deeper layers. Current multi-exit networks
typically implement linear classifiers at intermediate layers, compelling
low-level features to encapsulate high-level semantics. This sub-optimal design
invariably undermines the performance of later exits. In this paper, we propose
Dynamic Perceiver (Dyn-Perceiver) to decouple the feature extraction procedure
and the early classification task with a novel dual-branch architecture. A
feature branch serves to extract image features, while a classification branch
processes a latent code assigned for classification tasks. Bi-directional
cross-attention layers are established to progressively fuse the information of
both branches. Early exits are placed exclusively within the classification
branch, thus eliminating the need for linear separability in low-level
features. Dyn-Perceiver constitutes a versatile and adaptable framework that
can be built upon various architectures. Experiments on image classification,
action recognition, and object detection demonstrate that our method
significantly improves the inference efficiency of different backbones,
outperforming numerous competitive approaches across a broad range of
computational budgets. Evaluation on both CPU and GPU platforms substantiate
the superior practical efficiency of Dyn-Perceiver. Code is available at
https://www.github.com/LeapLabTHU/Dynamic_Perceiver.Comment: Accepted at ICCV 202
Retinex-qDPC: automatic background rectified quantitative differential phase contrast imaging
The quality of quantitative differential phase contrast reconstruction (qDPC)
can be severely degenerated by the mismatch of the background of two oblique
illuminated images, yielding problematic phase recovery results. These
background mismatches may result from illumination patterns, inhomogeneous
media distribution, or other defocusing layers. In previous reports, the
background is manually calibrated which is time-consuming, and unstable, since
new calibrations are needed if any modification to the optical system was made.
It is also impossible to calibrate the background from the defocusing layers,
or for high dynamic observation as the background changes over time. To tackle
the mismatch of background and increases the experimental robustness, we
propose the Retinex-qDPC in which we use the images edge features as data
fidelity term yielding L2-Retinex-qDPC and L1-Retinex-qDPC for high
background-robustness qDPC reconstruction. The split Bregman method is used to
solve the L1-Retinex DPC. We compare both Retinex-qDPC models against
state-of-the-art DPC reconstruction algorithms including total-variation
regularized qDPC, and isotropic-qDPC using both simulated and experimental
data. Results show that the Retinex qDPC can significantly improve the phase
recovery quality by suppressing the impact of mismatch background. Within, the
L1-Retinex-qDPC is better than L2-Retinex and other state-of-the-art DPC
algorithms. In general, the Retinex-qDPC increases the experimental robustness
against background illumination without any modification of the optical system,
which will benefit all qDPC applications
PoMC : An Efficient Blockchain Consensus Mechanism for the Agricultural Internet of Things
Blockchain-based agricultural IoT systems face key challenges such as high delay and low transaction throughput. Existing complicated consensus mechanisms can cause IoT devices work inefficiently due to the limited computing, storage and energy resources. Additionally, many message exchanges can lead to high latency in the consensus process, which hinders the real-time applications of the agricultural IoT. Therefore, we propose Proof-of-Multifactor-Capacity (PoMC), an efficient and secure consensus mechanism for the agricultural IoT. It uses the communication capacity and credibility of a node as the evidence for making consensus. Moreover, a senator node lottery algorithm based on a credit mechanism and a new distributed incentive mechanism are designed to enhance security and motivate nodes to actively maintain the system. This paper analyses the performance of PoMC theoretically, including security, latency and system throughput, and presents a comparison of its asymptotic complexity with some existing consensus mechanisms. The simulation results demonstate that the average transaction validation latency and average consensus latency of PoMC have decreased by 10% and 23%. In addition, PoMC outperforms SENATE, PoQF and PBFT by 56%, 60% and 64% in terms of the system throughput, respectively
Contrastive Diffusion Model with Auxiliary Guidance for Coarse-to-Fine PET Reconstruction
To obtain high-quality positron emission tomography (PET) scans while
reducing radiation exposure to the human body, various approaches have been
proposed to reconstruct standard-dose PET (SPET) images from low-dose PET
(LPET) images. One widely adopted technique is the generative adversarial
networks (GANs), yet recently, diffusion probabilistic models (DPMs) have
emerged as a compelling alternative due to their improved sample quality and
higher log-likelihood scores compared to GANs. Despite this, DPMs suffer from
two major drawbacks in real clinical settings, i.e., the computationally
expensive sampling process and the insufficient preservation of correspondence
between the conditioning LPET image and the reconstructed PET (RPET) image. To
address the above limitations, this paper presents a coarse-to-fine PET
reconstruction framework that consists of a coarse prediction module (CPM) and
an iterative refinement module (IRM). The CPM generates a coarse PET image via
a deterministic process, and the IRM samples the residual iteratively. By
delegating most of the computational overhead to the CPM, the overall sampling
speed of our method can be significantly improved. Furthermore, two additional
strategies, i.e., an auxiliary guidance strategy and a contrastive diffusion
strategy, are proposed and integrated into the reconstruction process, which
can enhance the correspondence between the LPET image and the RPET image,
further improving clinical reliability. Extensive experiments on two human
brain PET datasets demonstrate that our method outperforms the state-of-the-art
PET reconstruction methods. The source code is available at
\url{https://github.com/Show-han/PET-Reconstruction}.Comment: Accepted and presented in MICCAI 2023. To be published in Proceeding
- …