13 research outputs found
Learning Spatially-Adaptive Squeeze-Excitation Networks for Image Synthesis and Image Recognition
Learning light-weight yet expressive deep networks in both image synthesis
and image recognition remains a challenging problem. Inspired by a more recent
observation that it is the data-specificity that makes the multi-head
self-attention (MHSA) in the Transformer model so powerful, this paper proposes
to extend the widely adopted light-weight Squeeze-Excitation (SE) module to be
spatially-adaptive to reinforce its data specificity, as a convolutional
alternative of the MHSA, while retaining the efficiency of SE and the inductive
basis of convolution. It presents two designs of spatially-adaptive
squeeze-excitation (SASE) modules for image synthesis and image recognition
respectively. For image synthesis tasks, the proposed SASE is tested in both
low-shot and one-shot learning tasks. It shows better performance than prior
arts. For image recognition tasks, the proposed SASE is used as a drop-in
replacement for convolution layers in ResNets and achieves much better accuracy
than the vanilla ResNets, and slightly better than the MHSA counterparts such
as the Swin-Transformer and Pyramid-Transformer in the ImageNet-1000 dataset,
with significantly smaller models
Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference
While increasingly deep networks are still in general desired for achieving
state-of-the-art performance, for many specific inputs a simpler network might
already suffice. Existing works exploited this observation by learning to skip
convolutional layers in an input-dependent manner. However, we argue their
binary decision scheme, i.e., either fully executing or completely bypassing
one layer for a specific input, can be enhanced by introducing finer-grained,
"softer" decisions. We therefore propose a Dynamic Fractional Skipping (DFS)
framework. The core idea of DFS is to hypothesize layer-wise quantization (to
different bitwidths) as intermediate "soft" choices to be made between fully
utilizing and skipping a layer. For each input, DFS dynamically assigns a
bitwidth to both weights and activations of each layer, where fully executing
and skipping could be viewed as two "extremes" (i.e., full bitwidth and zero
bitwidth). In this way, DFS can "fractionally" exploit a layer's expressive
power during input-adaptive inference, enabling finer-grained
accuracy-computational cost trade-offs. It presents a unified view to link
input-adaptive layer skipping and input-adaptive hybrid quantization. Extensive
experimental results demonstrate the superior tradeoff between computational
cost and model expressive power (accuracy) achieved by DFS. More visualizations
also indicate a smooth and consistent transition in the DFS behaviors,
especially the learned choices between layer skipping and different
quantizations when the total computational budgets vary, validating our
hypothesis that layer quantization could be viewed as intermediate variants of
layer skipping. Our source code and supplementary material are available at
\link{https://github.com/Torment123/DFS}
Dual Dynamic Inference: Enabling More Efficient, Adaptive and Controllable Deep Inference
State-of-the-art convolutional neural networks (CNNs) yield record-breaking
predictive performance, yet at the cost of high-energy-consumption inference,
that prohibits their widely deployments in resource-constrained Internet of
Things (IoT) applications. We propose a dual dynamic inference (DDI) framework
that highlights the following aspects: 1) we integrate both input-dependent and
resource-dependent dynamic inference mechanisms under a unified framework in
order to fit the varying IoT resource requirements in practice. DDI is able to
both constantly suppress unnecessary costs for easy samples, and to halt
inference for all samples to meet hard resource constraints enforced; 2) we
propose a flexible multi-grained learning to skip (MGL2S) approach for
input-dependent inference which allows simultaneous layer-wise and channel-wise
skipping; 3) we extend DDI to complex CNN backbones such as DenseNet and show
that DDI can be applied towards optimizing any specific resource goals
including inference latency or energy cost. Extensive experiments demonstrate
the superior inference accuracy-resource trade-off achieved by DDI, as well as
the flexibility to control such trade-offs compared to existing peer methods.
Specifically, DDI can achieve up to 4 times computational savings with the same
or even higher accuracy as compared to existing competitive baselines
Photoemission Evidence of a Novel Charge Order in Kagome Metal FeGe
A charge order has been discovered to emerge deep into the antiferromagnetic
phase of the kagome metal FeGe. To study its origin, the evolution of the
low-lying electronic structure across the charge order phase transition is
investigated with angle-resolved photoemission spectroscopy. We do not find
signatures of nesting between Fermi surface sections or van-Hove singularities
in zero-frequency joint density of states, and there are no obvious energy gaps
at the Fermi level, which exclude the nesting mechanism for the charge order
formation in FeGe. However, two obvious changes in the band structure have been
detected, i.e., one electron-like band around the K point and another one
around the A point move upward in energy position when the charge order forms.
These features can be well reproduced by our density-functional theory
calculations, where the charge order is primarily driven by magnetic energy
saving via large dimerizations of a quarter of Ge1-sites (in the kagome plane)
along the c-axis. Our results provide strong support for this novel charge
order formation mechanism in FeGe, in contrast to the conventional nesting
mechanism.Comment: 6 pages, 4 figure
Regionalization in the Yangtze River Delta, China, from the perspective of inter-city daily mobility
Regionalization in the Yangtze River Delta, China, from the perspective of inter-city daily mobility. Regional Studies. This paper applies a community detection algorithm to the Yangtze River Delta's (YRD) daily inter-city mobility network to produce an interaction-based regionalization, and then explores the processes underlying this regional (re-) production by comparing it with attribute-based regionalization. The results show that political boundaries and historical patterns of socio-economic integration are strikingly visible, and the effects of overlapping physical, economic, cultural and administrative spaces on regional integration are apparent. The authors conclude that both notions of 'territory' and 'network' come together as the YRD region is spatially configured, while 'regional path dependence' also seems to be relevant for understanding its relational formation
Using location-based social media to chart the patterns of people Moving between cities : the case of Weibo-users in the Yangtze River Delta
Urban-geographical research using location-based social media (LBSM) has itself been characterized by uneven geographies in that most studies deal with Europe and North America. This implies a relative dearth of studies focusing on countries such as China, and this in spite of the country having the largest number of Internet users in the world. This paper proposes to address this lacuna by showing the research potential of LBSM services associated with Weibo, by far the most popular online social microblog-ging and networking service in China. To this end, we map inter-city connections within the Yangtze River Delta based on three million individualsâ space-time footprints derived from Weibo. Empirical results reveal that the inter-city connections derived from Weibo present both common and speciïŹc spatial patterns associated with inter-city travel. We ïŹnd that a small percentage of cities and city-dyads constitute the backbone of this inter-city network. The dominant direction of individual ïŹows tends to be from primary cities to sub-primary cities, and from peripheral cities to primary cities. In addition, city-dyad connectivities do not strictly follow citiesâ positions in terms of their centralities in the hierarchical distribution. Furthermore, the effects of administrative boundaries and citiesâ administrative level are signiïŹcant. We benchmark these insights by re-examining our ïŹndings against the backdrop of polycentric developments in the Yangtze River Delta, which conïŹrms the potential usefulness of LBSM data for analyzing urban-geographical patterns
Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference
While increasingly deep networks are still in general desired for achieving state-of-the-art performance, for many specific inputs a simpler network might already suffice. Existing works exploited this observation by learning to skip convolutional layers in an input-dependent manner. However, we argue their binary decision scheme, i.e., either fully executing or completely bypassing one layer for a specific input, can be enhanced by introducing finer-grained, âsofterâ decisions. We therefore propose a Dynamic Fractional Skipping (DFS) framework. The core idea of DFS is to hypothesize layer-wise quantization (to different bitwidths) as intermediate âsoftâ choices to be made between fully utilizing and skipping a layer. For each input, DFS dynamically assigns a bitwidth to both weights and activations of each layer, where fully executing and skipping could be viewed as two âextremesâ (i.e., full bitwidth and zero bitwidth). In this way, DFS can âfractionallyâ exploit a layer's expressive power during input-adaptive inference, enabling finer-grained accuracy-computational cost trade-offs. It presents a unified view to link input-adaptive layer skipping and input-adaptive hybrid quantization. Extensive experimental results demonstrate the superior tradeoff between computational cost and model expressive power (accuracy) achieved by DFS. More visualizations also indicate a smooth and consistent transition in the DFS behaviors, especially the learned choices between layer skipping and different quantizations when the total computational budgets vary, validating our hypothesis that layer quantization could be viewed as intermediate variants of layer skipping. Our source code and supplementary material are available at https://github.com/Torment123/DFS