475 research outputs found
ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
Vision Transformers (ViTs) have shown impressive performance and have become
a unified backbone for multiple vision tasks. But both attention and
multi-layer perceptions (MLPs) in ViTs are not efficient enough due to dense
multiplications, resulting in costly training and inference. To this end, we
propose to reparameterize the pre-trained ViT with a mixture of multiplication
primitives, e.g., bitwise shifts and additions, towards a new type of
multiplication-reduced model, dubbed , which aims for
end-to-end inference speedups on GPUs without the need of training from
scratch. Specifically, all among queries, keys, and values
are reparameterized by additive kernels, after mapping queries and keys to
binary codes in Hamming space. The remaining MLPs or linear layers are then
reparameterized by shift kernels. We utilize TVM to implement and optimize
those customized kernels for practical hardware deployment on GPUs. We find
that such a reparameterization on (quadratic or linear) attention maintains
model accuracy, while inevitably leading to accuracy drops when being applied
to MLPs. To marry the best of both worlds, we further propose a new mixture of
experts (MoE) framework to reparameterize MLPs by taking multiplication or its
primitives as experts, e.g., multiplication and shift, and designing a new
latency-aware load-balancing loss. Such a loss helps to train a generic router
for assigning a dynamic amount of input tokens to different experts according
to their latency. In principle, the faster experts run, the larger amount of
input tokens are assigned. Extensive experiments consistently validate the
effectiveness of our proposed ShiftAddViT, achieving up to
\textbf{5.18\times} latency reductions on GPUs and \textbf{42.9%} energy
savings, while maintaining comparable accuracy as original or efficient ViTs.Comment: Accepted by NeurIPS 202
Glucose Enhances Leptin Signaling through Modulation of AMPK Activity
Leptin exerts its action by binding to and activating the long form of leptin receptors (LEPRb). LEPRb activates JAK2 that subsequently phosphorylates and activates STAT3. The JAK2/STAT3 pathway is required for leptin control of energy balance and body weight. Defects in leptin signaling lead to leptin resistance, a primary risk factor for obesity. Body weight is also regulated by nutrients, including glucose. Defects in glucose sensing also contribute to obesity. Here we report crosstalk between leptin and glucose. Glucose starvation blocked the ability of leptin to stimulate tyrosyl phosphorylation and activation of JAK2 and STAT3 in a variety of cell types. Glucose dose-dependently enhanced leptin signaling. In contrast, glucose did not enhance growth hormone-stimulated phosphorylation of JAK2 and STAT5. Glucose starvation or 2-deoxyglucose-induced inhibition of glycolysis activated AMPK and inhibited leptin signaling; pharmacological inhibition of AMPK restored the ability of leptin to stimulate STAT3 phosphorylation. Conversely, pharmacological activation of AMPK was sufficient to inhibit leptin signaling and to block the ability of glucose to enhance leptin signaling. These results suggest that glucose and/or its metabolites play a permissive role in leptin signaling, and that glucose enhances leptin sensitivity at least in part by attenuating the ability of AMPK to inhibit leptin signaling
voxel2vec: A Natural Language Processing Approach to Learning Distributed Representations for Scientific Data
Relationships in scientific data, such as the numerical and spatial
distribution relations of features in univariate data, the scalar-value
combinations' relations in multivariate data, and the association of volumes in
time-varying and ensemble data, are intricate and complex. This paper presents
voxel2vec, a novel unsupervised representation learning model, which is used to
learn distributed representations of scalar values/scalar-value combinations in
a low-dimensional vector space. Its basic assumption is that if two scalar
values/scalar-value combinations have similar contexts, they usually have high
similarity in terms of features. By representing scalar values/scalar-value
combinations as symbols, voxel2vec learns the similarity between them in the
context of spatial distribution and then allows us to explore the overall
association between volumes by transfer prediction. We demonstrate the
usefulness and effectiveness of voxel2vec by comparing it with the isosurface
similarity map of univariate data and applying the learned distributed
representations to feature classification for multivariate data and to
association analysis for time-varying and ensemble data.Comment: Accepted by IEEE Transaction on Visualization and Computer Graphics
(TVCG
NetBooster: Empowering Tiny Deep Learning By Standing on the Shoulders of Deep Giants
Tiny deep learning has attracted increasing attention driven by the
substantial demand for deploying deep learning on numerous intelligent
Internet-of-Things devices. However, it is still challenging to unleash tiny
deep learning's full potential on both large-scale datasets and downstream
tasks due to the under-fitting issues caused by the limited model capacity of
tiny neural networks (TNNs). To this end, we propose a framework called
NetBooster to empower tiny deep learning by augmenting the architectures of
TNNs via an expansion-then-contraction strategy. Extensive experiments show
that NetBooster consistently outperforms state-of-the-art tiny deep learning
solutions
A Survey of Multimodal Information Fusion for Smart Healthcare: Mapping the Journey from Data to Wisdom
Multimodal medical data fusion has emerged as a transformative approach in
smart healthcare, enabling a comprehensive understanding of patient health and
personalized treatment plans. In this paper, a journey from data to information
to knowledge to wisdom (DIKW) is explored through multimodal fusion for smart
healthcare. We present a comprehensive review of multimodal medical data fusion
focused on the integration of various data modalities. The review explores
different approaches such as feature selection, rule-based systems, machine
learning, deep learning, and natural language processing, for fusing and
analyzing multimodal data. This paper also highlights the challenges associated
with multimodal fusion in healthcare. By synthesizing the reviewed frameworks
and theories, it proposes a generic framework for multimodal medical data
fusion that aligns with the DIKW model. Moreover, it discusses future
directions related to the four pillars of healthcare: Predictive, Preventive,
Personalized, and Participatory approaches. The components of the comprehensive
survey presented in this paper form the foundation for more successful
implementation of multimodal fusion in smart healthcare. Our findings can guide
researchers and practitioners in leveraging the power of multimodal fusion with
the state-of-the-art approaches to revolutionize healthcare and improve patient
outcomes.Comment: This work has been submitted to the ELSEVIER for possible
publication. Copyright may be transferred without notice, after which this
version may no longer be accessibl
- …