344 research outputs found
Age-Energy Tradeoff in Fading Channels with Packet-Based Transmissions
The optimal transmission strategy to minimize the weighted combination of age
of information (AoI) and total energy consumption is studied in this paper. It
is assumed that the status update information is obtained and transmitted at
fixed rate over a Rayleigh fading channel in a packet-based wireless
communication system. A maximum transmission round on each packet is enforced
to guarantee certain reliability of the update packets. Given fixed average
transmission power, the age-energy tradeoff can be formulated as a constrained
Markov decision process (CMDP) problem considering the sensing power
consumption as well. Employing the Lagrangian relaxation, the CMDP problem is
transformed into a Markov decision process (MDP) problem. An algorithm is
proposed to obtain the optimal power allocation policy. Through simulation
results, it is shown that both age and energy efficiency can be improved by the
proposed optimal policy compared with two benchmark schemes. Also, age can be
effectively reduced at the expense of higher energy cost, and more emphasis on
energy consumption leads to higher average age at the same energy efficiency.
Overall, the tradeoff between average age and energy efficiency is identified
A robust two-way semi-linear model for normalization of cDNA microarray data
BACKGROUND: Normalization is a basic step in microarray data analysis. A proper normalization procedure ensures that the intensity ratios provide meaningful measures of relative expression values. METHODS: We propose a robust semiparametric method in a two-way semi-linear model (TW-SLM) for normalization of cDNA microarray data. This method does not make the usual assumptions underlying some of the existing methods. For example, it does not assume that: (i) the percentage of differentially expressed genes is small; or (ii) the numbers of up- and down-regulated genes are about the same, as required in the LOWESS normalization method. We conduct simulation studies to evaluate the proposed method and use a real data set from a specially designed microarray experiment to compare the performance of the proposed method with that of the LOWESS normalization approach. RESULTS: The simulation results show that the proposed method performs better than the LOWESS normalization method in terms of mean square errors for estimated gene effects. The results of analysis of the real data set also show that the proposed method yields more consistent results between the direct and the indirect comparisons and also can detect more differentially expressed genes than the LOWESS method. CONCLUSIONS: Our simulation studies and the real data example indicate that the proposed robust TW-SLM method works at least as well as the LOWESS method and works better when the underlying assumptions for the LOWESS method are not satisfied. Therefore, it is a powerful alternative to the existing normalization methods
Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning
Recently, large-scale pre-trained language-image models like CLIP have shown
extraordinary capabilities for understanding spatial contents, but naively
transferring such models to video recognition still suffers from unsatisfactory
temporal modeling capabilities. Existing methods insert tunable structures into
or in parallel with the pre-trained model, which either requires
back-propagation through the whole pre-trained model and is thus
resource-demanding, or is limited by the temporal reasoning capability of the
pre-trained structure. In this work, we present DiST, which disentangles the
learning of spatial and temporal aspects of videos. Specifically, DiST uses a
dual-encoder structure, where a pre-trained foundation model acts as the
spatial encoder, and a lightweight network is introduced as the temporal
encoder. An integration branch is inserted between the encoders to fuse
spatio-temporal information. The disentangled spatial and temporal learning in
DiST is highly efficient because it avoids the back-propagation of massive
pre-trained parameters. Meanwhile, we empirically show that disentangled
learning with an extra network for integration benefits both spatial and
temporal understanding. Extensive experiments on five benchmarks show that DiST
delivers better performance than existing state-of-the-art methods by
convincing gaps. When pre-training on the large-scale Kinetics-710, we achieve
89.7% on Kinetics-400 with a frozen ViT-L model, which verifies the scalability
of DiST. Codes and models can be found in
https://github.com/alibaba-mmai-research/DiST.Comment: ICCV2023. Code: https://github.com/alibaba-mmai-research/DiS
Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone
Parameter-efficient tuning has become a trend in transferring large-scale
foundation models to downstream applications. Existing methods typically embed
some light-weight tuners into the backbone, where both the design and the
learning of the tuners are highly dependent on the base model. This work offers
a new tuning paradigm, dubbed Res-Tuning, which intentionally unbinds tuners
from the backbone. With both theoretical and empirical evidence, we show that
popular tuning approaches have their equivalent counterparts under our
unbinding formulation, and hence can be integrated into our framework
effortlessly. Thanks to the structural disentanglement, we manage to free the
design of tuners from the network architecture, facilitating flexible
combination of various tuning strategies. We further propose a memory-efficient
variant of Res-Tuning, where the bypass i.e., formed by a sequence of tuners)
is effectively detached from the main branch, such that the gradients are
back-propagated only to the tuners but not to the backbone. Such a detachment
also allows one-time backbone forward for multi-task inference. Extensive
experiments on both discriminative and generative tasks demonstrate the
superiority of our method over existing alternatives from the perspectives of
efficacy and efficiency. Project page:
.Comment: Accepted to NeurIPS 202
VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation
A diffusion probabilistic model (DPM), which constructs a forward diffusion
process by gradually adding noise to data points and learns the reverse
denoising process to generate new samples, has been shown to handle complex
data distribution. Despite its recent success in image synthesis, applying DPMs
to video generation is still challenging due to high-dimensional data spaces.
Previous methods usually adopt a standard diffusion process, where frames in
the same video clip are destroyed with independent noises, ignoring the content
redundancy and temporal correlation. This work presents a decomposed diffusion
process via resolving the per-frame noise into a base noise that is shared
among all frames and a residual noise that varies along the time axis. The
denoising pipeline employs two jointly-learned networks to match the noise
decomposition accordingly. Experiments on various datasets confirm that our
approach, termed as VideoFusion, surpasses both GAN-based and diffusion-based
alternatives in high-quality video generation. We further show that our
decomposed formulation can benefit from pre-trained image diffusion models and
well-support text-conditioned video creation.Comment: Accepted to CVPR202
- …