15 research outputs found
Machine Learning for Predictive Deployment of UAVs with Multiple Access
In this paper, a machine learning based deployment framework of unmanned
aerial vehicles (UAVs) is studied. In the considered model, UAVs are deployed
as flying base stations (BS) to offload heavy traffic from ground BSs. Due to
time-varying traffic distribution, a long short-term memory (LSTM) based
prediction algorithm is introduced to predict the future cellular traffic. To
predict the user service distribution, a KEG algorithm, which is a joint
K-means and expectation maximization (EM) algorithm based on Gaussian mixture
model (GMM), is proposed for determining the service area of each UAV. Based on
the predicted traffic, the optimal UAV positions are derived and three
multi-access techniques are compared so as to minimize the total transmit
power. Simulation results show that the proposed method can reduce up to 24\%
of the total power consumption compared to the conventional method without
traffic prediction. Besides, rate splitting multiple access (RSMA) has the
lower required transmit power compared to frequency domain multiple access
(FDMA) and time domain multiple access (TDMA)
Boosting Unsupervised Contrastive Learning Using Diffusion-Based Data Augmentation From Scratch
Unsupervised contrastive learning methods have recently seen significant
improvements, particularly through data augmentation strategies that aim to
produce robust and generalizable representations. However, prevailing data
augmentation methods, whether hand designed or based on foundation models, tend
to rely heavily on prior knowledge or external data. This dependence often
compromises their effectiveness and efficiency. Furthermore, the applicability
of most existing data augmentation strategies is limited when transitioning to
other research domains, especially science-related data. This limitation stems
from the paucity of prior knowledge and labeled data available in these
domains. To address these challenges, we introduce DiffAug-a novel and
efficient Diffusion-based data Augmentation technique. DiffAug aims to ensure
that the augmented and original data share a smoothed latent space, which is
achieved through diffusion steps. Uniquely, unlike traditional methods, DiffAug
first mines sufficient prior semantic knowledge about the neighborhood. This
provides a constraint to guide the diffusion steps, eliminating the need for
labels, external data/models, or prior knowledge. Designed as an
architecture-agnostic framework, DiffAug provides consistent improvements.
Specifically, it improves image classification and clustering accuracy by
1.6%~4.5%. When applied to biological data, DiffAug improves performance by up
to 10.1%, with an average improvement of 5.8%. DiffAug shows good performance
in both vision and biological domains.Comment: arXiv admin note: text overlap with arXiv:2302.07944 by other author
Align Yourself: Self-supervised Pre-training for Fine-grained Recognition via Saliency Alignment
Self-supervised contrastive learning has demonstrated great potential in
learning visual representations. Despite their success on various downstream
tasks such as image classification and object detection, self-supervised
pre-training for fine-grained scenarios is not fully explored. In this paper,
we first point out that current contrastive methods are prone to memorizing
background/foreground texture and therefore have a limitation in localizing the
foreground object. Analysis suggests that learning to extract discriminative
texture information and localization are equally crucial for self-supervised
pre-training in fine-grained scenarios. Based on our findings, we introduce
cross-view saliency alignment (CVSA), a contrastive learning framework that
first crops and swaps saliency regions of images as a novel view generation and
then guides the model to localize on the foreground object via a cross-view
alignment loss. Extensive experiments on four popular fine-grained
classification benchmarks show that CVSA significantly improves the learned
representation.Comment: The second version of CVSA. 10 pages, 4 figure
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN
Masked image modeling (MIM), an emerging self-supervised pre-training method,
has shown impressive success across numerous downstream vision tasks with
Vision transformers (ViTs). Its underlying idea is simple: a portion of the
input image is randomly masked out and then reconstructed via the pre-text
task. However, the working principle behind MIM is not well explained, and
previous studies insist that MIM primarily works for the Transformer family but
is incompatible with CNNs. In this paper, we first study interactions among
patches to understand what knowledge is learned and how it is acquired via the
MIM task. We observe that MIM essentially teaches the model to learn better
middle-order interactions among patches and extract more generalized features.
Based on this fact, we propose an Architecture-Agnostic Masked Image Modeling
framework (AMIM), which is compatible with both Transformers and CNNs in a
unified way. Extensive experiments on popular benchmarks show that our AMIM
learns better representations without explicit design and endows the backbone
model with the stronger capability to transfer to various downstream tasks for
both Transformers and CNNs.Comment: Preprint under review (update reversion). The source code will be
released in https://github.com/Westlake-AI/openmixu
EVNet: An Explainable Deep Network for Dimension Reduction
Dimension reduction (DR) is commonly utilized to capture the intrinsic
structure and transform high-dimensional data into low-dimensional space while
retaining meaningful properties of the original data. It is used in various
applications, such as image recognition, single-cell sequencing analysis, and
biomarker discovery. However, contemporary parametric-free and parametric DR
techniques suffer from several significant shortcomings, such as the inability
to preserve global and local features and the pool generalization performance.
On the other hand, regarding explainability, it is crucial to comprehend the
embedding process, especially the contribution of each part to the embedding
process, while understanding how each feature affects the embedding results
that identify critical components and help diagnose the embedding process. To
address these problems, we have developed a deep neural network method called
EVNet, which provides not only excellent performance in structural
maintainability but also explainability to the DR therein. EVNet starts with
data augmentation and a manifold-based loss function to improve embedding
performance. The explanation is based on saliency maps and aims to examine the
trained EVNet parameters and contributions of components during the embedding
process. The proposed techniques are integrated with a visual interface to help
the user to adjust EVNet to achieve better DR performance and explainability.
The interactive visual interface makes it easier to illustrate the data
features, compare different DR techniques, and investigate DR. An in-depth
experimental comparison shows that EVNet consistently outperforms the
state-of-the-art methods in both performance measures and explainability.Comment: 18 pages, 15 figures, accepted by TVC
Machine Learning for Predictive Deployment of UAVs With Multiple Access
This paper presents a machine learning-based framework for the predictive deployment of unmanned aerial vehicles (UAVs) as flying base stations (BSs) to offload heavy traffic from ground BSs. To account for time-varying traffic distribution, a long short-term memory (LSTM)-based prediction algorithm is introduced to predict future cellular traffic. A joint K-means and expectation maximization (EM) algorithm based on Gaussian mixture models (GMM) is proposed to determine the service area of each UAV based on the predicted user service distribution. Based on the predicted traffic, the optimal positions of UAVs are derived, and four multiple access techniques, namely, rate splitting multiple access (RSMA), frequency domain multiple access (FDMA), time domain multiple access (TDMA), and non-orthogonal multiple access (NOMA), are compared to minimize the total transmit power. Simulation results show that the proposed method can reduce up to 24% of the total power consumption compared to the conventional method without traffic prediction. Furthermore, RSMA is found to require the lowest transmit power among the four multiple access techniques. Therefore, this paper focuses on the comparison of multiple access techniques for UAV deployment, which is essential for the efficient and effective use of UAVs as flying BSs
Panoramic Manifold Projection (Panoramap) for Single-Cell Data Dimensionality Reduction and Visualization
Nonlinear dimensionality reduction (NLDR) methods such as t-Distributed Stochastic Neighbour Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP) have been widely used for biological data exploration, especially in single-cell analysis. However, the existing methods have drawbacks in preserving data’s geometric and topological structures. A high-dimensional data analysis method, called Panoramic manifold projection (Panoramap), was developed as an enhanced deep learning framework for structure-preserving NLDR. Panoramap enhances deep neural networks by using cross-layer geometry-preserving constraints. The constraints constitute the loss for deep manifold learning and serve as geometric regularizers for NLDR network training. Therefore, Panoramap has better performance in preserving global structures of the original data. Here, we apply Panoramap to single-cell datasets and show that Panoramap excels at delineating the cell type lineage/hierarchy and can reveal rare cell types. Panoramap can facilitate trajectory inference and has the potential to aid in the early diagnosis of tumors. Panoramap gives improved and more biologically plausible visualization and interpretation of single-cell data. Panoramap can be readily used in single-cell research domains and other research fields that involve high dimensional data analysis