15 research outputs found

    Machine Learning for Predictive Deployment of UAVs with Multiple Access

    Full text link
    In this paper, a machine learning based deployment framework of unmanned aerial vehicles (UAVs) is studied. In the considered model, UAVs are deployed as flying base stations (BS) to offload heavy traffic from ground BSs. Due to time-varying traffic distribution, a long short-term memory (LSTM) based prediction algorithm is introduced to predict the future cellular traffic. To predict the user service distribution, a KEG algorithm, which is a joint K-means and expectation maximization (EM) algorithm based on Gaussian mixture model (GMM), is proposed for determining the service area of each UAV. Based on the predicted traffic, the optimal UAV positions are derived and three multi-access techniques are compared so as to minimize the total transmit power. Simulation results show that the proposed method can reduce up to 24\% of the total power consumption compared to the conventional method without traffic prediction. Besides, rate splitting multiple access (RSMA) has the lower required transmit power compared to frequency domain multiple access (FDMA) and time domain multiple access (TDMA)

    Boosting Unsupervised Contrastive Learning Using Diffusion-Based Data Augmentation From Scratch

    Full text link
    Unsupervised contrastive learning methods have recently seen significant improvements, particularly through data augmentation strategies that aim to produce robust and generalizable representations. However, prevailing data augmentation methods, whether hand designed or based on foundation models, tend to rely heavily on prior knowledge or external data. This dependence often compromises their effectiveness and efficiency. Furthermore, the applicability of most existing data augmentation strategies is limited when transitioning to other research domains, especially science-related data. This limitation stems from the paucity of prior knowledge and labeled data available in these domains. To address these challenges, we introduce DiffAug-a novel and efficient Diffusion-based data Augmentation technique. DiffAug aims to ensure that the augmented and original data share a smoothed latent space, which is achieved through diffusion steps. Uniquely, unlike traditional methods, DiffAug first mines sufficient prior semantic knowledge about the neighborhood. This provides a constraint to guide the diffusion steps, eliminating the need for labels, external data/models, or prior knowledge. Designed as an architecture-agnostic framework, DiffAug provides consistent improvements. Specifically, it improves image classification and clustering accuracy by 1.6%~4.5%. When applied to biological data, DiffAug improves performance by up to 10.1%, with an average improvement of 5.8%. DiffAug shows good performance in both vision and biological domains.Comment: arXiv admin note: text overlap with arXiv:2302.07944 by other author

    Align Yourself: Self-supervised Pre-training for Fine-grained Recognition via Saliency Alignment

    Full text link
    Self-supervised contrastive learning has demonstrated great potential in learning visual representations. Despite their success on various downstream tasks such as image classification and object detection, self-supervised pre-training for fine-grained scenarios is not fully explored. In this paper, we first point out that current contrastive methods are prone to memorizing background/foreground texture and therefore have a limitation in localizing the foreground object. Analysis suggests that learning to extract discriminative texture information and localization are equally crucial for self-supervised pre-training in fine-grained scenarios. Based on our findings, we introduce cross-view saliency alignment (CVSA), a contrastive learning framework that first crops and swaps saliency regions of images as a novel view generation and then guides the model to localize on the foreground object via a cross-view alignment loss. Extensive experiments on four popular fine-grained classification benchmarks show that CVSA significantly improves the learned representation.Comment: The second version of CVSA. 10 pages, 4 figure

    Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN

    Full text link
    Masked image modeling (MIM), an emerging self-supervised pre-training method, has shown impressive success across numerous downstream vision tasks with Vision transformers (ViTs). Its underlying idea is simple: a portion of the input image is randomly masked out and then reconstructed via the pre-text task. However, the working principle behind MIM is not well explained, and previous studies insist that MIM primarily works for the Transformer family but is incompatible with CNNs. In this paper, we first study interactions among patches to understand what knowledge is learned and how it is acquired via the MIM task. We observe that MIM essentially teaches the model to learn better middle-order interactions among patches and extract more generalized features. Based on this fact, we propose an Architecture-Agnostic Masked Image Modeling framework (A2^2MIM), which is compatible with both Transformers and CNNs in a unified way. Extensive experiments on popular benchmarks show that our A2^2MIM learns better representations without explicit design and endows the backbone model with the stronger capability to transfer to various downstream tasks for both Transformers and CNNs.Comment: Preprint under review (update reversion). The source code will be released in https://github.com/Westlake-AI/openmixu

    EVNet: An Explainable Deep Network for Dimension Reduction

    Full text link
    Dimension reduction (DR) is commonly utilized to capture the intrinsic structure and transform high-dimensional data into low-dimensional space while retaining meaningful properties of the original data. It is used in various applications, such as image recognition, single-cell sequencing analysis, and biomarker discovery. However, contemporary parametric-free and parametric DR techniques suffer from several significant shortcomings, such as the inability to preserve global and local features and the pool generalization performance. On the other hand, regarding explainability, it is crucial to comprehend the embedding process, especially the contribution of each part to the embedding process, while understanding how each feature affects the embedding results that identify critical components and help diagnose the embedding process. To address these problems, we have developed a deep neural network method called EVNet, which provides not only excellent performance in structural maintainability but also explainability to the DR therein. EVNet starts with data augmentation and a manifold-based loss function to improve embedding performance. The explanation is based on saliency maps and aims to examine the trained EVNet parameters and contributions of components during the embedding process. The proposed techniques are integrated with a visual interface to help the user to adjust EVNet to achieve better DR performance and explainability. The interactive visual interface makes it easier to illustrate the data features, compare different DR techniques, and investigate DR. An in-depth experimental comparison shows that EVNet consistently outperforms the state-of-the-art methods in both performance measures and explainability.Comment: 18 pages, 15 figures, accepted by TVC

    Machine Learning for Predictive Deployment of UAVs With Multiple Access

    No full text
    This paper presents a machine learning-based framework for the predictive deployment of unmanned aerial vehicles (UAVs) as flying base stations (BSs) to offload heavy traffic from ground BSs. To account for time-varying traffic distribution, a long short-term memory (LSTM)-based prediction algorithm is introduced to predict future cellular traffic. A joint K-means and expectation maximization (EM) algorithm based on Gaussian mixture models (GMM) is proposed to determine the service area of each UAV based on the predicted user service distribution. Based on the predicted traffic, the optimal positions of UAVs are derived, and four multiple access techniques, namely, rate splitting multiple access (RSMA), frequency domain multiple access (FDMA), time domain multiple access (TDMA), and non-orthogonal multiple access (NOMA), are compared to minimize the total transmit power. Simulation results show that the proposed method can reduce up to 24% of the total power consumption compared to the conventional method without traffic prediction. Furthermore, RSMA is found to require the lowest transmit power among the four multiple access techniques. Therefore, this paper focuses on the comparison of multiple access techniques for UAV deployment, which is essential for the efficient and effective use of UAVs as flying BSs

    Panoramic Manifold Projection (Panoramap) for Single-Cell Data Dimensionality Reduction and Visualization

    No full text
    Nonlinear dimensionality reduction (NLDR) methods such as t-Distributed Stochastic Neighbour Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP) have been widely used for biological data exploration, especially in single-cell analysis. However, the existing methods have drawbacks in preserving data’s geometric and topological structures. A high-dimensional data analysis method, called Panoramic manifold projection (Panoramap), was developed as an enhanced deep learning framework for structure-preserving NLDR. Panoramap enhances deep neural networks by using cross-layer geometry-preserving constraints. The constraints constitute the loss for deep manifold learning and serve as geometric regularizers for NLDR network training. Therefore, Panoramap has better performance in preserving global structures of the original data. Here, we apply Panoramap to single-cell datasets and show that Panoramap excels at delineating the cell type lineage/hierarchy and can reveal rare cell types. Panoramap can facilitate trajectory inference and has the potential to aid in the early diagnosis of tumors. Panoramap gives improved and more biologically plausible visualization and interpretation of single-cell data. Panoramap can be readily used in single-cell research domains and other research fields that involve high dimensional data analysis
    corecore