103 research outputs found

    Sentiment analysis with adaptive multi-head attention in Transformer

    Full text link
    We propose a novel framework based on the attention mechanism to identify the sentiment of a movie review document. Previous efforts on deep neural networks with attention mechanisms focus on encoder and decoder with fixed numbers of multi-head attention. Therefore, we need a mechanism to stop the attention process automatically if no more useful information can be read from the memory.In this paper, we propose an adaptive multi-head attention architecture (AdaptAttn) which varies the number of attention heads based on length of sentences. AdaptAttn has a data preprocessing step where each document is classified into any one of the three bins small, medium or large based on length of the sentence. The document classified as small goes through two heads in each layer, the medium group passes four heads and the large group is processed by eight heads. We examine the merit of our model on the Stanford large movie review dataset. The experimental results show that the F1 score from our model is on par with the baseline model.Comment: Accepted by the 4th International Conference on Signal Processing and Machine Learnin

    Long time behaviors for the inhomogeneous NLS with a potential in R3\mathbb{R}^3

    Full text link
    In this article, we aim to study the scattering of the solution to the focusing inhomogeneous nonlinear Schr\"odinger equation with a potential of form \begin{align*} i\partial_t u+\Delta u- Vu=-|x|^{-b}|u|^{p-1}u \end{align*} in the energy space H1(R3)H^1(\R^3). We prove a scattering criterion, and then we use it together with Morawetz estimate to show the scattering theory, which generalizes the results of Dinh \cite{DD} to the non-radial symmetric case.Comment: In this version, we correct some mistakes and change the titl

    Evolution and Efficiency in Neural Architecture Search: Bridging the Gap Between Expert Design and Automated Optimization

    Full text link
    The paper provides a comprehensive overview of Neural Architecture Search (NAS), emphasizing its evolution from manual design to automated, computationally-driven approaches. It covers the inception and growth of NAS, highlighting its application across various domains, including medical imaging and natural language processing. The document details the shift from expert-driven design to algorithm-driven processes, exploring initial methodologies like reinforcement learning and evolutionary algorithms. It also discusses the challenges of computational demands and the emergence of efficient NAS methodologies, such as Differentiable Architecture Search and hardware-aware NAS. The paper further elaborates on NAS's application in computer vision, NLP, and beyond, demonstrating its versatility and potential for optimizing neural network architectures across different tasks. Future directions and challenges, including computational efficiency and the integration with emerging AI domains, are addressed, showcasing NAS's dynamic nature and its continued evolution towards more sophisticated and efficient architecture search methods.Comment: 7 Pages, Double Colum

    Joint Detection Algorithm for Multiple Cognitive Users in Spectrum Sensing

    Full text link
    Spectrum sensing technology is a crucial aspect of modern communication technology, serving as one of the essential techniques for efficiently utilizing scarce information resources in tight frequency bands. This paper first introduces three common logical circuit decision criteria in hard decisions and analyzes their decision rigor. Building upon hard decisions, the paper further introduces a method for multi-user spectrum sensing based on soft decisions. Then the paper simulates the false alarm probability and detection probability curves corresponding to the three criteria. The simulated results of multi-user collaborative sensing indicate that the simulation process significantly reduces false alarm probability and enhances detection probability. This approach effectively detects spectrum resources unoccupied during idle periods, leveraging the concept of time-division multiplexing and rationalizing the redistribution of information resources. The entire computation process relies on the calculation principles of power spectral density in communication theory, involving threshold decision detection for noise power and the sum of noise and signal power. It provides a secondary decision detection, reflecting the perceptual decision performance of logical detection methods with relative accuracy.Comment: https://aei.ewapublishing.org/article.html?pk=e24c40d220434209ae2fe2e984bcf2c

    Optimizing the Passenger Flow for Airport Security Check

    Full text link
    Due to the necessary security for the airport and flight, passengers are required to have strict security check before getting aboard. However, there are frequent complaints of wasting huge amount of time while waiting for the security check. This paper presents a potential solution aimed at optimizing gate setup procedures specifically tailored for Chicago OHare International Airport. By referring to queueing theory and performing Monte Carlo simulations, we propose an approach to significantly diminish the average waiting time to a more manageable level. Additionally, our study meticulously examines and identifies the influential factors contributing to this optimization, providing a comprehensive understanding of their impact

    Hybrid FedGraph: An efficient hybrid federated learning algorithm using graph convolutional neural network

    Full text link
    Federated learning is an emerging paradigm for decentralized training of machine learning models on distributed clients, without revealing the data to the central server. Most existing works have focused on horizontal or vertical data distributions, where each client possesses different samples with shared features, or each client fully shares only sample indices, respectively. However, the hybrid scheme is much less studied, even though it is much more common in the real world. Therefore, in this paper, we propose a generalized algorithm, FedGraph, that introduces a graph convolutional neural network to capture feature-sharing information while learning features from a subset of clients. We also develop a simple but effective clustering algorithm that aggregates features produced by the deep neural networks of each client while preserving data privacy

    New construction of mutually orthogonal complementary sequence sets

    Get PDF
    To address the limitations in design methods and the scarcity of construction parameters for mutual orthogonal complementary sequence set (MOCSS), a construction method for MOCSS based on paraunitary (PU) matrices was proposed. The new concept of coefficient paraunitary (CPU) matrices was defined, and by employing matrix multiplication, Kronecker product, and matrix iteration techniques, three types of PU matrices with varying sizes were constructed. Utilizing the equivalence between PU matrices and MOCSS, a series of multi-phase MOCSS with flexible parameter selection were developed, filling the parameter gap in the existing literature. Considering the suppression of peak-to-average power ratio (PAPR) in multi-carrier code division multiple access (MC-CDMA) systems, a class of CPU matrices with low column vector PAPR characteristics was designed using Boolean functions. Experimental results demonstrate that the constructed MOCSS using such CPU matrices effectively controls the column sequence PAPR within the range of below two, while maintaining the flexibility of code capacity and length, providing a variety of signal selection options for the systems

    SAMPLE-BASED DYNAMIC HIERARCHICAL TRANSFORMER WITH LAYER AND HEAD FLEXIBILITY VIA CONTEXTUAL BANDIT

    Get PDF
    Transformer requires a fixed number of layers and heads which makes them inflexible to the complexity of individual samples and expensive in training and inference. To address this, we propose a sample-based Dynamic Hierarchical Transformer (DHT) model whose layers and heads can be dynamically configured with single data samples via solving contextual bandit problems. To determine the number of layers and heads, we use the Uniform Confidence Bound algorithm while we deploy combinatorial Thompson Sampling in order to select specific head combinations given their number. Different from previous work that focuses on compressing trained networks for inference only, DHT is not only advantageous for adaptively optimizing the underlying network architecture during training but also has a flexible network for efficient inference. To the best of our knowledge, this is the first comprehensive data-driven dynamic transformer without any additional auxiliary neural networks that implement the dynamic system. According to the experiment results, we achieve up to 74% computational savings for both training and inference with a minimal loss of accuracy

    FEDEMB: A VERTICAL AND HYBRID FEDERATED LEARNING ALGORITHM USING NETWORK AND FEATURE EMBEDDING AGGREGATION

    Get PDF
    Federated learning (FL) is an emerging paradigm for decentralized training of machine learning models on distributed clients, without revealing the data to the central server. The learning scheme may be horizontal, vertical or hybrid (both vertical and horizontal). Most existing research work with deep neural network (DNN) modeling is focused on horizontal data distributions, while vertical and hybrid schemes are much less studied. In this paper, we propose a generalized algorithm FedEmb, for modeling vertical and hybrid DNN-based learning. The idea of our algorithm is characterized by higher inference accuracy, stronger privacy-preserving properties, and lower client-server communication bandwidth demands as compared with existing work. The experimental results show that FedEmb is an effective method to tackle both split feature & subject space decentralized problems. To be specific, there are 0.3% to 4.2% improvement on inference accuracy and 88.9 % time complexity reduction over baseline method
    corecore