53 research outputs found

    Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer

    Full text link
    Pretrained transformer models have demonstrated remarkable performance across various natural language processing tasks. These models leverage the attention mechanism to capture long- and short-range dependencies in the sequence. However, the (full) attention mechanism incurs high computational cost - quadratic in the sequence length, which is not affordable in tasks with long sequences, e.g., inputs with 8k tokens. Although sparse attention can be used to improve computational efficiency, as suggested in existing work, it has limited modeling capacity and often fails to capture complicated dependencies in long sequences. To tackle this challenge, we propose MASFormer, an easy-to-implement transformer variant with Mixed Attention Spans. Specifically, MASFormer is equipped with full attention to capture long-range dependencies, but only at a small number of layers. For the remaining layers, MASformer only employs sparse attention to capture short-range dependencies. Our experiments on natural language modeling and generation tasks show that a decoder-only MASFormer model of 1.3B parameters can achieve competitive performance to vanilla transformers with full attention while significantly reducing computational cost (up to 75%). Additionally, we investigate the effectiveness of continual training with long sequence data and how sequence length impacts downstream generation performance, which may be of independent interest.Comment: The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023 Findings

    Research of Frequency Splitting Caused by Uneven Mass of Micro-Hemispherical Resonator Gyro

    No full text
    In practical engineering, the frequency splitting of Hemispherical Resonator Gyro (HRG) caused by uneven mass distribution seriously affects the precision of HRG. So, the inherent frequency is an important parameter of micro-Hemispherical Resonator Gyro (m-HRG). In the processing of hemispherical resonator, there are some morphological errors and internal defects in the hemispherical resonator, which affect the inherent frequency and the working mode of m-HRG, and reduce the precision and performance of m-HRG. In order to improve the precision and performance of m-HRG, the partial differential equation of the hemispherical resonator is solved, and the three-dimensional model using ANSYS software accurately reflected the actual shape is established in this paper. Then, the mode of hemispherical resonator in ideal state and uneven mass distribution state are simulated and analyzed. The frequency splitting mechanism of the hemispherical resonator is determined by calculation and demonstration, and the frequency splitting of the hemispherical resonator is suppressed by partial mass elimination. The results show that the absolute balance of energy can ensure the high-quality factor and the minimum frequency splitting of the hemispherical resonator. Therefore, during the processing of hemispherical resonator, the balance of mass should be achieved as much as possible to avoid various surface damage, internal defects and uneven mass distribution to guarantee the high-quality factor Q and minimum frequency splitting of hemispherical resonator

    TiO2/PFO inorganic-organic hybrid heterojunction for self-powered blue light photodetector

    No full text
    Organic-inorganic hybrid heterojunction is very promising for low cost and high performance photodetection device application. We have fabricated a heterojunction composed of n-type inorganic TiO _2 nanorod arrays and p-type organic semiconductor polymer [9,9-bis-(2-ethylhexyl)-9H-fluorene-2,7-diyl] (PFO). A peak of 0.32 mA W ^βˆ’1 was observed locating at 410 nm in spectral responsivity curve. The photocurrent response was rapid, repeatable as well as consistent. Under weak blue light (410 nm, 75 ΞΌ W cm ^βˆ’2 ) irradiance, the rise time and decay time were observed to be 0.16 and 0.12 s, respectively. These results demonstrated that TiO _2 /PFO heterojuncton could bring prospect for the development of self-powered blue light photodetectors with high spectral selectivity

    DDSG-GAN: Generative Adversarial Network with Dual Discriminators and Single Generator for Black-Box Attacks

    No full text
    As one of the top ten security threats faced by artificial intelligence, the adversarial attack has caused scholars to think deeply from theory to practice. However, in the black-box attack scenario, how to raise the visual quality of an adversarial example (AE) and perform a more efficient query should be further explored. This study aims to use the architecture of GAN combined with the model-stealing attack to train surrogate models and generate high-quality AE. This study proposes an image AE generation method based on the generative adversarial networks with dual discriminators and a single generator (DDSG-GAN) and designs the corresponding loss function for each model. The generator can generate adversarial perturbation, and two discriminators constrain the perturbation, respectively, to ensure the visual quality and attack effect of the generated AE. We extensively experiment on MNIST, CIFAR10, and Tiny-ImageNet datasets. The experimental results illustrate that our method can effectively use query feedback to generate an AE, which significantly reduces the number of queries on the target model and can implement effective attacks

    Serum Elabela expression is decreased in hypertensive patients and could be associated with the progression of hypertensive renal damage

    No full text
    Abstract Background Elabela, a recently discovered hormonal peptide containing 32 amino acids, is a ligand for the apelin receptor. It can lower blood pressure and attenuate renal fibrosis. However, the clinicopathological relationship between Elabela level and renal damage caused by benign hypertension (BHT) and malignant hypertension (MHT) has not been elucidated. Therefore, we investigated the clinicopathological correlation between serum Elabela level and renal damage caused by BHT and MHT. Methods The participants comprised 50 patients and 25 age-matched healthy adults. The 50 patients were separated into two groups: MHT (n = 25) and BHT groups (n = 25). We analyzed their medical histories, demographics, and clinical examinations, including physical and laboratory tests. Results The results showed that serum Elabela level decreased gradually with a continuous increase in blood pressure from the healthy control group, BHT, to MHT. Moreover, Elabela levels negatively correlated with BMI (R =β€‰β€‰βˆ’β€‰0.27, P = 0.02), SBP (r =β€‰β€‰βˆ’β€‰0.64, P < 0.01), DBP (r =β€‰β€‰βˆ’β€‰0.58, P < 0.01), uric acid (r =β€‰β€‰βˆ’β€‰0.39, P < 0.01), bun (r =β€‰β€‰βˆ’β€‰0.53, P < 0.01), and Scr (r =β€‰β€‰βˆ’β€‰0.53 P < 0.01) but positively correlated with eGFR (r = 0.54, P < 0.01). Stepwise multivariate linear regression analysis showed that SBP was the variable most related to Elabela (t =β€‰β€‰βˆ’β€‰5.592, P < 0.01). Conclusions Serum Elabela levels decreased in patients with hypertension, especially malignant hypertension, and has the potential to be a marker of hypertension-related kidney damage
    • …
    corecore