1,597 research outputs found

    Segatron: Segment-Aware Transformer for Language Modeling and Understanding

    Full text link
    Transformers are powerful for sequence modeling. Nearly all state-of-the-art language models and pre-trained language models are based on the Transformer architecture. However, it distinguishes sequential tokens only with the token position index. We hypothesize that better contextual representations can be generated from the Transformer with richer positional information. To verify this, we propose a segment-aware Transformer (Segatron), by replacing the original token position encoding with a combined position encoding of paragraph, sentence, and token. We first introduce the segment-aware mechanism to Transformer-XL, which is a popular Transformer-based language model with memory extension and relative position encoding. We find that our method can further improve the Transformer-XL base model and large model, achieving 17.1 perplexity on the WikiText-103 dataset. We further investigate the pre-training masked language modeling task with Segatron. Experimental results show that BERT pre-trained with Segatron (SegaBERT) can outperform BERT with vanilla Transformer on various NLP tasks, and outperforms RoBERTa on zero-shot sentence representation learning.Comment: Accepted by AAAI 202

    Distributional Drift Adaptation with Temporal Conditional Variational Autoencoder for Multivariate Time Series Forecasting

    Full text link
    Due to the nonstationary nature, the distribution of real-world multivariate time series (MTS) changes over time, which is known as distribution drift. Most existing MTS forecasting models greatly suffer from distribution drift and degrade the forecasting performance over time. Existing methods address distribution drift via adapting to the latest arrived data or self-correcting per the meta knowledge derived from future data. Despite their great success in MTS forecasting, these methods hardly capture the intrinsic distribution changes, especially from a distributional perspective. Accordingly, we propose a novel framework temporal conditional variational autoencoder (TCVAE) to model the dynamic distributional dependencies over time between historical observations and future data in MTSs and infer the dependencies as a temporal conditional distribution to leverage latent variables. Specifically, a novel temporal Hawkes attention mechanism represents temporal factors subsequently fed into feed-forward networks to estimate the prior Gaussian distribution of latent variables. The representation of temporal factors further dynamically adjusts the structures of Transformer-based encoder and decoder to distribution changes by leveraging a gated attention mechanism. Moreover, we introduce conditional continuous normalization flow to transform the prior Gaussian to a complex and form-free distribution to facilitate flexible inference of the temporal conditional distribution. Extensive experiments conducted on six real-world MTS datasets demonstrate the TCVAE's superior robustness and effectiveness over the state-of-the-art MTS forecasting baselines. We further illustrate the TCVAE applicability through multifaceted case studies and visualization in real-world scenarios.Comment: 13 pages, 6 figures, submitted to IEEE Transactions on Neural Networks and Learning Systems (TNNLS

    RF EMF Exposure Compliance of mmWave Array Antennas for 5G User Equipment Application

    Get PDF

    A case report of multiple aneurysmal bone cysts

    Full text link

    Abnormal magnetoresistance behavior in Nb thin film with rectangular antidot lattice

    Full text link
    Abnormal magnetoresistance behavior is found in superconducting Nb films perforated with rectangular arrays of antidots (holes). Generally magnetoresistance were always found to increase with increasing magnetic field. Here we observed a reversal of this behavior for particular in low temperature or current density. This phenomenon is due to a strong 'caging effect' which interstitial vortices are strongly trapped among pinned multivortices.Comment: 4 pages, 2 figure

    Numerical simulation of thermal stratification in Lake Qiandaohu using an improved WRF-Lake model

    Get PDF
    Lake thermal stratification is important for regulating lake environments and ecosystems and is sensitive to climate change and human activity. However, numerical simulation of coupled hydrodynamics and heat transfer processes in deep lakes using one-dimensional lake models remains challenging because of the insufficient representation of key parameters. In this study, Lake Qiandaohu, a deep and warm monomictic reservoir, was used as an example to investigate thermal stratification via an improved parameterization scheme of the Weather Research and Forecast (WRF)-Lake. A comparison with in situ observations demonstrated that the default WRF-Lake model was able to simulate well the seasonal variation of the lake thermal structure. However, the simulations exhibited cold biases in lake surface water temperature (LSWT) throughout the year while generating weaker stratification in summer, thereby leading to an earlier cooling period in autumn. With an improved parameterization (i.e., via determination of initial lake water temperature profiles, light extinction coefficients, eddy diffusion coefficients and surface roughness lengths), the modified WRF-Lake model was able to better simulate LSWT and thermal stratification. Critically, employing realistic initial conditions for lake water temperature is essential for producing realistic hypolimnetic water temperatures. The use of time-dependent light extinction coefficients resulted in a deep thermocline and warm LSWT. Enlarging eddy diffusivity led to stronger mixing in summer and further influenced autumn cooling. The parameterized surface roughness lengths mitigated the excessive turbulent heat loss at the lake surface, improved the model performance in simulating LSWT, and generated a warm mixed layer. This study provides guidance on model parameterization for simulating the thermal structure of deep lakes and advances our understanding of the strength and revolution of lake thermal stratification under seasonal changes

    Uplift, Climate and Biotic Changes at the Eocene-Oligocene Transition in Southeast Tibet

    Get PDF
    The uplift history of southeastern Tibet is crucial to understanding processes driving the tectonic evolution of the Tibetan Plateau and surrounding areas. Underpinning existing palaeoaltimetric studies has been regional mapping based in large part on biostratigraphy that assumes a Neogene modernisation of the highly diverse, but threatened, Asian biota. Here, with new radiometric dating and newly-collected plant fossil archives, we quantify the surface height of part of Tibet’s southeastern margin of Tibet in the latest Eocene (~34 Ma) to be ~3 km and rising, possibly attaining its present elevation (3.9 km) in the early Oligocene. We also find that the Eocene-Oligocene transition in southeastern Tibet witnessed leaf size diminution and a floral composition change from sub-tropical/warm temperate to cool temperate, likely reflective of both uplift and secular climate change, and that by the latest Eocene floral modernization on Tibet had already taken place implying modernization was deeply-rooted in the Paleogene
    corecore