7 research outputs found

    DSTEA: Improving Dialogue State Tracking via Entity Adaptive Pre-training

    Full text link
    Dialogue State Tracking (DST) is critical for comprehensively interpreting user and system utterances, thereby forming the cornerstone of efficient dialogue systems. Despite past research efforts focused on enhancing DST performance through alterations to the model structure or integrating additional features like graph relations, they often require additional pre-training with external dialogue corpora. In this study, we propose DSTEA, improving Dialogue State Tracking via Entity Adaptive pre-training, which can enhance the encoder through by intensively training key entities in dialogue utterances. DSTEA identifies these pivotal entities from input dialogues utilizing four different methods: ontology information, named-entity recognition, the spaCy, and the flair library. Subsequently, it employs selective knowledge masking to train the model effectively. Remarkably, DSTEA only requires pre-training without the direct infusion of extra knowledge into the DST model. This approach resulted in substantial performance improvements of four robust DST models on MultiWOZ 2.0, 2.1, and 2.2, with joint goal accuracy witnessing an increase of up to 2.69% (from 52.41% to 55.10%). Further validation of DSTEA's efficacy was provided through comparative experiments considering various entity types and different entity adaptive pre-training configurations such as masking strategy and masking rate

    Domain-Slot Relationship Modeling Using a Pre-Trained Language Encoder for Multi-Domain Dialogue State Tracking

    No full text
    Dialogue state tracking for multi-domain dialogues is challenging because the model should be able to track dialogue states across multiple domains and slots. As using pre-trained language models is the de facto standard for natural language processing tasks, many recent studies use them to encode the dialogue context for predicting the dialogue states. Model architectures that have certain inductive biases for modeling the relationship among different domain-slot pairs are also emerging. Our work is based on these research approaches on multi-domain dialogue state tracking. We propose a model architecture that effectively models the relationship among domain-slot pairs using a pre-trained language encoder. Inspired by the way the special [CLS] token in BERT is used to aggregate the information of the whole sequence, we use multiple special tokens for each domain-slot pair that encodes information corresponding to its domain and slot. The special tokens are run together with the dialogue context through the pre-trained language encoder, which effectivelymodels the relationship among different domain-slot pairs. Our experimental results on the datasets MultiWOZ-2.0 and MultiWOZ-2.1 show that our model outperforms other models with the same setting. Our ablation studies incorporate three main parts. The first component shows the effectiveness of our approach exploiting the relationship modeling. The second component compares the effect of using different pre-trained language encoders. The final component involves comparing different initialization methods that could be used for the special tokens. Qualitative analysis of the attention map of the pre-trained language encoder shows that our special tokens encode relevant information through the encoding process by attending to each other.N

    Deep Transfer Learning-Based Fault Diagnosis Using Wavelet Transform for Limited Data

    No full text
    Although various deep learning techniques have been proposed to diagnose industrial faults, it is still challenging to obtain sufficient training samples to build the fault diagnosis model in practice. This paper presents a framework that combines wavelet transformation and transfer learning (TL) for fault diagnosis with limited target samples. The wavelet transform converts a time-series sample to a time-frequency representative image based on the extracted hidden time and frequency features of various faults. On the other hand, the TL technique leverages the existing neural networks, called GoogLeNet, which were trained using a sufficient source data set for different target tasks. Since the data distributions between the source and the target domains are considerably different in industrial practice, we partially retrain the pre-trained model of the source domain using intermediate samples that are conceptually related to the target domain. We use a reciprocating pump model to generate various combinations of faults with different severity levels and evaluate the effectiveness of the proposed method. The results show that the proposed method provides higher diagnostic accuracy than the support vector machine and the convolutional neural network under wide variations in the training data size and the fault severity. In particular, we show that the severity level of the fault condition heavily affects the diagnostic performance

    Tunable Berry curvature and transport crossover in topological Dirac semimetal KZnBi

    No full text
    © 2021, The Author(s).Topological Dirac semimetals have emerged as a platform to engineer Berry curvature with time-reversal symmetry breaking, which allows to access diverse quantum states in a single material system. It is of interest to realize such diversity in Dirac semimetals that provides insight on correlation between Berry curvature and quantum transport phenomena. Here, we report the transition between anomalous Hall and chiral fermion states in three-dimensional topological Dirac semimetal KZnBi, which is demonstrated by tuning the direction and flux of Berry curvature. Angle-dependent magneto-transport measurements show that both anomalous Hall resistance and positive magnetoresistance are maximized at 0° between net Berry curvature and rotational axis. We find that the unexpected crossover of anomalous Hall resistance and negative magnetoresistance suddenly occurs when the angle reaches to ~70°, indicating that Berry curvature strongly correlates with quantum transports of Dirac and chiral fermions. It would be interesting to tune Berry curvature within other quantum phases such as topological superconductivity.11Nsciescopu

    Coexistence of surface superconducting and three-dimensional topological dirac states in semimetal kznbi

    No full text
    We report the discovery of a new three-dimensional (3D) topological Dirac semimetal (TDS) material KZnBi, coexisting with a naturally formed superconducting state on the surface under ambient pressure. Using photoemission spectroscopy together with first-principles calculations, a 3D Dirac state with linear band dispersion is identified. The characteristic features of massless Dirac fermions are also confirmed by magnetotransport measurements, exhibiting an extremely small cyclotron mass of m∗=0.012 m0 and a high Fermi velocity of vF=1.04×106 m/s. Interestingly, superconductivity occurs below 0.85 K on the (001) surface, while the bulk remains nonsuperconducting. The captured linear temperature dependence of the upper critical field suggests the possible non-s-wave character of this surface superconductivity. Our discovery serves a distinctive platform to study the interplay between 3D TDS and the superconductivity.11Nsciescopu
    corecore