755 research outputs found
DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances
Recent advances in pre-trained language models have significantly improved
neural response generation. However, existing methods usually view the dialogue
context as a linear sequence of tokens and learn to generate the next word
through token-level self-attention. Such token-level encoding hinders the
exploration of discourse-level coherence among utterances. This paper presents
DialogBERT, a novel conversational response generation model that enhances
previous PLM-based dialogue models. DialogBERT employs a hierarchical
Transformer architecture. To efficiently capture the discourse-level coherence
among utterances, we propose two training objectives, including masked
utterance regression and distributed utterance order ranking in analogy to the
original BERT training. Experiments on three multi-turn conversation datasets
show that our approach remarkably outperforms the baselines, such as BART and
DialoGPT, in terms of quantitative evaluation. The human evaluation suggests
that DialogBERT generates more coherent, informative, and human-like responses
than the baselines with significant margins.Comment: Published as a conference paper at AAAI 202
Continuous Decomposition of Granularity for Neural Paraphrase Generation
While Transformers have had significant success in paragraph generation, they
treat sentences as linear sequences of tokens and often neglect their
hierarchical information. Prior work has shown that decomposing the levels of
granularity~(e.g., word, phrase, or sentence) for input tokens has produced
substantial improvements, suggesting the possibility of enhancing Transformers
via more fine-grained modeling of granularity. In this work, we propose a
continuous decomposition of granularity for neural paraphrase generation
(C-DNPG). In order to efficiently incorporate granularity into sentence
encoding, C-DNPG introduces a granularity-aware attention (GA-Attention)
mechanism which extends the multi-head self-attention with: 1) a granularity
head that automatically infers the hierarchical structure of a sentence by
neurally estimating the granularity level of each input token; and 2) two novel
attention masks, namely, granularity resonance and granularity scope, to
efficiently encode granularity into attention. Experiments on two benchmarks,
including Quora question pairs and Twitter URLs have shown that C-DNPG
outperforms baseline models by a remarkable margin and achieves
state-of-the-art results in terms of many metrics. Qualitative analysis reveals
that C-DNPG indeed captures fine-grained levels of granularity with
effectiveness.Comment: Accepted to be published in COLING 202
Estimation of utility weights for human papilloma virus-related health states according to disease severity
Scenarios for the different HPV-related health states. (DOCX 38 kb
RF-sputtered HfO 2 Gate Insulator in High-Performance AlGaN/GaN MOS-HEMTs
We have proposed and fabricated AlGaN/GaN metal-oxidesemiconductor-high-electron-mobility transistors (MOS-HEMTs) on Si substrate employing RF-sputtered HfO2 gate insulator for a high breakdown voltage. The HfO2 sputtering conditions such as a sputtering power and working pressure have been optimized in order to improve reverse blocking characteristics. We obtained the high breakdown voltage of 1524 V, the low drain leakage current of 67 pA/mm when VDS= 100 V and VGS= -10 V, and on/off current ratio of 2.37×10 10 at sputtering power of 50 W and working pressure of 3 mTorr. In addition, we also discussed the mechanism of breakdown voltage improvement and investigated HfO2/GaN interface in the proposed devices by measuring the leakage current, capacitance-voltage characteristics, and X-ray diffraction (XRD)
Pivotal Role of Language Modeling in Recommender Systems: Enriching Task-specific and Task-agnostic Representation Learning
Recent studies have proposed unified user modeling frameworks that leverage
user behavior data from various applications. Many of them benefit from
utilizing users' behavior sequences as plain texts, representing rich
information in any domain or system without losing generality. Hence, a
question arises: Can language modeling for user history corpus help improve
recommender systems? While its versatile usability has been widely investigated
in many domains, its applications to recommender systems still remain
underexplored. We show that language modeling applied directly to task-specific
user histories achieves excellent results on diverse recommendation tasks.
Also, leveraging additional task-agnostic user histories delivers significant
performance benefits. We further demonstrate that our approach can provide
promising transfer learning capabilities for a broad spectrum of real-world
recommender systems, even on unseen domains and services.Comment: 14 pages, 5 figures, 9 table
- …