8 research outputs found
SCL-RAI: Span-based Contrastive Learning with Retrieval Augmented Inference for Unlabeled Entity Problem in NER
Named Entity Recognition is the task to locate and classify the entities in
the text. However, Unlabeled Entity Problem in NER datasets seriously hinders
the improvement of NER performance. This paper proposes SCL-RAI to cope with
this problem. Firstly, we decrease the distance of span representations with
the same label while increasing it for different ones via span-based
contrastive learning, which relieves the ambiguity among entities and improves
the robustness of the model over unlabeled entities. Then we propose retrieval
augmented inference to mitigate the decision boundary shifting problem. Our
method significantly outperforms the previous SOTA method by 4.21% and 8.64%
F1-score on two real-world datasets.Comment: COLING 202
SANTA: Separate Strategies for Inaccurate and Incomplete Annotation Noise in Distantly-Supervised Named Entity Recognition
Distantly-Supervised Named Entity Recognition effectively alleviates the
burden of time-consuming and expensive annotation in the supervised setting.
But the context-free matching process and the limited coverage of knowledge
bases introduce inaccurate and incomplete annotation noise respectively.
Previous studies either considered only incomplete annotation noise or
indiscriminately handle two types of noise with the same strategy. In this
paper, we argue that the different causes of two types of noise bring up the
requirement of different strategies in model architecture. Therefore, we
propose the SANTA to handle these two types of noise separately with (1)
Memory-smoothed Focal Loss and Entity-aware KNN to relieve the entity ambiguity
problem caused by inaccurate annotation, and (2) Boundary Mixup to alleviate
decision boundary shifting problem caused by incomplete annotation and a
noise-tolerant loss to improve the robustness. Benefiting from our separate
tailored strategies, we confirm in the experiment that the two types of noise
are well mitigated. SANTA also achieves a new state-of-the-art on five public
datasets.Comment: Findings of ACL202
Distantly-Supervised Named Entity Recognition with Uncertainty-aware Teacher Learning and Student-student Collaborative Learning
Distantly-Supervised Named Entity Recognition (DS-NER) effectively alleviates
the burden of annotation, but meanwhile suffers from the label noise. Recent
works attempt to adopt the teacher-student framework to gradually refine the
training labels and improve the overall robustness. However, we argue that
these teacher-student methods achieve limited performance because poor network
calibration produces incorrectly pseudo-labeled samples, leading to error
propagation. Therefore, we attempt to mitigate this issue by proposing: (1)
Uncertainty-aware Teacher Learning that leverages the prediction uncertainty to
guide the selection of pseudo-labels, avoiding the number of incorrect
pseudo-labels in the self-training stage. (2) Student-student Collaborative
Learning that allows the transfer of reliable labels between two student
networks instead of completely relying on all pseudo-labels from its teacher.
Meanwhile, this approach allows a full exploration of mislabeled samples rather
than simply filtering unreliable pseudo-labeled samples. Extensive experimental
results on five DS-NER datasets demonstrate that our method is superior to
state-of-the-art teacher-student methods
UniPCM: Universal Pre-trained Conversation Model with Task-aware Automatic Prompt
Recent research has shown that multi-task pre-training greatly improves the
model's robustness and transfer ability, which is crucial for building a
high-quality dialog system. However, most previous works on multi-task
pre-training rely heavily on human-defined input format or prompt, which is not
optimal in quality and quantity. In this work, we propose to use Task-based
Automatic Prompt generation (TAP) to automatically generate high-quality
prompts. Using the high-quality prompts generated, we scale the corpus of the
pre-trained conversation model to 122 datasets from 15 dialog-related tasks,
resulting in Universal Pre-trained Conversation Model (UniPCM), a powerful
foundation model for various conversational tasks and different dialog systems.
Extensive experiments have shown that UniPCM is robust to input prompts and
capable of various dialog-related tasks. Moreover, UniPCM has strong transfer
ability and excels at low resource scenarios, achieving SOTA results on 9
different datasets ranging from task-oriented dialog to open-domain
conversation. Furthermore, we are amazed to find that TAP can generate prompts
on par with those collected with crowdsourcing. The code is released with the
paper
SpokenWOZ: A Large-Scale Speech-Text Benchmark for Spoken Task-Oriented Dialogue Agents
Task-oriented dialogue (TOD) models have made significant progress in recent
years. However, previous studies primarily focus on datasets written by
annotators, which has resulted in a gap between academic research and
real-world spoken conversation scenarios. While several small-scale spoken TOD
datasets are proposed to address robustness issues such as ASR errors, they
ignore the unique challenges in spoken conversation. To tackle the limitations,
we introduce SpokenWOZ, a large-scale speech-text dataset for spoken TOD,
containing 8 domains, 203k turns, 5.7k dialogues and 249 hours of audios from
human-to-human spoken conversations. SpokenWOZ further incorporates common
spoken characteristics such as word-by-word processing and reasoning in spoken
language. Based on these characteristics, we present cross-turn slot and
reasoning slot detection as new challenges. We conduct experiments on various
baselines, including text-modal models, newly proposed dual-modal models, and
LLMs, e.g., ChatGPT. The results show that the current models still have
substantial room for improvement in spoken conversation, where the most
advanced dialogue state tracker only achieves 25.65% in joint goal accuracy and
the SOTA end-to-end model only correctly completes the user request in 52.1% of
dialogues. The dataset, code, and leaderboard are available:
https://spokenwoz.github.io/SpokenWOZ-github.io/
CEPC Conceptual Design Report: Volume 2 - Physics & Detector
The Circular Electron Positron Collider (CEPC) is a large international scientific facility proposed by the Chinese particle physics community to explore the Higgs boson and provide critical tests of the underlying fundamental physics principles of the Standard Model that might reveal new physics. The CEPC, to be hosted in China in a circular underground tunnel of approximately 100 km in circumference, is designed to operate as a Higgs factory producing electron-positron collisions with a center-of-mass energy of 240 GeV. The collider will also operate at around 91.2 GeV, as a Z factory, and at the WW production threshold (around 160 GeV). The CEPC will produce close to one trillion Z bosons, 100 million W bosons and over one million Higgs bosons. The vast amount of bottom quarks, charm quarks and tau-leptons produced in the decays of the Z bosons also makes the CEPC an effective B-factory and tau-charm factory. The CEPC will have two interaction points where two large detectors will be located. This document is the second volume of the CEPC Conceptual Design Report (CDR). It presents the physics case for the CEPC, describes conceptual designs of possible detectors and their technological options, highlights the expected detector and physics performance, and discusses future plans for detector R&D and physics investigations. The final CEPC detectors will be proposed and built by international collaborations but they are likely to be composed of the detector technologies included in the conceptual designs described in this document. A separate volume, Volume I, recently released, describes the design of the CEPC accelerator complex, its associated civil engineering, and strategic alternative scenarios
CEPC Conceptual Design Report: Volume 2 - Physics & Detector
The Circular Electron Positron Collider (CEPC) is a large international scientific facility proposed by the Chinese particle physics community to explore the Higgs boson and provide critical tests of the underlying fundamental physics principles of the Standard Model that might reveal new physics. The CEPC, to be hosted in China in a circular underground tunnel of approximately 100 km in circumference, is designed to operate as a Higgs factory producing electron-positron collisions with a center-of-mass energy of 240 GeV. The collider will also operate at around 91.2 GeV, as a Z factory, and at the WW production threshold (around 160 GeV). The CEPC will produce close to one trillion Z bosons, 100 million W bosons and over one million Higgs bosons. The vast amount of bottom quarks, charm quarks and tau-leptons produced in the decays of the Z bosons also makes the CEPC an effective B-factory and tau-charm factory. The CEPC will have two interaction points where two large detectors will be located. This document is the second volume of the CEPC Conceptual Design Report (CDR). It presents the physics case for the CEPC, describes conceptual designs of possible detectors and their technological options, highlights the expected detector and physics performance, and discusses future plans for detector R&D and physics investigations. The final CEPC detectors will be proposed and built by international collaborations but they are likely to be composed of the detector technologies included in the conceptual designs described in this document. A separate volume, Volume I, recently released, describes the design of the CEPC accelerator complex, its associated civil engineering, and strategic alternative scenarios
CEPC Conceptual Design Report: Volume 2 - Physics & Detector
The Circular Electron Positron Collider (CEPC) is a large international scientific facility proposed by the Chinese particle physics community to explore the Higgs boson and provide critical tests of the underlying fundamental physics principles of the Standard Model that might reveal new physics. The CEPC, to be hosted in China in a circular underground tunnel of approximately 100 km in circumference, is designed to operate as a Higgs factory producing electron-positron collisions with a center-of-mass energy of 240 GeV. The collider will also operate at around 91.2 GeV, as a Z factory, and at the WW production threshold (around 160 GeV). The CEPC will produce close to one trillion Z bosons, 100 million W bosons and over one million Higgs bosons. The vast amount of bottom quarks, charm quarks and tau-leptons produced in the decays of the Z bosons also makes the CEPC an effective B-factory and tau-charm factory. The CEPC will have two interaction points where two large detectors will be located. This document is the second volume of the CEPC Conceptual Design Report (CDR). It presents the physics case for the CEPC, describes conceptual designs of possible detectors and their technological options, highlights the expected detector and physics performance, and discusses future plans for detector R&D and physics investigations. The final CEPC detectors will be proposed and built by international collaborations but they are likely to be composed of the detector technologies included in the conceptual designs described in this document. A separate volume, Volume I, recently released, describes the design of the CEPC accelerator complex, its associated civil engineering, and strategic alternative scenarios