Search CORE

10,764 research outputs found

SU(3) trimer resonating-valence-bond state on the square lattice

Author: Chen Ji-Yao
Dong Xiao-Yu
Tu Hong-Hao
Publication venue: 'American Physical Society (APS)'
Publication date: 01/11/2018
Field of study

We propose and study an SU(3) trimer resonating-valence-bond (tRVB) state with

C_{4v}

point-group symmetry on the square lattice. By devising a projected entangled-pair state representation, we show that all (connected) correlation functions between local operators in this SU(3) tRVB state decay exponentially, indicating its gapped nature. We further calculate the modular

S

and

T

matrices by constructing all nine topological sectors on a torus and establish the existence of

\mathbb{Z}_3

topological order in this SU(3) tRVB state.Comment: 6 pages, 6 figure

arXiv.org e-Print Archive

Multi-label Few-shot ICD Coding as Autoregressive Generation with Prompt

Author: Kwon Sunjae
Yang Zhichao
Yao Zonghai
Yu Hong
Publication venue
Publication date: 29/11/2022
Field of study

Automatic International Classification of Diseases (ICD) coding aims to assign multiple ICD codes to a medical note with an average of 3,000+ tokens. This task is challenging due to the high-dimensional space of multi-label assignment (155,000+ ICD code candidates) and the long-tail challenge - Many ICD codes are infrequently assigned yet infrequent ICD codes are important clinically. This study addresses the long-tail challenge by transforming this multi-label classification task into an autoregressive generation task. Specifically, we first introduce a novel pretraining objective to generate free text diagnoses and procedure using the SOAP structure, the medical logic physicians use for note documentation. Second, instead of directly predicting the high dimensional space of ICD codes, our model generates the lower dimension of text descriptions, which then infer ICD codes. Third, we designed a novel prompt template for multi-label classification. We evaluate our Generation with Prompt model with the benchmark of all code assignment (MIMIC-III-full) and few shot ICD code assignment evaluation benchmark (MIMIC-III-few). Experiments on MIMIC-III-few show that our model performs with a marco F1 30.2, which substantially outperforms the previous MIMIC-III-full SOTA model (marco F1 4.3) and the model specifically designed for few/zero shot setting (marco F1 18.7). Finally, we design a novel ensemble learner, a cross attention reranker with prompts, to integrate previous SOTA and our best few-shot coding predictions. Experiments on MIMIC-III-full show that our ensemble learner substantially improves both macro and micro F1, from 10.4 to 14.6 and from 58.2 to 59.1, respectively.Comment: To be appear in AAAI202

arXiv.org e-Print Archive

SELF-EXPLAIN: Teaching Large Language Models to Reason Complex Questions by Themselves

Author: Yang Zhichao
Yao Zonghai
Yu Hong
Zhao Jiachen
Publication venue
Publication date: 12/11/2023
Field of study

Large language models (LLMs) can generate intermediate reasoning steps. To elicit the reliable reasoning, the common practice is to employ few-shot chain-of-thought prompting, where several in-context demonstrations for reasoning are prepended to the question. However, such chain-of-thought examples are expensive to craft, especially for professional domains, and can have high variance depending on human annotators. Therefore, this work investigates whether LLMs can teach themselves to reason without human-crafted demonstrations. We propose SELF-EXPLAIN to generate CoT examples by LLMs inspired by "encoding specificity" in human memory retrieval. We find using self-explanations makes LLMs more confident, more calibrated and less biased when answering complex questions. Moreover, we find prompting with self-explanations can even significantly outperform using human-crafted CoTs on several complex question answering dataset.Comment: Workshop on robustness of zero/few-shot learning in foundation models @ NeurIPS 202

arXiv.org e-Print Archive