Search CORE

37 research outputs found

Test-time Augmentation for Factual Probing

Author: Heinzerling Benjamin
Inui Kentaro
Kamoda Go
Sakaguchi Keisuke
Publication venue
Publication date: 25/10/2023
Field of study

Factual probing is a method that uses prompts to test if a language model "knows" certain world knowledge facts. A problem in factual probing is that small changes to the prompt can lead to large changes in model output. Previous work aimed to alleviate this problem by optimizing prompts via text mining or fine-tuning. However, such approaches are relation-specific and do not generalize to unseen relation types. Here, we propose to use test-time augmentation (TTA) as a relation-agnostic method for reducing sensitivity to prompt variations by automatically augmenting and ensembling prompts at test time. Experiments show improved model calibration, i.e., with TTA, model confidence better reflects prediction accuracy. Improvements in prediction accuracy are observed for some models, but for other models, TTA leads to degradation. Error analysis identifies the difficulty of producing high-quality prompt variations as the main challenge for TTA.Comment: 12 pages, 4 figures, accepted to EMNLP 2023 Findings (short paper

arXiv.org e-Print Archive

Analyzing the Performance of GPT-3.5 and GPT-4 in Grammatical Error Correction

Author: Coyne Steven
Galvan-Sosa Diana
Inui Kentaro
Sakaguchi Keisuke
Zock Michael
Publication venue
Publication date: 30/05/2023
Field of study

GPT-3 and GPT-4 models are powerful, achieving high performance on a variety of Natural Language Processing tasks. However, there is a relative lack of detailed published analysis of their performance on the task of grammatical error correction (GEC). To address this, we perform experiments testing the capabilities of a GPT-3.5 model (text-davinci-003) and a GPT-4 model (gpt-4-0314) on major GEC benchmarks. We compare the performance of different prompts in both zero-shot and few-shot settings, analyzing intriguing or problematic outputs encountered with different prompt formats. We report the performance of our best prompt on the BEA-2019 and JFLEG datasets, finding that the GPT models can perform well in a sentence-level revision setting, with GPT-4 achieving a new high score on the JFLEG benchmark. Through human evaluation experiments, we compare the GPT models' corrections to source, human reference, and baseline GEC system sentences and observe differences in editing strategies and how they are scored by human raters

arXiv.org e-Print Archive

Empirical Investigation of Neural Symbolic Reasoning Strategies

Author: Aoki Yoichi
Brassard Ana
Inui Kentaro
Kudo Keito
Kuribayashi Tatsuki
Sakaguchi Keisuke
Yoshikawa Masashi
Publication venue
Publication date: 16/02/2023
Field of study

Neural reasoning accuracy improves when generating intermediate reasoning steps. However, the source of this improvement is yet unclear. Here, we investigate and factorize the benefit of generating intermediate steps for symbolic reasoning. Specifically, we decompose the reasoning strategy w.r.t. step granularity and chaining strategy. With a purely symbolic numerical reasoning dataset (e.g., A=1, B=3, C=A+3, C?), we found that the choice of reasoning strategies significantly affects the performance, with the gap becoming even larger as the extrapolation length becomes longer. Surprisingly, we also found that certain configurations lead to nearly perfect performance, even in the case of length extrapolation. Our results indicate the importance of further exploring effective strategies for neural reasoning models.Comment: This paper is accepted as the findings at EACL 2023, and the earlier version (non-archival) of this work got the Best Paper Award in the Student Research Workshop of AACL 202

arXiv.org e-Print Archive

Do Deep Neural Networks Capture Compositionality in Arithmetic Reasoning?

Author: Aoki Yoichi
Brassard Ana
Inui Kentaro
Kudo Keito
Kuribayashi Tatsuki
Sakaguchi Keisuke
Yoshikawa Masashi
Publication venue
Publication date: 15/02/2023
Field of study

Compositionality is a pivotal property of symbolic reasoning. However, how well recent neural models capture compositionality remains underexplored in the symbolic reasoning tasks. This study empirically addresses this question by systematically examining recently published pre-trained seq2seq models with a carefully controlled dataset of multi-hop arithmetic symbolic reasoning. We introduce a skill tree on compositionality in arithmetic symbolic reasoning that defines the hierarchical levels of complexity along with three compositionality dimensions: systematicity, productivity, and substitutivity. Our experiments revealed that among the three types of composition, the models struggled most with systematicity, performing poorly even with relatively simple compositions. That difficulty was not resolved even after training the models with intermediate reasoning steps.Comment: accepted by EACL 202

arXiv.org e-Print Archive

RealTime QA: What's the Answer Right Now?

Author: Asai Akari
Bras Ronan Le
Choi Yejin
Inui Kentaro
Kasai Jungo
Radev Dragomir
Sakaguchi Keisuke
Smith Noah A.
Takahashi Yoichi
Yu Xinyan
Publication venue
Publication date: 28/02/2024
Field of study

We introduce REALTIME QA, a dynamic question answering (QA) platform that announces questions and evaluates systems on a regular basis (weekly in this version). REALTIME QA inquires about the current world, and QA systems need to answer questions about novel events or information. It therefore challenges static, conventional assumptions in open-domain QA datasets and pursues instantaneous applications. We build strong baseline models upon large pretrained language models, including GPT-3 and T5. Our benchmark is an ongoing effort, and this paper presents real-time evaluation results over the past year. Our experimental results show that GPT-3 can often properly update its generation results, based on newly-retrieved documents, highlighting the importance of up-to-date information retrieval. Nonetheless, we find that GPT-3 tends to return outdated answers when retrieved documents do not provide sufficient information to find an answer. This suggests an important avenue for future research: can an open-domain QA system identify such unanswerable cases and communicate with the user or even the retrieval module to modify the retrieval results? We hope that REALTIME QA will spur progress in instantaneous applications of question answering and beyond.Comment: RealTime QA Website: https://realtimeqa.github.io

arXiv.org e-Print Archive

A comprehensive survey on quantum computer usage: How many qubits are employed for what purposes?

Author: Fujii Keisuke
Hakoshima Hideaki
Ichikawa Tsubasa
Inui Koji
Ito Kosuke
Matsuda Ryo
Mitarai Kosuke
Miyamoto Koichi
Mizukami Wataru
Mizuta Kaoru
Mori Toshio
Nakano Yuichiro
Nakayama Akimoto
Okada Ken N.
Sugimoto Takanori
Takahira Souichi
Takemori Nayuta
Tsukano Satoyuki
Ueda Hiroshi
Watanabe Ryo
Yoshida Yuichiro
Publication venue
Publication date: 10/10/2023
Field of study

Quantum computers (QCs), which work based on the law of quantum mechanics, are expected to be faster than classical computers in several computational tasks such as prime factoring and simulation of quantum many-body systems. In the last decade, research and development of QCs have rapidly advanced. Now hundreds of physical qubits are at our disposal, and one can find several remarkable experiments actually outperforming the classical computer in a specific computational task. On the other hand, it is unclear what the typical usages of the QCs are. Here we conduct an extensive survey on the papers that are posted in the quant-ph section in arXiv and claim to have used QCs in their abstracts. To understand the current situation of the research and development of the QCs, we evaluated the descriptive statistics about the papers, including the number of qubits employed, QPU vendors, application domains and so on. Our survey shows that the annual number of publications is increasing, and the typical number of qubits employed is about six to ten, growing along with the increase in the quantum volume (QV). Most of the preprints are devoted to applications such as quantum machine learning, condensed matter physics, and quantum chemistry, while quantum error correction and quantum noise mitigation use more qubits than the other topics. These imply that the increase in QV is fundamentally relevant, and more experiments for quantum error correction, and noise mitigation using shallow circuits with more qubits will take place.Comment: 14 pages, 5 figures, figures regenerate

arXiv.org e-Print Archive

Successful Transcatheter Chemoembolization for Acute Jaundice in a Patient with Advanced Hepatocellular Carcinoma and Portal Vein Tumor Thrombosis: A Case Report

Author: Hirokazu Komeichi
Keisuke Inui
Kyouichi Mizuno
Shuji Shimizu
Yasuhiro Takahashi
Yasumi Katsuta
Publication venue: 'Medical Association of Nippon Medical School'
Publication date: 01/01/2009
Field of study

Crossref

Type III Gustilo–Anderson open fracture does not justify routine prophylactic Gram-negative antibiotic coverage

Author: Keisuke Ishii
Miyoshi Sakai
Takahiro Inui
Takashi Suzuki
Taketo Kurozumi
Yoshinobu Watanabe
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2023
Field of study

Abstract Postoperative surgical site infection (SSI) is common in open long bone fractures, so early administration of prophylactic antibiotics is critical to prevent SSI. However, the necessity of initial broad-spectrum coverage for Gram-positive and -negative pathogens remains unclear. The purpose of this study was to clarify the effectiveness of prophylactic broad-spectrum antibiotics in a large, national-wide sample. We reviewed an open fracture database of prospectively collected data from 111 institutions managed by our society. A retrospective cohort study was designed to compare the rates of deep SSI between narrow- and broad-spectrum antibiotics, which were initiated within three hours after injury. A total of 1041 type III fractures were evaluated at three months after injury. Overall deep SSI rates did not differ significantly between the narrow-spectrum group (43/538, 8.0%) and broad-spectrum group (49/503, 9.8%) (p = 0.320). During propensity score-matched analysis, 425 pairs were analyzed. After matching, no significant difference in the SSI rate was seen between the narrow- and broad-spectrum groups, with 42 SSIs (9.9%) and 40 SSIs (9.4%), respectively (p = 0.816). The probability of deep SSI was not reduced by broad-spectrum antibiotics compared with narrow-spectrum antibiotics in type III open long bone fractures

Directory of Open Access Journals

Closed Compression Nailing Using a New-Generation Intramedullary Nail without Autologous Bone Grafting for Humeral Shaft Nonunion

Author: Atsuyuki Inui
Genta Fukumoto
Keisuke Oe
Ryosuke Kuroda
Takahiro Niikura
Tomoaki Fukui
Yutaka Mifune
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2021
Field of study

Introduction. Although the recommended treatment for humeral shaft nonunion is compression plating with autologous bone grafting, we treated a case of humeral shaft nonunion with an intramedullary nail (IMN) without bone grafting. Presentation of Case. Osteosynthesis with IMN was performed on a 24-year-old man with a humeral shaft fracture at another hospital. However, bony union was not obtained 1 year after the first surgery, and he was referred to our institution. We treated the nonunion with exchange nailing without autologous bone grafting using compression function of the nail, leading to bony union at 7 months postoperatively. At the final follow-up 2 years and 4 months postoperatively, the patient had full range of motion in the left shoulder and elbow joints. Discussion. Compression plating with autologous bone grafting is reported to be the gold standard for the treatment of humeral shaft nonunion. IMN is advantageous for minimal invasion; however, the conventional type of IMN cannot apply compression force between fragments and does not have sufficient stability against rotational force. In this case, we used an IMN that could apply compression between the fragments and which had rotational stability via many screws. We did not perform bone grafting because the current nonunion was adjudged to be biologically active, and we achieved good functional results. Conclusion. We treated humeral shaft nonunion using IMN with compression, but without bone grafting, leading to successful clinical outcomes. This strategy might be an appropriate choice for the treatment of humeral shaft nonunion with biological activity

Directory of Open Access Journals

Minimally invasive plate osteosynthesis for humeral shaft nonunion: A report of two cases

Author: Fukui Tomoaki
Inui Atsuyuki
Kawamoto Teruya
Kuroda Ryosuke
Mifune Yutaka
Niikura Takahiro
Oe Keisuke
Suda Yoshihito
Publication venue: 'Elsevier BV'
Publication date: 01/12/2019
Field of study

Introduction: We treated two cases of humeral shaft nonunion by minimally invasive plate osteosynthesis (MIPO) without autogenous bone grafting. Presntation of case: Case 1: An osteosynthesis with intramedullary nailing (IMN) was performed on a 17-year-old female for a humeral shaft fracture at another hospital; however, bony union was not obtained. We removed the nail and screws, then performed MIPO without autogenous bone grafting. At the final follow-up of 4 years after the surgery, she had obtained full range of motion. Case 2: Osteosynthesis with Rush pins had been performed in a 73-year-old female for a humeral shaft fracture at another hospital. Five months later, a revision surgery using IMN was performed at the same hospital; however, this led to nonunion. We removed the IMN and performed MIPO without autogenous bone grafting. At the final follow-up 2 years after surgery, she had obtained full range of motion. Discussion: The cause of nonunion is the lack of mechanical instability and/or biological activity. In these cases, from the findings of radiography and bone scintigraphy, mechanical instability was thought to be the primary cause; therefore, in order to enhance stability, we used a locking plate. Because we can see that these cases are biologically active, we decided not to use bone grafting. Both our cases successfully achieved bony union and excellent functional recovery using this method. Conclusion: We performed MIPO without exposure of the nonunion site and autogenous bone grafting in two cases of humeral shaft nonunion, and obtained successful clinical outcomes

Institutional Repositories DataBase (IRDB)

Kobe University Repository Kernel