141 research outputs found

    Probing Physical Reasoning with Counter-Commonsense Context

    Full text link
    In this study, we create a CConS (Counter-commonsense Contextual Size comparison) dataset to investigate how physical commonsense affects the contextualized size comparison task; the proposed dataset consists of both contexts that fit physical commonsense and those that do not. This dataset tests the ability of language models to predict the size relationship between objects under various contexts generated from our curated noun list and templates. We measure the ability of several masked language models and generative models. The results show that while large language models can use prepositions such as ``in'' and ``into'' in the provided context to infer size relationships, they fail to use verbs and thus make incorrect judgments led by their prior physical commonsense.Comment: Accepted to ACL 2023(Short Paper

    Characteristics and gender differences in the medical interview skills of Japanese medical students

    Get PDF

    Assessing the Benchmarking Capacity of Machine Reading Comprehension Datasets

    Get PDF
    Existing analysis work in machine reading comprehension (MRC) is largely concerned with evaluating the capabilities of systems. However, the capabilities of datasets are not assessed for benchmarking language understanding precisely. We propose a semi-automated, ablation-based methodology for this challenge; By checking whether questions can be solved even after removing features associated with a skill requisite for language understanding, we evaluate to what degree the questions do not require the skill. Experiments on 10 datasets (e.g., CoQA, SQuAD v2.0, and RACE) with a strong baseline model show that, for example, the relative scores of a baseline model provided with content words only and with shuffled sentence words in the context are on average 89.2% and 78.5% of the original score, respectively. These results suggest that most of the questions already answered correctly by the model do not necessarily require grammatical and complex reasoning. For precise benchmarking, MRC datasets will need to take extra care in their design to ensure that questions can correctly evaluate the intended skills.Comment: 11 pages, AAAI2020, with extra examples, data: https://github.com/Alab-NII/mrc-ablatio

    A Survey on Measuring and Mitigating Reasoning Shortcuts in Machine Reading Comprehension

    Full text link
    The issue of shortcut learning is widely known in NLP and has been an important research focus in recent years. Unintended correlations in the data enable models to easily solve tasks that were meant to exhibit advanced language understanding and reasoning capabilities. In this survey paper, we focus on the field of machine reading comprehension (MRC), an important task for showcasing high-level language understanding that also suffers from a range of shortcuts. We summarize the available techniques for measuring and mitigating shortcuts and conclude with suggestions for further progress in shortcut research. Importantly, we highlight two concerns for shortcut mitigation in MRC: (1) the lack of public challenge sets, a necessary component for effective and reusable evaluation, and (2) the lack of certain mitigation techniques that are prominent in other areas.Comment: 18 pages, 2 figures, 4 table

    Benchmarking Machine Reading Comprehension: A Psychological Perspective

    Get PDF
    Machine reading comprehension (MRC) has received considerable attention as a benchmark for natural language understanding. However, the conventional task design of MRC lacks explainability beyond the model interpretation, i.e., reading comprehension by a model cannot be explained in human terms. To this end, this position paper provides a theoretical basis for the design of MRC datasets based on psychology as well as psychometrics, and summarizes it in terms of the prerequisites for benchmarking MRC. We conclude that future datasets should (i) evaluate the capability of the model for constructing a coherent and grounded representation to understand context-dependent situations and (ii) ensure substantive validity by shortcut-proof questions and explanation as a part of the task design.Comment: 21 pages, EACL 202


    Get PDF
    From hexane extract of wheat flour infested by the sawtoothed gain beetle [Oryzaephilus surinamensis (L.); Coleoptera; Silvanidae, three ketosteroids,cholestan-3-one(3),ergostan-3-one(4)and stigmastan-3-one(5),were obtained in a mixture and identified as arrestants to this weevil.世界的に著名な貯穀害虫であるノコギリヒラタムシによって食害された小麦のヘキサン抽出物中には、未食害の小麦には含まれない、数種のノコギリヒラタムシ定着活性物質が存在し、このうちの2種の活性物質が既に構造解明された。本研究では種々の機器分析、および市販化合物からの誘導などにより、未知の活性物質がcholestan-3-one,ergostan-3-one,stifmastan-3-one の混合物であると同定した

    Striped catfish (Pangasianodon hypophthalmus) exploit food sources across anaerobic decomposition- and primary photosynthetic production-based food chains

    Get PDF
    Dietary information from aquatic organisms is instrumental in predicting biological interactions and understanding ecosystem functionality. In freshwater habitats, generalist fish species can access a diverse array of food sources from multiple food chains. These may include primary photosynthetic production and detritus derived from both oxic and anoxic decomposition. However, the exploitation of anoxic decomposition products by fish remains insufficiently explored. This study examines feeding habits of striped catfish (Pangasianodon hypophthalmus) at both adult and juvenile stages within a tropical reservoir, using stable carbon, nitrogen, and sulfur isotope ratios (δ¹³C, δ¹⁵N, and δ³⁴S, respectively) and fatty acid (FA) analyses. The adult catfish exhibited higher δ¹⁵N values compared to primary consumers that feed on primary photosynthetic producers, which suggests ingestion of food sources originating from primary photosynthetic production-based food chains. On the other hand, juvenile catfish demonstrated lower δ¹⁵N values than primary consumers, correlating with low δ³⁴S value and large proportions of bacterial FA but contained small proportions of polyunsaturated FA. This implies that juveniles utilize food sources from both anoxic decomposition and primary photosynthetic production-based food chains. Our results indicate that food chains based on anoxic decomposition can indeed contribute to the dietary sources of tropical fish species