120 research outputs found

    Research in progress: report on the ICAIL 2017 doctoral consortium

    Get PDF
    This paper arose out of the 2017 international conference on AI and law doctoral consortium. There were five students who presented their Ph.D. work, and each of them has contributed a section to this paper. The paper offers a view of what topics are currently engaging students, and shows the diversity of their interests and influences

    Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation and Beyond

    Full text link
    Logical reasoning consistently plays a fundamental and significant role in the domains of knowledge engineering and artificial intelligence. Recently, Large Language Models (LLMs) have emerged as a noteworthy innovation in natural language processing (NLP), exhibiting impressive achievements across various classic NLP tasks. However, the question of whether LLMs can effectively address the task of logical reasoning, which requires gradual cognitive inference similar to human intelligence, remains unanswered. To this end, we aim to bridge this gap and provide comprehensive evaluations in this paper. Firstly, to offer systematic evaluations, we select fifteen typical logical reasoning datasets and organize them into deductive, inductive, abductive and mixed-form reasoning settings. Considering the comprehensiveness of evaluations, we include three representative LLMs (i.e., text-davinci-003, ChatGPT and BARD) and evaluate them on all selected datasets under zero-shot, one-shot and three-shot settings. Secondly, different from previous evaluations relying only on simple metrics (e.g., accuracy), we propose fine-level evaluations from objective and subjective manners, covering both answers and explanations. Additionally, to uncover the logical flaws of LLMs, problematic cases will be attributed to five error types from two dimensions, i.e., evidence selection process and reasoning process. Thirdly, to avoid the influences of knowledge bias and purely focus on benchmarking the logical reasoning capability of LLMs, we propose a new dataset with neutral content. It contains 3,000 samples and covers deductive, inductive and abductive settings. Based on the in-depth evaluations, this paper finally forms a general evaluation scheme of logical reasoning capability from six dimensions. It reflects the pros and cons of LLMs and gives guiding directions for future works.Comment: 14 pages, 11 figure

    Neuro-symbolic Computation for XAI: Towards a Unified Model

    Get PDF
    The idea of integrating symbolic and sub-symbolic approaches to make intelligent systems (IS) understandable and explainable is at the core of new fields such as neuro-symbolic computing (NSC). This work lays under the umbrella of NSC, and aims at a twofold objective. First, we present a set of guidelines aimed at building explainable IS, which leverage on logic induction and constraints to integrate symbolic and sub-symbolic approaches. Then, we reify the proposed guidelines into a case study to show their effectiveness and potential, presenting a prototype built on the top of some NSC technologies

    A Generalised Quantifier Theory of Natural Language in Categorical Compositional Distributional Semantics with Bialgebras

    Get PDF
    Categorical compositional distributional semantics is a model of natural language; it combines the statistical vector space models of words with the compositional models of grammar. We formalise in this model the generalised quantifier theory of natural language, due to Barwise and Cooper. The underlying setting is a compact closed category with bialgebras. We start from a generative grammar formalisation and develop an abstract categorical compositional semantics for it, then instantiate the abstract setting to sets and relations and to finite dimensional vector spaces and linear maps. We prove the equivalence of the relational instantiation to the truth theoretic semantics of generalised quantifiers. The vector space instantiation formalises the statistical usages of words and enables us to, for the first time, reason about quantified phrases and sentences compositionally in distributional semantics

    Deductive Verification of Chain-of-Thought Reasoning

    Full text link
    Large Language Models (LLMs) significantly benefit from Chain-of-Thought (CoT) prompting in performing various reasoning tasks. While CoT allows models to produce more comprehensive reasoning processes, its emphasis on intermediate reasoning steps can inadvertently introduce hallucinations and accumulated errors, thereby limiting models' ability to solve complex reasoning tasks. Inspired by how humans engage in careful and meticulous deductive logical reasoning processes to solve tasks, we seek to enable language models to perform explicit and rigorous deductive reasoning, and also ensure the trustworthiness of their reasoning process through self-verification. However, directly verifying the validity of an entire deductive reasoning process is challenging, even with advanced models like ChatGPT. In light of this, we propose to decompose a reasoning verification process into a series of step-by-step subprocesses, each only receiving their necessary context and premises. To facilitate this procedure, we propose Natural Program, a natural language-based deductive reasoning format. Our approach enables models to generate precise reasoning steps where subsequent steps are more rigorously grounded on prior steps. It also empowers language models to carry out reasoning self-verification in a step-by-step manner. By integrating this verification process into each deductive reasoning stage, we significantly enhance the rigor and trustfulness of generated reasoning steps. Along this process, we also improve the answer correctness on complex reasoning tasks. Code will be released at https://github.com/lz1oceani/verify_cot

    An intelligent system for facility management

    Get PDF
    A software system has been developed that monitors and interprets temporally changing (internal) building environments and generates related knowledge that can assist in facility management (FM) decision making. The use of the multi agent paradigm renders a system that delivers demonstrable rationality and is robust within the dynamic environment that it operates. Agent behaviour directed at working toward goals is rendered intelligent with semantic web technologies. The capture of semantics though formal expression to model the environment, adds a richness that the agents exploit to intelligently determine behaviours to satisfy goals that are flexible and adaptable. The agent goals are to generate knowledge about building space usage as well as environmental conditions by elaborating and combining near real time sensor data and information from conventional building models. Additionally further inferences are facilitated including those about wasted resources such as unnecessary lighting and heating for example. In contrast, current FM tools, lacking automatic synchronisation with the domain and rich semantic modelling, are limited to the simpler querying of manually maintained models.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    An intelligent system for facility management

    Get PDF
    A software system has been developed that monitors and interprets temporally changing (internal) building environments and generates related knowledge that can assist in facility management (FM) decision making. The use of the multi agent paradigm renders a system that delivers demonstrable rationality and is robust within the dynamic environment that it operates. Agent behaviour directed at working toward goals is rendered intelligent with semantic web technologies. The capture of semantics though formal expression to model the environment, adds a richness that the agents exploit to intelligently determine behaviours to satisfy goals that are flexible and adaptable. The agent goals are to generate knowledge about building space usage as well as environmental conditions by elaborating and combining near real time sensor data and information from conventional building models. Additionally further inferences are facilitated including those about wasted resources such as unnecessary lighting and heating for example. In contrast, current FM tools, lacking automatic synchronisation with the domain and rich semantic modelling, are limited to the simpler querying of manually maintained models

    A Survey of Large Language Models

    Full text link
    Language is essentially a complex, intricate system of human expressions governed by grammatical rules. It poses a significant challenge to develop capable AI algorithms for comprehending and grasping a language. As a major approach, language modeling has been widely studied for language understanding and generation in the past two decades, evolving from statistical language models to neural language models. Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large-scale corpora, showing strong capabilities in solving various NLP tasks. Since researchers have found that model scaling can lead to performance improvement, they further study the scaling effect by increasing the model size to an even larger size. Interestingly, when the parameter scale exceeds a certain level, these enlarged language models not only achieve a significant performance improvement but also show some special abilities that are not present in small-scale language models. To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size. Recently, the research on LLMs has been largely advanced by both academia and industry, and a remarkable progress is the launch of ChatGPT, which has attracted widespread attention from society. The technical evolution of LLMs has been making an important impact on the entire AI community, which would revolutionize the way how we develop and use AI algorithms. In this survey, we review the recent advances of LLMs by introducing the background, key findings, and mainstream techniques. In particular, we focus on four major aspects of LLMs, namely pre-training, adaptation tuning, utilization, and capacity evaluation. Besides, we also summarize the available resources for developing LLMs and discuss the remaining issues for future directions.Comment: ongoing work; 51 page
    • …
    corecore