8 research outputs found

    Causal Structure Learning Supervised by Large Language Model

    Full text link
    Causal discovery from observational data is pivotal for deciphering complex relationships. Causal Structure Learning (CSL), which focuses on deriving causal Directed Acyclic Graphs (DAGs) from data, faces challenges due to vast DAG spaces and data sparsity. The integration of Large Language Models (LLMs), recognized for their causal reasoning capabilities, offers a promising direction to enhance CSL by infusing it with knowledge-based causal inferences. However, existing approaches utilizing LLMs for CSL have encountered issues, including unreliable constraints from imperfect LLM inferences and the computational intensity of full pairwise variable analyses. In response, we introduce the Iterative LLM Supervised CSL (ILS-CSL) framework. ILS-CSL innovatively integrates LLM-based causal inference with CSL in an iterative process, refining the causal DAG using feedback from LLMs. This method not only utilizes LLM resources more efficiently but also generates more robust and high-quality structural constraints compared to previous methodologies. Our comprehensive evaluation across eight real-world datasets demonstrates ILS-CSL's superior performance, setting a new standard in CSL efficacy and showcasing its potential to significantly advance the field of causal discovery. The codes are available at \url{https://github.com/tyMadara/ILS-CSL}

    Causal Inference Using LLM-Guided Discovery

    Full text link
    At the core of causal inference lies the challenge of determining reliable causal graphs solely based on observational data. Since the well-known backdoor criterion depends on the graph, any errors in the graph can propagate downstream to effect inference. In this work, we initially show that complete graph information is not necessary for causal effect inference; the topological order over graph variables (causal order) alone suffices. Further, given a node pair, causal order is easier to elicit from domain experts compared to graph edges since determining the existence of an edge can depend extensively on other variables. Interestingly, we find that the same principle holds for Large Language Models (LLMs) such as GPT-3.5-turbo and GPT-4, motivating an automated method to obtain causal order (and hence causal effect) with LLMs acting as virtual domain experts. To this end, we employ different prompting strategies and contextual cues to propose a robust technique of obtaining causal order from LLMs. Acknowledging LLMs' limitations, we also study possible techniques to integrate LLMs with established causal discovery algorithms, including constraint-based and score-based methods, to enhance their performance. Extensive experiments demonstrate that our approach significantly improves causal ordering accuracy as compared to discovery algorithms, highlighting the potential of LLMs to enhance causal inference across diverse fields

    Classifiers for modeling of mineral potential

    Get PDF
    [Extract] Classification and allocation of land-use is a major policy objective in most countries. Such an undertaking, however, in the face of competing demands from different stakeholders, requires reliable information on resources potential. This type of information enables policy decision-makers to estimate socio-economic benefits from different possible land-use types and then to allocate most suitable land-use. The potential for several types of resources occurring on the earth's surface (e.g., forest, soil, etc.) is generally easier to determine than those occurring in the subsurface (e.g., mineral deposits, etc.). In many situations, therefore, information on potential for subsurface occurring resources is not among the inputs to land-use decision-making [85]. Consequently, many potentially mineralized lands are alienated usually to, say, further exploration and exploitation of mineral deposits. Areas with mineral potential are characterized by geological features associated genetically and spatially with the type of mineral deposits sought. The term 'mineral deposits' means .accumulations or concentrations of one or more useful naturally occurring substances, which are otherwise usually distributed sparsely in the earth's crust. The term 'mineralization' refers to collective geological processes that result in formation of mineral deposits. The term 'mineral potential' describes the probability or favorability for occurrence of mineral deposits or mineralization. The geological features characteristic of mineralized land, which are called recognition criteria, are spatial objects indicative of or produced by individual geological processes that acted together to form mineral deposits. Recognition criteria are sometimes directly observable; more often, their presence is inferred from one or more geographically referenced (or spatial) datasets, which are processed and analyzed appropriately to enhance, extract, and represent the recognition criteria as spatial evidence or predictor maps. Mineral potential mapping then involves integration of predictor maps in order to classify areas of unique combinations of spatial predictor patterns, called unique conditions [51] as either barren or mineralized with respect to the mineral deposit-type sought

    Interactive Causal Structure Discovery

    Get PDF
    Multiple algorithms exist for the detection of causal relations from observational data but they are limited by their required assumptions regarding the data or by available computational resources. Only limited amount of information can be extracted from finite data but domain experts often have some knowledge of the underlying processes. We propose combining an expert’s prior knowledge with data likelihood to find models with high posterior probability. Our high-level procedure for interactive causal structure discovery contains three modules: discovery of initial models, navigation in the space of causal structures, and validation for model selection and evaluation. We present one manner of formulating the problem and implementing the approach assuming a rational, Bayesian expert which assumption we use to model the user in simulated experiments. The expert navigates greedily in the structure space using their prior information and the structures’ fit to data to find a local maximum a posteriori structure. Existing algorithms provide initial models for the navigation. Through simulated user experiments with synthetic data and use cases with real-world data, we find that the results of causal analysis can be improved by adding prior knowledge. Additionally, different initial models can lead to the expert finding different causal models and model validation helps detect overfitting and concept drift

    Learning Bayesian network equivalence classes using ant colony optimisation

    Get PDF
    Bayesian networks have become an indispensable tool in the modelling of uncertain knowledge. Conceptually, they consist of two parts: a directed acyclic graph called the structure, and conditional probability distributions attached to each node known as the parameters. As a result of their expressiveness, understandability and rigorous mathematical basis, Bayesian networks have become one of the first methods investigated, when faced with an uncertain problem domain. However, a recurring problem persists in specifying a Bayesian network. Both the structure and parameters can be difficult for experts to conceive, especially if their knowledge is tacit.To counteract these problems, research has been ongoing, on learning both the structure and parameters of Bayesian networks from data. Whilst there are simple methods for learning the parameters, learning the structure has proved harder. Part ofthis stems from the NP-hardness of the problem and the super-exponential space of possible structures. To help solve this task, this thesis seeks to employ a relatively new technique, that has had much success in tackling NP-hard problems. This technique is called ant colony optimisation. Ant colony optimisation is a metaheuristic based on the behaviour of ants acting together in a colony. It uses the stochastic activity of artificial ants to find good solutions to combinatorial optimisation problems. In the current work, this method is applied to the problem of searching through the space of equivalence classes of Bayesian networks, in order to find a good match against a set of data. The system uses operators that evaluate potential modifications to a current state. Each of the modifications is scored and the results used to inform the search. In order to facilitate these steps, other techniques are also devised, to speed up the learning process. The techniques includeThe techniques are tested by sampling data from gold standard networks and learning structures from this sampled data. These structures are analysed using various goodnessof-fit measures to see how well the algorithms perform. The measures include structural similarity metrics and Bayesian scoring metrics. The results are compared in depth against systems that also use ant colony optimisation and other methods, including evolutionary programming and greedy heuristics. Also, comparisons are made to well known state-of-the-art algorithms and a study performed on a real-life data set. The results show favourable performance compared to the other methods and on modelling the real-life data
    corecore