13 research outputs found

    Can Language Models Solve Graph Problems in Natural Language?

    Full text link
    Large language models (LLMs) are increasingly adopted for a variety of tasks with implicit graphical structures, such as planning in robotics, multi-hop question answering or knowledge probing, structured commonsense reasoning, and more. While LLMs have advanced the state-of-the-art on these tasks with structure implications, whether LLMs could explicitly process textual descriptions of graphs and structures, map them to grounded conceptual spaces, and perform structured operations remains underexplored. To this end, we propose NLGraph (Natural Language Graph), a comprehensive benchmark of graph-based problem solving designed in natural language. NLGraph contains 29,370 problems, covering eight graph reasoning tasks with varying complexity from simple tasks such as connectivity and shortest path up to complex problems such as maximum flow and simulating graph neural networks. We evaluate LLMs (GPT-3/4) with various prompting approaches on the NLGraph benchmark and find that 1) language models do demonstrate preliminary graph reasoning abilities, 2) the benefit of advanced prompting and in-context learning diminishes on more complex graph problems, while 3) LLMs are also (un)surprisingly brittle in the face of spurious correlations in graph and problem settings. We then propose Build-a-Graph Prompting and Algorithmic Prompting, two instruction-based approaches to enhance LLMs in solving natural language graph problems. Build-a-Graph and Algorithmic prompting improve the performance of LLMs on NLGraph by 3.07% to 16.85% across multiple tasks and settings, while how to solve the most complicated graph reasoning tasks in our setup with language models remains an open research question. The NLGraph benchmark and evaluation code are available at https://github.com/Arthur-Heng/NLGraph.Comment: NeurIPS 2023 Spotligh

    KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large Language Models

    Full text link
    Large language models (LLMs) demonstrate remarkable performance on knowledge-intensive tasks, suggesting that real-world knowledge is encoded in their model parameters. However, besides explorations on a few probing tasks in limited knowledge domains, it is not well understood how to evaluate LLMs' knowledge systematically and how well their knowledge abilities generalize, across a spectrum of knowledge domains and progressively complex task formats. To this end, we propose KGQuiz, a knowledge-intensive benchmark to comprehensively investigate the knowledge generalization abilities of LLMs. KGQuiz is a scalable framework constructed from triplet-based knowledge, which covers three knowledge domains and consists of five tasks with increasing complexity: true-or-false, multiple-choice QA, blank filling, factual editing, and open-ended knowledge generation. To gain a better understanding of LLMs' knowledge abilities and their generalization, we evaluate 10 open-source and black-box LLMs on the KGQuiz benchmark across the five knowledge-intensive tasks and knowledge domains. Extensive experiments demonstrate that LLMs achieve impressive performance in straightforward knowledge QA tasks, while settings and contexts requiring more complex reasoning or employing domain-specific facts still present significant challenges. We envision KGQuiz as a testbed to analyze such nuanced variations in performance across domains and task formats, and ultimately to understand, evaluate, and improve LLMs' knowledge abilities across a wide spectrum of knowledge domains and tasks

    HOFA: Twitter Bot Detection with Homophily-Oriented Augmentation and Frequency Adaptive Attention

    Full text link
    Twitter bot detection has become an increasingly important and challenging task to combat online misinformation, facilitate social content moderation, and safeguard the integrity of social platforms. Though existing graph-based Twitter bot detection methods achieved state-of-the-art performance, they are all based on the homophily assumption, which assumes users with the same label are more likely to be connected, making it easy for Twitter bots to disguise themselves by following a large number of genuine users. To address this issue, we proposed HOFA, a novel graph-based Twitter bot detection framework that combats the heterophilous disguise challenge with a homophily-oriented graph augmentation module (Homo-Aug) and a frequency adaptive attention module (FaAt). Specifically, the Homo-Aug extracts user representations and computes a k-NN graph using an MLP and improves Twitter's homophily by injecting the k-NN graph. For the FaAt, we propose an attention mechanism that adaptively serves as a low-pass filter along a homophilic edge and a high-pass filter along a heterophilic edge, preventing user features from being over-smoothed by their neighborhood. We also introduce a weight guidance loss to guide the frequency adaptive attention module. Our experiments demonstrate that HOFA achieves state-of-the-art performance on three widely-acknowledged Twitter bot detection benchmarks, which significantly outperforms vanilla graph-based bot detection techniques and strong heterophilic baselines. Furthermore, extensive studies confirm the effectiveness of our Homo-Aug and FaAt module, and HOFA's ability to demystify the heterophilous disguise challenge.Comment: 11 pages, 7 figure

    Study on Penetration Performance of Rear Shaped Charge Warhead

    No full text
    In guided munitions, the shaped charge jet (SCJ) warhead is located behind the simulation compartment (including the control cabin, the steering gear cabin, and the guidance cabin). Therefore, the order of penetration of the SCJ is the simulation cabin and the target. To study the penetration performance of the SCJ to the target plate, the numerical simulation method is used to study the penetration performance of the designed warhead for the steel target at different standoffs, and the depth of penetration (DOP) at the best standoff is obtained, that is, the DOP of the steel target is about 128 mm. Additionally, the penetration performance of the SCJ warhead to target is studied by numerical simulation and experimental verification. Numerical simulation and experimental results show that the DOP of the SCJ warhead to the steel target is 50 mm without the simulation cabin, and about 30 mm with the simulation cabin. The results show that the penetration performance of SCJ is greatly weakened under the condition of non-optimal standoff, but the rear shaped charge warhead still has a strong penetration performance after completing the penetration of the simulated cabin

    Machine Learning in Neuroimaging: A New Approach to Understand Acupuncture for Neuroplasticity

    No full text
    The effects of acupuncture facilitating neural plasticity for treating diseases have been identified by clinical and experimental studies. In the last two decades, the application of neuroimaging techniques in acupuncture research provided visualized evidence for acupuncture promoting neuroplasticity. Recently, the integration of machine learning (ML) and neuroimaging techniques becomes a focus in neuroscience and brings a new and promising approach to understand the facilitation of acupuncture on neuroplasticity at the individual level. This review is aimed at providing an overview of this rapidly growing field by introducing the commonly used ML algorithms in neuroimaging studies briefly and analyzing the characteristics of the acupuncture studies based on ML and neuroimaging, so as to provide references for future research

    Hydrocarbon charging stage and accumulation mode of forward fault step zone in fault basin: taking the Chengbei fault step zone in Qikou Sag, Bohai Bay Basin as an example

    No full text
    Fault step zones are well developed in faulted lake basins in the eastern part of China, among which, forward fault step zones account for a large proportion, and their spatiotemporal configuration relationship, oil and gas accumulation periods and accumulation conditions, which play a key role in the geological evaluation of regional oil and gas exploration, are difficult to study. In order to further clarify its hydrocarbon accumulation mode and filling period, the authors, by taking the Paleogene Dongying and Shahejie formations in the study area as the main target layers, and selecting wells Qidong 3-1, Zhang 10 and Chenghai 16 to represent the low, medium and high fault levels of the fault area, respectively, conducted in-depth research on the oil and gas properties and distribution laws of different charging stages in various districts of Chengbei through reservoir microlithography observation, fluid inclusion identification, salt water inclusion homogenization temperature and salinity test, GOI statistical analysis and other technologies; applied laser Raman test to effectively identify the gas composition of single inclusions in wells Zhang-10 and Qidong 3-1, determined the specific oil and gas accumulation time in Chengbei fault step zone in combination with the single well burial history, paleogeothermal history and autoclastic illite K-Ar isotopic dating technology of well Qidong 3-1, and summarized the four accumulation elements of source rocks, reservoirs, preservation and transportation in Chengbei area. There are two charging stages in the reservoir, the specific accumulation time of the first stage is from the end of Dongying to the early stage of Guantao, about 16 Ma ±, while that of the second stage is in the early stage of Minghua Town, starting from 6 Ma ± and continuing to the present. Multiple sets of reservoir layers are developed in the study area, and the special fault step structure plays a controlling role in channeling the source rocks and reservoirs, the sealing and preservation of oil and gas, and the transportation and accumulation of reservoirs. The regional accumulation conditions are relatively mature, and the lateral migration-accumulation mode of dual-source hydrocarbon supply and multi-stage accumulation and the longitudinal migration-accumulation mode of fault-sand coupling and relay climbing are reflected in the process of oil and gas accumulation

    Sichuan Rainfall Prediction Using an Analog Ensemble

    No full text
    This study aimed to address the significant bias in 0–44-day precipitation forecasts under numerical weather conditions. To achieve this, we utilized observational data obtained from 156 surface stations in the Sichuan region and reanalysis grid data from the National Centers for Environmental Prediction Climate Forecast System Model version 2. Statistical analysis of the spatiotemporal characteristics of precipitation in Sichuan was conducted, followed by a correction experiment based on the Analog Ensemble algorithm for 0–44-day precipitation forecasts for different seasons in the Sichuan region. The results show that, in terms of spatial distribution, the precipitation amounts and precipitation days in Sichuan Province gradually decreased from east to west. Temporally, the highest number of precipitation days occurred in autumn, while the maximum precipitation amount was observed in summer. The Analog Ensemble algorithm effectively reduced the error in the model forecast results for different seasons in the Sichuan region. However, the correction effectiveness varied seasonally, primarily because of the differing performance of the AnEn method in relation to precipitation events of various magnitudes. Notably, the correction effect was the poorest for heavy-rain forecasts. In addition, the degree of improvement of the Analog Ensemble algorithm varied for different initial forecast times and forecast lead times. As the forecast lead time increased, the correction effect gradually weakened

    The Specific and Nonspecific Effects of Tai Chi and Its Possible Central Responses: A Protocol of Neuroimaging Study

    No full text
    Tai Chi has been proven to be a safe and effective assistant therapy for healthcare and disease treatment. However, whether the adjuvant therapeutic effect of Tai Chi is general or disease-oriented remains uncertain. This trial focuses on exploring the specific and nonspecific effects of Tai Chi and its potential central responses. The results will deepen our understanding of the characteristics of Tai Chi exercise for adjuvant therapeutic effects and promote its application in the clinic. In this neuroimaging trial, 40 functional constipation (FC) patients and 40 healthy subjects (HS) will be recruited and will receive 10 weeks of Tai Chi exercise. The motor function, respiratory function, stool-related symptoms, quality of life, and emotional state of the participants will be evaluated at the baseline, the 5-week Tai Chi practice, and the end of practice. The potential changes in the heart rate variability and the cerebral function will be recorded by the 24 h dynamic electrocardiogram at the baseline and the functional magnetic resonance imaging at the end of practice. The possible correlations among the clinical variables, the heart rate variability, and the cerebral activity alterations in FC patients and HS will be analyzed. The healthcare and therapeutic effects of Tai Chi exercise might consist of the specific and nonspecific effects. This study provides not only a new perspective for understanding Tai Chi but also a new approach for investigating the mind-body exercise. This trial was registered in the Chinese Clinical Trial Registry (http://www.chictr.org.cn/showproj.aspx?proj=33243) on 28 November 2018 (registration number: ChiCTR1800019781; protocol version number: V1.0). This trial is currently in the stage of recruiting patients. The first patient was included on 1 December 2018. To date, 18 FC patients and 20 HS have been included. Recruitment will be completed in December 2020
    corecore