134 research outputs found

    Can We `Feel' the Temperature of Knowledge? Modelling Scientific Popularity Dynamics via Thermodynamics

    Full text link
    Just like everything in the nature, scientific topics flourish and perish. While existing literature well captures article's life-cycle via citation patterns, little is known about how scientific popularity and impact evolves for a specific topic. It would be most intuitive if we could `feel' topic's activity just as we perceive the weather by temperature. Here, we conceive knowledge temperature to quantify topic overall popularity and impact through citation network dynamics. Knowledge temperature includes 2 parts. One part depicts lasting impact by assessing knowledge accumulation with an analogy between topic evolution and isobaric expansion. The other part gauges temporal changes in knowledge structure, an embodiment of short-term popularity, through the rate of entropy change with internal energy, 2 thermodynamic variables approximated via node degree and edge number. Our analysis of representative topics with size ranging from 1000 to over 30000 articles reveals that the key to flourishing is topics' ability in accumulating useful information for future knowledge generation. Topics particularly experience temperature surges when their knowledge structure is altered by influential articles. The spike is especially obvious when there appears a single non-trivial novel research focus or merging in topic structure. Overall, knowledge temperature manifests topics' distinct evolutionary cycles

    New insights into stress changes before and after the Wenchuan Earthquake using hydraulic fracturing measurements

    Get PDF
    AbstractThis paper summarizes in situ stress data by hydraulic fracturing method over the past 10years along the Longmenshan fault belt, and these data can be divided into three segments: northern, middle, and southern. The orientations of the maximum horizontal stress rotate from north-northwest in the northern to northwest in the middle, and to west-northwest in the southern. The stress magnitudes are characterized by higher values in the two ends and lower values in the middle segment. Furthermore, three stress measurement campaigns in two boreholes on the northern segment of the Longmenshan fault belt, before and after the great earthquake, show clear stress decrease of 23%–29% in the shallow crust after the earthquake. Analysis using the mathematical fitting method also indicates a decrease in regional stress state after the earthquake. Meanwhile, the frictional characteristic indexes based on the stress measurements further imply that the frictional strength of the Longmenshan fault belt is characterized by a strong southern segment, a weak middle segment, and a moderately strong northern segment. The analysis based on the stress data implies that the northern and southern segments of the fault belt are extremely important stress concentration and transformation nodes of the regional stress regime

    Exploring and Verbalizing Academic Ideas by Concept Co-occurrence

    Full text link
    Researchers usually come up with new ideas only after thoroughly comprehending vast quantities of literature. The difficulty of this procedure is exacerbated by the fact that the number of academic publications is growing exponentially. In this study, we devise a framework based on concept co-occurrence for academic idea inspiration, which has been integrated into a research assistant system. From our perspective, the fusion of two concepts that co-occur in an academic paper can be regarded as an important way of the emergence of a new idea. We construct evolving concept graphs according to the co-occurrence relationship of concepts from 20 disciplines or topics. Then we design a temporal link prediction method based on masked language model to explore potential connections between different concepts. To verbalize the newly discovered connections, we also utilize the pretrained language model to generate a description of an idea based on a new data structure called co-occurrence citation quintuple. We evaluate our proposed system using both automatic metrics and human assessment. The results demonstrate that our system has broad prospects and can assist researchers in expediting the process of discovering new ideas.Comment: Accepted by ACL 202

    Graph Out-of-Distribution Generalization with Controllable Data Augmentation

    Full text link
    Graph Neural Network (GNN) has demonstrated extraordinary performance in classifying graph properties. However, due to the selection bias of training and testing data (e.g., training on small graphs and testing on large graphs, or training on dense graphs and testing on sparse graphs), distribution deviation is widespread. More importantly, we often observe \emph{hybrid structure distribution shift} of both scale and density, despite of one-sided biased data partition. The spurious correlations over hybrid distribution deviation degrade the performance of previous GNN methods and show large instability among different datasets. To alleviate this problem, we propose \texttt{OOD-GMixup} to jointly manipulate the training distribution with \emph{controllable data augmentation} in metric space. Specifically, we first extract the graph rationales to eliminate the spurious correlations due to irrelevant information. Secondly, we generate virtual samples with perturbation on graph rationale representation domain to obtain potential OOD training samples. Finally, we propose OOD calibration to measure the distribution deviation of virtual samples by leveraging Extreme Value Theory, and further actively control the training distribution by emphasizing the impact of virtual OOD samples. Extensive studies on several real-world datasets on graph classification demonstrate the superiority of our proposed method over state-of-the-art baselines.Comment: Under revie

    Exploring the Limits of Historical Information for Temporal Knowledge Graph Extrapolation

    Full text link
    Temporal knowledge graphs, representing the dynamic relationships and interactions between entities over time, have been identified as a promising approach for event forecasting. However, a limitation of most temporal knowledge graph reasoning methods is their heavy reliance on the recurrence or periodicity of events, which brings challenges to inferring future events related to entities that lack historical interaction. In fact, the current state of affairs is often the result of a combination of historical information and underlying factors that are not directly observable. To this end, we investigate the limits of historical information for temporal knowledge graph extrapolation and propose a new event forecasting model called Contrastive Event Network (CENET) based on a novel training framework of historical contrastive learning. CENET learns both the historical and non-historical dependency to distinguish the most potential entities that best match the given query. Simultaneously, by launching contrastive learning, it trains representations of queries to probe whether the current moment is more dependent on historical or non-historical events. These representations further help train a binary classifier, whose output is a boolean mask, indicating the related entities in the search space. During the inference process, CENET employs a mask-based strategy to generate the final results. We evaluate our proposed model on five benchmark graphs. The results demonstrate that CENET significantly outperforms all existing methods in most metrics, achieving at least 8.3% relative improvement of Hits@1 over previous state-of-the-art baselines on event-based datasets.Comment: Extended version of AAAI paper arXiv:2211.1090

    Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus

    Full text link
    Large Language Models (LLMs) have gained significant popularity for their impressive performance across diverse fields. However, LLMs are prone to hallucinate untruthful or nonsensical outputs that fail to meet user expectations in many real-world applications. Existing works for detecting hallucinations in LLMs either rely on external knowledge for reference retrieval or require sampling multiple responses from the LLM for consistency verification, making these methods costly and inefficient. In this paper, we propose a novel reference-free, uncertainty-based method for detecting hallucinations in LLMs. Our approach imitates human focus in factuality checking from three aspects: 1) focus on the most informative and important keywords in the given text; 2) focus on the unreliable tokens in historical context which may lead to a cascade of hallucinations; and 3) focus on the token properties such as token type and token frequency. Experimental results on relevant datasets demonstrate the effectiveness of our proposed method, which achieves state-of-the-art performance across all the evaluation metrics and eliminates the need for additional information.Comment: Accepted by EMNLP 2023 (main conference

    Effects of Heterozygous Deletion of Autism-related gene Cullin-3 in mice

    Get PDF
    Autism Spectrum Disorder (ASD) is a developmental disorder in which children display repetitive behavior, restricted range of interests, and atypical social interaction and communication. CUL3, coding for a Cullin family scaffold protein mediating assembly of ubiquitin ligase complexes through BTB domain substrate-recruiting adaptors, has been identified as a high-risk gene for autism. Although complete knockout of Cul3 results in embryonic lethality, Cul3 heterozygous mice have reduced CUL3 protein, demonstrate comparable body weight, and display minimal behavioral differences including decreased spatial object recognition memory. In measures of reciprocal social interaction, Cul3 heterozygous mice behaved similarly to their wild-type littermates. In area CA1 of hippocampus, reduction of Cul3 significantly increased mEPSC frequency but not amplitude nor baseline evoked synaptic transmission or paired-pulse ratio. Sholl and spine analysis data suggest there is a small yet significant difference in CA1 pyramidal neuron dendritic branching and stubby spine density. Unbiased proteomic analysis of Cul3 heterozygous brain tissue revealed dysregulation of various cytoskeletal organization proteins, among others. Overall, our results suggest that Cul3 heterozygous deletion impairs spatial object recognition memory, alters cytoskeletal organization proteins, but does not cause major hippocampal neuronal morphology, functional, or behavioral abnormalities in adult global Cul3 heterozygous mice

    ONCache: A Cache-Based Low-Overhead Container Overlay Network

    Full text link
    Recent years have witnessed a widespread adoption of containers. While containers simplify and accelerate application development, existing container network technologies either incur significant overhead, which hurts performance for distributed applications, or lose flexibility or compatibility, which hinders the widespread deployment in production. We design and implement ONCache (\textbf{O}verlay \textbf{N}etwork \textbf{Cache}), a cache-based container overlay network, to eliminate the overhead while keeping flexibility and compatibility. We carefully analyze the difference between an overlay network and a host network, and find that an overlay network incurs extra packet processing, including encapsulating, intra-host routing, namespace traversing and packet filtering. Fortunately, the extra processing exhibits an \emph{invariance property}, e.g., most packets of the same flow have the same processing results. This property motivates us to cache the extra processing results. With the proposed cache, ONCache significantly reduces the extra overhead while maintaining the same flexibility and compatibility as standard overlay networks. We implement ONCache using eBPF with only 524 lines of code, and deploy ONCache as a plugin of Antrea. With ONCache, container communication achieves similar performance as host communication. Compared to the standard overlay network, ONCache improves the throughput and request-response transaction rate by 12\% and 36\% for TCP (20\% and 34\% for UDP), while significant reduces per-packet CPU overhead. Many distributed applications also benefit from ONCache
    corecore