220 research outputs found

    A Tale of Two Direction Codes in Rat Retrosplenial Cortex: Uncovering the Neural Basis of Spatial Orientation in Complex Space

    Get PDF
    Head direction (HD) cells only become active whenever a rat faces one direction and stay inactive when it faces others, producing a unimodal activity distribution. Working together in a network, HD cells are considered the neural basis supporting a sense of direction. The retrosplenial cortex (RSC) is part of the HD circuit and contains neurons that express multiple spatial signals, including a pattern of bipolar directional tuning – as recently reported in rats exploring a rotationally symmetric two-compartment space. This suggests an unexplored mechanism of the neural compass. In this thesis, I investigated whether the association between the two-way firing symmetry and twofold environment symmetry reveals a general environment symmetry-encoding property of these RSC neurons. I recorded RSC neurons in environments having onefold, twofold and fourfold symmetry. The current study showed that RSC HD cells maintained a consistent global signal, whereas other RSC directional cells showed multi-fold symmetric firing patterns that reflected environment symmetry, not just globally (across all sub-compartments) but also locally (within each sub-compartment). The analyses also showed that the pattern was independent of egocentric boundary vector coding but represented an allocentric spatial code. It means that these RSC cells use environmental cues to organise multiple singular tuning curves which sometimes are combined to form a multidirectional pattern, likely via an interaction with the global HD signal. Thus, both local and global environment symmetry are encoded by local firing patterns in subspaces. This interestingly suggests cognitive mapping and abstraction of space beyond immediate perceptual bounds in RSC. The data generated from this study provides important insights for modelling of direction computation. Taken together, I discuss how having two types of direction codes in RSC may help us to orient more accurately and flexibly in complex and ambiguous space

    Attention-Based Capsule Networks with Dynamic Routing for Relation Extraction

    Full text link
    A capsule is a group of neurons, whose activity vector represents the instantiation parameters of a specific type of entity. In this paper, we explore the capsule networks used for relation extraction in a multi-instance multi-label learning framework and propose a novel neural approach based on capsule networks with attention mechanisms. We evaluate our method with different benchmarks, and it is demonstrated that our method improves the precision of the predicted relations. Particularly, we show that capsule networks improve multiple entity pairs relation extraction.Comment: To be published in EMNLP 201

    Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks

    Full text link
    We propose a distance supervised relation extraction approach for long-tailed, imbalanced data which is prevalent in real-world settings. Here, the challenge is to learn accurate "few-shot" models for classes existing at the tail of the class distribution, for which little data is available. Inspired by the rich semantic correlations between classes at the long tail and those at the head, we take advantage of the knowledge from data-rich classes at the head of the distribution to boost the performance of the data-poor classes at the tail. First, we propose to leverage implicit relational knowledge among class labels from knowledge graph embeddings and learn explicit relational knowledge using graph convolution networks. Second, we integrate that relational knowledge into relation extraction model by coarse-to-fine knowledge-aware attention mechanism. We demonstrate our results for a large-scale benchmark dataset which show that our approach significantly outperforms other baselines, especially for long-tail relations.Comment: To be published in NAACL 201

    Context-aware Deep Model for Entity Recommendation in Search Engine at Alibaba

    Full text link
    Entity recommendation, providing search users with an improved experience via assisting them in finding related entities for a given query, has become an indispensable feature of today's search engines. Existing studies typically only consider the queries with explicit entities. They usually fail to handle complex queries that without entities, such as "what food is good for cold weather", because their models could not infer the underlying meaning of the input text. In this work, we believe that contexts convey valuable evidence that could facilitate the semantic modeling of queries, and take them into consideration for entity recommendation. In order to better model the semantics of queries and entities, we learn the representation of queries and entities jointly with attentive deep neural networks. We evaluate our approach using large-scale, real-world search logs from a widely used commercial Chinese search engine. Our system has been deployed in ShenMa Search Engine and you can fetch it in UC Browser of Alibaba. Results from online A/B test suggest that the impression efficiency of click-through rate increased by 5.1% and page view increased by 5.5%.Comment: CIKM2019 International Workshop on Entity Retrieval. arXiv admin note: text overlap with arXiv:1511.08996 by other author

    Generative Knowledge Graph Construction: A Review

    Full text link
    Generative Knowledge Graph Construction (KGC) refers to those methods that leverage the sequence-to-sequence framework for building knowledge graphs, which is flexible and can be adapted to widespread tasks. In this study, we summarize the recent compelling progress in generative knowledge graph construction. We present the advantages and weaknesses of each paradigm in terms of different generation targets and provide theoretical insight and empirical analysis. Based on the review, we suggest promising research directions for the future. Our contributions are threefold: (1) We present a detailed, complete taxonomy for the generative KGC methods; (2) We provide a theoretical and empirical analysis of the generative KGC methods; (3) We propose several research directions that can be developed in the future.Comment: Accepted to EMNLP 2022 (oral) and a public repository is available in https://github.com/zjunlp/Generative_KG_Construction_Paper

    Towards A Unified View of Answer Calibration for Multi-Step Reasoning

    Full text link
    Large Language Models (LLMs) employing Chain-of-Thought (CoT) prompting have broadened the scope for improving multi-step reasoning capabilities. Usually, answer calibration strategies such as step-level or path-level calibration play a vital role in multi-step reasoning. While effective, there remains a significant gap in our understanding of the key factors that drive their success. In this paper, we break down the design of recent answer calibration strategies and present a unified view which establishes connections between them. We then conduct a thorough evaluation on these strategies from a unified view, systematically scrutinizing step-level and path-level answer calibration across multiple paths. Our study holds the potential to illuminate key insights for optimizing multi-step reasoning with answer calibration.Comment: Working in Progres

    SPEECH: Structured Prediction with Energy-Based Event-Centric Hyperspheres

    Full text link
    Event-centric structured prediction involves predicting structured outputs of events. In most NLP cases, event structures are complex with manifold dependency, and it is challenging to effectively represent these complicated structured events. To address these issues, we propose Structured Prediction with Energy-based Event-Centric Hyperspheres (SPEECH). SPEECH models complex dependency among event structured components with energy-based modeling, and represents event classes with simple but effective hyperspheres. Experiments on two unified-annotated event datasets indicate that SPEECH is predominant in event detection and event-relation extraction tasks.Comment: Accepted by ACL 2023 Main Conference. Code is released at \url{https://github.com/zjunlp/SPEECH

    Revisiting k-NN for Pre-trained Language Models

    Full text link
    Pre-trained Language Models (PLMs), as parametric-based eager learners, have become the de-facto choice for current paradigms of Natural Language Processing (NLP). In contrast, k-Nearest-Neighbor (k-NN) classifiers, as the lazy learning paradigm, tend to mitigate over-fitting and isolated noise. In this paper, we revisit k-NN classifiers for augmenting the PLMs-based classifiers. From the methodological level, we propose to adopt k-NN with textual representations of PLMs in two steps: (1) Utilize k-NN as prior knowledge to calibrate the training process. (2) Linearly interpolate the probability distribution predicted by k-NN with that of the PLMs' classifier. At the heart of our approach is the implementation of k-NN-calibrated training, which treats predicted results as indicators for easy versus hard examples during the training process. From the perspective of the diversity of application scenarios, we conduct extensive experiments on fine-tuning, prompt-tuning paradigms and zero-shot, few-shot and fully-supervised settings, respectively, across eight diverse end-tasks. We hope our exploration will encourage the community to revisit the power of classical methods for efficient NLP\footnote{Code and datasets are available in https://github.com/zjunlp/Revisit-KNN.Comment: Work in progres

    How to Unleash the Power of Large Language Models for Few-shot Relation Extraction?

    Full text link
    Scaling language models have revolutionized widespread NLP tasks, yet little comprehensively explored few-shot relation extraction with large language models. In this paper, we investigate principal methodologies, in-context learning and data generation, for few-shot relation extraction via GPT-3.5 through exhaustive experiments. To enhance few-shot performance, we further propose task-related instructions and schema-constrained data generation. We observe that in-context learning can achieve performance on par with previous prompt learning approaches, and data generation with the large language model can boost previous solutions to obtain new state-of-the-art few-shot results on four widely-studied relation extraction datasets. We hope our work can inspire future research for the capabilities of large language models in few-shot relation extraction. Code is available in https://github.com/zjunlp/DeepKE/tree/main/example/llm.Comment: SustaiNLP Workshop@ACL 202
    • …
    corecore