220 research outputs found
A Tale of Two Direction Codes in Rat Retrosplenial Cortex: Uncovering the Neural Basis of Spatial Orientation in Complex Space
Head direction (HD) cells only become active whenever a rat faces one direction and stay inactive when it faces others, producing a unimodal activity distribution. Working together in a network, HD cells are considered the neural basis supporting a sense of direction. The retrosplenial cortex (RSC) is part of the HD circuit and contains neurons that express multiple spatial signals, including a pattern of bipolar directional tuning – as recently reported in rats exploring a rotationally symmetric two-compartment space. This suggests an unexplored mechanism of the neural compass. In this thesis, I investigated whether the association between the two-way firing symmetry and twofold environment symmetry reveals a general environment symmetry-encoding property of these RSC neurons. I recorded RSC neurons in environments having onefold, twofold and fourfold symmetry. The current study showed that RSC HD cells maintained a consistent global signal, whereas other RSC directional cells showed multi-fold symmetric firing patterns that reflected environment symmetry, not just globally (across all sub-compartments) but also locally (within each sub-compartment). The analyses also showed that the pattern was independent of egocentric boundary vector coding but represented an allocentric spatial code. It means that these RSC cells use environmental cues to organise multiple singular tuning curves which sometimes are combined to form a multidirectional pattern, likely via an interaction with the global HD signal. Thus, both local and global environment symmetry are encoded by local firing patterns in subspaces. This interestingly suggests cognitive mapping and abstraction of space beyond immediate perceptual bounds in RSC. The data generated from this study provides important insights for modelling of direction computation. Taken together, I discuss how having two types of direction codes in RSC may help us to orient more accurately and flexibly in complex and ambiguous space
Attention-Based Capsule Networks with Dynamic Routing for Relation Extraction
A capsule is a group of neurons, whose activity vector represents the
instantiation parameters of a specific type of entity. In this paper, we
explore the capsule networks used for relation extraction in a multi-instance
multi-label learning framework and propose a novel neural approach based on
capsule networks with attention mechanisms. We evaluate our method with
different benchmarks, and it is demonstrated that our method improves the
precision of the predicted relations. Particularly, we show that capsule
networks improve multiple entity pairs relation extraction.Comment: To be published in EMNLP 201
Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks
We propose a distance supervised relation extraction approach for
long-tailed, imbalanced data which is prevalent in real-world settings. Here,
the challenge is to learn accurate "few-shot" models for classes existing at
the tail of the class distribution, for which little data is available.
Inspired by the rich semantic correlations between classes at the long tail and
those at the head, we take advantage of the knowledge from data-rich classes at
the head of the distribution to boost the performance of the data-poor classes
at the tail. First, we propose to leverage implicit relational knowledge among
class labels from knowledge graph embeddings and learn explicit relational
knowledge using graph convolution networks. Second, we integrate that
relational knowledge into relation extraction model by coarse-to-fine
knowledge-aware attention mechanism. We demonstrate our results for a
large-scale benchmark dataset which show that our approach significantly
outperforms other baselines, especially for long-tail relations.Comment: To be published in NAACL 201
Context-aware Deep Model for Entity Recommendation in Search Engine at Alibaba
Entity recommendation, providing search users with an improved experience via
assisting them in finding related entities for a given query, has become an
indispensable feature of today's search engines. Existing studies typically
only consider the queries with explicit entities. They usually fail to handle
complex queries that without entities, such as "what food is good for cold
weather", because their models could not infer the underlying meaning of the
input text. In this work, we believe that contexts convey valuable evidence
that could facilitate the semantic modeling of queries, and take them into
consideration for entity recommendation. In order to better model the semantics
of queries and entities, we learn the representation of queries and entities
jointly with attentive deep neural networks. We evaluate our approach using
large-scale, real-world search logs from a widely used commercial Chinese
search engine. Our system has been deployed in ShenMa Search Engine and you can
fetch it in UC Browser of Alibaba. Results from online A/B test suggest that
the impression efficiency of click-through rate increased by 5.1% and page view
increased by 5.5%.Comment: CIKM2019 International Workshop on Entity Retrieval. arXiv admin
note: text overlap with arXiv:1511.08996 by other author
Generative Knowledge Graph Construction: A Review
Generative Knowledge Graph Construction (KGC) refers to those methods that
leverage the sequence-to-sequence framework for building knowledge graphs,
which is flexible and can be adapted to widespread tasks. In this study, we
summarize the recent compelling progress in generative knowledge graph
construction. We present the advantages and weaknesses of each paradigm in
terms of different generation targets and provide theoretical insight and
empirical analysis. Based on the review, we suggest promising research
directions for the future. Our contributions are threefold: (1) We present a
detailed, complete taxonomy for the generative KGC methods; (2) We provide a
theoretical and empirical analysis of the generative KGC methods; (3) We
propose several research directions that can be developed in the future.Comment: Accepted to EMNLP 2022 (oral) and a public repository is available in
https://github.com/zjunlp/Generative_KG_Construction_Paper
Towards A Unified View of Answer Calibration for Multi-Step Reasoning
Large Language Models (LLMs) employing Chain-of-Thought (CoT) prompting have
broadened the scope for improving multi-step reasoning capabilities. Usually,
answer calibration strategies such as step-level or path-level calibration play
a vital role in multi-step reasoning. While effective, there remains a
significant gap in our understanding of the key factors that drive their
success. In this paper, we break down the design of recent answer calibration
strategies and present a unified view which establishes connections between
them. We then conduct a thorough evaluation on these strategies from a unified
view, systematically scrutinizing step-level and path-level answer calibration
across multiple paths. Our study holds the potential to illuminate key insights
for optimizing multi-step reasoning with answer calibration.Comment: Working in Progres
SPEECH: Structured Prediction with Energy-Based Event-Centric Hyperspheres
Event-centric structured prediction involves predicting structured outputs of
events. In most NLP cases, event structures are complex with manifold
dependency, and it is challenging to effectively represent these complicated
structured events. To address these issues, we propose Structured Prediction
with Energy-based Event-Centric Hyperspheres (SPEECH). SPEECH models complex
dependency among event structured components with energy-based modeling, and
represents event classes with simple but effective hyperspheres. Experiments on
two unified-annotated event datasets indicate that SPEECH is predominant in
event detection and event-relation extraction tasks.Comment: Accepted by ACL 2023 Main Conference. Code is released at
\url{https://github.com/zjunlp/SPEECH
Revisiting k-NN for Pre-trained Language Models
Pre-trained Language Models (PLMs), as parametric-based eager learners, have
become the de-facto choice for current paradigms of Natural Language Processing
(NLP). In contrast, k-Nearest-Neighbor (k-NN) classifiers, as the lazy learning
paradigm, tend to mitigate over-fitting and isolated noise. In this paper, we
revisit k-NN classifiers for augmenting the PLMs-based classifiers. From the
methodological level, we propose to adopt k-NN with textual representations of
PLMs in two steps: (1) Utilize k-NN as prior knowledge to calibrate the
training process. (2) Linearly interpolate the probability distribution
predicted by k-NN with that of the PLMs' classifier. At the heart of our
approach is the implementation of k-NN-calibrated training, which treats
predicted results as indicators for easy versus hard examples during the
training process. From the perspective of the diversity of application
scenarios, we conduct extensive experiments on fine-tuning, prompt-tuning
paradigms and zero-shot, few-shot and fully-supervised settings, respectively,
across eight diverse end-tasks. We hope our exploration will encourage the
community to revisit the power of classical methods for efficient
NLP\footnote{Code and datasets are available in
https://github.com/zjunlp/Revisit-KNN.Comment: Work in progres
How to Unleash the Power of Large Language Models for Few-shot Relation Extraction?
Scaling language models have revolutionized widespread NLP tasks, yet little
comprehensively explored few-shot relation extraction with large language
models. In this paper, we investigate principal methodologies, in-context
learning and data generation, for few-shot relation extraction via GPT-3.5
through exhaustive experiments. To enhance few-shot performance, we further
propose task-related instructions and schema-constrained data generation. We
observe that in-context learning can achieve performance on par with previous
prompt learning approaches, and data generation with the large language model
can boost previous solutions to obtain new state-of-the-art few-shot results on
four widely-studied relation extraction datasets. We hope our work can inspire
future research for the capabilities of large language models in few-shot
relation extraction. Code is available in
https://github.com/zjunlp/DeepKE/tree/main/example/llm.Comment: SustaiNLP Workshop@ACL 202
- …