11 research outputs found
Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction
Entities, as the essential elements in relation extraction tasks, exhibit
certain structure. In this work, we formulate such structure as distinctive
dependencies between mention pairs. We then propose SSAN, which incorporates
these structural dependencies within the standard self-attention mechanism and
throughout the overall encoding stage. Specifically, we design two alternative
transformation modules inside each self-attention building block to produce
attentive biases so as to adaptively regularize its attention flow. Our
experiments demonstrate the usefulness of the proposed entity structure and the
effectiveness of SSAN. It significantly outperforms competitive baselines,
achieving new state-of-the-art results on three popular document-level relation
extraction datasets. We further provide ablation and visualization to show how
the entity structure guides the model for better relation extraction. Our code
is publicly available.Comment: Accepted to AAAI 202
CLOP: Video-and-Language Pre-Training with Knowledge Regularizations
Video-and-language pre-training has shown promising results for learning
generalizable representations. Most existing approaches usually model video and
text in an implicit manner, without considering explicit structural
representations of the multi-modal content. We denote such form of
representations as structural knowledge, which express rich semantics of
multiple granularities. There are related works that propose object-aware
approaches to inject similar knowledge as inputs. However, the existing methods
usually fail to effectively utilize such knowledge as regularizations to shape
a superior cross-modal representation space. To this end, we propose a
Cross-modaL knOwledge-enhanced Pre-training (CLOP) method with Knowledge
Regularizations. There are two key designs of ours: 1) a simple yet effective
Structural Knowledge Prediction (SKP) task to pull together the latent
representations of similar videos; and 2) a novel Knowledge-guided sampling
approach for Contrastive Learning (KCL) to push apart cross-modal hard negative
samples. We evaluate our method on four text-video retrieval tasks and one
multi-choice QA task. The experiments show clear improvements, outperforming
prior works by a substantial margin. Besides, we provide ablations and insights
of how our methods affect the latent representation space, demonstrating the
value of incorporating knowledge regularizations into video-and-language
pre-training.Comment: ACM Multimedia 2022 (MM'22
Inferential Knowledge-Enhanced Integrated Reasoning for Video Question Answering
Recently, video question answering has attracted growing attention. It involves answering a question based on a fine-grained understanding of video multi-modal information. Most existing methods have successfully explored the deep understanding of visual modality. We argue that a deep understanding of linguistic modality is also essential for answer reasoning, especially for videos that contain character dialogues. To this end, we propose an Inferential Knowledge-Enhanced Integrated Reasoning method. Our method consists of two main components: 1) an Inferential Knowledge Reasoner to generate inferential knowledge for linguistic modality inputs that reveals deeper semantics, including the implicit causes, effects, mental states, etc. 2) an Integrated Reasoning Mechanism to enhance video content understanding and answer reasoning by leveraging the generated inferential knowledge. Experimental results show that our method achieves significant improvement on two mainstream datasets. The ablation study further demonstrates the effectiveness of each component of our approach
Reinspecting the Climate-Crop Yields Relationship at a Finer Scale and the Climate Damage Evaluation: Evidence from China
This paper reinvestigated the climate-crop yield relationship with the statistical model at crops’ growing stage scale. Compared to previous studies, our model introduced monthly climate variables in the production function of crops, which enables separating the yield changes induced by climate change and those caused by inputs variation and technique progress, as well as examining different climate effects during each growing stage of crops. By applying the fixed effect regression model with province-level panel data of crop yields, agricultural inputs, and the monthly climate variables of temperature and precipitation from 1985 to 2015, we found that the effects of temperature generally are negative and those of precipitation generally are positive, but they vary among different growth stages for each crop. Specifically, GDDs (i.e., growing degree days) have negative effects on spring maize’s yield except for the sowing and ripening stages; the effects of precipitation are negative in September for summer maize. Precipitation in December and the next April is significantly harmful to the yield of winter wheat; while, for the spring wheat, GDDs have positive effects during April and May, and precipitation has negative effects during the ripening period. In addition, we computed climate-induced losses based on the climate-crop yield relationship, which demonstrated a strong tendency for increasing yield losses for all crops, with large interannual fluctuations. Comparatively, the long-term climate effects on yields of spring maize, summer maize, and spring wheat are more noticeable than those of winter wheat
Capturing Sentence Relations for Answer Sentence Selection with Multi-Perspective Graph Encoding
This paper focuses on the answer sentence selection task. Unlike previous work, which only models the relation between the question and each candidate sentence, we propose Multi-Perspective Graph Encoder (MPGE) to take the relations among the candidate sentences into account and capture the relations from multiple perspectives. By utilizing MPGE as a module, we construct two answer sentence selection models which are based on traditional representation and pre-trained representation, respectively. We conduct extensive experiments on two datasets, WikiQA and SQuAD. The results show that the proposed MPGE is effective for both types of representation. Moreover, the overall performance of our proposed model surpasses the state-of-the-art on both datasets. Additionally, we further validate the robustness of our method by the adversarial examples of AddSent and AddOneSent