336 research outputs found

    Deep Reinforcement Learning-based Image Captioning with Embedding Reward

    Full text link
    Image captioning is a challenging problem owing to the complexity in understanding the image content and diverse ways of describing it in natural language. Recent advances in deep neural networks have substantially improved the performance of this task. Most state-of-the-art approaches follow an encoder-decoder framework, which generates captions using a sequential recurrent prediction model. However, in this paper, we introduce a novel decision-making framework for image captioning. We utilize a "policy network" and a "value network" to collaboratively generate captions. The policy network serves as a local guidance by providing the confidence of predicting the next word according to the current state. Additionally, the value network serves as a global and lookahead guidance by evaluating all possible extensions of the current state. In essence, it adjusts the goal of predicting the correct words towards the goal of generating captions similar to the ground truth captions. We train both networks using an actor-critic reinforcement learning model, with a novel reward defined by visual-semantic embedding. Extensive experiments and analyses on the Microsoft COCO dataset show that the proposed framework outperforms state-of-the-art approaches across different evaluation metrics

    Inter-Personal Relation Extraction Model Based on Bidirectional GRU and Attention Mechanism

    Get PDF
    Inter-Personal Relationship Extraction is an important part of knowledge extraction and is also the fundamental work of constructing the knowledge graph of people's relationships. Compared with the traditional pattern recognition methods, the deep learning methods are more prominent in the relation extraction (RE) tasks. At present, the research of Chinese relation extraction technology is mainly based on the method of kernel function and Distant Supervision. In this paper, we propose a Chinese relation extraction model based on Bidirectional GRU network and Attention mechanism. Combining with the structural characteristics of the Chinese language, the input vector is input in the form of word vectors. Aiming at the problem of context memory, a Bidirectional GRU neural network is used to fuse the input vectors. The feature information of the word level is extracted from a sentence, and the sentence feature is extracted through the Attention mechanism of the word level. To verify the feasibility of this method, we use the distant supervision method to extract data from websites and compare it with existing relationship extraction methods. The experimental results show that Bi-directional GRU with Attention mechanism model can make full use of all the feature information of sentences, and the accuracy of Bi-directional GRU model is significantly higher than that of other neural network models without Attention mechanism

    Named Entity Recognition Using BERT BiLSTM CRF for Chinese Electronic Health Records

    Get PDF
    As the generation and accumulation of massive electronic health records (EHR), how to effectively extract the valuable medical information from EHR has been a popular research topic. During the medical information extraction, named entity recognition (NER) is an essential natural language processing (NLP) task. This paper presents our efforts using neural network approaches for this task. Based on the Chinese EHR offered by CCKS 2019 and the Second Affiliated Hospital of Soochow University (SAHSU), several neural models for NER, including BiLSTM, have been compared, along with two pre-trained language models, word2vec and BERT. We have found that the BERT-BiLSTM-CRF model can achieve approximately 75% F1 score, which outperformed all other models during the tests

    PCR: Proxy-based Contrastive Replay for Online Class-Incremental Continual Learning

    Full text link
    Online class-incremental continual learning is a specific task of continual learning. It aims to continuously learn new classes from data stream and the samples of data stream are seen only once, which suffers from the catastrophic forgetting issue, i.e., forgetting historical knowledge of old classes. Existing replay-based methods effectively alleviate this issue by saving and replaying part of old data in a proxy-based or contrastive-based replay manner. Although these two replay manners are effective, the former would incline to new classes due to class imbalance issues, and the latter is unstable and hard to converge because of the limited number of samples. In this paper, we conduct a comprehensive analysis of these two replay manners and find that they can be complementary. Inspired by this finding, we propose a novel replay-based method called proxy-based contrastive replay (PCR). The key operation is to replace the contrastive samples of anchors with corresponding proxies in the contrastive-based way. It alleviates the phenomenon of catastrophic forgetting by effectively addressing the imbalance issue, as well as keeps a faster convergence of the model. We conduct extensive experiments on three real-world benchmark datasets, and empirical results consistently demonstrate the superiority of PCR over various state-of-the-art methods.Comment: To appear in CVPR 2023. 10 pages, 8 figures and 3 table
    corecore