10 research outputs found

    Recurrent neural networks with specialized word embeddings for health-domain named-entity recognition

    Full text link
    © 2017 Elsevier Inc. Background Previous state-of-the-art systems on Drug Name Recognition (DNR) and Clinical Concept Extraction (CCE) have focused on a combination of text “feature engineering” and conventional machine learning algorithms such as conditional random fields and support vector machines. However, developing good features is inherently heavily time-consuming. Conversely, more modern machine learning approaches such as recurrent neural networks (RNNs) have proved capable of automatically learning effective features from either random assignments or automated word “embeddings”. Objectives (i) To create a highly accurate DNR and CCE system that avoids conventional, time-consuming feature engineering. (ii) To create richer, more specialized word embeddings by using health domain datasets such as MIMIC-III. (iii) To evaluate our systems over three contemporary datasets. Methods Two deep learning methods, namely the Bidirectional LSTM and the Bidirectional LSTM-CRF, are evaluated. A CRF model is set as the baseline to compare the deep learning systems to a traditional machine learning approach. The same features are used for all the models. Results We have obtained the best results with the Bidirectional LSTM-CRF model, which has outperformed all previously proposed systems. The specialized embeddings have helped to cover unusual words in DrugBank and MedLine, but not in the i2b2/VA dataset. Conclusions We present a state-of-the-art system for DNR and CCE. Automated word embeddings has allowed us to avoid costly feature engineering and achieve higher accuracy. Nevertheless, the embeddings need to be retrained over datasets that are adequate for the domain, in order to adequately cover the domain-specific vocabulary

    ReWE: Regressing Word Embeddings for Regularization of Neural Machine Translation Systems

    Full text link
    Regularization of neural machine translation is still a significant problem, especially in low-resource settings. To mollify this problem, we propose regressing word embeddings (ReWE) as a new regularization technique in a system that is jointly trained to predict the next word in the translation (categorical value) and its word embedding (continuous value). Such a joint training allows the proposed system to learn the distributional properties represented by the word embeddings, empirically improving the generalization to unseen sentences. Experiments over three translation datasets have showed a consistent improvement over a strong baseline, ranging between 0.91 and 2.54 BLEU points, and also a marked improvement over a state-of-the-art system.Comment: Accepted at NAACL-HLT 201

    RewardsOfSum: Exploring Reinforcement Learning Rewards for Summarisation

    Full text link
    To date, most abstractive summarisation models have relied on variants of the negative log-likelihood (NLL) as their training objective. In some cases, reinforcement learning has been added to train the models with an objective that is closer to their evaluation measures (e.g. ROUGE). However, the reward function to be used within the reinforcement learning approach can play a key role for performance and is still partially unexplored. For this reason, in this paper, we propose two reward functions for the task of abstractive summarisation: the first function, referred to as RwB-Hinge, dynamically selects the samples for the gradient update. The second function, nicknamed RISK, leverages a small pool of strong candidates to inform the reward. In the experiments, we probe the proposed approach by fine-tuning an NLL pre trained model over nine summarisation datasets of diverse size and nature. The experimental results show a consistent improvement over the negative log-likelihood baselines.Comment: 5th Workshop on Structured Prediction for NLP; held in conjunction with ACL-IJCNLP 202

    BERTTune: Fine-Tuning Neural Machine Translation with BERTScore

    Get PDF

    Leveraging Discourse Rewards for Document-Level Neural Machine Translation

    Full text link
    Document-level machine translation focuses on the translation of entire documents from a source to a target language. It is widely regarded as a challenging task since the translation of the individual sentences in the document needs to retain aspects of the discourse at document level. However, document-level translation models are usually not trained to explicitly ensure discourse quality. Therefore, in this paper we propose a training approach that explicitly optimizes two established discourse metrics, lexical cohesion (LC) and coherence (COH), by using a reinforcement learning objective. Experiments over four different language pairs and three translation domains have shown that our training approach has been able to achieve more cohesive and coherent document translations than other competitive approaches, yet without compromising the faithfulness to the reference translation. In the case of the Zh-En language pair, our method has achieved an improvement of 2.46 percentage points (pp) in LC and 1.17 pp in COH over the runner-up, while at the same time improving 0.63 pp in BLEU score and 0.47 pp in F_BERT.Comment: Accepted at COLING 202

    A Shared Attention Mechanism for Interpretation of Neural Automatic Post-Editing Systems

    Full text link
    Automatic post-editing (APE) systems aim to correct the systematic errors made by machine translators. In this paper, we propose a neural APE system that encodes the source (src) and machine translated (mt) sentences with two separate encoders, but leverages a shared attention mechanism to better understand how the two inputs contribute to the generation of the post-edited (pe) sentences. Our empirical observations have showed that when the mt is incorrect, the attention shifts weight toward tokens in the src sentence to properly edit the incorrect translation. The model has been trained and evaluated on the official data from the WMT16 and WMT17 APE IT domain English-German shared tasks. Additionally, we have used the extra 500K artificial data provided by the shared task. Our system has been able to reproduce the accuracies of systems trained with the same data, while at the same time providing better interpretability

    UHF RFID temperature sensor assisted with body-heat dissipation energy harvesting

    No full text
    The number of wireless medical wearables has increased in recent years and is revolutionizing the current healthcare system. However, the state-of-the-art systems still need to be improved, as they are bulky, battery powered, and so require maintenance. On the contrary, battery-free wearables have unlimited lifetimes, are smaller, and are cheaper. This paper describes a design of a battery free wearable system that measures the skin temperature of the human body while at the same time collects energy from body heat. The system is composed of an UHF RFID temperature sensor tag located on the arm of the patient. It is assisted with extra power supply from a power harvesting module that stores the thermal energy dissipated from the neck of the patient. This paper presents the experimental results of the stored thermal energy, and characterizes the module in different conditions, e.g., still, walking indoors, and walking outdoors. Finally, the tag is tested in a fully passive condition and when it is power assisted. Our experimental results show that the communication range of the RFID sensor is improved by 100% when measurements are done every 750 ms and by 75% when measurements are done every 1000 ms when the sensor is assisted with the power harvesting module
    corecore