149 research outputs found
SemEval 2017 Task 10: ScienceIE - Extracting Keyphrases and Relations from Scientific Publications
We describe the SemEval task of extracting keyphrases and relations between them from scientific documents, which is crucial for understanding which publications describe which processes, tasks and materials. Although this was a new task, we had a total of 26 submissions across 3 evaluation scenarios. We expect the task and the findings reported in this paper to be relevant for researchers working on understanding scientific content, as well as the broader knowledge base population and information extraction communities
Faithfulness Tests for Natural Language Explanations
Explanations of neural models aim to reveal a model’s decision-making process for its predictions. However, recent work shows that current methods giving explanations such as saliency maps or counterfactuals can be misleading, as they are prone to present reasons that are unfaithful to the model’s inner workings. This work explores the challenging question of evaluating the faithfulness of natural language explanations (NLEs). To this end, we present two tests. First, we propose a counterfactual input editor for inserting reasons that lead to counterfactual predictions but are not reflected by the NLEs. Second, we reconstruct inputs from the reasons stated in the generated NLEs and check how often they lead to the same predictions. Our tests can evaluate emerging NLE models, proving a fundamental tool in the development of faithful NLEs
Stance detection with bidirectional conditional encoding
Stance detection is the task of classifying the attitude expressed in a text
towards a target such as Hillary Clinton to be "positive", negative" or
"neutral". Previous work has assumed that either the target is mentioned in the
text or that training data for every target is given. This paper considers the
more challenging version of this task, where targets are not always mentioned
and no training data is available for the test targets. We experiment with
conditional LSTM encoding, which builds a representation of the tweet that is
dependent on the target, and demonstrate that it outperforms encoding the tweet
and the target independently. Performance is improved further when the
conditional model is augmented with bidirectional encoding. We evaluate our
approach on the SemEval 2016 Task 6 Twitter Stance Detection corpus achieving
performance second best only to a system trained on semi-automatically labelled
tweets for the test target. When such weak supervision is added, our approach
achieves state-of-the-art results.Comment: 10 page
emoji2vec: Learning Emoji Representations from their Description
Many current natural language processing applications for social media rely on representation learning and utilize pre-trained word embeddings. There currently exist several publicly-available, pre-trained sets of word embeddings, but they contain few or no emoji representations even as emoji usage in social media has increased. In this paper we release emoji2vec, pre-trained embeddings for all Unicode emoji which are learned from their description in the Unicode emoji standard. The resulting emoji embeddings can be readily used in downstream social natural language processing applications alongside word2vec. We demonstrate, for the downstream task of sentiment analysis, that emoji embeddings learned from short descriptions outperforms a skip-gram model trained on a large collection of tweets, while avoiding the need for contexts in which emoji need to appear frequently in order to estimate a representation
The LEADING Guideline:Reporting Standards for Expert Panel, Best-Estimate Diagnosis, and Longitudinal Expert All Data (LEAD) Studies.
Accurate assessments of symptoms and diagnoses are essential for health research and clinical practice but face many challenges. The absence of a single error-free measure is currently addressed by assessment methods involving experts reviewing several sources of information to achieve a more accurate or best-estimate assessment. Three bodies of work spanning medicine, psychiatry, and psychology propose similar assessment methods: The Expert Panel, the Best-Estimate Diagnosis, and the Longitudinal Expert All Data (LEAD). However, the quality of such best-estimate assessments is typically very difficult to evaluate due to poor reporting of the assessment methods and when it is reported, the reporting quality varies substantially. Here we tackle this gap by developing reporting guidelines for such studies, using a four-stage approach: 1) drafting reporting standards accompanied by rationales and empirical evidence, which were further developed with a patient organization for depression, 2) incorporating expert feedback through a two-round Delphi procedure, 3) refining the guideline based on an expert consensus meeting, and 4) testing the guideline by i) having two researchers test it and ii) using it to examine the extent previously published articles report the standards. The last step also demonstrates the need for the guideline: 18 to 58% (Mean = 33%) of the standards were not reported across fifteen randomly selected studies. The LEADING guideline comprises 20 reporting standards related to four groups: The Longitudinal design; the Appropriate data; the Evaluation - experts, materials, and procedures; and the Validity group. We hope that the LEADING guideline will be useful in assisting researchers in planning, reporting, and evaluating research aiming to achieve best-estimate assessments. Open data (Delphi surveys 1 and 2), code (analyses), and material (surveys): https://osf.io/fkv4b
Distant supervision from knowledge graphs
In this chapter, we discuss approaches leveraging distant supervision for relation extraction. We start by introducing the key ideas behind distant supervision as well as their main shortcomings. We then discuss approaches that improve over the basic method, including approaches based on the at-least-one-principle along with their extensions for handling false negative labels, and approaches leveraging topic models. We also describe embeddings-based methods including methods leveraging convolutional neural networks. Finally, we discuss how to take advantage of auxiliary information to improve relation extraction
Dense Antihydrogen: Its Production and Storage to Envision Antimatter Propulsion
We discuss the possibility that dense antihydrogen could provide a path
towards a mechanism for a deep space propulsion system. We concentrate at
first, as an example, on Bose-Einstein Condensate (BEC) antihydrogen. In a
Bose-Einstein Condensate, matter (or antimatter) is in a coherent state
analogous to photons in a laser beam, and individual atoms lose their
independent identity. This allows many atoms to be stored in a small volume. In
the context of recent advances in producing and controlling BECs, as well as in
making antihydrogen, this could potentially provide a revolutionary path
towards the efficient storage of large quantities of antimatter, perhaps
eventually as a cluster or solid.Comment: 12 pages, 3 figure
Whole organisms or pure compounds? entourage effect versus drug specificity
As the therapeutic use of sacred plants and fungi becomes increasingly accepted by Western medicine, a tug of war has been taking place between those who advocate the traditional consumption of whole organisms and those who defend exclusively the utilization of purified compounds. The attempt to reduce organisms to single active principles is challenged by the sheer complexity of traditional medicine. Ayahuasca, for example, is a concoction of at least two plant species containing multiple psychoactive substances with complex interactions. Similarly, cannabis contains dozens of psychoactive substances whose specific combinations in different strains correspond to different types of therapeutic and cognitive effects. The “entourage effect” refers to the synergistic effects of the multiple compounds present in whole organisms, which may potentiate clinical efficacy while attenuating side effects. In opposition to this view, mainstream pharmacology is adamant about the need to use purified substances, presumably more specific and safe. In this chapter, I will review the evidence on both sides to discuss the scientific, economic, and political implications of this controversy. The evidence indicates that it is time to embrace the therapeutic complexity of psychedelics.2019-07-3
Post-mortem volatiles of vertebrate tissue
Volatile emission during vertebrate decay is a complex process that is understood incompletely. It depends on many factors. The main factor is the metabolism of the microbial species present inside and on the vertebrate. In this review, we combine the results from studies on volatile organic compounds (VOCs) detected during this decay process and those on the biochemical formation of VOCs in order to improve our understanding of the decay process. Micro-organisms are the main producers of VOCs, which are by- or end-products of microbial metabolism. Many microbes are already present inside and on a vertebrate, and these can initiate microbial decay. In addition, micro-organisms from the environment colonize the cadaver. The composition of microbial communities is complex, and communities of different species interact with each other in succession. In comparison to the complexity of the decay process, the resulting volatile pattern does show some consistency. Therefore, the possibility of an existence of a time-dependent core volatile pattern, which could be used for applications in areas such as forensics or food science, is discussed. Possible microbial interactions that might alter the process of decay are highlighted
- …