103,912 research outputs found

    An enhanced computational feature selection method for medical synonym identification via bilingualism and multi-corpus training

    Full text link
    Medical synonym identification has been an important part of medical natural language processing (NLP). However, in the field of Chinese medical synonym identification, there are problems like low precision and low recall rate. To solve the problem, in this paper, we propose a method for identifying Chinese medical synonyms. We first selected 13 features including Chinese and English features. Then we studied the synonym identification results of each feature alone and different combinations of the features. Through the comparison among identification results, we present an optimal combination of features for Chinese medical synonym identification. Experiments show that our selected features have achieved 97.37% precision rate, 96.00% recall rate and 97.33% F1 score

    Taxonomy of the Crematogaster degeeri-species-assemblage in the Malagasy region (Hymenoptera: Formicidae)

    Get PDF
    We revise the species-level taxonomy of the Crematogaster (Crematogaster) degeerispecies-assemblage, a group of related ants occuring in Madagascar and the wider Malagasy region, and further provide an identification key to all species-groups of the genus Crematogaster in this region. Within the C. degeeri-assemblage, we recognize twelve species based upon morphological data from worker, queen and male ants, as well as genetic data from the barcode region of cytochrome oxidase I. Seven new species are described: Crematogaster alafara Blaimer sp. nov., C. bara Blaimer sp. nov., C. mafybe Blaimer sp. nov., C.maina Blaimer sp. nov., C. malahelo Blaimer sp. nov., C. masokely Blaimer sp. nov., C. ramamy Blaimer sp. nov. Crematogaster tricolor Gerstäcker, 1859 (stat. rev.) and C. dentata Dalla Torre, 1893 (stat. nov.) are raised to species level, and the following new synonymies are proposed: Crematogaster degeeri lunaris Santschi, 1928 as a synonym of C. degeeri Forel, 1886; Crematogaster sewelli improba Forel, 1907 and C. sewelli mauritiana Forel, 1907 as synonyms of C. dentata Dalla Torre, 1893, and C. pacifi ca Santschi, 1919 as a synonym of C. lobata Emery, 1895. Species descriptions, images, and distribution maps and identification keys based on worker ants, as well as on queen ants where available, are presented for all twelve species. In addition, we present a molecular gene tree for cytochrome oxidase I and summarize levels of sequence divergence within and between species of the C. degeeri-species-assemblage. Our findings are discussed in the light of previous work on Malagasy Crematogaster ants

    Grounding Gene Mentions with Respect to Gene Database Identifiers

    Get PDF
    We describe our submission for task 1B of the BioCreAtIvE competition which is concerned with grounding gene mentions with respect to databases of organism gene identifiers. Several approaches to gene identification, lookup, and disambiguation are presented. Results are presented with two possible baseline systems and a discussion of the source of precision and recall errors as well as an estimate of precision and recall for an organism-specific tagger bootstrapped from gene synonym lists and the task 1B training data. 1

    Faunal diversity of Paederus Fabricius, 1775 (Coleoptera: Staphylinidae) in Iran

    Get PDF
    Beetles of the genus Paederus sensu stricto Fabricius, 1775 (Coleoptera: Staphylinidae) are often noticed because of their potency in inducing a dermal lesion, so-called linear dermatitis. This genus, which is placed in the tribe Paederini and subfamily Paederinae of Staphylinidae, currently comprises 490 species worldwide. Our study presents a short review of the former records of Paederus spp. in Iran plus some unpublished data. Field collections were done during March-October yearly (1997-2007) in northern and southern Iran and April-June from central, eastern, western and north-western Iran (2008-2009). The present study adds four species to the Iranian fauna of the genus Paederus, which are P. brevipennis Lacordaire, 1835, P. basalis Bernhauer, 1914, P. pubescens Cameron, 1914 and P. schoenherri Czwalina, 1899. Paederus brevipennis and P. schoenherri are the first members of the subgenus Harpopaederus Scheerpeltz, 1957, ever reported from Iran. Considering previous reports, museum-deposited materials and our findings, 14 species and subspecies of the genus Paederus, which are grouped in five subgenera, occur in Iran. These subgenera are Eopaederus Scheerpeltz, Harpopaederus Scheerpeltz, Heteropaederus Scheerpeltz, Paederus Fabricius and Poederomorphus des Cottes; however P. duplex spectabilis Bernhauer, 1913 is not yet attributed to any of the 13 so-far defined subgenera

    Families and genera of mosses no longer believed to occur in sub-Saharan Africa

    Get PDF
    Twelve genera are excluded from the sub-Saharan Africa checklist based on evidence from literature or re-identification. Atractylocarpus, Chorisodontium, Ctenidium, Dicranodontium, Homalia, Isothecium, Lasiodontium, Meesia and Potamium are excluded as the collections belong to other genera, and Camptochaete, Phyllodrepanium and Ptychomnion are excluded because of evidence of mistaken (or no longer existing) localities. As a consequence, the following families no longer are known from Africa: Echinodiaceae, Lembophyllaceae, Phyllodrepaniaceae and Ptychomniaceae. Ectropothecium nishimurii O’Shea & Ochyra, nom. nov. replaces Ectropothecium mauritianum (Broth.) Nishimura, hom. illeg., and Kindbergia kenyae (Dixon ex Tosco & Piovano) O’Shea & Ochyra, comb. nov. replaces Isothecium kenyae Dixon ex Tosco & Piovano. Lasiodontium mieheanum Ochyra in S. Miehe & G. Miehe, nom. nud., is a synonym of Daltonia angustifolia Dozy & Molk. and accordingly Lasiodontium Ochyra in S. Miehe & G. Miehe, nom. nud., must be placed in synonymy with Daltonia Hook. & Taylor

    What a Nerd! Beating Students and Vector Cosine in the ESL and TOEFL Datasets

    Full text link
    In this paper, we claim that Vector Cosine, which is generally considered one of the most efficient unsupervised measures for identifying word similarity in Vector Space Models, can be outperformed by a completely unsupervised measure that evaluates the extent of the intersection among the most associated contexts of two target words, weighting such intersection according to the rank of the shared contexts in the dependency ranked lists. This claim comes from the hypothesis that similar words do not simply occur in similar contexts, but they share a larger portion of their most relevant contexts compared to other related words. To prove it, we describe and evaluate APSyn, a variant of Average Precision that, independently of the adopted parameters, outperforms the Vector Cosine and the co-occurrence on the ESL and TOEFL test sets. In the best setting, APSyn reaches 0.73 accuracy on the ESL dataset and 0.70 accuracy in the TOEFL dataset, beating therefore the non-English US college applicants (whose average, as reported in the literature, is 64.50%) and several state-of-the-art approaches.Comment: in LREC 201

    Nomina dubia and faunistic issues with New Zealand spiders(Araneae)

    Get PDF
    Attempts to clarify the identity of obscure New Zealand spider taxa have lead to the conclusion that six species are best treated as nomina dubia [Philodromus rubrofrontus Urquhart 1891 (Philodromidae); Dictyna urquhartii Roewer 1951, (Dictynidae); Linyphia albiapiata Urquhart 1891, Linyphia cruenta Urquhart 1891, Linyphia multicolor Urquhart 1891, Linyphia pellos Urquhart 1891 (Linyphiidae)]. Four species currently listed in Araneus Clerck 1757 (Araneidae) are re-affirmed as synonyms [Araneus lineaacutus (Urquhart 1887) = Zealaranea crassa (Walckenaer 1842), Araneus powelli (Urquhart 1894) = Novaranea laevigata (Urquhart 1891), Araneus sublutius (Urquhart 1892b) = Zealaranea trinotata (Urquhart 1890), Araneus ventricosellus (Roewer 1942) = Eriophora heroine (L. Koch 1871)]. An old record of Araneus brisbanae (L. Koch 1867b) (Araneidae) from New Zealand is a misidentification of Eriophora decorosa Urquhart 1894. The family Philodromidae, the genera Dictyna Sundevall 1833 (Dictynidae) and Linyphia Latreille 1804 (Linyphiidae), as well as Tharpyna munda L. Koch 1875 (Thomisidae) and Araneus brisbanae (Araneidae) are absent from New Zealand

    Mining Entity Synonyms with Efficient Neural Set Generation

    Full text link
    Mining entity synonym sets (i.e., sets of terms referring to the same entity) is an important task for many entity-leveraging applications. Previous work either rank terms based on their similarity to a given query term, or treats the problem as a two-phase task (i.e., detecting synonymy pairs, followed by organizing these pairs into synonym sets). However, these approaches fail to model the holistic semantics of a set and suffer from the error propagation issue. Here we propose a new framework, named SynSetMine, that efficiently generates entity synonym sets from a given vocabulary, using example sets from external knowledge bases as distant supervision. SynSetMine consists of two novel modules: (1) a set-instance classifier that jointly learns how to represent a permutation invariant synonym set and whether to include a new instance (i.e., a term) into the set, and (2) a set generation algorithm that enumerates the vocabulary only once and applies the learned set-instance classifier to detect all entity synonym sets in it. Experiments on three real datasets from different domains demonstrate both effectiveness and efficiency of SynSetMine for mining entity synonym sets.Comment: AAAI 2019 camera-ready versio
    corecore