276 research outputs found

    Text Mining the History of Medicine

    Get PDF
    Historical text archives constitute a rich and diverse source of information, which is becoming increasingly readily accessible, due to large-scale digitisation efforts. However, it can be difficult for researchers to explore and search such large volumes of data in an efficient manner. Text mining (TM) methods can help, through their ability to recognise various types of semantic information automatically, e.g., instances of concepts (places, medical conditions, drugs, etc.), synonyms/variant forms of concepts, and relationships holding between concepts (which drugs are used to treat which medical conditions, etc.). TM analysis allows search systems to incorporate functionality such as automatic suggestions of synonyms of user-entered query terms, exploration of different concepts mentioned within search results or isolation of documents in which concepts are related in specific ways. However, applying TM methods to historical text can be challenging, according to differences and evolutions in vocabulary, terminology, language structure and style, compared to more modern text. In this article, we present our efforts to overcome the various challenges faced in the semantic analysis of published historical medical text dating back to the mid 19th century. Firstly, we used evidence from diverse historical medical documents from different periods to develop new resources that provide accounts of the multiple, evolving ways in which concepts, their variants and relationships amongst them may be expressed. These resources were employed to support the development of a modular processing pipeline of TM tools for the robust detection of semantic information in historical medical documents with varying characteristics. We applied the pipeline to two large-scale medical document archives covering wide temporal ranges as the basis for the development of a publicly accessible semantically-oriented search system. The novel resources are available for research purposes, while the processing pipeline and its modules may be used and configured within the Argo TM platform

    Gene and protein nomenclature in public databases

    Get PDF
    BACKGROUND: Frequently, several alternative names are in use for biological objects such as genes and proteins. Applications like manual literature search, automated text-mining, named entity identification, gene/protein annotation, and linking of knowledge from different information sources require the knowledge of all used names referring to a given gene or protein. Various organism-specific or general public databases aim at organizing knowledge about genes and proteins. These databases can be used for deriving gene and protein name dictionaries. So far, little is known about the differences between databases in terms of size, ambiguities and overlap. RESULTS: We compiled five gene and protein name dictionaries for each of the five model organisms (yeast, fly, mouse, rat, and human) from different organism-specific and general public databases. We analyzed the degree of ambiguity of gene and protein names within and between dictionaries, to a lexicon of common English words and domain-related non-gene terms, and we compared different data sources in terms of size of extracted dictionaries and overlap of synonyms between those. The study shows that the number of genes/proteins and synonyms covered in individual databases varies significantly for a given organism, and that the degree of ambiguity of synonyms varies significantly between different organisms. Furthermore, it shows that, despite considerable efforts of co-curation, the overlap of synonyms in different data sources is rather moderate and that the degree of ambiguity of gene names with common English words and domain-related non-gene terms varies depending on the considered organism. CONCLUSION: In conclusion, these results indicate that the combination of data contained in different databases allows the generation of gene and protein name dictionaries that contain significantly more used names than dictionaries obtained from individual data sources. Furthermore, curation of combined dictionaries considerably increases size and decreases ambiguity. The entries of the curated synonym dictionary are available for manual querying, editing, and PubMed- or Google-search via the ProThesaurus-wiki. For automated querying via custom software, we offer a web service and an exemplary client application

    Solid 4He and the Supersolid Phase: from Theoretical Speculation to the Discovery of a New State of Matter? A Review of the Past and Present Status of Research

    Full text link
    The possibility of a supersolid state of matter, i.e., a crystalline solid exhibiting superfluid properties, first appeared in theoretical studies about forty years ago. After a long period of little interest due to the lack of experimental evidence, it has attracted strong experimental and theoretical attention in the last few years since Kim and Chan (Penn State, USA) reported evidence for nonclassical rotational inertia effects, a typical signature of superfluidity, in samples of solid 4He. Since this "first observation", other experimental groups have observed such effects in the response to the rotation of samples of crystalline helium, and it has become clear that the response of the solid is extremely sensitive to growth conditions, annealing processes, and 3He impurities. A peak in the specific heat in the same range of temperatures has been reported as well as anomalies in the elastic behaviour of solid 4He with a strong resemblance to the phenomena revealed by torsional oscillator experiments. Very recently, the observation of unusual mass transport in hcp solid 4He has also been reported, suggesting superflow. From the theoretical point of view, powerful simulation methods have been used to study solid 4He, but the interpretation of the data is still rather difficult; dealing with the question of supersolidity means that one has to face not only the problem of the coexistence of quantum coherence phenomena and crystalline order, exploring the realm of spontaneous symmetry breaking and quantum field theory, but also the problem of the role of disorder, i.e., how defects, such as vacancies, impurities, dislocations, and grain boundaries, participate in the phase transition mechanism.Comment: Published on J. Phys. Soc. Jpn., Vol.77, No.11, p.11101

    eGIFT: Mining Gene Information from the Literature

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>With the biomedical literature continually expanding, searching PubMed for information about specific genes becomes increasingly difficult. Not only can thousands of results be returned, but gene name ambiguity leads to many irrelevant hits. As a result, it is difficult for life scientists and gene curators to rapidly get an overall picture about a specific gene from documents that mention its names and synonyms.</p> <p>Results</p> <p>In this paper, we present eGIFT (<url>http://biotm.cis.udel.edu/eGIFT</url>), a web-based tool that associates informative terms, called <it>i</it>Terms, and sentences containing them, with genes. To associate <it>i</it>Terms with a gene, eGIFT ranks <it>i</it>Terms about the gene, based on a score which compares the frequency of occurrence of a term in the gene's literature to its frequency of occurrence in documents about genes in general. To retrieve a gene's documents (Medline abstracts), eGIFT considers all gene names, aliases, and synonyms. Since many of the gene names can be ambiguous, eGIFT applies a disambiguation step to remove matches that do not correspond to this gene. Another additional filtering process is applied to retain those abstracts that focus on the gene rather than mention it in passing. eGIFT's information for a gene is pre-computed and users of eGIFT can search for genes by using a name or an EntrezGene identifier. <it>i</it>Terms are grouped into different categories to facilitate a quick inspection. eGIFT also links an <it>i</it>Term to sentences mentioning the term to allow users to see the relation between the <it>i</it>Term and the gene. We evaluated the precision and recall of eGIFT's <it>i</it>Terms for 40 genes; between 88% and 94% of the <it>i</it>Terms were marked as salient by our evaluators, and 94% of the UniProtKB keywords for these genes were also identified by eGIFT as <it>i</it>Terms.</p> <p>Conclusions</p> <p>Our evaluations suggest that <it>i</it>Terms capture highly-relevant aspects of genes. Furthermore, by showing sentences containing these terms, eGIFT can provide a quick description of a specific gene. eGIFT helps not only life scientists survey results of high-throughput experiments, but also annotators to find articles describing gene aspects and functions.</p

    Deuteron and antideuteron production in Au+Au collisions at sqrt(s_NN)=200 GeV

    Get PDF
    The production of deuterons and antideuterons in the transverse momentum range 1.1 < p_T < 4.3 GeV/c at mid-rapidity in Au + Au collisions at sqrt(s_NN)=200 GeV has been studied by the PHENIX experiment at RHIC. A coalescence analysis comparing the deuteron and antideuteron spectra with those of protons and antiprotons, has been performed. The coalescence probability is equal for both deuterons and antideuterons and increases as a function of p_T, which is consistent with an expanding collision zone. Comparing (anti)proton yields p_bar/p = 0.73 +/- 0.01, with (anti)deuteron yields: d_bar/d = 0.47 +/- 0.03, we estimate that n_bar/n = 0.64 +/- 0.04.Comment: 326 authors, 6 pages text, 5 figures, 1 Table. Submitted to PRL. Plain text data tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.htm

    Heavy Quarks and Heavy Quarkonia as Tests of Thermalization

    Full text link
    We present here a brief summary of new results on heavy quarks and heavy quarkonia from the PHENIX experiment as presented at the "Quark Gluon Plasma Thermalization" Workshop in Vienna, Austria in August 2005, directly following the International Quark Matter Conference in Hungary.Comment: 8 pages, 5 figures, Quark Gluon Plasma Thermalization Workshop (Vienna August 2005) Proceeding

    Production of phi mesons at mid-rapidity in sqrt(s_NN) = 200 GeV Au+Au collisions at RHIC

    Get PDF
    We present the first results of meson production in the K^+K^- decay channel from Au+Au collisions at sqrt(s_NN) = 200 GeV as measured at mid-rapidity by the PHENIX detector at RHIC. Precision resonance centroid and width values are extracted as a function of collision centrality. No significant variation from the PDG accepted values is observed. The transverse mass spectra are fitted with a linear exponential function for which the derived inverse slope parameter is seen to be constant as a function of centrality. These data are also fitted by a hydrodynamic model with the result that the freeze-out temperature and the expansion velocity values are consistent with the values previously derived from fitting single hadron inclusive data. As a function of transverse momentum the collisions scaled peripheral.to.central yield ratio RCP for the is comparable to that of pions rather than that of protons. This result lends support to theoretical models which distinguish between baryons and mesons instead of particle mass for explaining the anomalous proton yield.Comment: 326 authors, 24 pages text, 23 figures, 6 tables, RevTeX 4. To be submitted to Physical Review C as a regular article. Plain text data tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.htm

    Single Electrons from Heavy Flavor Decays in p+p Collisions at sqrt(s) = 200 GeV

    Get PDF
    The invariant differential cross section for inclusive electron production in p+p collisions at sqrt(s) = 200 GeV has been measured by the PHENIX experiment at the Relativistic Heavy Ion Collider over the transverse momentum range $0.4 <= p_T <= 5.0 GeV/c at midrapidity (eta <= 0.35). The contribution to the inclusive electron spectrum from semileptonic decays of hadrons carrying heavy flavor, i.e. charm quarks or, at high p_T, bottom quarks, is determined via three independent methods. The resulting electron spectrum from heavy flavor decays is compared to recent leading and next-to-leading order perturbative QCD calculations. The total cross section of charm quark-antiquark pair production is determined as sigma_(c c^bar) = 0.92 +/- 0.15 (stat.) +- 0.54 (sys.) mb.Comment: 329 authors, 6 pages text, 3 figures. Submitted to Phys. Rev. Lett. Plain text data tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.htm

    Nuclear Modification of Electron Spectra and Implications for Heavy Quark Energy Loss in Au+Au Collisions at sqrt(s_NN)=200 GeV

    Get PDF
    The PHENIX experiment has measured mid-rapidity transverse momentum spectra (0.4 < p_T < 5.0 GeV/c) of electrons as a function of centrality in Au+Au collisions at sqrt(s_NN)=200 GeV. Contributions from photon conversions and from light hadron decays, mainly Dalitz decays of pi^0 and eta mesons, were removed. The resulting non-photonic electron spectra are primarily due to the semi-leptonic decays of hadrons carrying heavy quarks. Nuclear modification factors were determined by comparison to non-photonic electrons in p+p collisions. A significant suppression of electrons at high p_T is observed in central Au+Au collisions, indicating substantial energy loss of heavy quarks.Comment: 330 authors, 6 pages text, 3 figures. Submitted to Phys. Rev. Lett. Plain text data tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.htm

    Mid-Rapidity Direct-Photon Production in p+p Collisions at sqrt(s) = 200 GeV

    Get PDF
    A measurement of direct photons in p+p collisions at sqrt(s)=200 GeV is presented. A photon excess above background from pi^0 --> gamma+gamma, eta --> gamma+gamma, and other decays is observed in the transverse momentum range 5.5 < p_T < 7 GeV/c. The result is compared to a next-to-leading-order perturbative QCD calculation. Within errors, good agreement is found between the QCD calculation and the measured result.Comment: 330 authors, 7 pages text, RevTeX, 2 figures, 2 tables. Submitted to Physical Review D. Plain text data tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.htm
    corecore