2,376 research outputs found

    Mining chemical information from Open patents

    Get PDF
    RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are.Abstract Linked Open Data presents an opportunity to vastly improve the quality of science in all fields by increasing the availability and usability of the data upon which it is based. In the chemical field, there is a huge amount of information available in the published literature, the vast majority of which is not available in machine-understandable formats. PatentEye, a prototype system for the extraction and semantification of chemical reactions from the patent literature has been implemented and is discussed. A total of 4444 reactions were extracted from 667 patent documents that comprised 10 weeks' worth of publications from the European Patent Office (EPO), with a precision of 78% and recall of 64% with regards to determining the identity and amount of reactants employed and an accuracy of 92% with regards to product identification. NMR spectra reported as product characterisation data are additionally captured.Peer Reviewe

    Long distance gene flow facilitated by bird-dispersed seeds in wind-pollinated species: A story of hybridization and introgression between Juniperus ashei and J. ovata told by nrDNA and cpDNA

    Get PDF
    nrDNA and cpDNA were sequenced of J. ashei and J. ovata from populations throughout their ranges. No J. ashei populations were found to be pure in their nrDNA for every tree, however all J. ashei trees in every population contained only the pure J. ashei chloroplast type. Populations of J. ovata in trans-Pecos Texas were almost pure in both nrDNA and cp DNA. Several plants in the J. ashei range contained J. ovata-type nrDNA (White Cliffs, AR, 3/10); Ranger, TX (1/5); Waco, TX (1/12). Every J ashei population contained at least 1 plant with hybrid (heterozygous) nrDNA and 3 J. ovata populations contained putative hybrids (by nrDNA), but one population had only pure J. ovata trees. The presence of ovata germplasm within J. ashei populations seems best explained by long distance bird dispersal of J. ovata seeds (thence seedlings and J. ovata trees and hybrids) in the disjunct J. ashei populations. The reason for the absence of ovata paternal cp, which is distributed by pollen in J. ashei populations is not known. Judged by distribution of cp data, there is very little movement of cp genomes. In contrast, nrDNA polymorphisms indicate there is considerable gene flow between J. ashei and J. ovata, but primarily in the direction of J. ovata to J. ashei which may be explained by a combination of bird migration pattern and recurring and preferential F1-hybrid formation

    The semantics of Chemical Markup Language (CML): dictionaries and conventions

    Get PDF
    RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are.Abstract The semantic architecture of CML consists of conventions, dictionaries and units. The conventions conform to a top-level specification and each convention can constrain compliant documents through machine-processing (validation). Dictionaries conform to a dictionary specification which also imposes machine validation on the dictionaries. Each dictionary can also be used to validate data in a CML document, and provide human-readable descriptions. An additional set of conventions and dictionaries are used to support scientific units. All conventions, dictionaries and dictionary elements are identifiable and addressable through unique URIs.Peer Reviewe

    The semantic architecture of the World-Wide Molecular Matrix (WWMM)

    Get PDF
    RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are.Abstract The World-Wide Molecular Matrix (WWMM) is a ten year project to create a peer-to-peer (P2P) system for the publication and collection of chemical objects, including over 250, 000 molecules. It has now been instantiated in a number of repositories which include data encoded in Chemical Markup Language (CML) and linked by URIs and RDF. The technical specification and implementation is now complete. We discuss the types of architecture required to implement nodes in the WWMM and consider the social issues involved in adoption.Peer Reviewe

    Inheritance of single copy nuclear genes (SCNGs) in artificial hybrids of Hesperocyparis arizonica x H. macrocarpa: Potential for utilization in the detection of hybridization in natural populations

    Get PDF
    Analyses were performed on 18 artificial hybrids from a cross of Hesperocyparis arizonica (male parent) x H. macrocarpa (female parent) using 9 single copy nuclear genes (SCNGs). Three SCNG were found to be informative: myb, 4CL and CnAIB2. Gene myb contained 5 variable sites, of which site 89 was homozygous (CC, TT) as was site 261 (GG, AA) and useful for the detection of hybridization. All 18 hybrids were heterozygous (CT and GA) at these 2 sites as predicted in hybrids. 4CL contained 8 variable sites, of which 1 site (591) was homozygous (TT, CC) and all 18 hybrids were heterozygous (TC) at this site as expected. CnAIP2 had two variable sites: 301 (AA, AC) and 554 (AG, AA). For site 301, 8 hybrids were AA, and 10 were AC as expected. For site 554, 10 hybrids were AA and 8 were AG, so neither would be useful for unequivocally identifying hybrids. The inheritance of variable sites for the three SCNGs followed simple co-occurrence. Examination of myb in the 18 hybrids revealed 2 cases of cross-over in the pollen gametes

    OSCAR4: a flexible architecture for chemical text-mining

    Get PDF
    RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are.Abstract The Open-Source Chemistry Analysis Routines (OSCAR) software, a toolkit for the recognition of named entities and data in chemistry publications, has been developed since 2002. Recent work has resulted in the separation of the core OSCAR functionality and its release as the OSCAR4 library. This library features a modular API (based on reduction of surface coupling) that permits client programmers to easily incorporate it into external applications. OSCAR4 offers a domain-independent architecture upon which chemistry specific text-mining tools can be built, and its development and usage are discussed.Peer Reviewe

    Nuclear and chloroplast DNAs reveal diverse origins and mis-identifications of Juniperus cultivars from Windsor Gardens, UK, Part 3 of 3

    Get PDF
    Ploidy was determined for 15 plants labeled as Juniperus squamata at the Windsor Gardens, UK and revealed 12 were tetraploids (2n=4x=44) and 3 were diploids (2n=2x=22). nrDNA (ITS) and cp DNA sequencing the tetraploids found: 4 J. squamata (4x); 4 J. tibetica (4x) x J. squamata (4x); 2 J. sabina var. balkanensis (4x) x J. squamata (4x); and one J. chinensis var. sargentii (4x) x J. squamata (4x). Sequencing the 3 diploids revealed: 2 J. pingii (2x) x J. pingii (2x); and 1 J. pingii (2x)? x J. komarovii(2x)? Ploidy analyses of 18 additional cultivars, putatively from Juniperus davurica, J. recurva, J. rushforthiana, J. sabina, and J. virginiana revealed 6 diploids, 5 triploids and 7 tetraploids. Cultivar \u27Musgrave\u27 (4x), by DNA, was identical to J. xpfitzeriana \u27Wilhelm Pfitzer\u27 (4x). The DNA of the 5 triploids were all nearly identical to J. xpfitzeriana \u27Wilhelm Pfitzer\u27 (4x). \u27Tamariscifolia\u27 and \u27Variegata\u27 both had J. sabina var. sabina as their maternal parent, but the first had J. sabina var. balkanensis as the male parent and the second had J. sabina var. sabina as the male parent. Thus, \u27Tamariscifolia\u27 is the first discovery of a J. sabina var. balkanensis x J. s. var. sabina hybrid in cultivation. None of the 3 \u27davurica\u27 cultivars proved to be J. davurica, but rather J. chinensis var. procumbens x J. chinensis var. sargentii. Cultivars J. indica and recurva \u27densa\u27 were shown to be J. indica var. caespitosa. recurva \u27 Embley Park\u27 appears to be J. coxii x J. squamata var. wilsonii. J. wallichiana (=J. indica) 15460 was found to be J. rushforthiana, whereas J. wallichiana (15487) was discovered to be J. indica x J. rushforthiana. Cultivar virginiana \u27cannaertii\u27 was shown to be J. virginiana. Botanic gardens provide a great opportunity for species to hybridize with other species that are not in contact in nature. The species care and suitable habitat provided in a garden setting, as well as vegetative propagation methods have allowed the preservation of those rare hybrids). Identification of juniper hybrids and variants is quite imprecise. DNA barcoding of cultivated plants in botanic gardens would greatly facilitate the recognition, study and utilization of rare hybrids and somatic mutations

    Report of the Pathogenesis and Pathophysiology of Lyme Disease Subcommittee of the HHS Tick Borne Disease Working Group

    Get PDF
    An understanding of the pathogenesis and pathophysiology of Lyme disease is key to the ultimate care of patients with Lyme disease. To better understand the various mechanisms underlying the infection caused by Borrelia burgdorferi, the Pathogenesis and Pathophysiology of Lyme Disease Subcommittee was formed to review what is currently known about the pathogenesis and pathophysiology of Lyme disease, from its inception, but also especially about its ability to persist in the host. To that end, the authors of this report were assembled to update our knowledge about the infectious process, identify the gaps that exist in our understanding of the process, and provide recommendations as to how to best approach solutions that could lead to a better means to manage patients with persistent Lyme disease

    Ami - The Chemist's Amanuensis

    Get PDF
    RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are.Abstract The Ami project was a six month Rapid Innovation project sponsored by JISC to explore the Virtual Research Environment space. The project brainstormed with chemists and decided to investigate ways to facilitate monitoring and collection of experimental data. A frequently encountered use-case was identified of how the chemist reaches the end of an experiment, but finds an unexpected result. The ability to replay events can significantly help make sense of how things progressed. The project therefore concentrated on collecting a variety of dimensions of ancillary data - data that would not normally be collected due to practicality constraints. There were three main areas of investigation: 1) Development of a monitoring tool using infrared and ultrasonic sensors; 2) Time-lapse motion video capture (for example, videoing 5 seconds in every 60); and 3) Activity-driven video monitoring of the fume cupboard environs. The Ami client application was developed to control these separate logging functions. The application builds up a timeline of the events in the experiment and around the fume cupboard. The videos and data logs can then be reviewed after the experiment in order to help the chemist determine the exact timings and conditions used. The project experimented with ways in which a Microsoft Kinect could be used in a laboratory setting. Investigations suggest that it would not be an ideal device for controlling a mouse, but it shows promise for usages such as manipulating virtual molecules.Peer Reviewe

    Genomic underpinnings of head and body shape in Arctic charr ecomorph pairs

    Get PDF
    Across its Holarctic range, Arctic charr (Salvelinus alpinus) populations have diverged into distinct trophic specialists across independent replicate lakes. The major aspect of divergence between ecomorphs is in head shape and body shape, which are ecomorphological traits reflecting niche use. However, whether the genomic underpinnings of these parallel divergences are consistent across replicates was unknown but key for resolving the substrate of parallel evolution. We investigated the genomic basis of head shape and body shape morphology across four benthivore–planktivore ecomorph pairs of Arctic charr in Scotland. Through genome-wide association analyses, we found genomic regions associated with head shape (89 SNPs) or body shape (180 SNPs) separately and 50 of these SNPs were strongly associated with both body and head shape morphology. For each trait separately, only a small number of SNPs were shared across all ecomorph pairs (3 SNPs for head shape and 10 SNPs for body shape). Signs of selection on the associated genomic regions varied across pairs, consistent with evolutionary demography differing considerably across lakes. Using a comprehensive database of salmonid QTLs newly augmented and mapped to a charr genome, we found several of the head- and body-shape-associated SNPs were within or near morphology QTLs from other salmonid species, reflecting a shared genetic basis for these phenotypes across species. Overall, our results demonstrate how parallel ecotype divergences can have both population-specific and deeply shared genomic underpinnings across replicates, influenced by differences in their environments and demographic histories
    corecore