Search CORE

101 research outputs found

Chemoinformatics Research at the University of Sheffield: A History and Citation Analysis

Author: Bishop N.
Gillet V.J.
Holliday J.D.
Willett P.
Publication venue: 'SAGE Publications'
Publication date: 01/07/2003
Field of study

This paper reviews the work of the Chemoinformatics Research Group in the Department of Information Studies at the University of Sheffield, focusing particularly on the work carried out in the period 1985-2002. Four major research areas are discussed, these involving the development of methods for: substructure searching in databases of three-dimensional structures, including both rigid and flexible molecules; the representation and searching of the Markush structures that occur in chemical patents; similarity searching in databases of both two-dimensional and three-dimensional structures; and compound selection and the design of combinatorial libraries. An analysis of citations to 321 publications from the Group shows that it attracted a total of 3725 residual citations during the period 1980-2002. These citations appeared in 411 different journals, and involved 910 different citing organizations from 54 different countries, thus demonstrating the widespread impact of the Group's work

Crossref

White Rose Research Online

7th German Conference on Chemoinformatics: 25 CIC-Workshop : Goslar, Germany, 6 - 8 November 2011 ; meeting abstracts / Edited by Frank Oellien, Uli Fechner and Thomas Engel

Author: Engel Thomas
Fechner Uli
Oellien Frank
Publication venue
Publication date: 01/05/2012
Field of study

Hochschulschriftenserver - Universität Frankfurt am Main

Recommended from our members

Information extraction from chemical patents

Author: Jessop David M
Publication venue: University of Cambridge
Publication date: 15/03/2011
Field of study

The automated extraction of semantic chemical data from the existing literature is demonstrated. For reasons of copyright, the work is focused on the patent literature, though the methods are expected to apply equally to other areas of the chemical literature. Hearst Patterns are applied to the patent literature in order to discover hyponymic relations describing chemical species. The acquired relations are manually validated to determine the precision of the determined hypernyms (85.0%) and of the asserted hyponymic relations (94.3%). It is demonstrated that the system acquires relations that are not present in the ChEBI ontology, suggesting that it could function as a valuable aid to the ChEBI curators. The relations discovered by this process are formalised using the Web Ontology Language (OWL) to enable re-use. PatentEye – an automated system for the extraction of reactions from chemical patents and their conversion to Chemical Markup Language (CML) – is presented. Chemical patents published by the European Patent Office over a ten-week period are used to demonstrate the capability of PatentEye – 4444 reactions are extracted with a precision of 78% and recall of 64% with regards to determining the identity and amount of reactants employed and an accuracy of 92% with regards to product identification. NMR spectra are extracted from the text using OSCAR3, which is developed to greatly increase recall. The resulting system is presented as a significant advancement towards the large-scale and automated extraction of high-quality reaction information. Extended Polymer Markup Language (EPML), a CML dialect for the description of Markush structures as they are presented in the literature, is developed. Software to exemplify and to enable substructure searching of EPML documents is presented. Further work is recommended to refine the language and code to publication-quality before they are presented to the community.Unileve

Apollo (Cambridge)

ClassyFire: automated chemical classification with a comprehensive, computable taxonomy

Author: Christoph Steinbeck
Craig Knox
David S. Wishart
Eoin Fahy
Evan Bolton
Gareth Owen
Janna Hastings
Leonid Chepelev
Roman Eisner
Russell Greiner
Shankar Subramanian
Yannick Djoumbou Feunang
Publication venue: Springer Nature
Publication date: 01/01/2016
Field of study

Additional file 5. Use cases. Text-based search on the ClassyFire web server. (A) Building the query. (B) Sparteine, one of the returned compounds

Springer - Publisher Connector

FigShare

Development of deep learning applications for the automated extraction of chemical information from scientific literature

Author: Brinkhaus Otto
Publication venue
Publication date: 01/01/2023
Field of study

This dissertation focuses on developing deep learning applications for extracting chemical information from scientific literature, particularly targeting the automated recognition of molecular structures in images. DECIMER Segmentation, a novel application, employs a Mask Region-based Convolutional Neural Network (MRCNN) model to segment chemical structures in documents, aided by a mask expansion algorithm, marking a significant advancement in processing chemical literature. The Optical Chemical Structure Recognition (OCSR) tool DECIMER Image Transformer uses an encoder-decoder architecture to convert chemical structure depictions into the machine-readable SMILES format. The model has been trained on over 450 million pairs of images and SMILES representations. Its ability to interpret various depiction styles, including hand-drawn structures, sets a new standard in OCSR. To artificially generate large and diverse OCSR training datasets using multiple cheminformatics toolkits, RanDepict was developed. The diversification of training data ensures robust model generalisation across different chemical structure depictions. A unique dataset of hand-drawn molecule images was created to evaluate the model's performance in interpreting these challenging depictions. This dataset further contributes to the understanding of automated structure recognition from diverse styles. The integration of these technologies led to the creation of DECIMER.ai, an open-source web application that combines segmentation and interpretation tools, allowing users to extract and process chemical information from literature efficiently. The work concludes with a discussion on the significance of open data in advancing molecular informatics, highlighting the potential to broader chemical research domains. By adhering to FAIR data standards and open-source principles, the tools developed for this dissertation are designed for adaptability and future development within the community

Digitale Bibliothek Thüringen

Patent Database: Their Importance in Prior Art Documentation and Patent Search

Author: Chakraborty Kajal
Singh Vikram
Vincent Lavina
Publication venue
Publication date: 01/01/2016
Field of study

In knowledge based economies the nation’s economic status depends on the production, distribution and use of knowledge and information. The recent trend in the economic growth of nations is mainly determined by innovative technological knowhow of the individuals. Intellectual property has gained attention in this era of knowledge. The vast amount of data generated through the application of intellectual assets is managed with the help of various in- silico tools. In recent days, the patent databases have gained importance due to the detailed information available on the granted patent and other details, such as, legal status of the patent applications, which are not available through any other literature search. This review paper attempts to describe different types of patent databases available, their unique features, strengths, weakness and their major purpose. This paper details the information on how to access a patent database, the relevance of patent information obtained from these databases in prior art search, patent analysis, and the drawbacks present in these patent databases

CMFRI Digital Repository

NOPR

Information retrieval and text mining technologies for chemistry

Author: Abacha A. B.
Alberts D.
Alfonso Valencia
American Chemical Society
Anália Lourenço
Aphinyanaphongs Y.
Appelt D. E.
Aramaki E.
Aronson A. R.
Asahara M.
Babych B.
Baeza-Yates R.
Bambenek J.
Barnard J. M.
Bast H.
Batista-Navarro R.
Batista-Navarro R. T.
Bian J.
Bies A.
Bikel D. M.
Blaschke C.
Brecher J. S.
Brill E.
Bunescu R.
Bunescu R. C.
Califf M. E.
Carpenter B.
Caruana R.
Chee B. W.
Chhieng D.
Chinchor N.
Chiticariu L.
Chowdhury M. F. M.
Chowdhury M. F. M.
Ciravegna F.
Cleverdon C. W.
Coden A.
Cohen R.
Collier N.
Corbett P.
Corbett P.
Cover T. M.
Craven M.
Cummings M. D.
Currano J. N.
Currano J. N.
Currano J. N.
Currano J. N.
Cutting D. R.
Davis C. H.
Dieb T. M.
Dieb T. M.
Dogan R. I.
Downs G. M.
Dunikowski L. G.
Embarek M.
Eom J.-H.
Faber J.
Fall C. J.
Fattore M.
Fennell R. W.
Freund Y.
Fujiyoshi A.
Fukuda K.
Gale W. A.
Garcelon N.
Garnier J.-P.
Garten Y.
Ginn R.
Giuliano C.
Gold S.
Grefenstette G.
Grishman R.
Gurulingappa H.
Gurulingappa H.
Gusfield D.
He Y.
Hearst M. A.
Hersh W.
Hersh W.
Hirschman L.
Hobbs J. R.
Hodge G. M.
Holzinger A.
Hsueh P.-Y.
Huber T.
Iyer S. V
Jackson P.
Joachims T.
Johnson D.
Jonnalagadda S.
Jonnalagadda S.
Julen Oyarzabal
Jurafsky D.
Kaewphan S.
Kaewphan S.
Karkaletsis V.
Katragadda S.
Kazama J.
Kazawa H.
Kelly L.
Kenny P. W.
Kim J.-D.
Kim Y.
Kleene S. C.
Kolárik C.
Kongburan W.
Kornai A.
Kraaij W.
Krallinger M.
Krallinger M.
Krallinger M.
Kremer G.
Kreuzthaler M.
Kucera H.
Lai H.
Lawson A. J.
Leaman R.
Leaman R.
Lee C.-H.
Levenshtein V. I.
Levin M. A.
Li J.
Li N.
Li Y.
Liu X.
Locke W. N.
Lovins J. B.
Lowe D. M.
Lupu M.
Lupu M.
Mackenzie C. E.
Manning C. D.
Mansouri A.
Martin E.
Martin Krallinger
Mattmann C.
Maynard D.
McCallum A.
McEwen L.
McKnight L.
McNaught A.
Meystre S. M.
Michalski S. R.
Michie D.
Mihalcea R.
Mitton R.
Miwa M.
Mollá D.
Murray-Rust P.
Müller B.
Nebel A.
Nikfarjam A.
Névéol A.
Névéol A.
Obdulia Rabal
Pang B.
Panico R.
Perez-Iratxeta C.
Ponomareva N.
Ratinov L.
Ratnaparkhi A.
Read J.
Rebholz-Schuhmann D.
Reeker L. H.
Rocchio J. J.
Rohbeck H.-G.
Rosario B.
Roth D. L.
Rupp C. J.
Rupp C. J.
Sagae K.
Salim N.
Salton G.
Sanchez-Cisneros D.
Saracevic T.
Sasaki Y.
Schapire R. E.
Schenck R.
Schenck R. J.
Schlaf A.
Schuemie M. J.
Segura Bedmar I.
Segura-Bedmar I.
Sekine S.
Sequeira E.
Settles B.
Settles B.
Sewell W.
Shen D.
Shidha M. V
Singhal A.
Smith E. G.
Stamatatos E.
Sutton C.
Sætre R.
Taylor K. T.
Tharatipyakul A.
Tomanek K.
Tomanek K.
Tsuruoka Y.
Tsuruoka Y.
Täger W.
Urbain J.
van Rijsbergen C. J.
Vapnik V. N.
Vasserman A.
Visweswaran S.
Voorhees E. M.
Wang W.
Wang Y.
Wei C.-H.
Wei C.-H.
Wermter J.
Wilbur W. J.
Willett P.
Willett P.
Williams A. J.
Witten I. H.
Workman M. L.
Wrublewski D. T.
Xu R.
Xue N.
Yan S.
Yang C.
Yang C. C.
Yang Y.
Zass E.
Zipf G. K.
Zipf G. K.
Zitnik S.
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2017
Field of study

Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.A.V. and M.K. acknowledge funding from the European Community’s Horizon 2020 Program (project reference: 654021 - OpenMinted). M.K. additionally acknowledges the Encomienda MINETAD-CNIO as part of the Plan for the Advancement of Language Technology. O.R. and J.O. thank the Foundation for Applied Medical Research (FIMA), University of Navarra (Pamplona, Spain). This work was partially funded by Consellería de Cultura, Educación e Ordenación Universitaria (Xunta de Galicia), and FEDER (European Union), and the Portuguese Foundation for Science and Technology (FCT) under the scope of the strategic funding of UID/BIO/04469/2013 unit and COMPETE 2020 (POCI-01-0145-FEDER-006684). We thank Iñigo Garciá -Yoldi for useful feedback and discussions during the preparation of the manuscript.info:eu-repo/semantics/publishedVersio

Universidade do Minho: RepositoriUM

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Science Inside Law: The Making of a New Patent Class in the International Patent Classification

Author: Kang Hyo Yoon
Publication venue
Publication date: 02/08/2017
Field of study

Recent studies of patents have argued that the very materiality and techniques of legal media, such as the written patent document, are vital for the legal construction of a patentable invention. Developing the centrality placed on patent documents further, it becomes important to understand how these documents are ordered and mobilized. Patent classification answers the necessity of making the virtual nature of textual claims practicable by linking written inscription to bureaucracy. Here, the epistemological organization of documents overlaps with the grid of patent administration. How are scientific inventions represented in such a process? If we examine the process of creating a new patent category within the International Patent Classification (IPC), it becomes clear that disagreements about the substance of the novel inventive subject matter have been resolved by computer simulations of patent documents in draft classifications. The practical needs of patent examiners were the most important concerns in the making of a new category. Such a lack of epistemological mediation between the scientific and legal identities of an invention depicts a legal understanding that science is already inside patent law. From an internal legal perspective, the self-referential introduction of the new patent category may make practical sense; however it becomes problematic from a technological and scientific standpoint as the remit of the patent classification also affects other social contexts and practice

RERO DOC Digital Library

Towards Inference of a Biochemical Ontology From a Metabolic Database

Author: Brooksbank
Corey
Feldman
Goto
Holcomb Wise
Krieger
The Gene Ontology Consortium.
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2005
Field of study

In order to predict the metabolic fate of an arbitrary compound based solely on structure, it is useful to be able to identify substructural ‘functional groups’ that are biochemically reactive. These functional groups are the substructural elements that can be removed and replaced to transform one compound into another. This problem of identifying functional groups is related to the problem of classifying compounds. The research presented here discusses the state of the art in biochemical databases and how these sources may be applied to the problem of classifying compounds based solely on structure. We describe a biochemical informatics system for processing molecular data and describe how 100 255 compositional (hasA) relationships are inferred between 835 abstractions and 9500 metabolites from the KEGG Ligand database. Specifically, we focus on the identification of amino acids and consider ways in which the inference of biochemical ontologies for metabolites will be improved in the future

Crossref

PubMed Central

Recommended from our members

Chemical Information Bulletin

Author: American Chemical Society. Division of Chemical Information.
Currano Judith N.
Publication venue: American Chemical Society. Division of Chemical Information.
Publication date
Field of study

Created as a supplement for "the regular journals of the American Chemical Society," this publication contains annotated bibliographies of chemical documentation literature as well as information about meetings, conferences, awards, scholarships, and other news from the American Chemical Society (ACS) Division of Chemical Information (CINF)

UNT Digital Library