Search CORE

171 research outputs found

Last rolls of the yoyo: Assessing the human canonical protein count

Author: Southan Christopher
Publication venue: 'F1000 Research Ltd'
Publication date: 01/04/2017
Field of study

In 2004, when the protein estimate from the finished human genome was only 24,000, the surprise was compounded as reviewed estimates fell to 19,000 by 2014. However, variability in the total canonical protein counts (i.e. excluding alternative splice forms) of open reading frames (ORFs) in different annotation portals persists. This work assesses these differences and possible causes. A 16-year analysis of Ensembl and UniProtKB/Swiss-Prot shows convergence to a protein number of ~20,000. The former had shown some yo-yoing, but both have now plateaued. Nine major annotation portals, reviewed at the beginning of 2017, gave a spread of counts from 21,819 down to 18,891. The 4-way cross-reference concordance (within UniProt) between Ensembl, Swiss-Prot, Entrez Gene and the Human Gene Nomenclature Committee (HGNC) drops to 18,690, indicating methodological differences in protein definitions and experimental existence support between sources. The Swiss-Prot and neXtProt evidence criteria include mass spectrometry peptide verification and also cross-references for antibody detection from the Human Protein Atlas. Notwithstanding, hundreds of Swiss-Prot entries are classified as non-coding biotypes by HGNC. The only inference that protein numbers might still rise comes from numerous reports of small ORF (smORF) discovery. However, while there have been recent cases of protein verifications from previous miss-annotation of non-coding RNA, very few have passed the Swiss-Prot curation and genome annotation thresholds. The post-genomic era has seen both advances in data generation and improvements in the human reference assembly. Notwithstanding, current numbers, while persistently discordant, show that the earlier yo-yoing has largely ceased. Given the importance to biology and biomedicine of defining the canonical human proteome, the task will need more collaborative inter-source curation combined with broader and deeper experimental confirmation in vivo and in vitro of proteins predicted in silico. The eventual closure could be well be below ~19,000

Directory of Open Access Journals

Edinburgh Research Explorer

Retrieving GPCR data from public databases

Author: Southan Christopher
Publication venue: 'Elsevier BV'
Publication date: 01/10/2016
Field of study

Crossref

Edinburgh Research Explorer

Expanding opportunities for mining bioactive chemistry from patents

Author: Bento
Christopher Southan
Filippov
Hilpert
Ihlenfeldt
Jefferson
Jensen
Liu
Okuno
Pawson
Southan
Southan
Southan
Southan
Southan
Suriyawongkul
Publication venue: 'Elsevier BV'
Publication date: 01/02/2015
Field of study

Bioactive structures published in medicinal chemistry patents typically exceed those in papers by at least twofold and may precede them by several years. The Big-Bang of open automated extraction since 2012 has contributed to over 15 million patent-derived compounds in PubChem. While mapping between chemical structures, assay results and protein targets from patent documents is challenging, these relationships can be harvested using open tools and are beginning to be curated into databases

Elsevier - Publisher Connector

Crossref

PubMed Central

Edinburgh Research Explorer

Challenges of connecting chemistry to pharmacology: perspectives from curating the IUPHAR/BPS Guide to PHARMACOLOGY

Author: Davies Jamie
Faccenda Elena
Harding Simon
Pawson Adam
Sharman Joanna
Southan Christopher
Publication venue: 'American Chemical Society (ACS)'
Publication date: 03/05/2018
Field of study

Connecting chemistry to pharmacology (c2p) has been an objective of GtoPdb and its precursor IUPHAR-DB since 2003. This has been achieved by populating our database with expert-curated relationships between documents, assays, quantitative results, chemical structures, their locations within the documents and the protein targets in the assays (D-A-R-C-P). A wide range of challenges associated with this are described in this perspective, using illustrative examples from GtoPdb entries. Our selection process begins with judgements of pharmacological relevance and scientific quality. Even though we have a stringent focus for our small-data extraction we note that assessing the quality of papers has become more difficult over the last 15 years. We discuss ambiguity issues with the resolution of authors’ descriptions of A-R-C-P entities to standardised identifiers. We also describe developments that have made this somewhat easier over the same period both in the publication ecosystem as well as enhancements of our internal processes over recent years. This perspective concludes with a look at challenges for the future including the wider capture of mechanistic nuances and possible impacts of text mining on automated entity extractio

Crossref

Directory of Open Access Journals

Edinburgh Research Explorer

FigShare

Hydrolases in GtoPdb v.2023.1

Author: Alexander Stephen P.H.
Doherty Patrick
Fairlie David
Fowler Christopher J.
Overall Christopher M.
Rawlings Neil
Southan Christopher
Turner Anthony J.
Publication venue: 'Edinburgh University Library'
Publication date: 26/04/2023
Field of study

Listed in this section are hydrolases not accumulated in other parts of the Concise Guide, such as monoacylglycerol lipase and acetylcholinesterase. Pancreatic lipase is the predominant mechanism of fat digestion in the alimentary system; its inhibition is associated with decreased fat absorption. CES1 is present at lower levels in the gut than CES2 (P23141), but predominates in the liver, where it is responsible for the hydrolysis of many aliphatic, aromatic and steroid esters. Hormone-sensitive lipase is also a relatively non-selective esterase associated with steroid ester hydrolysis and triglyceride metabolism, particularly in adipose tissue. Endothelial lipase is secreted from endothelial cells and regulates circulating cholesterol in high density lipoproteins

IUPHAR/BPS Guide to Pharmacology CITE

Journal Hosting Service | The University of Edinburgh

Hydrolases (version 2019.4) in the IUPHAR/BPS Guide to Pharmacology Database

Author: Alexander Stephen P.H.
Doherty Patrick
Fairlie David
Fowler Christopher J.
Overall Christopher M.
Rawlings Neil
Southan Christopher
Turner Anthony J.
Publication venue: 'Edinburgh University Library'
Publication date: 16/09/2019
Field of study

Crossref

IUPHAR/BPS Guide to Pharmacology CITE

Repository@Nottingham

Journal Hosting Service | The University of Edinburgh

SynPharm: a Guide to PHARMACOLOGY database tool for designing drug control into engineered proteins

Author: Davies Jamie
Dominguez Monedero Alazne
Harding Simon
Ireland Sam
Sharman Joanna
Southan Christopher
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/07/2018
Field of study

Crossref

Directory of Open Access Journals

Edinburgh Research Explorer

Inverse Pharmacology: approaches and tools for introducing druggability into engineered proteins

Author: Davies Jamie
Dominguez Monedero Alazne
Harding Simon
Ireland Sam
Sharman Joanna
Southan Christopher
Publication venue: 'Elsevier BV'
Publication date: 05/09/2019
Field of study

Edinburgh Research Explorer

Small-molecule Bioactivity Databases

Author: Alex M. Clark
Antony J. Williams
Barry A. Bunin
Christopher Southan
Sean Ekins
Publication venue: 'Royal Society of Chemistry (RSC)'
Publication date: 01/01/2016
Field of study

Crossref

Edinburgh Research Explorer

Amino acid sequence of β-galactoside-binding bovine heart lectin Member of a novel class of vertebrate proteins

Author: Abbott William M.
Aitken Alastair
Childs Robert A.
Feizi Ten
Southan Christopher
Publication venue: Published by Elsevier B.V.
Publication date
Field of study

AbstractA variety of animal tissues contain β-galactoside-binding lectins with molecular masses in the range 13–17 kDa. There is evidence that these lectins may constitute a new protein family although their function in vivo is not yet clear. In this work the major part of the amino acid sequence of the 13 kDa lectin from bovine heart muscle has been determined. Comparison of this sequence with the cDNA-deduced sequence published for the chick embryo skin lectin showed 58% homology. Comparison of the bovine lectin sequence with partial sequences from two cDNA clones from a human hepatoma library and partial amino acid sequences of human lung lectin showed 70, 40 and 85% homology, respectively. The sequences of these vertebrate lectins are thus clearly related, supporting earlier results of immunological cross-reactivity within this group of proteins. Computer searching of protein sequence databases did not detect significant homologies between the bovine lectin sequence and other known proteins

Elsevier - Publisher Connector