177 research outputs found
Last rolls of the yoyo: Assessing the human canonical protein count
In 2004, when the protein estimate from the finished human genome was only 24,000, the surprise was compounded as reviewed estimates fell to 19,000 by 2014. However, variability in the total canonical protein counts (i.e. excluding alternative splice forms) of open reading frames (ORFs) in different annotation portals persists. This work assesses these differences and possible causes. A 16-year analysis of Ensembl and UniProtKB/Swiss-Prot shows convergence to a protein number of ~20,000. The former had shown some yo-yoing, but both have now plateaued. Nine major annotation portals, reviewed at the beginning of 2017, gave a spread of counts from 21,819 down to 18,891. The 4-way cross-reference concordance (within UniProt) between Ensembl, Swiss-Prot, Entrez Gene and the Human Gene Nomenclature Committee (HGNC) drops to 18,690, indicating methodological differences in protein definitions and experimental existence support between sources. The Swiss-Prot and neXtProt evidence criteria include mass spectrometry peptide verification and also cross-references for antibody detection from the Human Protein Atlas. Notwithstanding, hundreds of Swiss-Prot entries are classified as non-coding biotypes by HGNC. The only inference that protein numbers might still rise comes from numerous reports of small ORF (smORF) discovery. However, while there have been recent cases of protein verifications from previous miss-annotation of non-coding RNA, very few have passed the Swiss-Prot curation and genome annotation thresholds. The post-genomic era has seen both advances in data generation and improvements in the human reference assembly. Notwithstanding, current numbers, while persistently discordant, show that the earlier yo-yoing has largely ceased. Given the importance to biology and biomedicine of defining the canonical human proteome, the task will need more collaborative inter-source curation combined with broader and deeper experimental confirmation in vivo and in vitro of proteins predicted in silico. The eventual closure could be well be below ~19,000
Expanding opportunities for mining bioactive chemistry from patents
Bioactive structures published in medicinal chemistry patents typically exceed those in papers by at least twofold and may precede them by several years. The Big-Bang of open automated extraction since 2012 has contributed to over 15 million patent-derived compounds in PubChem. While mapping between chemical structures, assay results and protein targets from patent documents is challenging, these relationships can be harvested using open tools and are beginning to be curated into databases
Challenges of connecting chemistry to pharmacology: perspectives from curating the IUPHAR/BPS Guide to PHARMACOLOGY
Connecting chemistry
to pharmacology (c2p) has been an objective of GtoPdb and its precursor
IUPHAR-DB since 2003. This has been achieved by populating our database with
expert-curated relationships between documents, assays, quantitative results,
chemical structures, their locations within the documents and the protein
targets in the assays (D-A-R-C-P). A
wide range of challenges associated with this are described in this perspective,
using illustrative examples from GtoPdb entries. Our selection process begins with judgements
of pharmacological relevance and scientific quality. Even though we have a stringent focus for our
small-data extraction we note that assessing the quality of papers has become
more difficult over the last 15 years. We discuss ambiguity issues with the
resolution of authors’ descriptions of A-R-C-P entities to standardised
identifiers. We also describe developments that have made this somewhat easier
over the same period both in the publication ecosystem as well as enhancements
of our internal processes over recent years.
This perspective concludes with a look at challenges for the future
including the wider capture of mechanistic nuances and possible impacts of text
mining on automated entity extractio
Hydrolases in GtoPdb v.2023.1
Listed in this section are hydrolases not accumulated in other parts of the Concise Guide, such as monoacylglycerol lipase and acetylcholinesterase. Pancreatic lipase is the predominant mechanism of fat digestion in the alimentary system; its inhibition is associated with decreased fat absorption. CES1 is present at lower levels in the gut than CES2 (P23141), but predominates in the liver, where it is responsible for the hydrolysis of many aliphatic, aromatic and steroid esters. Hormone-sensitive lipase is also a relatively non-selective esterase associated with steroid ester hydrolysis and triglyceride metabolism, particularly in adipose tissue. Endothelial lipase is secreted from endothelial cells and regulates circulating cholesterol in high density lipoproteins
Hydrolases & Lipases in GtoPdb v.2023.3
Listed in this section are hydrolases not accumulated in other parts of the Concise Guide, such as monoacylglycerol lipase and acetylcholinesterase. Pancreatic lipase is the predominant mechanism of fat digestion in the alimentary system; its inhibition is associated with decreased fat absorption. CES1 is present at lower levels in the gut than CES2 (P23141), but predominates in the liver, where it is responsible for the hydrolysis of many aliphatic, aromatic and steroid esters. Hormone-sensitive lipase is also a relatively non-selective esterase associated with steroid ester hydrolysis and triglyceride metabolism, particularly in adipose tissue. Endothelial lipase is secreted from endothelial cells and regulates circulating cholesterol in high density lipoproteins
Hydrolases (version 2019.4) in the IUPHAR/BPS Guide to Pharmacology Database
Listed in this section are hydrolases not accumulated in other parts of the Concise Guide, such as monoacylglycerol lipase and acetylcholinesterase. Pancreatic lipase is the predominant mechanism of fat digestion in the alimentary system; its inhibition is associated with decreased fat absorption. CES1 is present at lower levels in the gut than CES2 (P23141), but predominates in the liver, where it is responsible for the hydrolysis of many aliphatic, aromatic and steroid esters. Hormone-sensitive lipase is also a relatively non-selective esterase associated with steroid ester hydrolysis and triglyceride metabolism, particularly in adipose tissue. Endothelial lipase is secreted from endothelial cells and regulates circulating cholesterol in high density lipoproteins
- …