11 research outputs found
Recommended from our members
QSAR-derived affinity fingerprints (part 2): modeling performance for potency prediction.
Affinity fingerprints report the activity of small molecules across a set of assays, and thus permit to gather information about the bioactivities of structurally dissimilar compounds, where models based on chemical structure alone are often limited, and model complex biological endpoints, such as human toxicity and in vitro cancer cell line sensitivity. Here, we propose to model in vitro compound activity using computationally predicted bioactivity profiles as compound descriptors. To this aim, we apply and validate a framework for the calculation of QSAR-derived affinity fingerprints (QAFFP) using a set of 1360 QSAR models generated using Ki, Kd, IC50 and EC50 data from ChEMBL database. QAFFP thus represent a method to encode and relate compounds on the basis of their similarity in bioactivity space. To benchmark the predictive power of QAFFP we assembled IC50 data from ChEMBL database for 18 diverse cancer cell lines widely used in preclinical drug discovery, and 25 diverse protein target data sets. This study complements part 1 where the performance of QAFFP in similarity searching, scaffold hopping, and bioactivity classification is evaluated. Despite being inherently noisy, we show that using QAFFP as descriptors leads to errors in prediction on the test set in the ~ 0.65-0.95 pIC50 units range, which are comparable to the estimated uncertainty of bioactivity data in ChEMBL (0.76-1.00 pIC50 units). We find that the predictive power of QAFFP is slightly worse than that of Morgan2 fingerprints and 1D and 2D physicochemical descriptors, with an effect size in the 0.02-0.08 pIC50 units range. Including QSAR models with low predictive power in the generation of QAFFP does not lead to improved predictive power. Given that the QSAR models we used to compute the QAFFP were selected on the basis of data availability alone, we anticipate better modeling results for QAFFP generated using more diverse and biologically meaningful targets. Data sets and Python code are publicly available at https://github.com/isidroc/QAFFP_regression
Recommended from our members
QSAR-derived affinity fingerprints (part 2): modeling performance for potency prediction.
Affinity fingerprints report the activity of small molecules across a set of assays, and thus permit to gather information about the bioactivities of structurally dissimilar compounds, where models based on chemical structure alone are often limited, and model complex biological endpoints, such as human toxicity and in vitro cancer cell line sensitivity. Here, we propose to model in vitro compound activity using computationally predicted bioactivity profiles as compound descriptors. To this aim, we apply and validate a framework for the calculation of QSAR-derived affinity fingerprints (QAFFP) using a set of 1360 QSAR models generated using Ki, Kd, IC50 and EC50 data from ChEMBL database. QAFFP thus represent a method to encode and relate compounds on the basis of their similarity in bioactivity space. To benchmark the predictive power of QAFFP we assembled IC50 data from ChEMBL database for 18 diverse cancer cell lines widely used in preclinical drug discovery, and 25 diverse protein target data sets. This study complements part 1 where the performance of QAFFP in similarity searching, scaffold hopping, and bioactivity classification is evaluated. Despite being inherently noisy, we show that using QAFFP as descriptors leads to errors in prediction on the test set in the ~ 0.65-0.95 pIC50 units range, which are comparable to the estimated uncertainty of bioactivity data in ChEMBL (0.76-1.00 pIC50 units). We find that the predictive power of QAFFP is slightly worse than that of Morgan2 fingerprints and 1D and 2D physicochemical descriptors, with an effect size in the 0.02-0.08 pIC50 units range. Including QSAR models with low predictive power in the generation of QAFFP does not lead to improved predictive power. Given that the QSAR models we used to compute the QAFFP were selected on the basis of data availability alone, we anticipate better modeling results for QAFFP generated using more diverse and biologically meaningful targets. Data sets and Python code are publicly available at https://github.com/isidroc/QAFFP_regression
Polytematický strukturovaný heslář a jeho potenciál v oblasti třídění a zpřístupňování webových dokumentů
NTK - National Technical LibraryCZCzech Republi
Polythematic Structured Subject Heading System and its potential in the area of indexing and searching web documents 000000026 506__ aPublic
PSH je jedním ze zástupců systémů organizace znalostí a je proto určen k předmětové indexaci dokumentů tak, aby mohly být později efektivněji vyhledány. Zatímco v minulosti byla předmětová indexace dokumentů doménou profesionálů – knihovníků, nyní se tento druh intelektuální činnosti přesouvá také do rukou běžných uživatelů. V roce 2009 došlo v oblasti PSH k několika významným změnám. Heslář je nyní zveřejněn v sémantickém formátu Simple Knowledge Organisation System (SKOS) v souladu s principy linked data pod licencí Creative Commons. SKOS slouží k reprezentaci systémů organizace znalostí a je postaven na W3C standardech RDF (Resource Description Framework) a RDFS (RDF - Schema) za účelem zpřístupnění strukturovaných slovníků pro sémantický web. Vzhledem k používání globálně unikátních URI (Uniform Resource Identifier) jako identifikátorů je možné na hesla PSH jednoduše odkazovat a vytvářet vazby s dalšími informačními zdroji. V případě PSH to znamená, že u některých hesel lze nalézt odkazy na předmětová hesla Kongresové knihovny (Library of Congress Subject Headings) a DBPedii. Jednou z novinek zpřístupněných na stránkách Národní technické knihovny v sekci PSH je funkce pro vytváření úryvků metadat s hesly PSH ve formátech Dublin Core a Common Tag. Prostřednictvím zápisu RDF v atributech (RDFa) je možné jejich zakomponování přímo do těla dokumentů ve formátu (X)HTML bez vlivu na výsledné zobrazení. V současnosti PSH neslouží pouze jako prostředek pro věcnou selekci bibliografických záznamů, ale jeho potenciál je rozvíjen také v oblasti třídění a zpřístupňování webových dokumentů
Integration of an Automatic Indexing System within the Document Flow of a Grey Literature Repository
The Web empowered the authors of grey literature to publish their work on their own. In case of self-published works their author is also their indexer. And because not many of the grey literature authors are professional indexers, this may result in poor or no indexing. Even though the Web made publishing easier, indexing is still hard. Nevertheless, we believe that the web technologies and machine learning algorithms may help to reduce the cognitive overhead involved in indexing, and make it eventually as easy as publishing on the Web is
Committee for Coordination of Polythematic Structured Subject Heading System
Prezentace rekapituluje činnost referátu PSH v roce 2011. Zaměřuje se na spolupráci referátu PSH a Wikipedie v oblasti vzájemného linkování, na výsledky analýzy logu vyhledávání v katalogu NTK a na nové služby PSH Manager online a rozhraní pro automatickou indexaci dokumentů hesly PSH. Prezentace dále představuje propagaci PSH v roce 2011, stejně jako uvádí plán činností na rok 2012 a shrnuje hlavní body Koncepce rozvoje Polytematického strukturovaného hesláře (PSH)
Will the Chemical Probes Please Stand Up?
This work provides a uniquely comprehensive and comparative overview of chemical probe sources and targets. This will be maintained and expanded for experts and non-informaticions seeking chemical probes to use in their work. We have analysed 940 experimental and 3,670 calculated probe candidates. Together these have evidence of specific binding for 796 human proteins across the target classes. We have flagged unsuitable (i.e. potentially misleading and resource-wasting) compounds from both probe groups. Compared to ChEMBL approved drugs, probes tend to be larger and more complex structures
How to automatically index documents with Polythematic Structured Subject Headings System
Přednáška se věnuje otázce, jak automaticky přiřadit dokumentům hesla Polytematického strukturovaného hesláře (PSH)