Search CORE

263,713 research outputs found

Industry-scale application and evaluation of deep learning for drug target prediction

Author: Ashby Thomas J.
Böhm Stanislav
Ceulemans Hugo
Chen Hongming
Chupakhin Vladimir
Cima Vojtěch
Engkvist Ola
Golib-Dzib Jose-Felipe
Greene Nigel
Hochreiter Sepp
Jeliazkova Nina
Klambauer Günter
Martinovič Jan
Mayr Andreas
Sturm Noe
Van Thanh Le
Vander Aa Tom
Vandriessche Yves
Wegner Joerg
Publication venue: Springer Nature
Publication date: 05/06/2019
Field of study

Artificial intelligence (AI) is undergoing a revolution thanks to the breakthroughs of machine learning algorithms in computer vision, speech recognition, natural language processing and generative modelling. Recent works on publicly available pharmaceutical data showed that AI methods are highly promising for Drug Target prediction. However, the quality of public data might be different than that of industry data due to different labs reporting measurements, different measurement techniques, fewer samples and less diverse and specialized assays. As part of a European funded project (ExCAPE), that brought together expertise from pharmaceutical industry, machine learning, and high-performance computing, we investigated how well machine learning models obtained from public data can be transferred to internal pharmaceutical industry data. Our results show that machine learning models trained on public data can indeed maintain their predictive power to a large degree when applied to industry data. Moreover, we observed that deep learning derived machine learning models outperformed comparable models, which were trained by other machine learning algorithms, when applied to internal pharmaceutical company datasets. To our knowledge, this is the first large-scale study evaluating the potential of machine learning and especially deep learning directly at the level of industry-scale settings and moreover investigating the transferability of publicly learned target prediction models towards industrial bioactivity prediction pipelines.Web of Science121art. no. 2

DSpace at VSB Technical University of Ostrava

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Using Machine Learning as a Surrogate Model for Agent-Based Simulations

Author: Angione Claudio
Silverman Eric
Yaneske Elisabeth
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 10/02/2022
Field of study

In this proof-of-concept work, we evaluate the performance of multiple machine-learning methods as surrogate models for use in the analysis of agent-based models (ABMs). Analysing agent-based modelling outputs can be challenging, as the relationships between input parameters can be non-linear or even chaotic even in relatively simple models, and each model run can require significant CPU time. Surrogate modelling, in which a statistical model of the ABM is constructed to facilitate detailed model analyses, has been proposed as an alternative to computationally costly Monte Carlo methods. Here we compare multiple machine-learning methods for ABM surrogate modelling in order to determine the approaches best suited as a surrogate for modelling the complex behaviour of ABMs. Our results suggest that, in most scenarios, artificial neural networks (ANNs) and gradient-boosted trees outperform Gaussian process surrogates, currently the most commonly used method for the surrogate modelling of complex computational models. ANNs produced the most accurate model replications in scenarios with high numbers of model runs, although training times were longer than the other methods. We propose that agent-based modelling would benefit from using machine-learning methods for surrogate modelling, as this can facilitate more robust sensitivity analyses for the models while also reducing CPU time consumption when calibrating and analysing the simulation

Teeside University's Research Repository

PubMed Central

Enlighten

Recommended from our members

Predicting the state of charge and health of batteries using data-driven machine learning

Author: Conduit GJ
Ng MF
Seh ZW
Yan Q
Zhao J
Publication venue: Nature Machine Intelligence
Publication date: 01/01/2020
Field of study

Machine learning is a specific application of artificial intelligence that allows computers to learn and improve from data and experience via sets of algorithms, without the need for reprogramming. In the field of energy storage, machine learning has recently emerged as a promising modelling approach to determine the state of charge, state of health and remaining useful life of batteries. First, we review the two most studied types of battery models in the literature for battery state prediction: the equivalent circuit and physics-based models. Based on the current limitations of these models, we showcase the promise of various machine learning techniques for fast and accurate battery state prediction. Finally, we highlight the major challenges involved, especially in accurate modelling over length and time, performing in situ calculations and high-throughput data generation. Overall, this work provides insights into real-time, explainable machine learning for battery production, management and optimization in the future

Apollo (Cambridge)

Improving Language Modelling with Noise-contrastive estimation

Author: Grzes Marek
Liza Farhana Ferdousi
Publication venue
Publication date: 22/09/2017
Field of study

Neural language models do not scale well when the vocabulary is large. Noise-contrastive estimation (NCE) is a sampling-based method that allows for fast learning with large vocabularies. Although NCE has shown promising performance in neural machine translation, it was considered to be an unsuccessful approach for language modelling. A sufficient investigation of the hyperparameters in the NCE-based neural language models was also missing. In this paper, we showed that NCE can be a successful approach in neural language modelling when the hyperparameters of a neural network are tuned appropriately. We introduced the 'search-then-converge' learning rate schedule for NCE and designed a heuristic that specifies how to use this schedule. The impact of the other important hyperparameters, such as the dropout rate and the weight initialisation range, was also demonstrated. We showed that appropriate tuning of NCE-based neural language models outperforms the state-of-the-art single-model methods on a popular benchmark

arXiv.org e-Print Archive

Kent Academic Repository

An Open Source C++ Implementation of Multi-Threaded Gaussian Mixture Models, k-Means and Expectation Maximisation

Author: Curtin Ryan
Sanderson Conrad
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Modelling of multivariate densities is a core component in many signal processing, pattern recognition and machine learning applications. The modelling is often done via Gaussian mixture models (GMMs), which use computationally expensive and potentially unstable training algorithms. We provide an overview of a fast and robust implementation of GMMs in the C++ language, employing multi-threaded versions of the Expectation Maximisation (EM) and k-means training algorithms. Multi-threading is achieved through reformulation of the EM and k-means algorithms into a MapReduce-like framework. Furthermore, the implementation uses several techniques to improve numerical stability and modelling accuracy. We demonstrate that the multi-threaded implementation achieves a speedup of an order of magnitude on a recent 16 core machine, and that it can achieve higher modelling accuracy than a previously well-established publically accessible implementation. The multi-threaded implementation is included as a user-friendly class in recent releases of the open source Armadillo C++ linear algebra library. The library is provided under the permissive Apache~2.0 license, allowing unencumbered use in commercial products

arXiv.org e-Print Archive

University of Queensland eSpace