Search CORE

1,498 research outputs found

Inhibition in multiclass classification

Author: Bottou L.
Chang Y.-W.
Charles Elkan
Dempster A. P.
José M. Amigó
Kivinen J.
LeCun Y.
Lugosi G.
Platt J. C.
Platt J. C.
Ramón Huerta
Rifkin R.
Shankar Vembu
Smith B. H.
Tewari A.
Thomas Nowotny
Tsochantaridis I.
Weston J.
Publication venue: 'MIT Press - Journals'
Publication date: 01/09/2012
Field of study

The role of inhibition is investigated in a multiclass support vector machine formalism inspired by the brain structure of insects. The so-called mushroom bodies have a set of output neurons, or classification functions, that compete with each other to encode a particular input. Strongly active output neurons depress or inhibit the remaining outputs without knowing which is correct or incorrect. Accordingly, we propose to use a classification function that embodies unselective inhibition and train it in the large margin classifier framework. Inhibition leads to more robust classifiers in the sense that they perform better on larger areas of appropriate hyperparameters when assessed with leave-one-out strategies. We also show that the classifier with inhibition is a tight bound to probabilistic exponential models and is Bayes consistent for 3-class problems. These properties make this approach useful for data sets with a limited number of labeled examples. For larger data sets, there is no significant comparative advantage to other multiclass SVM approaches

Crossref

PubMed Central

Sussex Research Online

Inhibition in multiclass classification

Author: Bottou L.
Chang Y.-W.
Charles Elkan
Dempster A. P.
José M. Amigó
Kivinen J.
LeCun Y.
Lugosi G.
Platt J. C.
Platt J. C.
Ramón Huerta
Rifkin R.
Shankar Vembu
Smith B. H.
Tewari A.
Thomas Nowotny
Tsochantaridis I.
Weston J.
Publication venue: 'MIT Press - Journals'
Publication date: 01/09/2012
Field of study

Crossref

Directory of Open Access Journals

Red de Bibliotecas Virtuales de Ciencias Sociales de América Latina y El Caribe

DIALNET

Sussex Research Online

Repositorio de Objetos de Docencia e Investigación de la Universidad de Cádiz

idUS. Depósito de Investigación Universidad de Sevilla

Unconventional machine learning of genome-wide human cancer data

Author: Bajaj Sweta R.
Chittenden Thomas W.
Cilfone Nicholas
Gamel Omar E.
Gujja Sharvari
Gulcher Jeffrey R.
Li Richard Y.
Lidar Daniel A.
Publication venue
Publication date: 13/05/2020
Field of study

Recent advances in high-throughput genomic technologies coupled with exponential increases in computer processing and memory have allowed us to interrogate the complex aberrant molecular underpinnings of human disease from a genome-wide perspective. While the deluge of genomic information is expected to increase, a bottleneck in conventional high-performance computing is rapidly approaching. Inspired in part by recent advances in physical quantum processors, we evaluated several unconventional machine learning (ML) strategies on actual human tumor data. Here we show for the first time the efficacy of multiple annealing-based ML algorithms for classification of high-dimensional, multi-omics human cancer data from the Cancer Genome Atlas. To assess algorithm performance, we compared these classifiers to a variety of standard ML methods. Our results indicate the feasibility of using annealing-based ML to provide competitive classification of human cancer types and associated molecular subtypes and superior performance with smaller training datasets, thus providing compelling empirical evidence for the potential future application of unconventional computing architectures in the biomedical sciences

arXiv.org e-Print Archive

Directory of Open Access Journals

Development of Machine Learning Models for Generation and Activity Prediction of the Protein Tyrosine Kinase Inhibitors

Author: Kassab Ryan
Publication venue: Chapman University Digital Commons
Publication date: 01/08/2022
Field of study

The field of computational drug discovery and development continues to grow at a rapid pace, using generative machine learning approaches to present us with solutions to high dimensional and complex problems in drug discovery and design. In this work, we present a platform of Machine Learning based approaches for generation and scoring of novel kinase inhibitor molecules. We utilized a binary Random Forest classification model to develop a Machine Learning based scoring function to evaluate the generated molecules on Kinase Inhibition Likelihood. By training the model on several chemical features of each known kinase inhibitor, we were able to create a metric that captures the differences between a SRC Kinase Inhibitor and a non-SRC Kinase Inhibitor. We implemented the scoring function into a Biased and Unbiased Bayesian Optimization framework to generate molecules based on features of SRC Kinase Inhibitors. We then used similarity metrics such as Tanimoto Similarity to assess their closeness to that of known SRC Kinase Inhibitors. The molecules generated from this experiment demonstrated potential for belonging to the SRC Kinase Inhibitor family though chemical synthesis would be needed to confirm the results. The top molecules generated from the Unbiased and Biased Bayesian Optimization experiments were calculated to respectively have Tanimoto Similarity scores of 0.711 and 0.709 to known SRC Kinase Inhibitors. With calculated Kinase Inhibition Likelihood scores of 0.586 and 0.575, the top molecules generated from the Bayesian Optimization demonstrate a disconnect between the similarity scores to known SRC Kinase Inhibitors and the calculated Kinase Inhibition Likelihood score. It was found that implementing a bias into the Bayesian Optimization process had little effect on the quality of generated molecules. In addition, several molecules generated from the Bayesian Optimization process were sent to the School of Pharmacy for chemical synthesis which gives the experiment more concrete results. The results of this study demonstrated that generating molecules throughBayesian Optimization techniques could aid in the generation of molecules for a specific kinase family, but further expansions of the techniques would be needed for substantial results

Chapman University Digital Commons

Recommended from our members

Computer modelling of metabolic adaptions during mitochondrial dysfunction and machine learning to predict novel mitochondrial disease genes

Author: Smith Alexander Gary
Publication venue: University of Cambridge
Publication date: 26/09/2019
Field of study

Mitochondria are organelles found in almost every eukaryote and are primarily responsible for generating chemical energy in the form of adenosine triphosphate. This thesis investigates two main causes of mitochondrial dysfunction: mitochondrial toxicity arising from side-effects of drugs; and mitochondrial diseases arising from defects in nuclear-encoded genes. Novel chemical entities being developed as drug leads are screened for cellular toxicity in which mitochondrial dysfunction is a major cause. However, our lack of understanding of the metabolic adaptations to mitochondrial dysfunction limits the accurate screening of mitochondrial dysfunction for pharmaceutical companies, thus preventing potentially useful drugs from being developed. To further our understanding of these adaptations, I analysed a large-scale metabolomics data set of rats administered a known mitochondrial complex III inhibitor. The analyses revealed many perturbed pathways which can be exploited as biomarkers of mild mitochondrial dysfunction, a condition which is currently clinically undetectable during the drug development process. To direct future studies on mitochondrial dysfunction, a multi-organ model of mitochondrial metabolism was generated and used to simulate inhibition of the mitochondrial respiratory complexes. The simulations of complex III inhibition accurately predicted many of the metabolite behaviours identified in the metabolomics analyses and provided theories for their significance. Simulations of the other complexes’ inhibitions identified many unique behaviours which can be used to direct future studies, studies which would greatly improve our understanding of the metabolic adaptations and provide higher confidence biomarkers. Mitochondrial dysfunction is linked to many late onset diseases such as Parkinson’s, and inborn errors of mitochondrial metabolism cause severe neurological and physiological diseases. Patients with suspected mitochondrial disease have their DNA sequenced and analysed. Diagnosis of mitochondrial disease by sequencing requires knowledge of the mitochondrial proteome, which is currently incomplete. A predicted mitochondrial proteome was generated using a support vector machine trained using the abundance of protein localisation data available in the MitoMiner database. The support vector machine identified 442 novel mitochondrional proteins. The current success rate of diagnosing mitochondrial disease using sequencing is currently limited by our inability to filter and prioritise a patient’s DNA variants. Patients which do not have a variant in one of the already known mitochondrial disease genes are usually left with over hundreds of potential disease-causing variants. A probability of being disease-causing for each gene in the mitochondrial proteome was generated using two trained neural networks. The networks were trained on a large amount of different data sources for differentiating mitochondrial disease genes including protein-protein interaction network metrics, gene tissue expression and protein evolution. The predicted probabilities allow for better filtering and prioritisation of a patient’s variants for candidate disease-causing genes to be experimentally verified. The predicted mitochondrial proteome and their predicted disease-causing probabilities are currently used in an NGS analysis pipeline at the MRC Mitochondrial Biology Unit for diagnosing mitochondrial disease patient samples

Apollo (Cambridge)

TIMMA-R : an R package for predicting synergistic multi-targeted drug combinations in cancer cell lines or patient-derived samples

Author: Al-Lazikani
Barretina
Davis
Duşa
Gaulton
Halling-Brown
Hopkins
Jing Tang
Krister Wennerberg
Liye He
Lê
Pemovska
Sun
Tang
Tang
Tero Aittokallio
Tyner
Vempati
Yang
Zhao
Publication venue
Publication date: 31/01/2015
Field of study

Network pharmacology-based prediction of multi-targeted drug combinations is becoming a promising strategy to improve anticancer efficacy and safety. We developed a logic-based network algorithm, called Target Inhibition Interaction using Maximization and Minimization Averaging (TIMMA), which predicts the effects of drug combinations based on their binary drug-target interactions and single-drug sensitivity profiles in a given cancer sample. Here, we report the R implementation of the algorithm (TIMMA-R), which is much faster than the original MATLAB code. The major extensions include modeling of multiclass drug-target profiles and network visualization. We also show that the TIMMA-R predictions are robust to the intrinsic noise in the experimental data, thus making it a promising high-throughput tool to prioritize drug combinations in various cancer types for follow-up experimentation or clinical applications.Peer reviewe

Crossref

PubMed Central

Helsingin yliopiston digitaalinen arkisto

Machine learning of visual object categorization: an application of the SUSTAIN model

Author: Cangelosi A
Carmantini GS
Wills AJ
Publication venue: Austin, TX
Publication date: 12/08/2014
Field of study

Formal models of categorization are psychological theories that try to describe the process of categorization in a lawful way, using the language of mathematics. Their mathematical formulation makes it possible for the models to generate precise, quantitative predictions. SUSTAIN (Love, Medin & Gureckis, 2004) is a powerful formal model of categorization that has been used to model a range of human experimental data, describing the process of categorization in terms of an adaptive clustering principle. Love et al. (2004) suggested a possible application of the model in the field of object recognition and categorization. The present study explores this possibility, investigating at the same time the utility of using a formal model of categorization in a typical machine learning task. The image categorization performance of SUSTAIN on a well-known image set is compared with that of a linear Support Vector Machine, confirming the capability of SUSTAIN to perform image categorization with a reasonable accuracy, even if at a rather high computational cost

Plymouth Electronic Archive and Research Library

Leukemia multiclass assessment and classification from Microarray and RNA-seq technologies integration at gene expression level

Author: Caba Pérez Octavio
Castillo Secilla Daniel
Gálvez Gómez Juan Manuel
Herrera Maldonado Luis Javier
Prados Salazar José Carlos
Rojas Ruiz Fernando José
Rojas Ruiz Ignacio
Valenzuela Cansino Olga
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2019
Field of study

In more recent years, a significant increase in the number of available biological experiments has taken place due to the widespread use of massive sequencing data. Furthermore, the continuous developments in the machine learning and in the high performance computing areas, are allowing a faster and more efficient analysis and processing of this type of data. However, biological information about a certain disease is normally widespread due to the use of different sequencing technologies and different manufacturers, in different experiments along the years around the world. Thus, nowadays it is of paramount importance to attain a correct integration of biologically-related data in order to achieve genuine benefits from them. For this purpose, this work presents an integration of multiple Microarray and RNA-seq platforms, which has led to the design of a multiclass study by collecting samples from the main four types of leukemia, quantified at gene expression. Subsequently, in order to find a set of differentially expressed genes with the highest discernment capability among different types of leukemia, an innovative parameter referred to as coverage is presented here. This parameter allows assessing the number of different pathologies that a certain gen is able to discern. It has been evaluated together with other widely known parameters under assessment of an ANOVA statistical test which corroborated its filtering power when the identified genes are subjected to a machine learning process at multiclass level. The optimal tuning of gene extraction evaluated parameters by means of this statistical test led to the selection of 42 highly relevant expressed genes. By the use of minimum- Redundancy Maximum-Relevance (mRMR) feature selection algorithm, these genes were reordered and assessed under the operation of four different classification techniques. Outstanding results were achieved by taking exclusively the first ten genes of the ranking into consideration. Finally, specific literature was consulted on this last subset of genes, revealing the occurrence of practically all of them with biological processes related to leukemia. At sight of these results, this study underlines the relevance of considering a new parameter which facilitates the identification of highly valid expressed genes for simultaneously discerning multiple types of leukemia.This work was supported by Project TIN2015-71873-R (Spanish Ministry of Economy and Competitiveness -MINECO- and the European Regional Development Fund -ERDF) and Junta de Andalucı´a (P12–TIC–2082)

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

Repositorio Institucional Universidad de Granada

Fondo Bibliográfico Digital Institucional

Multiclass methods in the analysis of metabolomic datasets: the example of raspberry cultivar volatile compounds detected by GC-MS and PTR-MS

Author: Aprea Eugenio
Biasioli Franco
Cappellin Luca
Gasperi Flavia
Granitto Pablo Miguel
Romano Andrea
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

Multiclass sample classification and marker selection are cutting-edge problems in metabolomics. In the present study we address the classification of 14 raspberry cultivars having different levels of gray mold (Botrytis cinerea) susceptibility. We characterized raspberry cultivars by two headspace analysis methods, namely solid-phase microextraction/gas chromatography–mass spectrometry (SPME/GC–MS) and proton transfer reaction-mass spectrometry (PTR-MS). Given the high number of classes, advanced data mining methods are necessary. Random Forest (RF), Penalized Discriminant Analysis (PDA), Discriminant Partial Least Squares (dPLS) and Support Vector Machine (SVM) have been employed for cultivar classification and Random Forest-Recursive Feature Elimination (RF-RFE) has been used to perform feature selection. In particular the most important GC–MS and PTR-MS variables related to gray mold susceptibility of the selected raspberry cultivars have been investigated. Moving from GC–MS profiling to the more rapid and less invasive PTR-MS fingerprinting leads to a cultivar characterization which is still related to the corresponding Botrytis susceptibility level and therefore marker identification is still possible.Fil: Cappellin, Luca. Fondazione Edmund Mach. Research and Innovation Centre; ItaliaFil: Aprea, Eugenio. Fondazione Edmund Mach. Research and Innovation Centre; ItaliaFil: Granitto, Pablo Miguel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y Sistemas; ArgentinaFil: Romano, Andrea. Fondazione Edmund Mach. Research and Innovation Centre; ItaliaFil: Gasperi, Flavia. Fondazione Edmund Mach. Research and Innovation Centre; ItaliaFil: Biasioli, Franco. Fondazione Edmund Mach. Research and Innovation Centre; Itali

CONICET Digital

Archivio istituzionale della ricerca - Fondazione Edmund Mach

Archivio istituzionale della ricerca - Università di Padova