2,613 research outputs found

    White box radial basis function classifiers with component selection for clinical prediction models

    Get PDF
    Objective: To propose a new flexible and sparse classifier that results in interpretable decision support systems. Methods: Support vector machines (SVMs) for classification are very powerful methods to obtain classifiers for complex problems. Although the performance of these methods is consistently high and non-linearities and interactions between variables can be handled efficiently when using non-linear kernels such as the radial basis function (RBF) kernel, their use in domains where interpretability is an issue is hampered by their lack of transparency. Many feature selection algorithms have been developed to allow for some interpretation but the impact of the different input variables on the prediction still remains unclear. Alternative models using additive kernels are restricted to main effects, reducing their usefulness in many applications. This paper proposes a new approach to expand the RBF kernel into interpretable and visualizable components, including main and two-way interaction effects. In order to obtain a sparse model representation, an iterative l-regularized parametric model using the interpretable components as inputs is proposed. Results: Results on toy problems illustrate the ability of the method to select the correct contributions and an improved performance over standard RBF classifiers in the presence of irrelevant input variables. For a 10-dimensional x-or problem, an SVM using the standard RBF kernel obtains an area under the receiver operating characteristic curve (AUC) of 0.947, whereas the proposed method achieves an AUC of 0.997. The latter additionally identifies the relevant components. In a second 10-dimensional artificial problem, the underlying class probability follows a logistic regression model. An SVM with the RBF kernel results in an AUC of 0.975, as apposed to 0.994 for the presented method. The proposed method is applied to two benchmark datasets: the Pima Indian diabetes and the Wisconsin Breast Cancer dataset. The AUC is in both cases comparable to those of the standard method (0.826 versus 0.826 and 0.990 versus 0.996) and those reported in the literature. The selected components are consistent with different approaches reported in other work. However, this method is able to visualize the effect of each of the components, allowing for interpretation of the learned logic by experts in the application domain. Conclusions: This work proposes a new method to obtain flexible and sparse risk prediction models. The proposed method performs as well as a support vector machine using the standard RBF kernel, but has the additional advantage that the resulting model can be interpreted by experts in the application domain. © 2013 Elsevier B.V

    Machine Learning Methods Enable Predictive Modeling of Antibody Feature:Function Relationships in RV144 Vaccinees

    Get PDF
    The adaptive immune response to vaccination or infection can lead to the production of specific antibodies to neutralize the pathogen or recruit innate immune effector cells for help. The non-neutralizing role of antibodies in stimulating effector cell responses may have been a key mechanism of the protection observed in the RV144 HIV vaccine trial. In an extensive investigation of a rich set of data collected from RV144 vaccine recipients, we here employ machine learning methods to identify and model associations between antibody features (IgG subclass and antigen specificity) and effector function activities (antibody dependent cellular phagocytosis, cellular cytotoxicity, and cytokine release). We demonstrate via cross-validation that classification and regression approaches can effectively use the antibody features to robustly predict qualitative and quantitative functional outcomes. This integration of antibody feature and function data within a machine learning framework provides a new, objective approach to discovering and assessing multivariate immune correlates.U.S. Military HIV Research ProgramCollaboration for AIDS Vaccine Discover (OPP1032817)National Institutes of Health (U.S.) (3R01AI080289-02S1)National Institutes of Health (U.S.) (5R01AI080289-03)United States. Army Medical Research and Materiel Command (National Institute of Allergy and Infectious Diseases (U.S.) Interagency Agreement Y1-AI-2642-12)Henry M. Jackson Foundation for the Advancement of Military Medicine (U.S.) (United States. Dept. of Defense Cooperative Agreement W81XWH-07-2-0067

    Explaining Support Vector Machines: A Color Based Nomogram.

    Get PDF
    PROBLEM SETTING: Support vector machines (SVMs) are very popular tools for classification, regression and other problems. Due to the large choice of kernels they can be applied with, a large variety of data can be analysed using these tools. Machine learning thanks its popularity to the good performance of the resulting models. However, interpreting the models is far from obvious, especially when non-linear kernels are used. Hence, the methods are used as black boxes. As a consequence, the use of SVMs is less supported in areas where interpretability is important and where people are held responsible for the decisions made by models. OBJECTIVE: In this work, we investigate whether SVMs using linear, polynomial and RBF kernels can be explained such that interpretations for model-based decisions can be provided. We further indicate when SVMs can be explained and in which situations interpretation of SVMs is (hitherto) not possible. Here, explainability is defined as the ability to produce the final decision based on a sum of contributions which depend on one single or at most two input variables. RESULTS: Our experiments on simulated and real-life data show that explainability of an SVM depends on the chosen parameter values (degree of polynomial kernel, width of RBF kernel and regularization constant). When several combinations of parameter values yield the same cross-validation performance, combinations with a lower polynomial degree or a larger kernel width have a higher chance of being explainable. CONCLUSIONS: This work summarizes SVM classifiers obtained with linear, polynomial and RBF kernels in a single plot. Linear and polynomial kernels up to the second degree are represented exactly. For other kernels an indication of the reliability of the approximation is presented. The complete methodology is available as an R package and two apps and a movie are provided to illustrate the possibilities offered by the method

    The potential application of artificial intelligence for diagnosis and management of glaucoma in adults

    Get PDF
    BACKGROUND: Glaucoma is the most frequent cause of irreversible blindness worldwide. There is no cure, but early detection and treatment can slow the progression and prevent loss of vision. It has been suggested that artificial intelligence (AI) has potential application for detection and management of glaucoma. SOURCES OF DATA: This literature review is based on articles published in peer-reviewed journals. AREAS OF AGREEMENT: There have been significant advances in both AI and imaging techniques that are able to identify the early signs of glaucomatous damage. Machine and deep learning algorithms show capabilities equivalent to human experts, if not superior. AREAS OF CONTROVERSY: Concerns that the increased reliance on AI may lead to deskilling of clinicians. GROWING POINTS: AI has potential to be used in virtual review clinics, telemedicine and as a training tool for junior doctors. Unsupervised AI techniques offer the potential of uncovering currently unrecognized patterns of disease. If this promise is fulfilled, AI may then be of use in challenging cases or where a second opinion is desirable. AREAS TIMELY FOR DEVELOPING RESEARCH: There is a need to determine the external validity of deep learning algorithms and to better understand how the 'black box' paradigm reaches results

    Improved Alzheimer’s disease detection by MRI using multimodal machine learning algorithms

    Get PDF
    Dementia is one of the huge medical problems that have challenged the public health sector around the world. Moreover, it generally occurred in older adults (age > 60). Shockingly, there are no legitimate drugs to fix this sickness, and once in a while it will directly influence individual memory abilities and diminish the human capacity to perform day by day exercises. Many health experts and computing scientists were performing research works on this issue for the most recent twenty years. All things considered, there is an immediate requirement for finding the relative characteristics that can figure out the identification of dementia. The motive behind the works presented in this thesis is to propose the sophisticated supervised machine learning model in the prediction and classification of AD in elder people. For that, we conducted different experiments on open access brain image information including demographic MRI data of 373 scan sessions of 150 patients. In the first two works, we applied single ML models called support vectors and pruned decision trees for the prediction of dementia on the same dataset. In the first experiment with SVM, we achieved 70% of the prediction accuracy of late-stage dementia. Classification of true dementia subjects (precision) is calculated as 75%. Similarly, in the second experiment with J48 pruned decision trees, the accuracy was improved to the value of 88.73%. Classification of true dementia cases with this model was comprehensively done and achieved 92.4% of precision. To enhance this work, rather than single modelling we employed multi-modelling approaches. In the comparative analysis of the machine learning study, we applied the feature reduction technique called principal component analysis. This approach identifies the high correlated features in the dataset that are closely associated with dementia type. By doing the simultaneous application of three models such as KNN, LR, and SVM, it has been possible to identify an ideal model for the classification of dementia subjects. When compared with support vectors, KNN and LR models comprehensively classified AD subjects with 97.6% and 98.3% of accuracy respectively. These values are relatively higher than the previous experiments. However, because of the AD severity in older adults, it should be mandatory to not leave true AD positives. For the classification of true AD subjects among total subjects, we enhanced the model accuracy by introducing three independent experiments. In this work, we incorporated two new models called Naïve Bayes and Artificial Neural Networks along support vectors and KNN. In the first experiment, models were independently developed with manual feature selection. The experimental outcome suggested that KNN 3 is the optimal model solution because of 91.32% of classification accuracy. In the second experiment, the same models were tested with limited features (with high correlation). SVM was produced a high 96.12% of classification accuracy and NB produced a 98.21% classification rate of true AD subjects. Ultimately, in the third experiment, we mixed these four models and created a new model called hybrid type modelling. Hybrid model performance is validated AU-ROC curve value which is 0.991 (i.e., 99.1% of classification accuracy) has achieved. All these experimental results suggested that the ensemble modelling approach with wrapping is an optimal solution in the classification of AD subjects

    Analytical fusion of multimodal magnetic resonance imaging to identify pathological states in genetically selected Marchigian Sardinian alcohol-preferring (msP) rats

    Full text link
    [EN] Alcohol abuse is one of the most alarming issues for the health authorities. It is estimated that at least 23 million of European citizens are affected by alcoholism causing a cost around 270 million euros. Excessive alcohol consumption is related with physical harm and, although it damages the most of body organs, liver, pancreas, and brain are more severally affected. Not only physical harm is associated to alcohol-related disorders, but also other psychiatric disorders such as depression are often comorbiding. As well, alcohol is present in many of violent behaviors and traffic injures. Altogether reflects the high complexity of alcohol-related disorders suggesting the involvement of multiple brain systems. With the emergence of non-invasive diagnosis techniques such as neuroimaging or EEG, many neurobiological factors have been evidenced to be fundamental in the acquisition and maintenance of addictive behaviors, relapsing risk, and validity of available treatment alternatives. Alterations in brain structure and function reflected in non-invasive imaging studies have been repeatedly investigated. However, the extent to which imaging measures may precisely characterize and differentiate pathological stages of the disease often accompanied by other pathologies is not clear. The use of animal models has elucidated the role of neurobiological mechanisms paralleling alcohol misuses. Thus, combining animal research with non-invasive neuroimaging studies is a key tool in the advance of the disorder understanding. As the volume of data from very diverse nature available in clinical and research settings increases, an integration of data sets and methodologies is required to explore multidimensional aspects of psychiatric disorders. Complementing conventional mass-variate statistics, interests in predictive power of statistical machine learning to neuroimaging data is currently growing among scientific community. This doctoral thesis has covered most of the aspects mentioned above. Starting from a well-established animal model in alcohol research, Marchigian Sardinian rats, we have performed multimodal neuroimaging studies at several stages of alcohol-experimental design including the etiological mechanisms modulating high alcohol consumption (in comparison to Wistar control rats), alcohol consumption, and treatment with the opioid antagonist Naltrexone, a well-established drug in clinics but with heterogeneous response. Multimodal magnetic resonance imaging acquisition included Diffusion Tensor Imaging, structural imaging, and the calculation of magnetic-derived relaxometry maps. We have designed an analytical framework based on widely used algorithms in neuroimaging field, Random Forest and Support Vector Machine, combined in a wrapping fashion. Designed approach was applied on the same dataset with two different aims: exploring the validity of the approach to discriminate experimental stages running at subject-level and establishing predictive models at voxel-level to identify key anatomical regions modified during the experiment course. As expected, combination of multiple magnetic resonance imaging modalities resulted in an enhanced predictive power (between 3 and 16%) with heterogeneous modality contribution. Surprisingly, we have identified some inborn alterations correlating high alcohol preference and thalamic neuroadaptations related to Naltrexone efficacy. As well, reproducible contribution of DTI and relaxometry -related biomarkers has been repeatedly identified guiding further studies in alcohol research. In summary, along this research we demonstrate the feasibility of incorporating multimodal neuroimaging, machine learning algorithms, and animal research in the advance of the understanding alcohol-related disorders.[ES] El abuso de alcohol es una de las mayores preocupaciones de las autoridades sanitarias en la Unión Europea. El consumo de alcohol en exceso afecta en mayor o menor medida la totalidad del organismo siendo el páncreas e hígado los más severamente afectados. Además de estos, el sistema nervioso central sufre deterioros relacionados con el alcohol y con frecuencia se presenta en paralelo con otras patologías psiquiátricas como la depresión u otras adicciones como la ludopatía. La presencia de estas comorbidades demuestra la complejidad de la patología en la que multitud de sistemas neuronales interaccionan entre sí. El uso imágenes de resonancia magnética (RM) han ayudado en el estudio de enfermedades psiquiátricas facilitando el descubrimiento de mecanismos neurológicos fundamentales en el desarrollo y mantenimiento de la adicción al alcohol, recaídas y el efecto de los tratamientos disponibles. A pesar de los avances, todavía se necesita investigar más para identificar las bases biológicas que contribuyen a la enfermedad. En este sentido, los modelos animales sirven, por lo tanto, a discriminar aquellos factores únicamente relacionados con el alcohol controlando otros factores que facilitan el desarrollo del alcoholismo. Estudios de resonancia magnética en animales de laboratorio y su posterior evaluación en humanos juegan un papel fundamental en el entendimiento de las patologías psiquatricas como la addicción al alcohol. La imagen por resonancia magnética se ha integrado en entornos clínicos como prueba diagnósticas no invasivas. A medida que el volumen de datos se va incrementando, se necesitan herramientas y metodologías capaces de fusionar información de muy distinta naturaleza y así establecer criterios diagnósticos cada vez más exactos. El poder predictivo de herramientas derivadas de la inteligencia artificial como el aprendizaje automático sirven de complemento a tradicionales métodos estadísticos. En este trabajo se han abordado la mayoría de estos aspectos. Se han obtenido datos multimodales de resonancia magnética de un modelo validado en la investigación de patologías derivadas del consumo del alcohol, las ratas Marchigian-Sardinian desarrolladas en la Universidad de Camerino (Italia) y con consumos de alcohol comparables a los humanos. Para cada animal se han adquirido datos antes y después del consumo de alcohol y bajo dos condiciones de abstinencia (con y sin tratamiento de Naltrexona, una medicaciones anti-recaídas usada como farmacoterapia en el alcoholismo). Los datos de resonancia magnética multimodal consistentes en imágenes de difusión, de relaxometría y estructurales se han fusionado en un esquema analítico multivariable incorporando dos herramientas generalmente usadas en datos derivados de neuroimagen, Random Forest y Support Vector Machine. Nuestro esquema fue aplicado con dos objetivos diferenciados. Por un lado, determinar en qué fase experimental se encuentra el sujeto a partir de biomarcadores y por el otro, identificar sistemas cerebrales susceptibles de alterarse debido a una importante ingesta de alcohol y su evolución durante la abstinencia. Nuestros resultados demostraron que cuando biomarcadores derivados de múltiples modalidades de neuroimagen se fusionan en un único análisis producen diagnósticos más exactos que los derivados de una única modalidad (hasta un 16% de mejora). Biomarcadores derivados de imágenes de difusión y relaxometría discriminan estados experimentales. También se han identificado algunos aspectos innatos que están relacionados con posteriores comportamientos con el consumo de alcohol o la relación entre la respuesta al tratamiento y los datos de resonancia magnética. Resumiendo, a lo largo de esta tesis, se demuestra que el uso de datos de resonancia magnética multimodales en modelos animales combinados en esquemas analíticos multivariados es una herramienta válida en el entendimiento de patologías[CAT] L'abús de alcohol es una de les majors preocupacions per part de les autoritats sanitàries de la Unió Europea. Malgrat la dificultat de establir xifres exactes, se estima que uns 23 milions de europeus actualment sofreixen de malalties derivades del alcoholisme amb un cost que supera els 150.000 milions de euros per a la societat. Un consum de alcohol en excés afecta en major o menor mesura el cos humà sent el pàncreas i el fetge el més afectats. A més, el cervell sofreix de deterioraments produïts per l'alcohol i amb freqüència coexisteixen amb altres patologies com depressió o altres addiccions com la ludopatia. Tot aquest demostra la complexitat de la malaltia en la que múltiple sistemes neuronals interactuen entre si. Tècniques no invasives com el encefalograma (EEG) o imatges de ressonància magnètica (RM) han ajudat en l'estudi de malalties psiquiàtriques facilitant el descobriment de mecanismes neurològics fonamentals en el desenvolupament i manteniment de la addició, recaiguda i la efectivitat dels tractaments disponibles. Tot i els avanços, encara es necessiten més investigacions per identificar les bases biològiques que contribueixen a la malaltia. En aquesta direcció, el models animals serveixen per a identificar únicament dependents del abús del alcohol. Estudis de ressonància magnètica en animals de laboratori i posterior avaluació en humans jugarien un paper fonamental en l' enteniment de l'ús del alcohol. L'ús de probes diagnostiques no invasives en entorns clínics has sigut integrades. A mesura que el volum de dades es incrementa, eines i metodologies per a la fusió d' informació de molt distinta natura i per tant, establir criteris diagnòstics cada vegada més exactes. La predictibilitat de eines desenvolupades en el camp de la intel·ligència artificial com la aprenentatge automàtic serveixen de complement a mètodes estadístics tradicionals. En aquesta investigació se han abordat tots aquestes aspectes. Dades multimodals de ressonància magnètica se han obtingut de un model animal validat en l'estudi de patologies relacionades amb el consum d'alcohol, les rates Marchigian-Sardinian desenvolupades en la Universitat de Camerino (Italià) i amb consums d'alcohol comparables als humans. Per a cada animal es van adquirir dades previs i després al consum de alcohol i dos condicions diferents de abstinència (amb i sense tractament anti-recaiguda). Dades de ressonància magnètica multimodal constituides per imatges de difusió, de relaxometria magnètica i estructurals van ser fusionades en esquemes analítics multivariats incorporant dues metodologies validades en el camp de neuroimatge, Random Forest i Support Vector Machine. Nostre esquema ha sigut aplicat amb dos objectius diferenciats. El primer objectiu es determinar en quina fase experimental es troba el subjecte a partir de biomarcadors obtinguts per neuroimatge. Per l'altra banda, el segon objectiu es identificar el sistemes cerebrals susceptibles de ser alterats durant una important ingesta de alcohol i la seua evolució durant la fase del tractament. El nostres resultats demostraren que l'ús de biomarcadors derivats de varies modalitats de neuroimatge fusionades en un anàlisis multivariat produeixen diagnòstics més exactes que els derivats de una única modalitat (fins un 16% de millora). Biomarcadors derivats de imatges de difusió i relaxometria van contribuir de distints estats experimentals. També s'han identificat aspectes innats que estan relacionades amb posterior preferències d'alcohol o la relació entre la resposta al tractament anti-recaiguda i les dades de ressonància magnètica. En resum, al llarg de aquest treball, es demostra que l'ús de dades de ressonància magnètica multimodal en models animals combinats en esquemes analítics multivariats són una eina molt valida en l'enteniment i avanç de patologies psiquiàtriques com l'alcoholisme.Cosa Liñán, A. (2017). Analytical fusion of multimodal magnetic resonance imaging to identify pathological states in genetically selected Marchigian Sardinian alcohol-preferring (msP) rats [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/90523TESI

    Prediction of Breast Cancer Proteins Involved in Immunotherapy, Metastasis, and RNA-Binding Using Molecular Descriptors and Artifcial Neural Networks

    Get PDF
    [Abstract] Breast cancer (BC) is a heterogeneous disease where genomic alterations, protein expression deregulation, signaling pathway alterations, hormone disruption, ethnicity and environmental determinants are involved. Due to the complexity of BC, the prediction of proteins involved in this disease is a trending topic in drug design. This work is proposing accurate prediction classifer for BC proteins using six sets of protein sequence descriptors and 13 machine-learning methods. After using a univariate feature selection for the mix of fve descriptor families, the best classifer was obtained using multilayer perceptron method (artifcial neural network) and 300 features. The performance of the model is demonstrated by the area under the receiver operating characteristics (AUROC) of 0.980±0.0037, and accuracy of 0.936±0.0056 (3-fold cross-validation). Regarding the prediction of 4,504 cancer-associated proteins using this model, the best ranked cancer immunotherapy proteins related to BC were RPS27, SUPT4H1, CLPSL2, POLR2K, RPL38, AKT3, CDK3, RPS20, RASL11A and UBTD1; the best ranked metastasis driver proteins related to BC were S100A9, DDA1, TXN, PRNP, RPS27, S100A14, S100A7, MAPK1, AGR3 and NDUFA13; and the best ranked RNA-binding proteins related to BC were S100A9, TXN, RPS27L, RPS27, RPS27A, RPL38, MRPL54, PPAN, RPS20 and CSRP1. This powerful model predicts several BC-related proteins that should be deeply studied to fnd new biomarkers and better therapeutic targets. Scripts can be downloaded at https://github.com/muntisa/ neural-networks-for-breast-cancer-proteins.This work was supported by a) Universidad UTE (Ecuador), b) the Collaborative Project in Genomic Data Integration (CICLOGEN) PI17/01826 funded by the Carlos III Health Institute from the Spanish National plan for Scientific and Technical Research and Innovation 2013-2016 and the European Regional Development Funds (FEDER) - “A way to build Europe”; c) the General Directorate of Culture, Education and University Management of Xunta de Galicia ED431D 2017/16 and “Drug Discovery Galician Network” Ref. ED431G/01 and the “Galician Network for Colorectal Cancer Research” (Ref. ED431D 2017/23); d) the Spanish Ministry of Economy and Competitiveness for its support through the funding of the unique installation BIOCAI (UNLC08-1E-002, UNLC13-13-3503) and the European Regional Development Funds (FEDER) by the European Union; e) the Consolidation and Structuring of Competitive Research Units - Competitive Reference Groups (ED431C 2018/49), funded by the Ministry of Education, University and Vocational Training of the Xunta de Galicia endowed with EU FEDER funds; f) research grants from Ministry of Economy and Competitiveness, MINECO, Spain (FEDER CTQ2016-74881-P), Basque government (IT1045-16), and kind support of Ikerbasque, Basque Foundation for Science; and, g) Sociedad Latinoamericana de Farmacogenómica y Medicina Personalizada (SOLFAGEM)Xunta de Galicia; ED431D 2017/16Xunta de Galicia; ED431G/01Xunta de Galicia; ED431D 2017/23Xunta de Galicia; ED431C 2018/49Gobierno Vasco; IT1045-1
    corecore