121 research outputs found

    Combinatorial and dispersion activity coefficient models for molecular solutions

    Get PDF

    Combinatorial and dispersion activity coefficient models for molecular solutions

    Get PDF

    A kernel-based approach for fault diagnosis in batch processes

    Full text link
    This article explores the potential of kernel-based techniques for discriminating on-specification and off-specification batch runs, combining kernel-partial least squares discriminant analysis and three common approaches to analyze batch data by means of bilinear models: landmark features extraction, batchwise unfolding, and variablewise unfolding. Gower s idea of pseudo-sample projection is exploited to recover the contribution of the initial variables to the final model and visualize those having the highest discriminant power. The results show that the proposed approach provides an efficient fault discrimination and enables a correct identification of the discriminant variables in the considered case studies.Vitale, R.; De Noord, OE.; Ferrer, A. (2014). A kernel-based approach for fault diagnosis in batch processes. Journal of Chemometrics. 28(8):697-707. doi:10.1002/cem.2629S697707288Cao, D.-S., Liang, Y.-Z., Xu, Q.-S., Hu, Q.-N., Zhang, L.-X., & Fu, G.-H. (2011). Exploring nonlinear relationships in chemical data using kernel-based methods. Chemometrics and Intelligent Laboratory Systems, 107(1), 106-115. doi:10.1016/j.chemolab.2011.02.004Walczak, B., & Massart, D. L. (1996). The Radial Basis Functions — Partial Least Squares approach as a flexible non-linear regression technique. Analytica Chimica Acta, 331(3), 177-185. doi:10.1016/0003-2670(96)00202-4Walczak, B., & Massart, D. L. (1996). Application of Radial Basis Functions — Partial Least Squares to non-linear pattern recognition problems: diagnosis of process faults. Analytica Chimica Acta, 331(3), 187-193. doi:10.1016/0003-2670(96)00206-1Gasteiger, J., & Zupan, J. (1993). Neural Networks in Chemistry. Angewandte Chemie International Edition in English, 32(4), 503-527. doi:10.1002/anie.199305031Li, H., Liang, Y., & Xu, Q. (2009). Support vector machines and its applications in chemistry. Chemometrics and Intelligent Laboratory Systems, 95(2), 188-198. doi:10.1016/j.chemolab.2008.10.007Williams, P. (2009). Influence of Water on Prediction of Composition and Quality Factors: The Aquaphotomics of Low Moisture Agricultural Materials. Journal of Near Infrared Spectroscopy, 17(6), 315-328. doi:10.1255/jnirs.862Tan, C., & Li, M. (2008). Mutual information-induced interval selection combined with kernel partial least squares for near-infrared spectral calibration. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 71(4), 1266-1273. doi:10.1016/j.saa.2008.03.033Embrechts, M. J., & Ekins, S. (2006). Classification of Metabolites with Kernel-Partial Least Squares (K-PLS). Drug Metabolism and Disposition, 35(3), 325-327. doi:10.1124/dmd.106.013185Arenas-Garcia, J., & Camps-Valls, G. (2008). Efficient Kernel Orthonormalized PLS for Remote Sensing Applications. IEEE Transactions on Geoscience and Remote Sensing, 46(10), 2872-2881. doi:10.1109/tgrs.2008.918765Sun, R., & Tsung, F. (2003). A kernel-distance-based multivariate control chart using support vector methods. International Journal of Production Research, 41(13), 2975-2989. doi:10.1080/1352816031000075224Lee, J.-M., Yoo, C., Choi, S. W., Vanrolleghem, P. A., & Lee, I.-B. (2004). Nonlinear process monitoring using kernel principal component analysis. Chemical Engineering Science, 59(1), 223-234. doi:10.1016/j.ces.2003.09.012Kewley, R. H., Embrechts, M. J., & Breneman, C. (2000). Data strip mining for the virtual design of pharmaceuticals with neural networks. IEEE Transactions on Neural Networks, 11(3), 668-679. doi:10.1109/72.846738Üstün, B., Melssen, W. J., & Buydens, L. M. C. (2007). Visualisation and interpretation of Support Vector Regression models. Analytica Chimica Acta, 595(1-2), 299-309. doi:10.1016/j.aca.2007.03.023Krooshof, P. W. T., Üstün, B., Postma, G. J., & Buydens, L. M. C. (2010). Visualization and Recovery of the (Bio)chemical Interesting Variables in Data Analysis with Support Vector Machine Classification. Analytical Chemistry, 82(16), 7000-7007. doi:10.1021/ac101338yGOWER, J. C., & HARDING, S. A. (1988). Nonlinear biplots. Biometrika, 75(3), 445-455. doi:10.1093/biomet/75.3.445Postma, G. J., Krooshof, P. W. T., & Buydens, L. M. C. (2011). Opening the kernel of kernel partial least squares and support vector machines. Analytica Chimica Acta, 705(1-2), 123-134. doi:10.1016/j.aca.2011.04.025Smolinska, A., Blanchet, L., Coulier, L., Ampt, K. A. M., Luider, T., Hintzen, R. Q., … Buydens, L. M. C. (2012). Interpretation and Visualization of Non-Linear Data Fusion in Kernel Space: Study on Metabolomic Characterization of Progression of Multiple Sclerosis. PLoS ONE, 7(6), e38163. doi:10.1371/journal.pone.0038163Camacho, J., Picó, J., & Ferrer, A. (2008). Bilinear modelling of batch processes. Part I: theoretical discussion. Journal of Chemometrics, 22(5), 299-308. doi:10.1002/cem.1113Wold, S., Kettaneh-Wold, N., MacGregor, J. F., & Dunn, K. G. (2009). Batch Process Modeling and MSPC. Comprehensive Chemometrics, 163-197. doi:10.1016/b978-044452701-1.00108-3Nomikos, P., & MacGregor, J. F. (1995). Multivariate SPC Charts for Monitoring Batch Processes. Technometrics, 37(1), 41-59. doi:10.1080/00401706.1995.10485888García-Muñoz, S., Kourti, T., MacGregor, J. F., Mateos, A. G., & Murphy, G. (2003). Troubleshooting of an Industrial Batch Process Using Multivariate Methods. Industrial & Engineering Chemistry Research, 42(15), 3592-3601. doi:10.1021/ie0300023Pérez, N. F., Ferré, J., & Boqué, R. (2009). Calculation of the reliability of classification in discriminant partial least-squares binary classification. Chemometrics and Intelligent Laboratory Systems, 95(2), 122-128. doi:10.1016/j.chemolab.2008.09.005Lindgren, F., Hansen, B., Karcher, W., Sjöström, M., & Eriksson, L. (1996). Model validation by permutation tests: Applications to variable selection. Journal of Chemometrics, 10(5-6), 521-532. doi:10.1002/(sici)1099-128x(199609)10:5/63.0.co;2-jQuintás, G., Portillo, N., García-Cañaveras, J. C., Castell, J. V., Ferrer, A., & Lahoz, A. (2011). Chemometric approaches to improve PLSDA model outcome for predicting human non-alcoholic fatty liver disease using UPLC-MS as a metabolic profiling tool. Metabolomics, 8(1), 86-98. doi:10.1007/s11306-011-0292-5Courrieu, P. (2002). Straight monotonic embedding of data sets in Euclidean spaces. Neural Networks, 15(10), 1185-1196. doi:10.1016/s0893-6080(02)00091-

    Explaining Support Vector Machines: A Color Based Nomogram.

    Get PDF
    PROBLEM SETTING: Support vector machines (SVMs) are very popular tools for classification, regression and other problems. Due to the large choice of kernels they can be applied with, a large variety of data can be analysed using these tools. Machine learning thanks its popularity to the good performance of the resulting models. However, interpreting the models is far from obvious, especially when non-linear kernels are used. Hence, the methods are used as black boxes. As a consequence, the use of SVMs is less supported in areas where interpretability is important and where people are held responsible for the decisions made by models. OBJECTIVE: In this work, we investigate whether SVMs using linear, polynomial and RBF kernels can be explained such that interpretations for model-based decisions can be provided. We further indicate when SVMs can be explained and in which situations interpretation of SVMs is (hitherto) not possible. Here, explainability is defined as the ability to produce the final decision based on a sum of contributions which depend on one single or at most two input variables. RESULTS: Our experiments on simulated and real-life data show that explainability of an SVM depends on the chosen parameter values (degree of polynomial kernel, width of RBF kernel and regularization constant). When several combinations of parameter values yield the same cross-validation performance, combinations with a lower polynomial degree or a larger kernel width have a higher chance of being explainable. CONCLUSIONS: This work summarizes SVM classifiers obtained with linear, polynomial and RBF kernels in a single plot. Linear and polynomial kernels up to the second degree are represented exactly. For other kernels an indication of the reliability of the approximation is presented. The complete methodology is available as an R package and two apps and a movie are provided to illustrate the possibilities offered by the method

    Combinatorial and dispersion activity coefficient models for molecular solutions

    No full text
    corecore