170 research outputs found

    Variable selection towards classification of digital images: identification of altered glucose levels in serum

    Get PDF
    classed as 125 mg/dL). Herein, we propose a method to identify control, pre-diabetic, or diabetic simulated and real-world samples based on their glucose levels using classification-based variable selection algorithms [successive projections algorithm (SPA) or genetic algorithm (GA)] coupled to linear discriminant analysis (SPA-LDA and GA-LDA) towards analyzing red–green–blue digital images. Images were recorded after glucose enzymatic reaction, whereby 250 μL of reactant content of samples were captured by using a common cell phone camera. Processing was applied to the images at a pixel level, where 72.2% of the pixels were correctly classified as control, 79.2% as pre-diabetic, and 90.9% as diabetic using SPA-LDA algorithm; and 76.8% as control, 81.4% as pre-diabetic, and 91.7% as diabetic using GA-LDA algorithm in the validation set containing nine simulated samples. Eight real-world samples were measured as an external test set, where the accuracy using GA-LDA was found to be 92%, with sensitivities ranging from 70% to 100 and specificities ranging from 90% to 99%. This method shows the potential of variable selection techniques coupled with digital image analysis towards blood glucose monitorin

    TTWD-DA: A MATLAB toolbox for discriminant analysis based on trilinear three-way data

    Get PDF
    Three-way trilinear data is increasingly used in chemical and biochemical applications. This type of data is composed of three-way structures representing two different signal responses and one sample dimension distributed among a 3D structure, such as the data represented by fluorescence excitation emission matrices (EMMs), spectral-pH responses, spectral-kinetic responses, spectral-electric potential responses, among others. Herein, we describe a new MATLAB toolbox for classification of trilinear three-way data using discriminant analysis techniques (linear discriminant analysis [LDA], quadratic discriminant analysis [QDA], and partial least squares discriminant analysis [PLS-DA]), termed “TTWD-DA”. These discrimination techniques were coupled to multivariate deconvolution techniques by means of parallel factor analysis (PARAFAC) and Tucker3 algorithm. The toolbox is based on a user-friendly graphical interface, where these algorithms can be easily applied. Also, as output, multiple figures of merit are automatically calculated, such as accuracy, sensitivity and specificity. This software is free available online

    ATR-FTIR spectroscopy for virus identification: A powerful alternative

    Get PDF
    In pandemic times, like the one we are witnessing for COVID-19, the discussion about new efficient and rapid techniques for diagnosis of diseases is more evident. In this mini-review, we present to the virological scientific community the potential of attenuated total reflection Fourier-transform infrared (ATR-FTIR) spectroscopy as a diagnosis technique. Herein, we explain the operation of this technique, as well as its advantages over standard methods. In addition, we also present the multivariate analysis tools that can be used to extract useful information from the data towards classification purposes. Tools such as Principal Component Analysis (PCA), Successive Projections Algorithm (SPA), Genetic Algorithm (GA) and Linear and Quadratic Discriminant Analysis (LDA and QDA) are covered, including examples of published studies. Finally, the advantages and disadvantages of ATR-FTIR spectroscopy are emphasized, as well as future prospects in this field of study that is only growing. One of the main aims of this paper is to encourage the scientific community to explore the potential of this spectroscopic tool to detect changes in biological samples such as those caused by the presence of viruses

    Um Interior Vazio: A análise da situação dos municípios do COREDE Fronteira Noroeste

    Get PDF
    Um dos fatores que influenciam o desenvolvimento local é a dinâmica populacional. O deslocamento de pessoas para uma determinada área está relacionado com o comportamento da economia e, a análise das conexões entre migração e desenvolvimento se torna pertinente. O Conselho Regional de Desenvolvimento (COREDE) Fronteira Noroeste é o objeto desta pesquisa devido à evasão populacional que se verifica em seu território, sobretudo nos municípios de menor porte. Esta investigação busca despertar uma reflexão sobre as causas e as soluções para este problema de modo que, o aprofundamento deste estudo contribui para o entendimento dos movimentos migratórios e o desenvolvimento. One of the factors that influence local development is the population dynamics. The movement of people to a particular area is related to the behavior of the economy and the analysis of the links between migration and development becomes relevant. The Regional Development Council (COREDE) North West Frontier is the object of this research due to the decline in population that occurs in its territory, particularly in smaller municipalities. This research seeks to arouse a reflection on the causes and solutions to this problem so that the deepening of this study contributes to the understanding of migration and development

    Uncertainty estimation and misclassification probability for classification models based on discriminant analysis and support vector machines

    Get PDF
    Uncertainty estimation provides a quantitative value of the predictive performance of a classification model based on its misclassification probability. Low misclassification probabilities are associated with a low degree of uncertainty, indicating high trustworthiness; while high misclassification probabilities are associated with a high degree of uncertainty, indicating a high susceptibility to generate incorrect classification. Herein, misclassification probability estimations based on uncertainty estimation by bootstrap were developed for classification models using discriminant analysis [linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA)] and support vector machines (SVM). Principal component analysis (PCA) was used as variable reduction technique prior classification. Four spectral datasets were tested (1 simulated and 3 real applications) for binary and ternary classifications. Models with lower misclassification probabilities were more stable when the spectra were perturbed with white Gaussian noise, indicating better robustness. Thus, misclassification probability can be used as an additional figure of merit to assess model robustness, providing a reliable metric to evaluate the predictive performance of a classifier

    Colourimetric Determination of High-Density Lipoprotein (HDL) Cholesterol using Red-Green-Blue Digital Colour Imaging

    Get PDF
    A rapid, low-cost and sensitive method for quantification of high-density lipoprotein (HDL) cholesterol based on enzymatic colorimetric reactions and digital image analysis was developed. The proposed method was adapted to a 96-microwell enzyme-linked immunosorbent assay (ELISA) plate and imaging acquisition was performed using a conventional desktop scanner. The images were recorded using the red-green-blue (RGB) colour system in which the resolved absorbance for each colour channel was used for multiple linear regression. The regression model presented a root mean squared error of calibration and R2 value of 1.53 mg dL-1 and 0.995, respectively. Prediction was obtained with a root mean square error of prediction of 2.42 mg dL-1 and R2 of 0.993; therefore, showing a good prediction response. A limit of detection of 0.43 mg dL-1 and precision better than 1.72% reinforced these results. This method was compared with a reference methodology using UV-Vis measurements at 500 nm and no statistical difference was observed at a confidence level of 95%; showing its potential for future clinical applications

    Advances in chemometric control of commercial diesel adulteration by kerosene using IR spectroscopy

    Get PDF
    Adulteration is a recurrent issue found in fuel screening. Commercial diesel contamination by kerosene is highly difficult to be detected via physicochemical methods applied in market. Although the contamination may affect diesel quality and storage stability, there is a lack of efficient methodologies for this evaluation. This paper assessed the use of IR spectroscopies (MIR and NIR) coupled with partial least squares (PLS) regression, support vector machine regression (SVR), and multivariate curve resolution with alternating least squares (MCR-ALS) calibration models for quantifying and identifying the presence of kerosene adulterant in commercial diesel. Moreover, principal component analysis (PCA), successive projections algorithm (SPA), and genetic algorithm (GA) tools coupled to linear discriminant analysis were used to observe the degradation behavior of 60 samples of pure and kerosene-added diesel fuel in different concentrations over 60 days of storage. Physicochemical properties of commercial diesel with 15% kerosene remained within conformity with Brazilian screening specifications; in addition, specified tests were not able to identify changes in the blends’ performance over time. By using multivariate classification, the samples of pure and contaminated fuel were accurately classified by aging level into two well-defined groups, and some spectral features related to fuel degradation products were detected. PLS and SVR were accurate to quantify kerosene in the 2.5–40% (v/v) range, reaching RMSEC < 2.59% and RMSEP < 5.56%, with high correlation between real and predicted concentrations. MCR-ALS with correlation constraint was able to identify and recover the spectral profile of commercial diesel and kerosene adulterant from the IR spectra of contaminated blends

    Improving data splitting for classification applications in spectrochemical analyses employing a random-mutation Kennard-Stone algorithm approach

    Get PDF
    Motivation: Data splitting is a fundamental step for building classification models with spectral data, especially in biomedical applications. This approach is performed following pre-processing and prior to model construction, and consists of dividing the samples into at least training and test sets; herein, the training set is used for model construction and the test set for model validation. Some of the most used methodologies for data splitting are the random selection (RS) and the Kennard-Stone (KS) algorithms; here, the former works based on a random splitting process and the latter is based on the calculation of the Euclidian distance between the samples. We propose an algorithm called the Morais-Lima-Martin (MLM) algorithm, as an alternative method to improve data splitting in classification models. MLM is a modification of KS algorithm by adding a random-mutation factor. Results: RS, KS and MLM performance are compared in simulated and six real-world biospectroscopic applications using principal component analysis linear discriminant analysis (PCALDA). MLM generated a better predictive performance in comparison with RS and KS algorithms, in particular regarding sensitivity and specificity values. Classification is found to be more wellequilibrated using MLM. RS showed the poorest predictive response, followed by KS which showed good accuracy towards prediction, but relatively unbalanced sensitivities and specificities. These findings demonstrate the potential of this new MLM algorithm as a sample selection method for classification applications in comparison with other regular methods often applied in this type of data. Availability: MLM algorithm is freely available for MATLAB at https://doi.org/10.6084/m9.figshare.7393517.v1. Contact: [email protected]/[email protected]

    Raman spectral discrimination in human liquid biopsies of oesophageal transformation to adenocarcinoma

    Get PDF
    The aim of this study was to determine whether Raman spectroscopy combined with chemometric analysis can be applied to interrogate biofluids (plasma, serum, saliva and urine) towards detecting oesophageal stages through to oesophageal adenocarcinoma (normal/squamous epithelium, inflammatory, Barrett's, low-grade dysplasia [LGD], high-grade dysplasia [HGD], and oesophageal adenocarcinoma [OAC]). The chemometric analysis of the spectral data was performed using principal component analysis (PCA), successive projections algorithm (SPA) or genetic algorithm (GA) followed by quadratic discriminant analysis (QDA). The GA-QDA model using a few selected wavenumbers for saliva and urine samples achieved 100% classification for all classes. For plasma and serum, the GA-QDA model achieved excellent accuracy in all oesophageal stages (>90%). The main GA-QDA features responsible for sample discrimination were: 1012 cm (C-O stretching of ribose), 1336 cm (Amide III and CH wagging vibrations from glycine backbone), 1450 cm (methylene deformation), and 1660 cm (Amide I). The results of this study are promising and support the concept that Raman on biofluids may become a useful and objective diagnostic tool to identify oesophageal disease stages from squamous epithelium to OAC. This article is protected by copyright. All rights reserved. [Abstract copyright: This article is protected by copyright. All rights reserved.

    Potential of mid-infrared spectroscopy as a non-invasive diagnostic test in urine for endometrial or ovarian cancer

    Get PDF
    The current lack of an accurate, cost-effective and non-invasive test that would allow for screening and diagnosis of gynaecological carcinomas, such as endometrial and ovarian cancer, signals the necessity for alternative approaches. The potential of spectroscopic techniques in disease investigation and diagnosis has been previously demonstrated. Here, we used attenuated total reflection Fourier-transform infrared (ATR-FTIR) spectroscopy to analyse urine samples from women with endometrial (n=10) and ovarian cancer (n=10), as well as from healthy individuals (n=10). After applying multivariate analysis and classification algorithms, biomarkers of disease were pointed out and high levels of accuracy were achieved for both endometrial (95% sensitivity, 100% specificity; accuracy: 95%) and ovarian cancer (100% sensitivity, 96.3% specificity; accuracy 100%). The efficacy of this approach, in combination with the non-invasive method for urine collection, suggest a potential diagnostic tool for endometrial and ovarian cancers
    • …
    corecore