615 research outputs found

    Use of Microarray Test Data for Toxicogenomic Prediction-Multi-Intelligent Systems for Toxicogenomic Applications (MISTA)

    Full text link

    An improved machine learning pipeline for urinary volatiles disease detection:Diagnosing diabetes

    Get PDF
    Motivation The measurement of disease biomarkers in easily–obtained bodily fluids has opened the door to a new type of non–invasive medical diagnostics. New technologies are being developed and fine–tuned in order to make this possibility a reality. One such technology is Field Asymmetric Ion Mobility Spectrometry (FAIMS), which allows the measurement of volatile organic compounds (VOCs) in biological samples such as urine. These VOCs are known to contain a range of information on the relevant person’s metabolism and can in principle be used for disease diagnostic purposes. Key to the effective use of such data are well–developed data processing pipelines, which are necessary to extract the most useful data from the complex underlying biological structure. Results In this study, we present a new data analysis pipeline for FAIMS data, and demonstrate a number of improvements over previously used methods. We evaluate the effect of a series of candidate operational steps during data processing, such as the use of wavelet transforms, principal component analysis (PCA), and classifier ensembles. We also demonstrate the use of FAIMS data in our pipeline to diagnose diabetes on the basis of a simple urine sample using machine learning classifiers. We present results for data generated from a case-control study of 115 urine samples, collected from 72 type II diabetic patients, with 43 healthy volunteers as negative controls. The resulting pipeline combines the steps that resulted in the best classification model performance. These include the use of a two–dimensional discrete wavelet transform, and the Wilcoxon rank–sum test for feature selection. We are able to achieve a best ROC curve AUC of 0.825 (0.747–0.9, 95% CI) for classification of diabetes vs control. We also note that this result is robust to changes in the data pipeline and different analysis runs, with AUC > 0.80 achieved in a range of cases. This is a substantial improvement in performance over previously used data processing methods in this area. Our ability to make strong statements about FAIMS ability to diagnose diabetes is sadly limited, as we found confounding effects from the demographics when including these data in the pipeline. The demographics alone produced a best AUC of 0.87 (0.795–0.94, 95% CI). While the combination of the demographics and FAIMS data resulted in an improvement on the AUC (0.907; 0.848–0.97, 95% CI), it did not prove to be a significant difference. Nevertheless, the pipeline itself shows a significant improvement in performance over more basic methods which have been used with FAIMS data in the past

    Estimating the concentration of physico chemical parameters in hydroelectric power plant reservoir

    Get PDF
    The United Nations Educational, Scientific and Cultural Organization (UNESCO) defines the amazon region and adjacent areas, such as the Pantanal, as world heritage territories, since they possess unique flora and fauna and great biodiversity. Unfortunately, these regions have increasingly been suffering from anthropogenic impacts. One of the main anthropogenic impacts in the last decades has been the construction of hydroelectric power plants. As a result, dramatic altering of these ecosystems has been observed, including changes in water levels, decreased oxygenation and loss of downstream organic matter, with consequent intense land use and population influxes after the filling and operation of these reservoirs. This, in turn, leads to extreme loss of biodiversity in these areas, due to the large-scale deforestation. The fishing industry in place before construction of dams and reservoirs, for example, has become much more intense, attracting large populations in search of work, employment and income. Environmental monitoring is fundamental for reservoir management, and several studies around the world have been performed in order to evaluate the water quality of these ecosystems. The Brazilian Amazon, in particular, goes through well defined annual hydrological cycles, which are very importante since their study aids in monitoring anthropogenic environmental impacts and can lead to policy and decision making with regard to environmental management of this area. The water quality of amazon reservoirs is greatly influenced by this defined hydrological cycle, which, in turn, causes variations of microbiological, physical and chemical characteristics. Eutrophication, one of the main processes leading to water deterioration in lentic environments, is mostly caused by anthropogenic activities, such as the releases of industrial and domestic effluents into water bodies. Physico-chemical water parameters typically related to eutrophication are, among others, chlorophyll-a levels, transparency and total suspended solids, which can, thus, be used to assess the eutrophic state of water bodies. Usually, these parameters must be investigated by going out to the field and manually measuring water transparency with the use of a Secchi disk, and taking water samples to the laboratory in order to obtain chlorophyll-a and total suspended solid concentrations. These processes are time- consuming and require trained personnel. However, we have proposed other techniques to environmental monitoring studies which do not require fieldwork, such as remote sensing and computational intelligence. Simulations in different reservoirs were performed to determine a relationship between these physico-chemical parameters and the spectral response. Based on the in situ measurements, empirical models were established to relate the reflectance of the reservoir measured by the satellites. The images were calibrated and corrected atmospherically. Statistical analysis using error estimation was used to evaluate the most accurate methodology. The Neural Networks were trained by hydrological cycle, and were useful to estimate the physicalchemical parameters of the water from the reflectance of visible bands and NIR of satellite images, with better results for the period with few clouds in the regions analyzed. The present study shows the application of wavelet neural network to estimate water quality parameters using concentration of the water samples collected in the Amazon reservoir and Cefni reservoir, UK. Sattelite imagens from Landsats and Sentinel-2 were used to train the ANN by hydrological cycle. The trained ANNs demonstrated good results between observed and estimated after Atmospheric corrections in satellites images. The ANNs showed in the results are useful to estimate these concentrations using remote sensing and wavelet transform for image processing. Therefore, the techniques proposed and applied in the present study are noteworthy since they can aid in evaluating important physico-chemical parameters, which, in turn, allows for identification of possible anthropogenic impacts, being relevant in environmental management and policy decision-making processes. The tests results showed that the predicted values have good accurate. Improving efficiency to monitor water quality parameters and confirm the reliability and accuracy of the approaches proposed for monitoring water reservoirs. This thesis contributes to the evaluation of the accuracy of different methods in the estimation of physical-chemical parameters, from satellite images and artificial neural networks. For future work, the accuracy of the results can be improved by adding more satellite images and testing new neural networks with applications in new water reservoirs
    • …
    corecore