25 research outputs found

    Radiomics analysis of bone marrow biopsy locations in [18F]FDG PET/CT images for measurable residual disease assessment in multiple myeloma.

    Get PDF
    The combination of visual assessment of whole body [18F]FDG PET images and evaluation of bone marrow samples by Multiparameter Flow Cytometry (MFC) or Next-Generation Sequencing (NGS) is currently the most common clinical practice for the detection of Measurable Residual Disease (MRD) in Multiple Myeloma (MM) patients. In this study, radiomic features extracted from the bone marrow biopsy locations are analyzed and compared to those extracted from the whole bone marrow in order to study the representativeness of these biopsy locations in the image-based MRD assessment. Whole body [18F]FDG PET of 39 patients with newly diagnosed MM were included in the database, and visually evaluated by experts in nuclear medicine. A methodology for the segmentation of biopsy sites from PET images, including sternum and posterior iliac crest, and their subsequent quantification is proposed. First, starting from the bone marrow segmentation, a segmentation of the biopsy sites is performed. Then, segmentations are quantified extracting SUV metrics and radiomic features from the [18F]FDG PET images and are evaluated by Mann-Whitney U-tests as valuable features differentiating PET+/PET- and MFC+ /MFC- groups. Moreover, correlation between whole bone marrow and biopsy sites is studied by Spearman ρ rank. Classification performance of the radiomics features is evaluated applying seven machine learning algorithms. Statistical analyses reveal that some images features are significant in PET+/PET- differentiation, such as SUVmax, Gray Level Non-Uniformity or Entropy, especially with a balanced database where 16 of the features show a p value < 0.001. Correlation analyses between whole bone marrow and biopsy sites results in significant and acceptable coefficients, with 11 of the variables reaching a correlation coefficient greater than 0.7, with a maximum of 0.853. Machine learning algorithms demonstrate high performances in PET+/PET- classification reaching a maximum AUC of 0.974, but not for MFC+/MFC- classification. The results demonstrate the representativeness of sample sites as well as the effectiveness of extracted features (SUV metrics and radiomic features) from the [18F]FDG PET images in MRD assessment in MM patients.The author E.M. received financial support through a predoctoral Fellowship (ayuda del Programa Propio de I+D+i 2020) from Universidad Politecnica de Madrid. The project was partially supported by COVITECH-CM (Plataforma cientifico-tecnologica para alerta, diagnostico, pronostico, terapia y seguimiento de la enfermedad COVID19 y futuras pandemias) and REACT-UE through the European Regional Development Fund (ERDF), the European Social Fund (EFS) and the Fund for European Aid to the Most Deprived (FEAD).Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. The authors declare that no funds, grants, or other support were received during the preparation of this manuscriptS

    Predicting the risk of cancer in adults using supervised machine learning: a scoping review

    Get PDF
    OBJECTIVES: The purpose of this scoping review is to: (1) identify existing supervised machine learning (ML) approaches on the prediction of cancer in asymptomatic adults; (2) to compare the performance of ML models with each other and (3) to identify potential gaps in research. DESIGN: Scoping review using the population, concept and context approach. SEARCH STRATEGY: PubMed search engine was used from inception to 10 November 2020 to identify literature meeting following inclusion criteria: (1) a general adult (≥18 years) population, either sex, asymptomatic (population); (2) any study using ML techniques to derive predictive models for future cancer risk using clinical and/or demographic and/or basic laboratory data (concept) and (3) original research articles conducted in all settings in any region of the world (context). RESULTS: The search returned 627 unique articles, of which 580 articles were excluded because they did not meet the inclusion criteria, were duplicates or were related to benign neoplasm. Full-text reviews were conducted for 47 articles and a final set of 10 articles were included in this scoping review. These 10 very heterogeneous studies used ML to predict future cancer risk in asymptomatic individuals. All studies reported area under the receiver operating characteristics curve (AUC) values as metrics of model performance, but no study reported measures of model calibration. CONCLUSIONS: Research gaps that must be addressed in order to deliver validated ML-based models to assist clinical decision-making include: (1) establishing model generalisability through validation in independent cohorts, including those from low-income and middle-income countries; (2) establishing models for all cancer types; (3) thorough comparisons of ML models with best available clinical tools to ensure transparency of their potential clinical utility; (4) reporting of model calibration performance and (5) comparisons of different methods on the same cohort to reveal important information about model generalisability and performance

    OMWS: A Web Service Interface for Ecological Niche Modelling

    Get PDF
    [EN] Ecological niche modelling (ENM) experiments often involve a high number of tasks to be performed. Such tasks may consume a significant amount of computing resources and take a long time to complete, especially when using personal computers. OMWS is a Web service interface that allows more powerful computing back-ends to be remotely exploited by other applications to carry out ENM tasks. Its latest version includes a new operation that can be used to specify complex workflows in a single request, adding the possibility of using workflow management systems on parallel computing back-end. In this paper we describe the OMWS protocol and compare its most recent version with the previous one by running the same ENM experiment using two functionally equivalent clients, each designed for one of the OMWS interface versions. Different back-end configurations were used to investigate how the performance scales for each protocol version when more processing power is made available. Results show that the new version outperforms (in a factor of 2) the previous one when more computing resources are used.The latest version of OMWS contains improvements coming from different sets of requirements originated from two projects that funded their corresponding implementation: EUBrazilOpenBio14, with grants from the European Commission and the National Council for Scientific and Technological Development of Brazil (CNPq) of the Brazilian Ministry of Science and Technology (MCT), and BioVeL, with grants from the European Commission. Server infrastructure was operated through a provisioning system developed in the frame of the Spanish project CLUVIEM (TIN2013-44390-R) funded by the "Ministerio de Economía y Competitividad".Giovanni, RD.; Torres Serrano, E.; Amaral, RB.; Blanquer Espert, I.; Rebello, V.; Canhos, VP. (2015). OMWS: A Web Service Interface for Ecological Niche Modelling. Biodiversity Informatics. 10:35-44. https://doi.org/10.17161/bi.v10i0.4853S35441

    Varying dataset resolution alters predictive accuracy of spatially explicit ensemble models for avian species distribution

    Get PDF
    Species distribution models can be made more accurate by use of new “Spatiotemporal Exploratory Models” (STEMs), a type of spatially explicit ensemble model (SEEM) developed at the continental scale that averages regional models pixel by pixel. Although SEEMs can generate more accurate predictions of species distributions, they are computationally expensive. We compared the accuracies of each model for 11 grassland bird species and examined whether they improve accuracy at a statewide scale for fine and coarse predictor resolutions. We used a combination of survey data and citizen science data for 11 grassland bird species in Oklahoma to test a spatially explicit ensemble model at a smaller scale for its effects on accuracy of current models. We found that only four species performed best with either a statewide model or SEEM; the most accurate model for the remaining seven species varied with data resolution and performance measure. Policy implications: Determination of nonheterogeneity may depend on the spatial resolution of the examined dataset. Managers should be cautious if any regional differences are expected when developing policy from range‐wide results that show a single model or timeframe. We recommend use of standard species distribution models or other types of nonspatially explicit ensemble models for local species prediction models. Further study is necessary to understand at what point SEEMs become necessary with varying dataset resolutions.Article processing charges funded by University of Oklahoma Libraries. This work was funded by U.S. Department of Agriculture (USDA) NIFA grant 2013‐67009‐20369 to ESB and supported by the AWS Cloud Credits for Research program. CMC was supported by National Science Foundation (NSF) grants IDBR 1014891 and ABI 1458402 to ESB and Oklahoma Department of Wildlife Conservation grant F17AF01294 (W‐194‐R‐1) to M.A. Patten. AJC was supported by NSF grants IDBR 1014891, DGE 1545261, and DEB 0946685 and by USDA grant NIFA‐AFRI‐003536. Additional support was provided by the University Strategic Organization in "Applied Aeroecology" at the University of Oklahoma.Ye

    Combined mechanistic modeling and machine-learning approaches in systems biology - A systematic literature review

    Get PDF
    Background and objective: Mechanistic-based Model simulations (MM) are an effective approach commonly employed, for research and learning purposes, to better investigate and understand the inherent behavior of biological systems. Recent advancements in modern technologies and the large availability of omics data allowed the application of Machine Learning (ML) techniques to different research fields, including systems biology. However, the availability of information regarding the analyzed biological context, sufficient experimental data, as well as the degree of computational complexity, represent some of the issues that both MMs and ML techniques could present individually. For this reason, recently, several studies suggest overcoming or significantly reducing these drawbacks by combining the above-mentioned two methods. In the wake of the growing interest in this hybrid analysis approach, with the present review, we want to systematically investigate the studies available in the scientific literature in which both MMs and ML have been combined to explain biological processes at genomics, proteomics, and metabolomics levels, or the behavior of entire cellular populations. Methods: Elsevier Scopus®, Clarivate Web of Science™ and National Library of Medicine PubMed® databases were enquired using the queries reported in Table 1, resulting in 350 scientific articles. Results: Only 14 of the 350 documents returned by the comprehensive search conducted on the three major online databases met our search criteria, i.e. present a hybrid approach consisting of the synergistic combination of MMs and ML to treat a particular aspect of systems biology. Conclusions: Despite the recent interest in this methodology, from a careful analysis of the selected papers, it emerged how examples of integration between MMs and ML are already present in systems biology, highlighting the great potential of this hybrid approach to both at micro and macro biological scales

    MODELOS DE MACHINE LEARNING APLICADOS NA ESTIMAÇÃO DA EVAPOTRANSPIRAÇÃO DE REFERÊNCIA DO PLANALTO OCIDENTAL PAULISTA

    Get PDF
    Evapotranspiration depends on the interaction between meteorological variables (solar radiation, air temperature, precipitation, relative humidity and wind speed) and phytosanitary conditions of agricultural crops. It is complex to build reliable evapotranspiration measurements due to the high costs of implementing micrometeorological techniques, in addition to difficulties in the operation and maintenance of the necessary equipment. The purpose of this research was to model the reference evapotranspiration through machine learning techniques in climatic data from 30 automatic weather stations in the Planalto Ocidental Paulista, State of São Paulo, Brazil, in the period 2013-2017. A comparison of the statistical performance between the techniques used was carried out, where the best performance of the EToMLP4 model (rRMSE = 0.62%), followed by EToANFIS4 (rRMSE = 0.75%), EToSVM4 (rRMSE = 1.19%) and EToGRNN4 (rRMSE = 11.05 %). Performance measures of the validation base show that the proposed models are able to estimate the reference evapotranspiration, with emphasis on the MPL technique.La evapotranspiración depende de la interacción entre las variables meteorológicas (radiación solar, temperatura del aire, precipitación, humedad relativa y velocidad del viento) y las condiciones fitosanitarias de los cultivos agrícolas. Es complejo construir mediciones confiables de evapotranspiración debido a los altos costos de implementar técnicas micrometeorológicas, además de las dificultades en la operación y mantenimiento de los equipos necesarios. El objetivo de esta investigación fue modelar la evapotranspiración de referencia a través de técnicas de aprendizaje automático en datos climáticos de 30 estaciones meteorológicas automáticas en el Planalto Ocidental Paulista, Estado de São Paulo, Brasil, en el período 2013-2017. Se realizó una comparación del rendimiento estadístico entre las técnicas utilizadas, donde el mejor rendimiento del modelo EToMLP4 (rRMSE = 0,62%), seguido de EToANFIS4 (rRMSE = 0,75%), EToSVM4 (rRMSE = 1,19%) y EToGRNN4 (rRMSE = 11,05 %). Las medidas de desempeño de la base de validación muestran que los modelos propuestos son capaces de estimar la evapotranspiración de referencia, con énfasis en la técnica MPL.A evapotranspiração depende da interação entre variáveis meteorológicas (radiação solar, temperatura do ar, precipitação, umidade relativa do ar e velocidade do vento) e condições fitossanitárias das culturas agrícolas. É complexo construir medidas confiáveis de evapotranspiração devido aos elevados custos para implantação de técnicas micrometeorológicas, além de dificuldades na operação e manutenção dos equipamentos necessários. O propósito desta pesquisa foi modelar a evapotranspiração de referência (ETo) por meio de técnicas de machine learning em dados climáticos de 30 estações meteorológicas automáticas do Planalto Ocidental Paulista, Estado de São Paulo, Brasil, no período de 2013-2017. Uma comparação do desempenho estatístico entre as técnicas utilizadas foi realizada onde constatou-se melhor desempenho do modelo EToMLP4 (rRMSE = 0.62%), seguido por EToANFIS4 (rRMSE = 0.75%), EToSVM4 (rRMSE = 1.19%) e EToGRNN4 (rRMSE = 11.05%). Medidas de performance da base de validação evidenciam que os modelos propostos são aptos à estimativa da evapotranspiração de referência com destaque para a técnica MPL. Palavras-chave: evapotranspiração; modelagem matemática; aprendizagem de máquina.   Machine learning models applied in the estimation of reference evapotranspiration from the Western Plateau of Paulista   ABSTRACT: Evapotranspiration depends on the interaction between meteorological variables (solar radiation, air temperature, precipitation, relative humidity and wind speed) and phytosanitary conditions of agricultural crops. It is complex to build reliable evapotranspiration measurements due to the high costs of implementing micrometeorological techniques, in addition to difficulties in the operation and maintenance of the necessary equipment. The purpose of this research was to model the reference evapotranspiration through machine learning techniques in climatic data from 30 automatic weather stations in the Planalto Ocidental Paulista, State of São Paulo, Brazil, in the period 2013-2017. A comparison of the statistical performance between the techniques used was carried out, where the best performance of the EToMLP4 model (rRMSE = 0.62%), followed by EToANFIS4 (rRMSE = 0.75%), EToSVM4 (rRMSE = 1.19%) and EToGRNN4 (rRMSE = 11.05 %). Performance measures of the validation base show that the proposed models are able to estimate the reference evapotranspiration, with emphasis on the MPL technique. Keywords: evapotranspiration; modeling; machine learning

    Comparação de técnicas de machine learning para predição de default e aplicação da heurística VNS para seleção de variáveis

    Get PDF
    Credit scoring possui um papel fundamental para instituições financeiras no processo de análise para concessão de crédito. Nesse sentido, técnicas de machine learning têm sido utilizadas para desenvolver modelos de credit scoring, uma vez que elas buscam reconhecer padrões existentes em bases de dados contendo o histórico de tomadores de crédito, e assim podem inferir quais indivíduos terão mais propensão a cometer um calote (default). Entretanto, essas bases de dados comumente apresentam um grande número de variáveis, algumas das quais podem ser ruidosas, o que prejudica a análise. No presente trabalho, é proposta uma técnica de seleção de variáveis baseada em um conceito de vizinhança variável, chamado VNS. A aplicabilidade do método é avaliada em conjunto com sete das principais técnicas utilizadas para fazer predição de default em problemas de análise de crédito. Seu desempenho foi comparado com a seleção de variáveis obtida pelo conhecido método estatístico PCA. Os resultados indicam performance superior do VNS na maior parte dos testes aplicados, sugerindo a robustez do método.Credit scoring plays a major role for financial institutions when making credit-granting decisions. In this context, machine learning techniques have been used to develop a credit scoring model, as they seek to recognize existing patterns in databases containing the credit history of borrowers to infer potential defaulters. However, these databases often contain a large number of variables, some of which can be noisy, leading to imprecise results. In the present work, a feature selection technique is proposed based on a variable neighborhood concept, so-called VNS. The applicability of the method is assessed in conjunction with seven of the main techniques used to make default prediction in credit analysis problems. Its performance was compared to the feature selection obtained by the well-known PCA statistical method. The results indicate superior performance of the VNS in most of the applied tests, suggesting the robustness of the method
    corecore