227 research outputs found

    Alternative contingency table measures improve the power and detection of multifactor dimensionality reduction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Multifactor Dimensionality Reduction (MDR) has been introduced previously as a non-parametric statistical method for detecting gene-gene interactions. MDR performs a dimensional reduction by assigning multi-locus genotypes to either high- or low-risk groups and measuring the percentage of cases and controls incorrectly labelled by this classification – the classification error. The combination of variables that produces the lowest classification error is selected as the best or most fit model. The correctly and incorrectly labelled cases and controls can be expressed as a two-way contingency table. We sought to improve the ability of MDR to detect gene-gene interactions by replacing classification error with a different measure to score model quality.</p> <p>Results</p> <p>In this study, we compare the detection and power of MDR using a variety of measures for two-way contingency table analysis. We simulated 40 genetic models, varying the number of disease loci in the model (2 – 5), allele frequencies of the disease loci (.2/.8 or .4/.6) and the broad-sense heritability of the model (.05 – .3). Overall, detection using NMI was 65.36% across all models, and specific detection was 59.4% versus detection using classification error at 62% and specific detection was 52.2%.</p> <p>Conclusion</p> <p>Of the 10 measures evaluated, the likelihood ratio and normalized mutual information (NMI) are measures that consistently improve the detection and power of MDR in simulated data over using classification error. These measures also reduce the inclusion of spurious variables in a multi-locus model. Thus, MDR, which has already been demonstrated as a powerful tool for detecting gene-gene interactions, can be improved with the use of alternative fitness functions.</p

    Synthesis-View: visualization and interpretation of SNP association results for multi-cohort, multi-phenotype data and meta-analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Initial genome-wide association study (GWAS) discoveries are being further explored through the use of large cohorts across multiple and diverse populations involving meta-analyses within large consortia and networks. Many of the additional studies characterize less than 100 single nucleotide polymorphisms (SNPs), often include multiple and correlated phenotypic measurements, and can include data from multiple-sites, multiple-studies, as well as multiple race/ethnicities. New approaches for visualizing resultant data are necessary in order to fully interpret results and obtain a broad view of the trends between DNA variation and phenotypes, as well as provide information on specific SNP and phenotype relationships.</p> <p>Results</p> <p>The Synthesis-View software tool was designed to visually synthesize the results of the aforementioned types of studies. Presented herein are multiple examples of the ways Synthesis-View can be used to report results from association studies of DNA variation and phenotypes, including the visual integration of p-values or other metrics of significance, allele frequencies, sample sizes, effect size, and direction of effect.</p> <p>Conclusions</p> <p>To truly allow a user to visually integrate multiple pieces of information typical of a genetic association study, innovative views are needed to integrate multiple pieces of information. As a result, we have created "Synthesis-View" software for the visualization of genotype-phenotype association data in multiple cohorts. Synthesis-View is freely available for non-commercial research institutions, for full details see <url>https://chgr.mc.vanderbilt.edu/synthesisview</url>.</p

    AZOrange - High performance open source machine learning for QSAR modeling in a graphical programming environment

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Machine learning has a vast range of applications. In particular, advanced machine learning methods are routinely and increasingly used in quantitative structure activity relationship (QSAR) modeling. QSAR data sets often encompass tens of thousands of compounds and the size of proprietary, as well as public data sets, is rapidly growing. Hence, there is a demand for computationally efficient machine learning algorithms, easily available to researchers without extensive machine learning knowledge. In granting the scientific principles of transparency and reproducibility, Open Source solutions are increasingly acknowledged by regulatory authorities. Thus, an Open Source state-of-the-art high performance machine learning platform, interfacing multiple, customized machine learning algorithms for both graphical programming and scripting, to be used for large scale development of QSAR models of regulatory quality, is of great value to the QSAR community.</p> <p>Results</p> <p>This paper describes the implementation of the Open Source machine learning package AZOrange. AZOrange is specially developed to support batch generation of QSAR models in providing the full work flow of QSAR modeling, from descriptor calculation to automated model building, validation and selection. The automated work flow relies upon the customization of the machine learning algorithms and a generalized, automated model hyper-parameter selection process. Several high performance machine learning algorithms are interfaced for efficient data set specific selection of the statistical method, promoting model accuracy. Using the high performance machine learning algorithms of AZOrange does not require programming knowledge as flexible applications can be created, not only at a scripting level, but also in a graphical programming environment.</p> <p>Conclusions</p> <p>AZOrange is a step towards meeting the needs for an Open Source high performance machine learning platform, supporting the efficient development of highly accurate QSAR models fulfilling regulatory requirements.</p

    A Comparison of Machine Learning and Classical Demand Forecasting Methods: A Case Study of Ecuadorian Textile Industry

    Full text link
    [EN] This document presents a comparison of demand forecasting methods, with the aim of improving demand forecasting and with it, the production planning system of Ecuadorian textile industry. These industries present problems in providing a reliable estimate of future demand due to recent changes in the Ecuadorian context. The impact on demand for textile products has been observed in variables such as sales prices and manufacturing costs, manufacturing gross domestic product and the unemployment rate. Being indicators that determine to a great extent, the quality and accuracy of the forecast, generating also, uncertainty scenarios. For this reason, the aim of this work is focused on the demand forecasting for textile products by comparing a set of classic methods such as ARIMA, STL Decomposition, Holt-Winters and machine learning, Artificial Neural Networks, Bayesian Networks, Random Forest, Support Vector Machine, taking into consideration all the above mentioned, as an essential input for the production planning and sales of the textile industries. And as a support, when developing strategies for demand management and medium-term decision making of this sector under study. Finally, the effectiveness of the methods is demonstrated by comparing them with different indicators that evaluate the forecast error, with the Multi-layer Neural Networks having the best results with the least error and the best performance.The authors are greatly grateful by the support given by the SDAS Research Group (https://sdas-group.com/).Lorente-Leyva, LL.; Alemany DĂ­az, MDM.; Peluffo-Ordóñez, DH.; Herrera-Granda, ID. (2021). A Comparison of Machine Learning and Classical Demand Forecasting Methods: A Case Study of Ecuadorian Textile Industry. Lecture Notes in Computer Science. 131-142. https://doi.org/10.1007/978-3-030-64580-9_11S131142Silva, P.C.L., Sadaei, H.J., Ballini, R., Guimaraes, F.G.: Probabilistic forecasting with fuzzy time series. IEEE Trans. Fuzzy Syst. (2019). https://doi.org/10.1109/TFUZZ.2019.2922152Lorente-Leyva, L.L., et al.: Optimization of the master production scheduling in a textile industry using genetic algorithm. In: PĂ©rez GarcĂ­a, H., SĂĄnchez GonzĂĄlez, L., CastejĂłn Limas, M., QuintiĂĄn Pardo, H., Corchado RodrĂ­guez, E. (eds.) HAIS 2019. LNCS (LNAI), vol. 11734, pp. 674–685. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29859-3_57Seifert, M., Siemsen, E., Hadida, A.L., Eisingerich, A.B.: Effective judgmental forecasting in the context of fashion products. J. Oper. Manag. 36, 33–45 (2015). https://doi.org/10.1016/j.jom.2015.02.001Tratar, L.F., Strmčnik, E.: Forecasting methods in engineering. IOP Conf. Ser. Mater. Sci. Eng. 657, 012027 (2019). https://doi.org/10.1088/1757-899X/657/1/012027Prak, D., Teunter, R.: A general method for addressing forecasting uncertainty in inventory models. Int. J. Forecast. 35, 224–238 (2019). https://doi.org/10.1016/j.ijforecast.2017.11.004Gaba, A., Tsetlin, I., Winkler, R.L.: Combining interval forecasts. Decis. Anal. 14, 1–20 (2017). https://doi.org/10.1287/deca.2016.0340Zhang, B., Duan, D., Ma, Y.: Multi-product expedited ordering with demand forecast updates. Int. J. Prod. Econ. 206, 196–208 (2018). https://doi.org/10.1016/j.ijpe.2018.09.034Januschowski, T., et al.: Criteria for classifying forecasting methods. Int. J. Forecast. 36, 167–177 (2020). https://doi.org/10.1016/j.ijforecast.2019.05.008Box, G.E., Jenkins, G.M., Reinsel, C., Ljung, M.: Time Series Analysis: Forecasting and Control, 5th edn. Wiley, Hoboken (2015)Murray, P.W., Agard, B., Barajas, M.A.: Forecast of individual customer’s demand from a large and noisy dataset. Comput. Ind. Eng. 118, 33–43 (2018). https://doi.org/10.1016/j.cie.2018.02.007Bruzda, J.: Quantile smoothing in supply chain and logistics forecasting. Int. J. Prod. Econ. 208, 122–139 (2019). https://doi.org/10.1016/j.ijpe.2018.11.015Bajari, P., Nekipelov, D., Ryan, S.P., Yang, M.: Machine learning methods for demand estimation. Am. Econ. Rev. 105, 481–485 (2015). https://doi.org/10.1257/aer.p20151021Villegas, M.A., Pedregal, D.J., Trapero, J.R.: A support vector machine for model selection in demand forecasting applications. Comput. Ind. Eng. 121, 1–7 (2018). https://doi.org/10.1016/j.cie.2018.04.042Herrera-Granda, I.D., et al.: Artificial neural networks for bottled water demand forecasting: a small business case study. In: Rojas, I., Joya, G., Catala, A. (eds.) IWANN 2019. LNCS, vol. 11507, pp. 362–373. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20518-8_31Dudek, G.: Multilayer perceptron for short-term load forecasting: from global to local approach. Neural Comput. Appl. 32(8), 3695–3707 (2019). https://doi.org/10.1007/s00521-019-04130-ySalinas, D., Flunkert, V., Gasthaus, J., Januschowski, T.: DeepAR: probabilistic forecasting with autoregressive recurrent networks. Int. J. Forecast. (2019). https://doi.org/10.1016/j.ijforecast.2019.07.001Weng, Y., Wang, X., Hua, J., Wang, H., Kang, M., Wang, F.Y.: Forecasting horticultural products price using ARIMA model and neural network based on a large-scale data set collected by web crawler. IEEE Trans. Comput. Soc. Syst. 6, 547–553 (2019). https://doi.org/10.1109/TCSS.2019.2914499Zhang, X., Zheng, Y., Wang, S.: A demand forecasting method based on stochastic frontier analysis and model average: an application in air travel demand forecasting. J. Syst. Sci. Complexity 32(2), 615–633 (2019). https://doi.org/10.1007/s11424-018-7093-0Lorente-Leyva, L.L., et al.: Artificial neural networks for urban water demand forecasting: a case study. J. Phys: Conf. Ser. 1284(1), 012004 (2019). https://doi.org/10.1088/1742-6596/1284/1/012004Scott, S.L., Varian, H.R.: Predicting the present with Bayesian structural time series. Int. J. Math. Model. Numer. Optim. 5, 4–23 (2014). https://doi.org/10.1504/IJMMNO.2014.059942Gallego, V., SuĂĄrez-GarcĂ­a, P., Angulo, P., GĂłmez-Ullate, D.: Assessing the effect of advertising expenditures upon sales: a Bayesian structural time series model. Appl. Stoch. Model. Bus. Ind. 35, 479–491 (2019). https://doi.org/10.1002/asmb.2460Han, S., Ko, Y., Kim, J., Hong, T.: Housing market trend forecasts through statistical comparisons based on big data analytic methods. J. Manag. Eng. 34 (2018). https://doi.org/10.1061/(ASCE)ME.1943-5479.0000583Lee, J.: A neural network method for nonlinear time series analysis. J. Time Ser. Econom. 11, 1–18 (2019). https://doi.org/10.1515/jtse-2016-0011Trull, O., GarcĂ­a-DĂ­az, J.C., Troncoso, A.: Initialization methods for multiple seasonal holt-winters forecasting models. Mathematics 8, 1–16 (2020). https://doi.org/10.3390/math8020268Biau, G., Scornet, E.: A random forest guided tour. Test 25(2), 197–227 (2016). https://doi.org/10.1007/s11749-016-0481-

    A General Framework for Formal Tests of Interaction after Exhaustive Search Methods with Applications to MDR and MDR-PDT

    Get PDF
    The initial presentation of multifactor dimensionality reduction (MDR) featured cross-validation to mitigate over-fitting, computationally efficient searches of the epistatic model space, and variable construction with constructive induction to alleviate the curse of dimensionality. However, the method was unable to differentiate association signals arising from true interactions from those due to independent main effects at individual loci. This issue leads to problems in inference and interpretability for the results from MDR and the family-based compliment the MDR-pedigree disequilibrium test (PDT). A suggestion from previous work was to fit regression models post hoc to specifically evaluate the null hypothesis of no interaction for MDR or MDR-PDT models. We demonstrate with simulation that fitting a regression model on the same data as that analyzed by MDR or MDR-PDT is not a valid test of interaction. This is likely to be true for any other procedure that searches for models, and then performs an uncorrected test for interaction. We also show with simulation that when strong main effects are present and the null hypothesis of no interaction is true, that MDR and MDR-PDT reject at far greater than the nominal rate. We also provide a valid regression-based permutation test procedure that specifically tests the null hypothesis of no interaction, and does not reject the null when only main effects are present. The regression-based permutation test implemented here conducts a valid test of interaction after a search for multilocus models, and can be applied to any method that conducts a search to find a multilocus model representing an interaction

    Frequency-specific hippocampal-prefrontal interactions during associative learning

    Get PDF
    Much of our knowledge of the world depends on learning associations (for example, face-name), for which the hippocampus (HPC) and prefrontal cortex (PFC) are critical. HPC-PFC interactions have rarely been studied in monkeys, whose cognitive and mnemonic abilities are akin to those of humans. We found functional differences and frequency-specific interactions between HPC and PFC of monkeys learning object pair associations, an animal model of human explicit memory. PFC spiking activity reflected learning in parallel with behavioral performance, whereas HPC neurons reflected feedback about whether trial-and-error guesses were correct or incorrect. Theta-band HPC-PFC synchrony was stronger after errors, was driven primarily by PFC to HPC directional influences and decreased with learning. In contrast, alpha/beta-band synchrony was stronger after correct trials, was driven more by HPC and increased with learning. Rapid object associative learning may occur in PFC, whereas HPC may guide neocortical plasticity by signaling success or failure via oscillatory synchrony in different frequency bands.National Institute of Mental Health (U.S.) (Conte Center Grant P50-MH094263-03)National Institute of Mental Health (U.S.) (Fellowship F32-MH081507)Picower Foundatio

    Role of N-terminal tau domain integrity on the survival of cerebellar granule neurons

    Get PDF
    Although the role of the microtubule-binding domain of the tau protein in the modulation of microtubule assembly is widely established, other possible functions of this protein have been poorly investigated. We have analyzed the effect of adenovirally mediated expression of two fragments of the N-terminal portion - free of microtubule-binding domain - of the tau protein in cerebellar granule neurons (CGNs). We found that while the expression of the tau (1-230) fragment, as well as of full-length tau, inhibits the onset of apoptosis, the tau (1-44) fragment exerts a powerful toxic action on the same neurons. The antiapoptotic action of tau (1-230) is exerted at the level of Akt-mediated activation of the caspase cascade. On the other hand, the toxic action of the (1-44) fragment is not prevented by inhibitors of CGN apoptosis, but is fully inhibited by NMDA receptor antagonists. These findings point to a novel, physiological role of the N-terminal domain of tau, but also underlay that its possible proteolytic truncation mediated by apoptotic proteases may generate a highly toxic fragment that could contribute to neuronal death

    ATHENA: A knowledge-based hybrid backpropagation-grammatical evolution neural network algorithm for discovering epistasis among quantitative trait Loci

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Growing interest and burgeoning technology for discovering genetic mechanisms that influence disease processes have ushered in a flood of genetic association studies over the last decade, yet little heritability in highly studied complex traits has been explained by genetic variation. Non-additive gene-gene interactions, which are not often explored, are thought to be one source of this "missing" heritability.</p> <p>Methods</p> <p>Stochastic methods employing evolutionary algorithms have demonstrated promise in being able to detect and model gene-gene and gene-environment interactions that influence human traits. Here we demonstrate modifications to a neural network algorithm in ATHENA (the Analysis Tool for Heritable and Environmental Network Associations) resulting in clear performance improvements for discovering gene-gene interactions that influence human traits. We employed an alternative tree-based crossover, backpropagation for locally fitting neural network weights, and incorporation of domain knowledge obtainable from publicly accessible biological databases for initializing the search for gene-gene interactions. We tested these modifications <it>in silico </it>using simulated datasets.</p> <p>Results</p> <p>We show that the alternative tree-based crossover modification resulted in a modest increase in the sensitivity of the ATHENA algorithm for discovering gene-gene interactions. The performance increase was highly statistically significant when backpropagation was used to locally fit NN weights. We also demonstrate that using domain knowledge to initialize the search for gene-gene interactions results in a large performance increase, especially when the search space is larger than the search coverage.</p> <p>Conclusions</p> <p>We show that a hybrid optimization procedure, alternative crossover strategies, and incorporation of domain knowledge from publicly available biological databases can result in marked increases in sensitivity and performance of the ATHENA algorithm for detecting and modelling gene-gene interactions that influence a complex human trait.</p

    Molecular mechanisms of cell death: recommendations of the Nomenclature Committee on Cell Death 2018.

    Get PDF
    Over the past decade, the Nomenclature Committee on Cell Death (NCCD) has formulated guidelines for the definition and interpretation of cell death from morphological, biochemical, and functional perspectives. Since the field continues to expand and novel mechanisms that orchestrate multiple cell death pathways are unveiled, we propose an updated classification of cell death subroutines focusing on mechanistic and essential (as opposed to correlative and dispensable) aspects of the process. As we provide molecularly oriented definitions of terms including intrinsic apoptosis, extrinsic apoptosis, mitochondrial permeability transition (MPT)-driven necrosis, necroptosis, ferroptosis, pyroptosis, parthanatos, entotic cell death, NETotic cell death, lysosome-dependent cell death, autophagy-dependent cell death, immunogenic cell death, cellular senescence, and mitotic catastrophe, we discuss the utility of neologisms that refer to highly specialized instances of these processes. The mission of the NCCD is to provide a widely accepted nomenclature on cell death in support of the continued development of the field
    • 

    corecore