68 research outputs found

    Credal Model Averaging: dealing robustly with model uncertainty on small data sets.

    Get PDF
    Datasets of population dynamics are typically characterized by a short temporal extension. In this condition, several alternative models typically achieve close accuracy, though returning quite different predictions (model uncertainty ). Bayesian model averaging (BMA) addresses this issue by averaging the prediction of the different models, using as weights the posterior probability of the models. However, an open problem of BMA is the choice of the prior probability of the models, which can largely impact on the inferences, especially when data are scarce. We present Credal Model Averaging (CMA), which addresses this problem by simultaneously considering a set of prior probability distributions over the models. This allows to represent very weak prior knowledge about the appropriateness of the different models and also to easily accommodate expert judgments, considering that in many cases the expert is not willing to commit himself to a single prior probability distribution. The predictions generated by CMA are intervals whose lengths shows the sensitivity of the predictions on the choice of the prior over the models

    Objective way to support embryo transfer: a probabilistic decision

    Get PDF
    STUDY QUESTION Is it feasible to identify factors that significantly affect the clinical outcome of IVF-ICSI cycles and use them to reliably design a predictor of implantation? SUMMARY ANSWER The Bayesian network (BN) identified top-history embryos, female age and the insemination technique as the most relevant factors for predicting the occurrence of pregnancy (AUC, area under curve, of 0.72). In addition, it could discriminate between no implantation and single or twin implantations in a prognostic model that can be used prospectively. WHAT IS KNOWN ALREADY The key requirement for achieving a single live birth in an IVF-ICSI cycle is the capacity to estimate embryo viability in relation to maternal receptivity. Nevertheless, the lack of a strong predictor imposes several restrictions on this strategy. STUDY DESIGN, SIZE, DURATION Medical histories, laboratory data and clinical outcomes of all fresh transfer cycles performed at the International Institute for Reproductive Medicine of Lugano, Switzerland, in the period 2006-2008 (n = 388 cycles), were retrospectively evaluated and analyzed. PARTICIPANTS/MATERIALS, SETTING, METHODS Patients were unselected for age, sperm parameters or other infertility criteria. Before being admitted to treatment, uterine anomalies were excluded by diagnostic hysteroscopy. To evaluate the factors possibly related to embryo viability and maternal receptivity, the class variable was categorized as pregnancy versus no pregnancy and the features included: female age, number of previous cycles, insemination technique, sperm of proven fertility, the number of transferred top-history embryos, the number of transferred top-quality embryos, the number of follicles >14 mm and the level of estradiol on the day of HCG administration. To assess the classifier, the indicators of performance were computed by cross-validation. Two statistical models were used: the decision tree and the BN. MAIN RESULTS AND THE ROLE OF CHOICE The decision tree identified the number of transferred top-history embryos, female age and the insemination technique as the features discriminating between pregnancy and no pregnancy. The model achieved an accuracy of 81.5% that was significantly higher in comparison with the trivial classifier, but the increase was so modest that the model was clinically useless for predictions of pregnancy. The BN could more reliably predict the occurrence of pregnancy with an AUC of 0.72, and confirmed the importance of top-history embryos, female age and insemination technique in determining implantation. In addition, it could discriminate between no implantation, single implantation and twin implantation with the AUC of 0.72, 0.64 and 0.83, respectively. LIMITATIONS, REASONS FOR CAUTION The relatively small sample of the study did not permit the inclusion of more features that could also have a role in determining the clinical outcome. The design of this study was retrospective to identify the relevant features; a prospective study is now needed to verify the validity of the model. WIDER IMPLICATIONS OF THE FINDINGS The resulting predictive model can discriminate with reasonable reliability between pregnancy and no pregnancy, and can also predict the occurrence of a single pregnancy or multiple pregnancy. This could represent an effective support for deciding how many embryos and which embryos to transfer for each couple. Due to its flexibility, the number of variables in the predictor can easily be increased to include other features that may affect implantation. STUDY FUNDING/COMPETING INTERESTS This study was supported by a grant, CTI Medtech Project Number: 9707.1 PFLS-L, Swiss Confederation. No competing interests are declare

    Binary credal classification under sparsity constraints.

    Get PDF
    Binary classification is a well known problem in statistics. Besides classical methods, several techniques such as the naive credal classifier (for categorical data) and imprecise logistic regression (for continuous data) have been proposed to handle sparse data. However, a convincing approach to the classification problem in high dimensional problems (i.e., when the number of attributes is larger than the number of observations) is yet to be explored in the context of imprecise probability. In this article, we propose a sensitivity analysis based on penalised logistic regression scheme that works as binary classifier for high dimensional cases. We use an approach based on a set of likelihood functions (i.e. an imprecise likelihood, if you like), that assigns a set of weights to the attributes, to ensure a robust selection of the important attributes, whilst training the model at the same time, all in one fell swoop. We do a sensitivity analysis on the weights of the penalty term resulting in a set of sparse constraints which helps to identify imprecision in the dataset

    The Buckland Park air shower array

    Get PDF
    The new Buckland Park Air Shower Array has been producing analyzed shower data since July 1984. The array is described and some preliminary performance figures are presented

    Air quality and urban sustainable development: the application of machine learning tools

    Full text link
    [EN] Air quality has an efect on a population¿s quality of life. As a dimension of sustainable urban development, governments have been concerned about this indicator. This is refected in the references consulted that have demonstrated progress in forecasting pollution events to issue early warnings using conventional tools which, as a result of the new era of big data, are becoming obsolete. There are a limited number of studies with applications of machine learning tools to characterize and forecast behavior of the environmental, social and economic dimensions of sustainable development as they pertain to air quality. This article presents an analysis of studies that developed machine learning models to forecast sustainable development and air quality. Additionally, this paper sets out to present research that studied the relationship between air quality and urban sustainable development to identify the reliability and possible applications in diferent urban contexts of these machine learning tools. To that end, a systematic review was carried out, revealing that machine learning tools have been primarily used for clustering and classifying variables and indicators according to the problem analyzed, while tools such as artifcial neural networks and support vector machines are the most widely used to predict diferent types of events. The nonlinear nature and synergy of the dimensions of sustainable development are of great interest for the application of machine learning tools.Molina-Gómez, NI.; Díaz-Arévalo, JL.; López Jiménez, PA. (2021). Air quality and urban sustainable development: the application of machine learning tools. International Journal of Environmental Science and Technology. 18(4):1-18. https://doi.org/10.1007/s13762-020-02896-6S118184Al-Dabbous A, Kumar P, Khan A (2017) Prediction of airborne nanoparticles at roadside location using a feed–forward artificial neural network. Atmos Pollut Res 8:446–454. https://doi.org/10.1016/j.apr.2016.11.004Antanasijević D, Pocajt V, Povrenović D, Ristić M, Perić-Grujić A (2013) PM10 emission forecasting using artificial neural networks and genetic algorithm input variable optimization. Sci Total Environ 443:511–519. https://doi.org/10.1016/j.scitotenv.2012.10.110Brink H, Richards JW, Fetherolf M (2016) Real-world machine learning. Richards JW, Fetherolf M (eds) Manning Publications Co. Berkeley, CA. https://www.manning.com/books/real-world-machine-learning. Accessed 26 Apr 2020Cervone G, Franzese P, Ezber Y, Boybeyi Z (2008) Risk assessment of atmospheric emissions using machine learning. Nat Hazard Earth Syst 8:991–1000. https://doi.org/10.5194/nhess-8-991-2008Chen S, Kan G, Li J, Liang K, Hong Y (2018) Investigating China’s urban air quality using big data, information theory, and machine learning. Pol J Environ Stud 27:565–578. https://doi.org/10.15244/pjoes/75159Corani (2005) Air quality prediction in Milan: feed-forward neural networks, pruned neural networks and lazy learning. Ecol Model 185:513–529. https://doi.org/10.1016/j.ecolmodel.2005.01.008Cruz C, Gómez A, Ramírez L, Villalva A, Monge O, Varela J, Quiroz J, Duarte H (2017) Calidad del aire respecto de metales (Pb, Cd, Ni, Cu, Cr) y relación con salud respiratoria: caso Sonora, México. Rev Int Contam Ambient 33:23–34. https://doi.org/10.20937/RICA.2017.33.esp02.02de Hoogh K, Héritier H, Stafoggia M, Künzli N, Kloog I (2018) Modelling daily PM2.5 concentrations at high spatio-temporal resolution across Switzerland. Environ Pollut 233:1147–1154. https://doi.org/10.1016/j.envpol.2017.10.025Franceschi F, Cobo M, Figueredo M (2018) Discovering relationships and forecasting PM10 and PM2.5 concentrations in Bogotá, Colombia, using Artificial Neural Networks, Principal Component Analysis, and k-means clustering. Atmos Pollut Res 9:912–922. https://doi.org/10.1016/j.apr.2018.02.006García N, Combarro E, del Coz J, Montañes E (2013) A SVM-based regression model to study the air quality at local scale in Oviedo urban area (Northern Spain): a case study. Appl Math Comput 219:8923–8937. https://doi.org/10.1016/j.amc.2013.03.018Gibert K, Sànchez-Màrre M, Sevilla B (2012) Tools for environmental data mining and intelligent decision support. In iEMSs. Leipzig, Germany. http://www.iemss.org/society/index.php/iemss-2012-proceedings. Accessed 26 Nov 2018Gibert K, Sànchez-Marrè M, Izquierdo J (2016) A survey on pre-processing techniques: relevant issues in the context of environmental data mining. Ai Commun 29:627–663. https://doi.org/10.3233/AIC-160710Gounaridis D, Chorianopoulos I, Koukoulas S (2018) Exploring prospective urban growth trends under different economic outlooks and land-use planning scenarios: the case of Athens. Appl Geogr 90:134–144. https://doi.org/10.1016/j.apgeog.2017.12.001Holloway J, Mengersen K (2018) Statistical machine learning methods and remote sensing for sustainable development goals: a review. Remote Sens 10:1–21. https://doi.org/10.3390/rs10091365Ifaei P, Karbassi A, Lee S, Yoo Ch (2017) A renewable energies-assisted sustainable development plan for Iran using techno-econo-socio-environmental multivariate analysis and big data. Energy Convers Manag 153:257–277. https://doi.org/10.1016/j.enconman.2017.10.014Kadiyala A, Kumar A (2017a) Applications of R to evaluate environmental data science problems. Environ Prog Sustain 36:1358–1364. https://doi.org/10.1002/ep.12676Kadiyala A, Kumar A (2017b) Vector time series-based radial basis function neural network modeling of air quality inside a public transportation bus using available software. Environ Prog Sustain 36:4–10. https://doi.org/10.1002/ep.12523Karimian H, Li Q, Wu Ch, Qi Y, Mo Y, Chen G, Zhang X, Sachdeva S (2019) Evaluation of different machine learning approaches to forecasting PM2.5 mass concentrations. Aerosol Air Qual Res 19:1400–1410. https://doi.org/10.4209/aaqr.2018.12.0450Krzyzanowski M, Apte J, Bonjour S, Brauer M, Cohen A, Prüss-Ustun A (2014) Air pollution in the mega-cities. Curr Environ Health Rep 1:185–191. https://doi.org/10.1007/s40572-014-0019-7Lässig K, Morik (2016) Computat sustainability. Springer, Berlin. https://doi.org/10.1007/978-3-319-31858-5Li Y, Wu Y-X, Zeng Z-X, Guo L (2006) Research on forecast model for sustainable development of economy-environment system based on PCA and SVM. In: Proceedings of the 2006 international conference on machine learning and cybernetics, vol 2006. IEEE, Dalian, China, pp 3590–3593. https://doi.org/10.1109/ICMLC.2006.258576Liu B-Ch, Binaykia A, Chang P-Ch, Tiwari M, Tsao Ch-Ch (2017) Urban air quality forecasting based on multi- dimensional collaborative support vector regression (SVR): a case study of Beijing-Tianjin-Shijiazhuang. PLoS ONE 12:1–17. https://doi.org/10.1371/journal.pone.0179763Lubell M, Feiock R, Handy S (2009) City adoption of environmentally sustainable policies in California’s Central Valley. J Am Plan Assoc 75:293–308. https://doi.org/10.1080/01944360902952295Ma D, Zhang Z (2016) Contaminant dispersion prediction and source estimation with integrated Gaussian-machine learning network model for point source emission in atmosphere. J Hazard Mater 311:237–245. https://doi.org/10.1016/j.jhazmat.2016.03.022Madu C, Kuei N, Lee P (2017) Urban sustainability management: a deep learning perspective. Sustain Cities Soc 30:1–17. https://doi.org/10.1016/j.scs.2016.12.012Mellos K (1988) Theory of eco-development. In: Perspectives on ecology. Palgrave Macmillan, London. https://doi.org/10.1007/978-1-349-19598-5_4Ni XY, Huang H, Du WP (2017) Relevance analysis and short-term prediction of PM2.5 concentrations in Beijing based on multi-source data. Atmos Environ 150:146–161. https://doi.org/10.1016/j.atmosenv.2016.11.054Oprea M, Dragomir E, Popescu M, Mihalache S (2016) Particulate matter air pollutants forecasting using inductive learning approach. Rev Chim 67:2075–2081Paas B, Stienen J, Vorländer M, Schneider Ch (2017) Modelling of urban near-road atmospheric PM concentrations using an artificial neural network approach with acoustic data input. Environments 4:1–25. https://doi.org/10.3390/environments4020026Pandey G, Zhang B, Jian L (2013) Predicting submicron air pollution indicators: a machine learning approach. Environ Sci Proc Impacts 15:996–1005. https://doi.org/10.1039/c3em30890aPeng H, Lima A, Teakles A, Jin J, Cannon A, Hsieh W (2017) Evaluating hourly air quality forecasting in Canada with nonlinear updatable machine learning methods. Air Qual Atmos Health 10:195–211. https://doi.org/10.1007/s11869-016-0414-3Pérez-Ortíz M, de La Paz-Marín M, Gutiérrez PA, Hervás-Martínez C (2014) Classification of EU countries’ progress towards sustainable development based on ordinal regression techniques. Knowl Based Syst 66:178–189. https://doi.org/10.1016/j.knosys.2014.04.041Phillis Y, Kouikoglou V, Verdugo C (2017) Urban sustainability assessment and ranking of cities. Comput Environ Urban 64:254–265. https://doi.org/10.1016/j.compenvurbsys.2017.03.002Saeed S, Hussain L, Awan I, Idris A (2017) Comparative analysis of different statistical methods for prediction of PM2.5 and PM10 concentrations in advance for several hours. Int J Comput Sci Netw Secur 17:45–52Sayegh A, Munir S, Habeebullah T (2014) Comparing the performance of statistical models for predicting PM10 concentrations. Aerosol Air Qual Res 14:653–665. https://doi.org/10.4209/aaqr.2013.07.0259Shaban K, Kadri A, Rezk E (2016) Urban air pollution monitoring system with forecasting models. IEEE Sens J 16:2598–2606. https://doi.org/10.1109/JSEN.2016.2514378Sierra B (2006) Aprendizaje automático conceptos básicos y avanzados Aspectos prácticos utilizando el software Weka. Madrid Pearson Prentice Hall, MadridSingh K, Gupta S, Rai P (2013) Identifying pollution sources and predicting urban air quality using ensemble learning methods. Atmos Environ 80:426–437. https://doi.org/10.1016/j.atmosenv.2013.08.023Song L, Pang S, Longley I, Olivares G, Sarrafzadeh A (2014) Spatio-temporal PM2.5 prediction by spatial data aided incremental support vector regression. In: International joint conference on neural networks. IEEE, Beijing, pp 623–630. https://doi.org/10.1109/IJCNN.2014.6889521Souza R, Coelho G, da Silva A, Pozza S (2015) Using ensembles of artificial neural networks to improve PM10 forecasts. Chem Eng Trans 43:2161–2166. https://doi.org/10.3303/CET1543361Suárez A, García PJ, Riesgo P, del Coz JJ, Iglesias-Rodríguez FJ (2011) Application of an SVM-based regression model to the air quality study at local scale in the Avilés urban area (Spain). Math Comput Model 54:453–1466. https://doi.org/10.1016/j.mcm.2011.04.017Tamas W, Notton G, Paoli C, Nivet M, Voyant C (2016) Hybridization of air quality forecasting models using machine learning and clustering: an original approach to detect pollutant peaks. Aerosol Air Qual Res 16:405–416. https://doi.org/10.4209/aaqr.2015.03.0193Toumi O, Le Gallo J, Ben Rejeb J (2017) Assessment of Latin American sustainability. Renew Sustain Energy Rev 78:878–885. https://doi.org/10.1016/j.rser.2017.05.013Tzima F, Mitkas P, Voukantsis D, Karatzas K (2011) Sparse episode identification in environmental datasets: the case of air quality assessment. Expert Syst Appl 38:5019–5027. https://doi.org/10.1016/j.eswa.2010.09.148United Nations, Department of Economic and Social Affairs (2019) World urbanization prospects The 2018 Revision. New York. https://doi.org/10.18356/b9e995fe-enWang B (2019) Applying machine-learning methods based on causality analysis to determine air quality in China. Pol J Environ Stud 28:3877–3885. https://doi.org/10.15244/pjoes/99639Wang X, Xiao Z (2017) Regional eco-efficiency prediction with support vector spatial dynamic MIDAS. J Clean Prod 161:165–177. https://doi.org/10.1016/j.jclepro.2017.05.077Wang W, Men C, Lu W (2008) Online prediction model based on support vector machine. Neurocomputing 71:550–558. https://doi.org/10.1016/j.neucom.2007.07.020WCED (1987) Report of the world commission on environment and development: our common future: report of the world commission on environment and development. WCED, Oslo. https://doi.org/10.1080/07488008808408783Weizhen H, Zhengqiang L, Yuhuan Z, Hua X, Ying Z, Kaitao L, Donghui L, Peng W, Yan M (2014) Using support vector regression to predict PM10 and PM2.5. In: IOP conference series: earth and environmental science, vol 17. IOP. https://doi.org/10.1088/1755-1315/17/1/012268WHO (2016) OMS | La OMS publica estimaciones nacionales sobre la exposición a la contaminación del aire y sus repercusiones para la salud. WHO. http://www.who.int/mediacentre/news/releases/2016/air-pollution-estimates/es/. Accesed 26 Nov 2018Yeganeh N, Shafie MP, Rashidi Y, Kamalan H (2012) Prediction of CO concentrations based on a hybrid partial least square and support vector machine model. Atmos Environ 55:357–365. https://doi.org/10.1016/j.atmosenv.2012.02.092Zalakeviciute R, Bastidas M, Buenaño A, Rybarczyk Y (2020) A traffic-based method to predict and map urban air quality. Appl Sci. https://doi.org/10.3390/app10062035Zeng L, Guo J, Wang B, Lv J, Wang Q (2019) Analyzing sustainability of Chinese coal cities using a decision tree modeling approach. Resour Policy 64:101501. https://doi.org/10.1016/j.resourpol.2019.101501Zhan Y, Luo Y, Deng X, Grieneisen M, Zhang M, Di B (2018) Spatiotemporal prediction of daily ambient ozone levels across China using random forest for human exposure assessment. Environ Pollut 233:464–473. https://doi.org/10.1016/j.envpol.2017.10.029Zhang Y, Huan Q (2006) Research on the evaluation of sustainable development in Cangzhou city based on neural-network-AHP. In: Proceedings of the fifth international conference on machine learning and cybernetics, vol 2006. pp 3144–3147. https://doi.org/10.1109/ICMLC.2006.258407Zhang Y, Shang W, Wu Y (2009) Research on sustainable development based on neural network. In: 2009 Chinese control and decision conference. IEEE, pp 3273–3276. https://doi.org/10.1109/CCDC.2009.5192476Zhou Y, Chang F-J, Chang L-Ch, Kao I-F, Wang YS (2019) Explore a deep learning multi-output neural network for regional multi-step-ahead air quality forecasts. J Clean Prod 209:134–145. https://doi.org/10.1016/j.jclepro.2018.10.24

    Bayesian analysis of Jolly-Seber type models; Incorporating heterogeneity in arrival and departure

    Get PDF
    We propose the use of finite mixtures of continuous distributions in modelling the process by which new individuals, that arrive in groups, become part of a wildlife population. We demonstrate this approach using a data set of migrating semipalmated sandpipers (Calidris pussila) for which we extend existing stopover models to allow for individuals to have different behaviour in terms of their stopover duration at the site. We demonstrate the use of reversible jump MCMC methods to derive posterior distributions for the model parameters and the models, simultaneously. The algorithm moves between models with different numbers of arrival groups as well as between models with different numbers of behavioural groups. The approach is shown to provide new ecological insights about the stopover behaviour of semipalmated sandpipers but is generally applicable to any population in which animals arrive in groups and potentially exhibit heterogeneity in terms of one or more other processes

    Rainfall mediations in the spreading of epidemic cholera

    Get PDF
    Following the empirical evidence of a clear correlation between rainfall events and cholera resurgence that was observed in particular during the recent outbreak in Haiti, a spatially explicit model of epidemic cholera is re-examined. Specifically, we test a multivariate Poisson rainfall generator, with parameters varying in space and time, as a driver of enhanced disease transmission. The relevance of the issue relates to the key insight that predictive mathematical models may provide into the course of an ongoing cholera epidemic aiding emergency management (say, in allocating life-saving supplies or health care staff) or in evaluating alternative management strategies. Our model consists of a set of dynamical equations (SIRB-like i.e. subdivided into the compartments of Susceptible, Infected and Recovered individuals, and including a balance of Bacterial concentrations in the water reservoir) describing a connected network of human communities where the infection results from the exposure to excess concentrations of pathogens in the water. These, in turn, are driven by rainfall washout of open-air defecation sites or cesspool overflows, hydrologic transport through waterways and by mobility of susceptible and infected individuals. We perform an a posteriori analysis (from the beginning of the epidemic in October 2010 until December 2011) to test the model reliability in predicting cholera cases and in testing control measures, involving vaccination and sanitation campaigns, for the ongoing epidemic. Even though predicting reliably the timing of the epidemic resurgence proves difficult due to rainfall inter-annual variability, we find that the model can reasonably quantify the total number of reported infection cases in the selected time-span. We then run a multi-seasonal prediction of the course of the epidemic until December 2015, to investigate conditions for further resurgences and endemicity of cholera in the region with a view to policies which may bring to the eradication of the disease in Haiti. The projections, although strongly depending on still uncertain epidemiological processes, show an endemic, seasonal pattern establishing in the region, which can be better forestalled by an improvement of the sanitation system only, rather than by vaccination alone. We thus conclude that hydrologic drivers and water resources management prove central to prediction, emergency management and long-term control of epidemic cholera

    Credal model averaging of logistic regression for modeling the distribution of marmot burrows

    No full text
    Bayesian model averaging (BMA) weights the inferences produced by a set of competing models, using as weights the models posterior probabilities. An open problem of BMA is how to set the prior probability of the models. Credal model averaging (CMA) is a credal ensemble of Bayesian models, which generalizes BMA by substituting the single prior over the models by a set of priors. The base models of the ensemble are learned in a Bayesian fashion. We use CMA to ensemble base classi ers which are Bayesian logistic regressors, characterized by di erent sets of covariates. CMA returns indeterminate classi cations when the classi cation is prior-dependent, namely when the most probable class depends on the prior probability assigned to the di erent models. We apply CMA for modelling the presence and absence of marmot burrows in an Alpine valley in Italy and show that it compares favorably to BMA
    • …
    corecore