55 research outputs found
Inferring slowly changing dynamic gene-regulatory networks
Dynamic gene-regulatory networks are complex since the interaction patterns between its components mean that it is impossible to study parts of the network in separation. This holistic character of gene-regulatory networks poses a real challenge to any type of modelling. Graphical models are a class of models that connect the network with a conditional independence relationships between the random variables. By interpreting the random variables as gene activities and the conditional independence relationships as functional non-relatedness, graphical models have been used to describe gene-regulatory networks. Whereas the literature has been focused on static networks, most time-course experiments are designed in order to tease out temporal changes in the underlying networks. It is typically reasonable to assume that changes in genomic networks are few, because biological systems tend to be stable. We introduce a new model for estimating slowly changes in dynamic gene-regulatory networks, which is suitable for high-dimensional data, e.g. time-course genomic data. Our aim is to estimate a dynamically changing genomic network based on temporal activity measurements of the genes in the network. Our method is based on the penalized likelihood with l1-norm, that penalizes conditional dependencies between genes as well as differences between conditional independence elements across time points. We also present a the heuristic search strategy to find optimal tuning parameters. We re-write the penalized maximum likelihood problem into a standard convex optimization problem subject to linear equality constraints. We show that our method performs well in simulation studies. Finally, we apply the proposed model to a time-course T-cell dataset
Sparse model-based network inference using Gaussian graphical models
We consider the problem of estimating a sparse dynamic Gaussian graphical model with L1 penalized maximum likelihood of structured precision matrix. The structure can consist of specific time dynamics, known presence or absence of links in the graphical model or equality constraints on the parameters. The model is defined on the basis of partial correlations, which results in a specific class precision matrices. A priori L1 penalized maximum likelihood estimation in this class is extremely difficult, because of the above mentioned constraints, the computational complexity of the L1 constraint on the side of the usual positive-definite constraint. The implementation is non-trivial, but we show that the com- putation can be done effectively by taking advantage of an efficient maximum determinant algorithm (SDPT3) developed in convex optimization. For selecting the tuning parameter, we compare several selection criteria and argue that the traditional AIC and BIC should not expect to work. We compare our method with related methods, such as glasso (Friedman et al. 2007)
Generalized information criterion for model selection in penalized graphical models
This paper introduces an estimator of the relative directed distance between
an estimated model and the true model, based on the Kulback-Leibler divergence
and is motivated by the generalized information criterion proposed by Konishi
and Kitagawa. This estimator can be used to select model in penalized Gaussian
copula graphical models. The use of this estimator is not feasible for
high-dimensional cases. However, we derive an efficient way to compute this
estimator which is feasible for the latter class of problems. Moreover, this
estimator is, generally, appropriate for several penalties such as lasso,
adaptive lasso and smoothly clipped absolute deviation penalty. Simulations
show that the method performs similarly to KL oracle estimator and it also
improves BIC performance in terms of support recovery of the graph.
Specifically, we compare our method with Akaike information criterion, Bayesian
information criterion and cross validation for band, sparse and dense network
structures
Spatio-temporal spread pattern of Covid-19 in Italy
This paper investigates the spatio-temporal spread pattern of Covid-19 in Italy, during the first wave of infections, from February to October 2020. Disease mappings of the virus infections by using the Besag-York-Molliè model and some spatio-temporal extensions are provided. This modelling framework, which includes a temporal component, allows to study the time evolution of the spread pattern among the 107 Italian provinces. The focus is on the effect of citizens’ mobility patterns, represented here by the three distinct phases of the Italian virus first wave, identified by the Italian government, also characterised by the lockdown period. Results show the effectiveness of the lockdown action and an inhomogeneous spatial trend that characterises the virus spread during the first wave. Furthermore, the results suggest that the temporal evolution of each province’s cases is independent of the temporal evolution of the other ones, meaning that the contagions and temporal trend may be caused by some province-specific aspects rather than by the subjects’ spatial movements
Inferring slowly-changing dynamic gene-regulatory networks
Dynamic gene-regulatory networks are complex since the interaction patterns between their components mean that it is impossible to study parts of the network in separation. This holistic character of gene-regulatory networks poses a real challenge to any type of modelling. Graphical models are a class of models that connect the network with a conditional independence relationships between random variables. By interpreting these random variables as gene activities and the conditional independence relationships as functional non-relatedness, graphical models have been used to describe gene-regulatory networks. Whereas the literature has been focused on static networks, most time-course experiments are designed in order to tease out temporal changes in the underlying network. It is typically reasonable to assume that changes in genomic networks are few, because biological systems tend to be stable. We introduce a new model for estimating slow changes in dynamic gene-regulatory networks, which is suitable for high-dimensional data, e.g. time-course microarray data. Our aim is to estimate a dynamically changing genomic network based on temporal activity measurements of the genes in the network. Our method is based on the penalized likelihood with l1-norm, that penalizes conditional dependencies between genes as well as differences between conditional independence elements across time points. We also present a heuristic search strategy to find optimal tuning parameters. We re-write the penalized maximum likelihood problem into a standard convex optimization problem subject to linear equality constraints. We show that our method performs well in simulation studies. Finally, we apply the proposed model to a time-course T-cell dataset
A computationally fast alternative to cross-validation in penalized Gaussian graphical models
We study the problem of selection of regularization parameter in penalized Gaussian graphical models. When the goal is to obtain a model with good predicting power, cross validation is the gold standard. We present a new estimator of Kullback-Leibler loss in Gaussian Graphical model which provides a computationally fast alternative to cross-validation. The estimator is obtained by approximating leave-one-out-cross validation. Our approach is demonstrated on simulated data sets for various types of graphs. The proposed formula exhibits superior performance, especially in the typical small sample size scenario, compared to other available alternatives to cross validation, such as Akaike’s information criterion and Generalized approximate cross validation. We also show that the estimator can be used to improve the performance of the BIC when the sample size is small
Networks as mediating variables: a Bayesian latent space approach
The use of network analysis to investigate social structures has recently seen a rise due to the high availability of data and the numerous insights it can provide into different fields. Most analyses focus on the topological characteristics of networks and the estimation of relationships between the nodes. We adopt a different perspective by considering the whole network as a random variable conveying the effect of an exposure on a response. This point of view represents a classical mediation setting, where the interest lies in estimating the indirect effect, that is, the effect propagated through the mediating variable. We introduce a latent space model mapping the network into a space of smaller dimension by considering the hidden positions of the units in the network. The coordinates of each node are used as mediators in the relationship between the exposure and the response. We further extend mediation analysis in the latent space framework by using Generalised Linear Models instead of linear ones, as previously done in the literature, adopting an approach based on derivatives to obtain the effects of interest. A Bayesian approach allows us to get the entire distribution of the indirect effect, generally unknown, and compute the corresponding highest density interval, which gives accurate and interpretable bounds for the mediated effect. Finally, an application to social interactions among a group of adolescents and their attitude toward substance use is presented
Chapter Determinants of spatial intensity of stop locations on cruise passengers tracking data
This paper aims at analyzing the spatial intensity in the distribution of stop locations of cruise passengers during their visit at the destination through a stochastic point process modelling approach on a linear network. Data collected through the integration of GPS tracking technology and questionnaire-based survey on cruise passengers visiting the city of Palermo are used, to identify the main determinants which characterize their stop locations pattern. The spatial intensity of stop locations is estimated through a Gibbs point process model, taking into account for both individual-related variables, contextual-level information, and for spatial interaction among stop points. The Berman-Turner device for maximum pseudolikelihood is considered, by using a quadrature scheme generated on the network. The approach used allows taking into account the linear network determined by the street configuration of the destination under analysis. The results show an influence of both socio-demographic and trip-related characteristics on the stop location patterns, as well as the relevance of distance from the main attractions, and potential interactions among cruise passengers in stop configuration. The proposed approach represents both improvements from the methodological perspective, related to the modelling of spatial point process on a linear network, and from the applied perspective, given that better knowledge of the determinants of spatial intensity of visitors’ stop locations in urban contexts may orient destination management policy
Hydrological post-processing based on approximate Bayesian computation (ABC)
[EN] This study introduces a method to quantify the conditional predictive uncertainty in hydrological post-processing contexts when it is cumbersome to calculate the likelihood (intractable likelihood). Sometimes, it can be difficult to calculate the likelihood itself in hydrological modelling, specially working with complex models or with ungauged catchments. Therefore, we propose the ABC post-processor that exchanges the requirement of calculating the likelihood function by the use of some sufficient summary statistics and synthetic datasets. The aim is to show that the conditional predictive distribution is qualitatively similar produced by the exact predictive (MCMC post-processor) or the approximate predictive (ABC post-processor). We also use MCMC post-processor as a benchmark to make results more comparable with the proposed method. We test the ABC post-processor in two scenarios: (1) the Aipe catchment with tropical climate and a spatially-lumped hydrological model (Colombia) and (2) the Oria catchment with oceanic climate and a spatially-distributed hydrological model (Spain). The main finding of the study is that the approximate (ABC post-processor) conditional predictive uncertainty is almost equivalent to the exact predictive (MCMC post-processor) in both scenarios.This study was partially supported by the Departamento del Huila Scholarship Program No. 677 (Colombia) and Colciencias, by the Spanish Research Project TETIS-MED (ref. CGL2014-58127-C3-3-R) and TETIS-CHANGE (ref.RTI2018-093717-B-I00). Also, G. Adelfio's research has been supported by the national grant of the Italian Ministry of Education University and Research (MIUR) for the PRIN-2015 program, "Complex space-time modelling and functional analysis for probabilistic forecast of seismic events'. The authors also wish to thank the editor and the two anonymous reviewers for their thoughtful comments for the revision of the manuscript.Romero-Cuellar, J.; Abbruzzo, A.; Adelfio, G.; Francés, F. (2019). Hydrological post-processing based on approximate Bayesian computation (ABC). Stochastic Environmental Research and Risk Assessment. 33(7):1361-1373. https://doi.org/10.1007/s00477-019-01694-yS13611373337Beaumont MA, Zhang W, Balding DJ (2002) Approximate Bayesian computation in population genetics. Genetics 162(4):2025–2035Blackwell D, Dubins L (1962) Merging of opinions with increasing information. Ann Math Stat 33(3):882–886Bogner K, Liechti K, Zappa M (2016) Post-processing of stream flows in Switzerland with an emphasis on low flows and floods. Water 8(4):115Brown JD, Seo D-J (2010) A nonparametric postprocessor for bias correction of hydrometeorological and hydrologic ensemble forecasts. J Hydrometeorol 11(3):642–665Butts MB, Payne JT, Kristensen M, Madsen H (2004) An evaluation of the impact of model structure on hydrological modelling uncertainty for streamflow simulation. J Hydrol 298(1):242–266Coccia G, Todini E (2011) Recent developments in predictive uncertainty assessment based on the model conditional processor approach. Hydrol Earth Syst Sci 15:3253–3274Csillery K, Francois O, Blum MGB (2012) abc: an R package for approximate Bayesian computation (abc). Methods Ecol Evol 3:475–479Diaconis P, Freedman D (1986) On the consistency of bayes estimates. Ann Stat 14(1):1–26Diks CGH, Vrugt JA (2010) Comparison of point forecast accuracy of model averaging methods in hydrologic applications. Stoch Environ Res Risk Assess 24(6):809–820Drovandi CC, Pettitt AN (2011) Likelihood-free Bayesian estimation of multivariate quantile distributions. Comput Stat Data Anal 55(9):2541–2556Evin G, Thyer M, Kavetski D, McInerney D, Kuczera G (2014) Comparison of joint versus postprocessor approaches for hydrological uncertainty estimation accounting for error autocorrelation and heteroscedasticity. Water Resour Res 50(3):2350–2375Fearnhead P, Prangle D (2012) Constructing summary statistics for approximate bayesian computation: semi-automatic approximate Bayesian computation. J R Stat Soc Ser B Stat Methodol 74(3):419–474Fenicia F, Kavetski D, Reichert P, Albert C (2018) Signature-domain calibration of hydrological models using approximate Bayesian computation: empirical analysis of fundamental properties. Water Resour Res 54:3958–3987Francés F, Vélez JI, Vélez JJ (2007) Split-parameter structure for the automatic calibration of distributed hydrological models. J Hydrol 332(1):226–240Frazier DT, Maneesoonthorn W, Martin GM, McCabe BP (2019) Approximate Bayesian forecasting. Int J Forecast 35(2):521–539Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7(4):457–472Gelman A, Stern HS, Carlin JB, Dunson DB, Vehtari A, Rubin DB (2013) Bayesian data analysis. Chapman and Hall/CRC, Boca RatonGlahn HR, Lowry DA (1972) The use of model output statistics (mos) in objective weather forecasting. J Appl Meteorol 11(8):1203–1211Gupta HV, Kling H, Yilmaz KK, Martinez GF (2009) Decomposition of the mean squared error and nse performance criteria: implications for improving hydrological modelling. J Hydrol 377(1):80–91Haario H, Saksman E, Tamminen J (2001) An adaptive metropolis algorithm. Bernoulli 7(2):223–242Kavetski D, Fenicia F, Reichert P, Albert C (2018) Signature-domain calibration of hydrological models using approximate Bayesian computation: theory and comparison to existing applications. Water Resour Res 54:4059–4083Khajehei S, Moradkhani H (2017) Towards an improved ensemble precipitation forecast: a probabilistic post-processing approach. J Hydrol 546:476–489Klein B, Meissner D, Kobialka H-U, Reggiani P (2016) Predictive uncertainty estimation of hydrological multi-model ensembles using pair-copula construction. Water 8(4):125Krzysztofowicz R, Kelly KS (2000) Hydrologic uncertainty processor for probabilistic river stage forecasting. Water Resour Res 36(11):3265–3277Laio F, Tamea S (2007) Verification tools for probabilistic forecasts of continuous hydrological variables. Hydrol Earth Syst Sci 11(4):1267–1277Li B, Liang Z, He Y, Hu L, Zhao W, Acharya K (2017) Comparison of parameter uncertainty analysis techniques for a topmodel application. Stoch Environ Res Risk Assess 31(5):1045–1059Liang Z, Chang W, Li B (2012) Bayesian flood frequency analysis in the light of model and parameter uncertainties. Stoch Environ Res Risk Assess 26(5):721–730Lindley DV, Smith AFM (1972) Bayes estimates for the linear model. J R Stat Soc Ser B Methodol 34(1):1–41Liu Y, Gupta HV (2007) Uncertainty in hydrologic modeling: toward an integrated data assimilation framework. Water Resour Res 43(7):W07401Madadgar S, Moradkhani H (2014) Improved Bayesian multimodeling: integration of copulas and Bayesian model averaging. Water Resour Res 50(12):9586–9603Marin J-M, Pudlo P, Robert CP, Ryder RJ (2012) Approximate Bayesian computational methods. Stat Comput 22(6):1167–1180Marjoram P, Molitor J, Plagnol V, Tavaré S (2003) Markov chain monte carlo without likelihoods. Proc Natl Acad Sci 100(26):15324–15328Marshall L, Nott D, Sharma A (2004) A comparative study of Markov chain Monte Carlo methods for conceptual rainfall-runoff modeling. Water Resour Res 40(2):W02501Mengersen KL, Pudlo P, Robert CP (2013) Bayesian computation via empirical likelihood. Proc Natl Acad Sci 110(4):1321–1326Montanari A, Brath A (2004) A stochastic approach for assessing the uncertainty of rainfall-runoff simulations. Water Resour Res 40:W01106. https://doi.org/10.1029/2003WR002540Montanari A, Grossi G (2008) Estimating the uncertainty of hydrological forecasts: a statistical approach. Water Resour Res 44:W00B08. https://doi.org/10.1029/2008WR006897Montanari A, Koutsoyiannis D (2012) A blueprint for process-based modeling of uncertain hydrological systems. Water Resour Res 48(9):W09555Moriasi DN, Arnold JG, Van Liew MW, Bingner RL, Harmel RD, Veith TL (2007) Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Trans ASABE 50(3):885–900Nott DJ, Marshall L, Brown J (2011) Generalized likelihood uncertainty estimation (glue) and approximate Bayesian computation: what’s the connection? Water Resour Res 48(12):W12602Price LF, Drovandi CC, Lee A, Nott DJ (2018) Bayesian synthetic likelihood. J Comput Graph Stat 27(1):1–11Pritchard JK, Seielstad MT, Perez-Lezaun A, Feldman MW (1999) Population growth of human y chromosomes: a study of y chromosome microsatellites. Mol Biol Evol 16(12):1791–1798R Core Team (2013) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, AustriaRaftery AE, Gneiting T, Balabdaoui F, Polakowski M (2005) Using Bayesian model averaging to calibrate forecast ensembles. Mon Weather Rev 133(5):1155–1174Reichert P, Langhans SD, Lienert J, Schuwirth N (2015) The conceptual foundation of environmental decision support. J Environ Manag 154:316–332Robert CP (2016) Approximate bayesian computation: A survey on recent results. In: Cools R, Nuyens D (eds) Monte Carlo and Quasi-Monte Carlo Methods. Springer, Cham, pp 185–205Romero-Cuéllar J, Buitrago-Vargas A, Quintero-Ruiz T, Francés F (2018) Modelling the potential impacts of climate change on the hydrology of the Aipe river basin in Huila, Colombia. Ribagua 5(1):63–78Schefzik R, Thorarinsdottir TL, Gneiting T (2013) Uncertainty quantification in complex simulation models using ensemble copula coupling. Stat Sci 28(4):616–640Schoups G, Vrugt JA (2010) A formal likelihood function for parameter and predictive inference of hydrologic models with correlated, heteroscedastic, and non-Gaussian errors. Water Resour Res 46(10):W10531Schoups G, van de Giesen NC, Savenije HHG (2008) Model complexity control for hydrologic prediction. Water Resour Res 44(12):W00B03Shafii M, Tolson B, Matott LS (2014) Uncertainty-based multi-criteria calibration of rainfall-runoff models: a comparative study. Stoch Environ Res Risk Assess 28(6):1493–1510Sikorska AE, Montanari A, Koutsoyiannis D (2015) Estimating the uncertainty of hydrological predictions through data-driven resampling techniques. J Hydrol Eng 20(1):A4014009Sisson SA, Fan Y, Tanaka MM (2007) Sequential Monte Carlo without likelihoods. Proc Natl Acad Sci 104(6):1760–1765Solomatine DP, Shrestha DL (2009) A novel method to estimate model uncertainty using machine learning techniques. Water Resour Res 45:W00B11. https://doi.org/10.1029/2008WR006839Tavaré S, Balding DJ, Griffiths RC, Donnelly P (1997) Inferring coalescence times from DNA sequence data. Genetics 145(2):505–518Thomas H (1981) Improved methods for national water assessment, water resources contract: WR15249270. Technical report, Harvard University, CambridgeThyer M, Renard B, Kavetski D, Kuczera G, Franks SW, Srikanthan S (2009) Critical evaluation of parameter consistency and predictive uncertainty in hydrological modeling: a case study using Bayesian total error analysis. Water Resour Res 45:W00B14. https://doi.org/10.1029/2008WR006825Tian Y, Nearing GS, Peters-Lidard CD, Harrison KW, Tang L (2016) Performance metrics, error modeling, and uncertainty quantification. Mon Weather Rev 144(2):607–613Todini E (2008) A model conditional processor to assess predictive uncertainty in flood forecasting. Int J River Basin Manag 6(2):123–137Tran M-N, Nott DJ, Kohn R (2017) Variational bayes with intractable likelihood. J Comput Graph Stat 26(4):873–882Turner BM, Van Zandt T (2012) A tutorial on approximate Bayesian computation. J Math Psychol 56(2):69–85van Oijen M (2017) Bayesian methods for quantifying and reducing uncertainty and error in forest models. Curr For Rep 3(4):269–280Vélez JJ, Puricelli M, López Unzu F, Francés F (2009) Parameter extrapolation to ungauged basins with a hydrological distributed model in a regional framework. Hydrol Earth Syst Sci 13(2):229–246Vrugt JA, Robinson BA (2007) Treatment of uncertainty using ensemble methods: comparison of sequential data assimilation and Bayesian model averaging. Water Resour Res 43(1):W01411Vrugt JA, Sadegh M (2013) Toward diagnostic model calibration and evaluation: approximate Bayesian computation. Water Resour Res 49:4335–4345Waerden BVD (1953) Order tests for the two-sample problem and their power. Indag Math Proc 56:80Wagener T, Gupta HV (2005) Model identification for hydrological forecasting under uncertainty. Stoch Environ Res Risk Assess 19(6):378–387Wang Q, Robertson D, Chiew FS (2009) A bayesian joint probability modeling approach for seasonal forecasting of streamflows at multiple sites. Water Resour Res 45(5):W05407Weerts AH, Winsemius HC, Verkade JS (2011) Estimation of predictive hydrological uncertainty using quantile regression: examples from the national flood forecasting system (england and wales). Hydrol Earth Syst Sci 15(1):255–265Wentao L, Qingyun D, Chiyuan M, Aizhong Y, Wei G, Zhenhua D (2017) A review on statistical postprocessing methods for hydrometeorological ensemble forecasting. Wiley Interdiscip Rev Water 4(6):e1246Wilby RL, Harris I (2006) A framework for assessing uncertainties in climate change impacts: low-flow scenarios for the river thames, UK. Water Resour Res 42(2):W02419Woldemeskel F, McInerney D, Lerat J, Thyer M, Kavetski D, Shin D, Tuteja N, Kuczera G (2018) Evaluating post-processing approaches for monthly and seasonal streamflow forecasts. Hydrol Earth Syst Sci 22:6257–6278. https://doi.org/10.5194/hess-22-6257-2018Ye A, Duan Q, Yuan X, Wood EF, Schaake J (2014) Hydrologic post-processing of MOPEX streamflow simulations. J Hydrol 508:147–156Yoon S, Cho W, Heo J-H, Kim CE (2010) A full bayesian approach to generalized maximum likelihood estimation of generalized extreme value distribution. Stoch Environ Res Risk Assess 24(5):761–770Zhang X, Zhao K (2012) Bayesian neural networks for uncertainty analysis of hydrologic modeling: a comparison of two schemes. Water Resour Manag 26(8):2365–2382Zhao L, Duan Q, Schaake J, Ye A, Xia J (2011) A hydrologic post-processor for ensemble streamflow predictions. Adv Geosci 29:51–59Zhu W, Marin JM, Leisen F (2016) A bootstrap likelihood approach to Bayesian computation. Aust N Z J Stat 58(2):227–24
- …