599,745 research outputs found

    Model-based clustering for populations of networks

    Get PDF
    Until recently obtaining data on populations of networks was typically rare. However, with the advancement of automatic monitoring devices and the growing social and scientific interest in networks, such data has become more widely available. From sociological experiments involving cognitive social structures to fMRI scans revealing large-scale brain networks of groups of patients, there is a growing awareness that we urgently need tools to analyse populations of networks and particularly to model the variation between networks due to covariates. We propose a model-based clustering method based on mixtures of generalized linear (mixed) models that can be employed to describe the joint distribution of a populations of networks in a parsimonious manner and to identify subpopulations of networks that share certain topological properties of interest (degree distribution, community structure, effect of covariates on the presence of an edge, etc.). Maximum likelihood estimation for the proposed model can be efficiently carried out with an implementation of the EM algorithm. We assess the performance of this method on simulated data and conclude with an example application on advice networks in a small business.Comment: The final (published) version of the article can be downloaded for free (Open Access) from the editor's website (click on the DOI link below

    Compositional data for global monitoring: the case of drinking water and sanitation

    Get PDF
    Introduction At a global level, access to safe drinking water and sanitation has been monitored by the Joint Monitoring Programme (JMP) of WHO and UNICEF. The methods employed are based on analysis of data from household surveys and linear regression modelling of these results over time. However, there is evidence of non-linearity in the JMP data. In addition, the compositional nature of these data is not taken into consideration. This article seeks to address these two previous shortcomings in order to produce more accurate estimates. Methods We employed an isometric log-ratio transformation designed for compositional data. We applied linear and non-linear time regressions to both the original and the transformed data. Specifically, different modelling alternatives for non-linear trajectories were analysed, all of which are based on a generalized additive model (GAM). Results and discussion Non-linear methods, such as GAM, may be used for modelling non-linear trajectories in the JMP data. This projection method is particularly suited for data-rich countries. Moreover, the ilr transformation of compositional data is conceptually sound and fairly simple to implement. It helps improve the performance of both linear and non-linear regression models, specifically in the occurrence of extreme data points, i.e. when coverage rates are near either 0% or 100%.Peer ReviewedPostprint (author's final draft

    Do Stockholders Share Risk More Effectively Than Non- stockholders?

    Get PDF
    This paper analyzes the extent of risk-sharing among stockholders and among nonstockholders. Wealthy households play a crucial role in many economic problems due to the substantial concentration of asset holdings in the U.S. data. Hence, to evaluate the empirical importance of market incompleteness, it is essential to determine if idiosyncratic shocks are important for the wealthy, who have access to better insurance opportunities, but also face different risks, than the average household. We study a model where each period households decide whether to participate in the stock market by paying a fixed cost. Due to this endogenous entry decision, the testable implications of perfect risk- sharing take the form of a sample selection model, which we estimate and test using a semi-parametric GMM estimator proposed by Kyriazidou (2001). Using data from PSID we strongly reject perfect risk-sharing among stockholders, but perhaps surprisingly, do not find evidence against it among non-stockholders. These results appear to be robust to several extensions we considered. These findings indicate that market incompleteness may be more important for the wealthy, and suggest further focus on risk factors that primarily affect this group, such as entrepreneurial income risk.Perfect risk-sharing, incomplete markets, semiparametric estimation, Generalized Method of Moments, limited stock market participation.

    Estimasi Dampak Program Asuransi Kesehatan pada Jumlah Kunjungan Rawat Jalan di Indonesia

    Full text link
    Background and method: This research aimed to selectthe best methods to predict the effect of health insuranceprogram on the numbers of outpatient visits in Indonesia. Theanalysis was applied to the second round of the IndonesianFamily Life Survey data (IFLS2).Result: The author compares the estimation results derivedfrom 6(six) econometrics technique count data model and selectthe best alternatives based on several statistics tests. Theresults confirm that Generalized Method of Moments (GMM)estimator is best to model the number of visits to public outpatient,whilst Hurdle Negative Binomial (HNB) is superior to model thenumber of visits to private one. It is proved that the insuredhave higher probability in the number of visits for outpatientservices then uninsured (p<1%). Supplies induce demandphenomena was not detected among the insured, howeverthis behaviour was likely happen where providers competitionare relatively high.Conclusions: This study concludes that estimates of healthcare demand given insurance have been shown to depend onthe empirical specification used in the analysis. Not controllingthe existence endogeneity of insurance leads to lower theparameter estimates. This study supports a national healthinsurance policy as an instrument to increase access to formalhealth care services.Keywords: health insurance, modeling, demand for health careservice

    ANALISIS PENGARUH PENDAPATAN, PENDIDIKAN, SUKU BUNGA, PENETRASI DEMOGRAFIS DAN GEOGRAFIS PERBANKAN TERHADAP KREDIT UMKM (Studi kasus Kota dan Kabupaten di Jawa Tengah Tahun 2011-2015)

    Get PDF
    Financial inclusion today becomes international interest and national priority. The more inclusive financial access, the more opportunity people have to improve their economy. Financial inclusion has become an alternative way to grow economy of society. Credit is one of financial services which can improve people’s economy, specifically for SME in capital credit. In addition to measure financial inclusion level in a province known by how much people’s access to credit, credit appears as a tool to open new work space. This research examines effects from factors such as people income, education level, interest rate, demographic and geographic penetration of banking to SME credit. Samples selected in this research by total sampling method. Samples used are 35 cities and regencies at Central Java in 2011-2015. Data analysis using data panel regression analysis with Fixed Effect Model and Generalized Least Square. Regression model in this research tested with classical assumption test, while regression analysis model selected through chow test and hausman test result. This research show that people income has insignificant positive effect to SME credit. While education level has significant negative effect and intereset rate has insignificant negative effect. On the other side, demographic penetration of banking has insignificant positive effect and geographic penetration of banking has significant positive effect

    Total Factor Productivity, Demographic Traits and ICT: Empirical Analysis for Asia

    Get PDF
    This paper advances a model to explain the total factor productivity in Asian countries, most of which are labor surplus and are endowed with substantial human capital. Such promising demographic potentials are considered as complementary factors to use of Information and Communication Technology (ICT). Population with such favorable demographic traits and access to ICT results in higher Total factor productivity (TFP). We call this as Demo-Tech-TFP Model and is tested by using data for 2000-2010 of 24 Asian countries. Econometric concerns like presence of endogenous and/or predetermined covariates and small time-series and cross-sectional dimensions of panel dataset are tackled by using System Generalized Method of Moments (SYS-GMM). Results show considerable support for the Demo-Tech-TFP hypothesis. Need is to design such models that suit the local demography and patterns of technological diffusion currently taking place in developing countries

    Estimation of aggregated modal split mode

    Get PDF
    In spite of the fact that disaggregate modelling has undergone considerable development in the last twenty years, many studies are still based on aggregate modelling. In France, for example, aggregate models are still in much more common use than disaggregate models, even for modal split. The estimation of aggregate models is still therefore an important issue.In France, for most studies it is possible to use behavioural data from household surveys, which are conducted every ten years in most French conurbations. These household surveys provide data on the socioeconomic characteristics both of individuals and the households to which they belong and data on modal choice for all the trips made the day before the survey. The sampling rate is generally of 1% of the population, which gives about 50,000 trips for a conurbation of 1 million inhabitants. However, matrices that contain several hundred rows and columns are frequently used. We therefore have to construct several modal matrices that contain more than 10,000 cells (in the case of a small matrix with only 100 rows) with less than 50,000 trips (to take the above example). Obviously, the matrices will contain a large number of empty cells and the precision of almost all the cells will be very low. It is consequently not possible to estimate the model at this level of zoning.The solution which is generally chosen is to aggregate zones. This must comply with two contradictory objectives:- the number of zones must be as small as possible in order to increase the number of surveyed trips that can be used during estimation and hence the accuracy of the O-D matrices for trips conducted on each mode;- the zones must be as small as possible in order to produce accurate data for the explanatory variables such as the generalized cost for each of the transport modes considered. When the size of the zone increases, it is more difficult to evaluate the access and regress time for public transport and there are several alternative routes with different travel times between each origin zone and each destination. Therefore more uncertainty is associated with the generalized cost that represents the quality of service available between the two zones. The generally adopted solution is to produce a weighted average of all the generalized costs computed from the most disaggregated matrix. However, there is no guarantee that this weighted mean will be accurate for the origin-destination pair in question.When the best compromise has been made, some of the matrix cells are generally empty or suffer from an insufficient level of precision. To deal with this problem we generally keep only the cells for which the data is sufficiently precise by selecting those cells in which the number of surveyed trips exceeds a certain threshold. However, this process involves rejecting part of the data which cannot be used for estimation purposes. When a fairly large number of zones is used, the origin destination pairs which are selected for the estimation of the model mainly involve trips that are performed in the centre of the conurbation or radial trips between the centre and the suburbs. These origin-destination pairs are also those for which public transport's share is generally the highest. The result is to reduce the variance of the data and therefore the quality of the estimation.To cope with this problem we propose a different aggregation process which makes it possible to retain all the trips and use a more disaggregate zoning system. The principle of the method is very simple. We shall apply the method to the model most commonly used for modal split, which is the logit model. When there are only two modes of transport, the share of each mode is obtained directly from the difference in the utility between the two modes with the logit function. We can therefore aggregate the origin-destination pairs for which the difference between the utility of the two modes is very small in order to obtain enough surveyed trips to ensure sufficient data accuracy. This process is justified by the fact that generally the data used to calculate the utility of each mode is as accurate or even more accurate at a more disaggregate level of zoning. The problem with this method is that the utility function coefficients have to be estimated at the same time as the logit model. An iterative process is therefore necessary. The steps of the method are summarised below:- selection of initialization values for the utility function coefficients for the two transport modes in order to intitialize the iteration process. These values can, for example, be obtained from a previous study or calibration performed according to the classical method described in Section 1.2;- the utility for each mode is computed on the basis of the above coefficients, followed by the difference in the utility for each O-D pair in the smallest scale zoning system for which explanatory variables with an adequate level of accuracy are available (therefore with very limited zonal aggregation or even none at all);- the O-D pairs are classified on the basis of increasing utility difference;- the O-D pairs are then aggregated. This is done on the basis of closeness of utility difference. The method involves taking the O-D link with the smallest utility difference then combining it with the next O-D pair (in order of increasing utility difference). This process is continued until the number of surveyed trips in the grouping is greater than a threshold value that is decided on the basis of the level of accuracy that is required for trip flow estimation. When this threshold is reached the construction of the second grouping is commenced, and so on and so forth until each O-D pair has been assigned to a group;- for each new class of O-D pairs it is necessary to compute the values of the explanatory variables which make up the utility functions for each class. This value is obtained on the basis of the weighted average of the values for each O-D pair in the class;- a new estimation of the utility function coefficients.This process is repeated until the values of the utility function coefficients converge. We have tested this method for the Lyon conurbation with data from the most recent household travel survey conducted in 1995/96. We have conducted a variety of tests in order to identify the best application of the method and to test the stability of the results. It would seem that this method always produces better results than the more traditional method that involves zoning aggregation. The paper presents both the methodology and the results obtained from different aggregation methods. In particular, we analyse how the choice of zoning system affects the results of the estimation.Aggregate modelling ; choice modal ; Zoning system ; Urban mobility ; Conurbation (Lyon, France) ; Estimation method

    Meta-analysis of generalized additive models in neuroimaging studies

    Get PDF
    Contains fulltext : 231772.pdf (publisher's version ) (Open Access)Analyzing data from multiple neuroimaging studies has great potential in terms of increasing statistical power, enabling detection of effects of smaller magnitude than would be possible when analyzing each study separately and also allowing to systematically investigate between-study differences. Restrictions due to privacy or proprietary data as well as more practical concerns can make it hard to share neuroimaging datasets, such that analyzing all data in a common location might be impractical or impossible. Meta-analytic methods provide a way to overcome this issue, by combining aggregated quantities like model parameters or risk ratios. Most meta-analytic tools focus on parametric statistical models, and methods for meta-analyzing semi-parametric models like generalized additive models have not been well developed. Parametric models are often not appropriate in neuroimaging, where for instance age-brain relationships may take forms that are difficult to accurately describe using such models. In this paper we introduce meta-GAM, a method for meta-analysis of generalized additive models which does not require individual participant data, and hence is suitable for increasing statistical power while upholding privacy and other regulatory concerns. We extend previous works by enabling the analysis of multiple model terms as well as multivariate smooth functions. In addition, we show how meta-analytic p-values can be computed for smooth terms. The proposed methods are shown to perform well in simulation experiments, and are demonstrated in a real data analysis on hippocampal volume and self-reported sleep quality data from the Lifebrain consortium. We argue that application of meta-GAM is especially beneficial in lifespan neuroscience and imaging genetics. The methods are implemented in an accompanying R package metagam, which is also demonstrated
    corecore