286 research outputs found

    Clustering and Structural Robustness in Causal Diagrams

    Full text link
    Graphs are commonly used to represent and visualize causal relations. For a small number of variables, this approach provides a succinct and clear view of the scenario at hand. As the number of variables under study increases, the graphical approach may become impractical, and the clarity of the representation is lost. Clustering of variables is a natural way to reduce the size of the causal diagram, but it may erroneously change the essential properties of the causal relations if implemented arbitrarily. We define a specific type of cluster, called transit cluster, that is guaranteed to preserve the identifiability properties of causal effects under certain conditions. We provide a sound and complete algorithm for finding all transit clusters in a given graph and demonstrate how clustering can simplify the identification of causal effects. We also study the inverse problem, where one starts with a clustered graph and looks for extended graphs where the identifiability properties of causal effects remain unchanged. We show that this kind of structural robustness is closely related to transit clusters

    Sequence Analysis and Related Approaches

    Get PDF
    Life course data often consists of multiple parallel sequences, one for each life domain of interest. Multichannel sequence analysis has been used for computing pairwise dissimilarities and finding clusters in this type of multichannel (or multidimensional) sequence data. Describing and visualizing such data is, however, often challenging. We propose an approach for compressing, interpreting, and visualizing the information within multichannel sequences by finding (1) groups of similar trajectories and (2) similar phases within trajectories belonging to the same group. For these tasks we combine multichannel sequence analysis and hidden Markov modelling. We illustrate this approach with an empirical application to life course data but the proposed approach can be useful in various longitudinal problems.</p

    Graphical model inference: Sequential Monte Carlo meets deterministic approximations

    Full text link
    Approximate inference in probabilistic graphical models (PGMs) can be grouped into deterministic methods and Monte-Carlo-based methods. The former can often provide accurate and rapid inferences, but are typically associated with biases that are hard to quantify. The latter enjoy asymptotic consistency, but can suffer from high computational costs. In this paper we present a way of bridging the gap between deterministic and stochastic inference. Specifically, we suggest an efficient sequential Monte Carlo (SMC) algorithm for PGMs which can leverage the output from deterministic inference methods. While generally applicable, we show explicitly how this can be done with loopy belief propagation, expectation propagation, and Laplace approximations. The resulting algorithm can be viewed as a post-correction of the biases associated with these methods and, indeed, numerical results show clear improvements over the baseline deterministic methods as well as over "plain" SMC

    A Bayesian spatio-temporal analysis of markets during the Finnish 1860s famine

    Full text link
    We develop a Bayesian spatio-temporal model to study pre-industrial grain market integration during the Finnish famine of the 1860s. Our model takes into account several problematic features often present when analysing multiple spatially interdependent time series. For example, compared with the error correction methodology commonly applied in econometrics, our approach allows simultaneous modeling of multiple interdependent time series avoiding cumbersome statistical testing needed to predetermine the market leader as a point of reference. Furthermore, introducing a flexible spatio-temporal structure enables analysing detailed regional and temporal dynamics of the market mechanisms. Applying the proposed method, we detected spatially asymmetric "price ripples" that spread out from the shock origin. We corroborated the existing literature on the speedier adjustment to emerging price differentials during the famine, but we observed this principally in urban markets. This hastened return to long-run equilibrium means faster and longer travel of price shocks, implying prolonged out-of-equilibrium dynamics, proliferated influence of market shocks, and, importantly, a wider spread of famine conditions

    Price Optimization Combining Conjoint Data and Purchase History: A Causal Modeling Approach

    Full text link
    Pricing decisions of companies require an understanding of the causal effect of a price change on the demand. When real-life pricing experiments are infeasible, data-driven decision-making must be based on alternative data sources such as purchase history (sales data) and conjoint studies where a group of customers is asked to make imaginary purchases in an artificial setup. We present an approach for price optimization that combines population statistics, purchase history and conjoint data in a systematic way. We build on the recent advances in causal inference to identify and quantify the effect of price on the purchase probability at the customer level. The identification task is a transportability problem whose solution requires a parametric assumption on the differences between the conjoint study and real purchases. The causal effect is estimated using Bayesian methods that take into account the uncertainty of the data sources. The pricing decision is made by comparing the estimated posterior distributions of gross profit for different prices. The approach is demonstrated with simulated data resembling the features of real-world data

    Can visualization alleviate dichotomous thinking Effects of visual representations on the cliff effect

    Get PDF
    Common reporting styles for statistical results in scientific articles, such as \pvalues\ and confidence intervals (CI), have been reported to be prone to dichotomous interpretations, especially with respect to the null hypothesis significance testing framework. For example when the p-value is small enough or the CIs of the mean effects of a studied drug and a placebo are not overlapping, scientists tend to claim significant differences while often disregarding the magnitudes and absolute differences in the effect sizes. This type of reasoning has been shown to be potentially harmful to science. Techniques relying on the visual estimation of the strength of evidence have been recommended to reduce such dichotomous interpretations but their effectiveness has also been challenged. We ran two experiments on researchers with expertise in statistical analysis to compare several alternative representations of confidence intervals and used Bayesian multilevel models to estimate the effects of the representation styles on differences in researchers' subjective confidence in the results. We also asked the respondents' opinions and preferences in representation styles. Our results suggest that adding visual information to classic CI representation can decrease the tendency towards dichotomous interpretations measured as the cliff effect: the sudden drop in confidence around p-value 0.05 compared with classic CI visualization and textual representation of the CI with p-values. All data and analyses are publicly available at https://github.com/helske/statvis.</p

    Spatio-temporal modeling of co-dynamics of smallpox, measles and pertussis in pre-healthcare Finland

    Full text link
    Infections are known to interact as previous infections may have an effect on risk of succumbing to a new infection. The co-dynamics can be mediated by immunosuppression or -modulation, shared environmental or climatic drivers, or competition for susceptible hosts. Research and statistical methods in epidemiology often concentrate on large pooled datasets, or high quality data from cities, leaving rural areas underrepresented in literature. Data considering rural populations are typically sparse and scarce, especially in the case of historical data sources, which may introduce considerable methodological challenges. In order to overcome many obstacles due to such data, we present a general Bayesian spatio-temporal model for disease co-dynamics. Applying the proposed model on historical (1820-1850) Finnish parish register data, we study the spread of infectious diseases in pre-healthcare Finland. We observe that measles, pertussis and smallpox exhibit positively correlated dynamics such that any new infection increased mortality in all three diseases, indicating possibly general immunosuppressive effects at population level

    Comparison of Attention Behaviour Across User Sets through Automatic Identification of Common Areas of Interest

    Get PDF
    Eye tracking is used to analyze and compare user behaviour across diverse domains, but long duration eye tracking experiments across multiple users generate millions of eye gaze samples, making the data analysis process complex. Usually the samples are labelled into Areas of Interest (AoI) or Objects of Interest (OoI), where the AoI approach aims to understand how a user monitors different regions of a scene, while OoI identification uncovers distinct objects in the scene that attract user attention. Using scalable clustering and cluster merging that is not constrained by input parameters, we label AoIs across multiple users in long duration eye tracking experiments. Using the common AoI labels then allows direct comparison of the users as well as the use of such methods as Hidden Markov Models and Sequence mining to uncover interesting behaviour across the users which, until now, has been prohibitively difficult to achieve

    Samassa myrskyssÀ mutta eri veneissÀ: COVID-19 ja eriarvoisuus

    Get PDF
    Koronakriisi ei ole vain terveydellinen vaan myös taloudellinen ja sosiaalinen kriisi, jonka negatiiviset vaikutukset ovat pÀÀosin suurimmat niille, jotka ovat jo valmiiksi heikoimmassa asemassa. TÀssÀ artikkelissa teemme poikkitieteellisen katsauksen koronapandemian aikaisiin muutoksiin ja sen eriarvoistaviin vaikutuksiin eri ihmisryhmiin koko elÀmÀnkaarella ja eri elÀmÀnalueilla. Tarkastelemme erityisesti lapsiperheitÀ ja lapsiperheiden palveluita, oppimista ja opiskelua, työelÀmÀÀ ja tuloja, aikuisten sosiaalipalveluita sekÀ terveyteen ja hyvinvointiin liittyvÀÀ tutkimusta. Globaalisti eriarvoistuminen on merkittÀvÀsti suurempaa kuin Suomessa. Keskitymme tÀssÀ kuitenkin erityisesti Suomen kannalta kiinnostavaan eriarvoisuustutkimukseen. Tutkimuksen valossa pohdimme myös, mitÀ Suomessa voitaisiin tehdÀ, jotta epidemia ja sen rajoitustoimet eivÀt entisestÀÀn syventÀisi olemassa olevaa eriarvoisuutta ja miten haittoja voidaan koronan jÀlkihoidossa vÀhentÀÀ.Tutkimus osoittaa, ettÀ taloudellinen, terveydellinen ja sosiaalinen eriarvoisuus on lisÀÀntynyt eri elÀmÀnalueilla. Vaikka koronapandemialla on ollut myös myönteisiÀ vaikutuksia esimerkiksi digitaalisten palvelujen kÀyttöönottoon, ovat myönteiset vaikutukset kasautuneet pÀÀosin yhteiskunnan hyvÀosaisille. Haavoittuvassa asemassa olevien ihmisryhmien ja heidÀn ongelmiensa tunnistaminen on avainasemassa koronakriisin negatiivisten vaikutusten minimoimisessa ja jÀlkihoidossa.</p

    Meso-level contextual patterns of fathers' family leave uptake in Finland

    Get PDF
    Family leave uptake by fathers represents one pathway to redress the typically unequal division of early childcare, which has been linked to various family outcomes, including the mother’s employment and children’s schooling. The generous Nordic leave systems are designed around this concept to encourage  leave uptake, though the choice to take leave remains an individual one. There is a substantial body of literature on policy and individual-level drivers of leave uptake, but less so for meso-level factors, such as workplace and extended family, despite a possible key role in influencing individual family leave decisions. We used population register data from Finland to examine the demographics of fathers’ family leave uptake in 2007–2016. We found that uptake was highest amongst the employed, and that female-dominated industries and workplaces were associated with fathers taking longer periods of leave, particularly in the later years of the study. We also found possible indicators of role model effects, with long leaves of close family and of colleagues associated with longer leaves for new fathers. Our results suggest that meso-level contexts may be an important mediator in decisions surrounding uptake and length of fathers’ family leaves.</p
    • 

    corecore