286 research outputs found
Clustering and Structural Robustness in Causal Diagrams
Graphs are commonly used to represent and visualize causal relations. For a
small number of variables, this approach provides a succinct and clear view of
the scenario at hand. As the number of variables under study increases, the
graphical approach may become impractical, and the clarity of the
representation is lost. Clustering of variables is a natural way to reduce the
size of the causal diagram, but it may erroneously change the essential
properties of the causal relations if implemented arbitrarily. We define a
specific type of cluster, called transit cluster, that is guaranteed to
preserve the identifiability properties of causal effects under certain
conditions. We provide a sound and complete algorithm for finding all transit
clusters in a given graph and demonstrate how clustering can simplify the
identification of causal effects. We also study the inverse problem, where one
starts with a clustered graph and looks for extended graphs where the
identifiability properties of causal effects remain unchanged. We show that
this kind of structural robustness is closely related to transit clusters
Sequence Analysis and Related Approaches
Life course data often consists of multiple parallel sequences, one for
each life domain of interest. Multichannel sequence analysis has been
used for computing pairwise dissimilarities and finding clusters in this
type of multichannel (or multidimensional) sequence data. Describing
and visualizing such data is, however, often challenging. We propose an
approach for compressing, interpreting, and visualizing the information
within multichannel sequences by finding (1) groups of similar
trajectories and (2) similar phases within trajectories belonging to the
same group. For these tasks we combine multichannel sequence analysis
and hidden Markov modelling. We illustrate this approach with an
empirical application to life course data but the proposed approach can
be useful in various longitudinal problems.</p
Graphical model inference: Sequential Monte Carlo meets deterministic approximations
Approximate inference in probabilistic graphical models (PGMs) can be grouped
into deterministic methods and Monte-Carlo-based methods. The former can often
provide accurate and rapid inferences, but are typically associated with biases
that are hard to quantify. The latter enjoy asymptotic consistency, but can
suffer from high computational costs. In this paper we present a way of
bridging the gap between deterministic and stochastic inference. Specifically,
we suggest an efficient sequential Monte Carlo (SMC) algorithm for PGMs which
can leverage the output from deterministic inference methods. While generally
applicable, we show explicitly how this can be done with loopy belief
propagation, expectation propagation, and Laplace approximations. The resulting
algorithm can be viewed as a post-correction of the biases associated with
these methods and, indeed, numerical results show clear improvements over the
baseline deterministic methods as well as over "plain" SMC
A Bayesian spatio-temporal analysis of markets during the Finnish 1860s famine
We develop a Bayesian spatio-temporal model to study pre-industrial grain
market integration during the Finnish famine of the 1860s. Our model takes into
account several problematic features often present when analysing multiple
spatially interdependent time series. For example, compared with the error
correction methodology commonly applied in econometrics, our approach allows
simultaneous modeling of multiple interdependent time series avoiding
cumbersome statistical testing needed to predetermine the market leader as a
point of reference. Furthermore, introducing a flexible spatio-temporal
structure enables analysing detailed regional and temporal dynamics of the
market mechanisms. Applying the proposed method, we detected spatially
asymmetric "price ripples" that spread out from the shock origin. We
corroborated the existing literature on the speedier adjustment to emerging
price differentials during the famine, but we observed this principally in
urban markets. This hastened return to long-run equilibrium means faster and
longer travel of price shocks, implying prolonged out-of-equilibrium dynamics,
proliferated influence of market shocks, and, importantly, a wider spread of
famine conditions
Price Optimization Combining Conjoint Data and Purchase History: A Causal Modeling Approach
Pricing decisions of companies require an understanding of the causal effect
of a price change on the demand. When real-life pricing experiments are
infeasible, data-driven decision-making must be based on alternative data
sources such as purchase history (sales data) and conjoint studies where a
group of customers is asked to make imaginary purchases in an artificial setup.
We present an approach for price optimization that combines population
statistics, purchase history and conjoint data in a systematic way. We build on
the recent advances in causal inference to identify and quantify the effect of
price on the purchase probability at the customer level. The identification
task is a transportability problem whose solution requires a parametric
assumption on the differences between the conjoint study and real purchases.
The causal effect is estimated using Bayesian methods that take into account
the uncertainty of the data sources. The pricing decision is made by comparing
the estimated posterior distributions of gross profit for different prices. The
approach is demonstrated with simulated data resembling the features of
real-world data
Can visualization alleviate dichotomous thinking Effects of visual representations on the cliff effect
Common reporting styles for statistical results in scientific articles,
such as \pvalues\ and confidence intervals (CI), have been reported to
be prone to dichotomous interpretations, especially with respect to the
null hypothesis significance testing framework. For example when the
p-value is small enough or the CIs of the mean effects of a studied drug
and a placebo are not overlapping, scientists tend to claim significant
differences while often disregarding the magnitudes and absolute
differences in the effect sizes. This type of reasoning has been shown
to be potentially harmful to science. Techniques relying on the visual
estimation of the strength of evidence have been recommended to reduce
such dichotomous interpretations but their effectiveness has also been
challenged. We ran two experiments on researchers with expertise in
statistical analysis to compare several alternative representations of
confidence intervals and used Bayesian multilevel models to estimate the
effects of the representation styles on differences in researchers'
subjective confidence in the results. We also asked the respondents'
opinions and preferences in representation styles. Our results suggest
that adding visual information to classic CI representation can decrease
the tendency towards dichotomous interpretations measured as the cliff
effect: the sudden drop in confidence around p-value 0.05 compared with
classic CI visualization and textual representation of the CI with
p-values. All data and analyses are publicly available at
https://github.com/helske/statvis.</p
Spatio-temporal modeling of co-dynamics of smallpox, measles and pertussis in pre-healthcare Finland
Infections are known to interact as previous infections may have an effect on
risk of succumbing to a new infection. The co-dynamics can be mediated by
immunosuppression or -modulation, shared environmental or climatic drivers, or
competition for susceptible hosts. Research and statistical methods in
epidemiology often concentrate on large pooled datasets, or high quality data
from cities, leaving rural areas underrepresented in literature. Data
considering rural populations are typically sparse and scarce, especially in
the case of historical data sources, which may introduce considerable
methodological challenges. In order to overcome many obstacles due to such
data, we present a general Bayesian spatio-temporal model for disease
co-dynamics. Applying the proposed model on historical (1820-1850) Finnish
parish register data, we study the spread of infectious diseases in
pre-healthcare Finland. We observe that measles, pertussis and smallpox exhibit
positively correlated dynamics such that any new infection increased mortality
in all three diseases, indicating possibly general immunosuppressive effects at
population level
Comparison of Attention Behaviour Across User Sets through Automatic Identification of Common Areas of Interest
Eye tracking is used to analyze and compare user behaviour across diverse domains, but long duration eye tracking experiments across multiple users generate millions of eye gaze samples, making the data analysis process complex. Usually the samples are labelled into Areas of Interest (AoI) or Objects of Interest (OoI), where the AoI approach aims to understand how a user monitors different regions of a scene, while OoI identification uncovers distinct objects in the scene that attract user attention. Using scalable clustering and cluster merging that is not constrained by input parameters, we label AoIs across multiple users in long duration eye tracking experiments. Using the common AoI labels then allows direct comparison of the users as well as the use of such methods as Hidden Markov Models and Sequence mining to uncover interesting behaviour across the users which, until now, has been prohibitively difficult to achieve
Samassa myrskyssÀ mutta eri veneissÀ: COVID-19 ja eriarvoisuus
Koronakriisi ei ole vain terveydellinen vaan myös taloudellinen ja sosiaalinen kriisi, jonka negatiiviset vaikutukset ovat pÀÀosin suurimmat niille, jotka ovat jo valmiiksi heikoimmassa asemassa. TÀssÀ artikkelissa teemme poikkitieteellisen katsauksen koronapandemian aikaisiin muutoksiin ja sen eriarvoistaviin vaikutuksiin eri ihmisryhmiin koko elÀmÀnkaarella ja eri elÀmÀnalueilla. Tarkastelemme erityisesti lapsiperheitÀ ja lapsiperheiden palveluita, oppimista ja opiskelua, työelÀmÀÀ ja tuloja, aikuisten sosiaalipalveluita sekÀ terveyteen ja hyvinvointiin liittyvÀÀ tutkimusta. Globaalisti eriarvoistuminen on merkittÀvÀsti suurempaa kuin Suomessa. Keskitymme tÀssÀ kuitenkin erityisesti Suomen kannalta kiinnostavaan eriarvoisuustutkimukseen. Tutkimuksen valossa pohdimme myös, mitÀ Suomessa voitaisiin tehdÀ, jotta epidemia ja sen rajoitustoimet eivÀt entisestÀÀn syventÀisi olemassa olevaa eriarvoisuutta ja miten haittoja voidaan koronan jÀlkihoidossa vÀhentÀÀ.Tutkimus osoittaa, ettÀ taloudellinen, terveydellinen ja sosiaalinen eriarvoisuus on lisÀÀntynyt eri elÀmÀnalueilla. Vaikka koronapandemialla on ollut myös myönteisiÀ vaikutuksia esimerkiksi digitaalisten palvelujen kÀyttöönottoon, ovat myönteiset vaikutukset kasautuneet pÀÀosin yhteiskunnan hyvÀosaisille. Haavoittuvassa asemassa olevien ihmisryhmien ja heidÀn ongelmiensa tunnistaminen on avainasemassa koronakriisin negatiivisten vaikutusten minimoimisessa ja jÀlkihoidossa.</p
Meso-level contextual patterns of fathers' family leave uptake in Finland
Family leave uptake by fathers represents one pathway to redress the typically unequal division of early childcare, which has been linked to various family outcomes, including the motherâs employment and childrenâs schooling. The generous Nordic leave systems are designed around this concept to encourage leave uptake, though the choice to take leave remains an individual one. There is a substantial body of literature on policy and individual-level drivers of leave uptake, but less so for meso-level factors, such as workplace and extended family, despite a possible key role in influencing individual family leave decisions. We used population register data from Finland to examine the demographics of fathersâ family leave uptake in 2007â2016. We found that uptake was highest amongst the employed, and that female-dominated industries and workplaces were associated with fathers taking longer periods of leave, particularly in the later years of the study. We also found possible indicators of role model effects, with long leaves of close family and of colleagues associated with longer leaves for new fathers. Our results suggest that meso-level contexts may be an important mediator in decisions surrounding uptake and length of fathersâ family leaves.</p
- âŠ