13 research outputs found
Income, education, and other poverty-related variables: a journey through Bayesian hierarchical models
One-shirt-size policy cannot handle poverty issues well since each area has
its unique challenges, while having a custom-made policy for each area
separately is unrealistic due to limitation of resources as well as having
issues of ignoring dependencies of characteristics between different areas. In
this work, we propose to use Bayesian hierarchical models which can potentially
explain the data regarding income and other poverty-related variables in the
multi-resolution governing structural data of Thailand. We discuss the journey
of how we design each model from simple to more complex ones, estimate their
performance in terms of variable explanation and complexity, discuss models'
drawbacks, as well as propose the solutions to fix issues in the lens of
Bayesian hierarchical models in order to get insight from data.
We found that Bayesian hierarchical models performed better than both
complete pooling (single policy) and no pooling models (custom-made policy).
Additionally, by adding the year-of-education variable, the hierarchical model
enriches its performance of variable explanation. We found that having a higher
education level increases significantly the households' income for all the
regions in Thailand. The impact of the region in the households' income is
almost vanished when education level or years of education are considered.
Therefore, education might have a mediation role between regions and the
income. Our work can serve as a guideline for other countries that require the
Bayesian hierarchical approach to model their variables and get insight from
data
Framework for inferring empirical causal graphs from binary data to support multidimensional poverty analysis
Poverty is one of the fundamental issues that mankind faces. To solve poverty
issues, one needs to know how severe the issue is. The Multidimensional Poverty
Index (MPI) is a well-known approach that is used to measure a degree of
poverty issues in a given area. To compute MPI, it requires information of MPI
indicators, which are \textbf{binary variables} collecting by surveys, that
represent different aspects of poverty such as lacking of education, health,
living conditions, etc. Inferring impacts of MPI indicators on MPI index can be
solved by using traditional regression methods. However, it is not obvious that
whether solving one MPI indicator might resolve or cause more issues in other
MPI indicators and there is no framework dedicating to infer empirical causal
relations among MPI indicators.
In this work, we propose a framework to infer causal relations on binary
variables in poverty surveys. Our approach performed better than baseline
methods in simulated datasets that we know ground truth as well as correctly
found a causal relation in the Twin births dataset. In Thailand poverty survey
dataset, the framework found a causal relation between smoking and alcohol
drinking issues. We provide R CRAN package `BiCausality' that can be used in
any binary variables beyond the poverty analysis context.Comment: The accepted version by Heliyon. The latest version of R package can
be found at https://github.com/DarkEyes/BiCausalit
Variable-lag Granger Causality for Time Series Analysis
Granger causality is a fundamental technique for causal inference in time
series data, commonly used in the social and biological sciences. Typical
operationalizations of Granger causality make a strong assumption that every
time point of the effect time series is influenced by a combination of other
time series with a fixed time delay. However, the assumption of the fixed time
delay does not hold in many applications, such as collective behavior,
financial markets, and many natural phenomena. To address this issue, we
develop variable-lag Granger causality, a generalization of Granger causality
that relaxes the assumption of the fixed time delay and allows causes to
influence effects with arbitrary time delays. In addition, we propose a method
for inferring variable-lag Granger causality relations. We demonstrate our
approach on an application for studying coordinated collective behavior and
show that it performs better than several existing methods in both simulated
and real-world datasets. Our approach can be applied in any domain of time
series analysis.Comment: This paper will be appeared in the proceeding of 2019 IEEE
International Conference on Data Science and Advanced Analytics (DSAA). The R
package is available at https://github.com/DarkEyes/VLTimeSeriesCausalit
mFLICA: An R package for Inferring Leadership of Coordination From Time Series
Leadership is a process that leaders influence followers to achieve
collective goals. One of special cases of leadership is the coordinated pattern
initiation. In this context, leaders are initiators who initiate coordinated
patterns that everyone follows. Given a set of individual-multivariate time
series of real numbers, the mFLICA package provides a framework for R users to
infer coordination events within time series, initiators and followers of these
coordination events, as well as dynamics of group merging and splitting. The
mFLICA package also has a visualization function to make results of leadership
inference more understandable. The package is available on Comprehensive R
Archive Network (CRAN) at https://CRAN.R-project.org/package=mFLICA.Comment: The latest version of R package can be found at
https://github.com/DarkEyes/mFLIC
Inference of Leadership of Coordinated Activity in Time Series
When a group of people decides to move somewhere together, who is the initiator who starts moving and everyone follows? Do the group members follow friends around them or do they prefer to follow specific individuals? These questions are about leadership. Leadership plays a key role in social animals', including humans', decision-making and coalescence in coordinated activities such as hunting, migration, sport, diplomatic negotiation, etc. In these coordinated activities, leadership is a process which organizes interactions among members to make a group achieve collective goals. Understanding initiation of coordinated activities allows scientists to gain more insight into social species' behaviors. However, by using only the data on time series of activities, inferring leadership, as manifested by the initiation of coordinated activities, faces many challenging issues. First, there is no fundamental concept to describe these activities computationally. Second, coordinated activities are dynamic. Third, several different coordinated activities may occur simultaneously among subgroups. To fill these remaining gaps in leadership inference, we formalize several computational leadership problems and propose methodologies to solve them.
We evaluate and demonstrate the performance of the proposed frameworks in both simulated and real-world datasets, such as baboon trajectories, time series of fish movement as well as time series of closing price of stock market. The frameworks perform better than non-trivial baselines in both simulated and real-world datasets. Our problem formalization and frameworks enable opportunities for scientists to analyze coordinated activities and generate scientific hypotheses about collective behaviors that can be tested statistically and in the field