13 research outputs found

    Income, education, and other poverty-related variables: a journey through Bayesian hierarchical models

    Full text link
    One-shirt-size policy cannot handle poverty issues well since each area has its unique challenges, while having a custom-made policy for each area separately is unrealistic due to limitation of resources as well as having issues of ignoring dependencies of characteristics between different areas. In this work, we propose to use Bayesian hierarchical models which can potentially explain the data regarding income and other poverty-related variables in the multi-resolution governing structural data of Thailand. We discuss the journey of how we design each model from simple to more complex ones, estimate their performance in terms of variable explanation and complexity, discuss models' drawbacks, as well as propose the solutions to fix issues in the lens of Bayesian hierarchical models in order to get insight from data. We found that Bayesian hierarchical models performed better than both complete pooling (single policy) and no pooling models (custom-made policy). Additionally, by adding the year-of-education variable, the hierarchical model enriches its performance of variable explanation. We found that having a higher education level increases significantly the households' income for all the regions in Thailand. The impact of the region in the households' income is almost vanished when education level or years of education are considered. Therefore, education might have a mediation role between regions and the income. Our work can serve as a guideline for other countries that require the Bayesian hierarchical approach to model their variables and get insight from data

    Framework for inferring empirical causal graphs from binary data to support multidimensional poverty analysis

    Full text link
    Poverty is one of the fundamental issues that mankind faces. To solve poverty issues, one needs to know how severe the issue is. The Multidimensional Poverty Index (MPI) is a well-known approach that is used to measure a degree of poverty issues in a given area. To compute MPI, it requires information of MPI indicators, which are \textbf{binary variables} collecting by surveys, that represent different aspects of poverty such as lacking of education, health, living conditions, etc. Inferring impacts of MPI indicators on MPI index can be solved by using traditional regression methods. However, it is not obvious that whether solving one MPI indicator might resolve or cause more issues in other MPI indicators and there is no framework dedicating to infer empirical causal relations among MPI indicators. In this work, we propose a framework to infer causal relations on binary variables in poverty surveys. Our approach performed better than baseline methods in simulated datasets that we know ground truth as well as correctly found a causal relation in the Twin births dataset. In Thailand poverty survey dataset, the framework found a causal relation between smoking and alcohol drinking issues. We provide R CRAN package `BiCausality' that can be used in any binary variables beyond the poverty analysis context.Comment: The accepted version by Heliyon. The latest version of R package can be found at https://github.com/DarkEyes/BiCausalit

    Variable-lag Granger Causality for Time Series Analysis

    Full text link
    Granger causality is a fundamental technique for causal inference in time series data, commonly used in the social and biological sciences. Typical operationalizations of Granger causality make a strong assumption that every time point of the effect time series is influenced by a combination of other time series with a fixed time delay. However, the assumption of the fixed time delay does not hold in many applications, such as collective behavior, financial markets, and many natural phenomena. To address this issue, we develop variable-lag Granger causality, a generalization of Granger causality that relaxes the assumption of the fixed time delay and allows causes to influence effects with arbitrary time delays. In addition, we propose a method for inferring variable-lag Granger causality relations. We demonstrate our approach on an application for studying coordinated collective behavior and show that it performs better than several existing methods in both simulated and real-world datasets. Our approach can be applied in any domain of time series analysis.Comment: This paper will be appeared in the proceeding of 2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA). The R package is available at https://github.com/DarkEyes/VLTimeSeriesCausalit

    mFLICA: An R package for Inferring Leadership of Coordination From Time Series

    Full text link
    Leadership is a process that leaders influence followers to achieve collective goals. One of special cases of leadership is the coordinated pattern initiation. In this context, leaders are initiators who initiate coordinated patterns that everyone follows. Given a set of individual-multivariate time series of real numbers, the mFLICA package provides a framework for R users to infer coordination events within time series, initiators and followers of these coordination events, as well as dynamics of group merging and splitting. The mFLICA package also has a visualization function to make results of leadership inference more understandable. The package is available on Comprehensive R Archive Network (CRAN) at https://CRAN.R-project.org/package=mFLICA.Comment: The latest version of R package can be found at https://github.com/DarkEyes/mFLIC

    Inference of Leadership of Coordinated Activity in Time Series

    No full text
    When a group of people decides to move somewhere together, who is the initiator who starts moving and everyone follows? Do the group members follow friends around them or do they prefer to follow specific individuals? These questions are about leadership. Leadership plays a key role in social animals', including humans', decision-making and coalescence in coordinated activities such as hunting, migration, sport, diplomatic negotiation, etc. In these coordinated activities, leadership is a process which organizes interactions among members to make a group achieve collective goals. Understanding initiation of coordinated activities allows scientists to gain more insight into social species' behaviors. However, by using only the data on time series of activities, inferring leadership, as manifested by the initiation of coordinated activities, faces many challenging issues. First, there is no fundamental concept to describe these activities computationally. Second, coordinated activities are dynamic. Third, several different coordinated activities may occur simultaneously among subgroups. To fill these remaining gaps in leadership inference, we formalize several computational leadership problems and propose methodologies to solve them. We evaluate and demonstrate the performance of the proposed frameworks in both simulated and real-world datasets, such as baboon trajectories, time series of fish movement as well as time series of closing price of stock market. The frameworks perform better than non-trivial baselines in both simulated and real-world datasets. Our problem formalization and frameworks enable opportunities for scientists to analyze coordinated activities and generate scientific hypotheses about collective behaviors that can be tested statistically and in the field
    corecore