53 research outputs found

    open PDS: Protecting the Privacy of Metadata through SafeAnswers.

    Get PDF
    The rise of smartphones and web services made possible the large-scale collection of personal metadata. Information about individuals' location, phone call logs, or web-searches, is collected and used intensively by organizations and big data researchers. Metadata has however yet to realize its full potential. Privacy and legal concerns, as well as the lack of technical solutions for personal metadata management is preventing metadata from being shared and reconciled under the control of the individual

    Improving official statistics in emerging markets using machine learning and mobile phone data

    Get PDF
    Mobile phones are one of the fastest growing technologies in the developing world with global penetration rates reaching 90%. Mobile phone data, also called CDR, are generated everytime phones are used and recorded by carriers at scale. CDR have generated groundbreaking insights in public health, official statistics, and logistics. However, the fact that most phones in developing countries are prepaid means that the data lacks key information about the user, including gender and other demographic variables. This precludes numerous uses of this data in social science and development economic research. It furthermore severely prevents the development of humanitarian applications such as the use of mobile phone data to target aid towards the most vulnerable groups during crisis. We developed a framework to extract more than 1400 features from standard mobile phone data and used them to predict useful individual characteristics and group estimates. We here present a systematic cross-country study of the applicability of machine learning for dataset augmentation at low cost. We validate our framework by showing how it can be used to reliably predict gender and other information for more than half a million people in two countries. We show how standard machine learning algorithms trained on only 10,000 users are sufficient to predict individual’s gender with an accuracy ranging from 74.3 to 88.4% in a developed country and from 74.5 to 79.7% in a developing country using only metadata. This is significantly higher than previous approaches and, once calibrated, gives highly accurate estimates of gender balance in groups. Performance suffers only marginally if we reduce the training size to 5,000, but significantly decreases in a smaller training set. We finally show that our indicators capture a large range of behavioral traits using factor analysis and that the framework can be used to predict other indicators of vulnerability such as age or socio-economic status. Mobile phone data has a great potential for good and our framework allows this data to be augmented with vulnerability and other information at a fraction of the cost

    Change in BMI Accurately Predicted by Social Exposure to Acquaintances

    Get PDF
    Research has mostly focused on obesity and not on processes of BMI change more generally, although these may be key factors that lead to obesity. Studies have suggested that obesity is affected by social ties. However these studies used survey based data collection techniques that may be biased toward select only close friends and relatives. In this study, mobile phone sensing techniques were used to routinely capture social interaction data in an undergraduate dorm. By automating the capture of social interaction data, the limitations of self-reported social exposure data are avoided. This study attempts to understand and develop a model that best describes the change in BMI using social interaction data. We evaluated a cohort of 42 college students in a co-located university dorm, automatically captured via mobile phones and survey based health-related information. We determined the most predictive variables for change in BMI using the least absolute shrinkage and selection operator (LASSO) method. The selected variables, with gender, healthy diet category, and ability to manage stress, were used to build multiple linear regression models that estimate the effect of exposure and individual factors on change in BMI. We identified the best model using Akaike Information Criterion (AIC) and R[superscript 2]. This study found a model that explains 68% (p<0.0001) of the variation in change in BMI. The model combined social interaction data, especially from acquaintances, and personal health-related information to explain change in BMI. This is the first study taking into account both interactions with different levels of social interaction and personal health-related information. Social interactions with acquaintances accounted for more than half the variation in change in BMI. This suggests the importance of not only individual health information but also the significance of social interactions with people we are exposed to, even people we may not consider as close friends.MIT Masdar ProgramMIT Media Lab Consortiu

    bandicoot: an open-source Python toolbox to analyze mobile phone metadata

    No full text
    bandicoot is an open-source Python toolbox to extract more than 1442 features from standard mobile phone metadata. bandicoot makes it easy for machine learning researchers and practitioners to load mobile phone data, to analyze and visualize them, and to extract robust features which can be used for various classification and clustering tasks. Emphasis is put on ease of use, consistency, and documentation. bandicoot has no dependencies and is distributed under MIT licens

    Behavioral attributes and financial churn prediction

    Get PDF
    Customer retention is crucial in a variety of businesses as acquiring new customers is often more costly than keeping the current ones. As a consequence, churn prediction has attracted great attention from both the business and academic worlds. Traditional efforts in the financial domain mainly focus on domain specific variables such as product ownership or service usage aggregation, however, without considering dynamic behavioral patterns of customers’ financial transactions. In this paper, we attempt to fill in this gap by investigating the spatio-temporal patterns and entropy of choices underlying the customers’ financial decisions, and their relations to customer churning activities. Inspired by previous works in the emerging field of computational social science, we built a prediction model based on spatio-temporal and choice behavioral traits using individual transaction records. Our results show that proposed dynamic behavioral models could predict churn decisions significantly better than traditionally considered factors such as demographic-based features, and that this effect remains consistent across multiple data sets and various churn definitions. We further study the relative importance of the various behavioral features in churn prediction, and how the predictive power varies across different demographic groups. More generally, the proposed features can also be applied to churn prediction in other domains where spatio-temporal behavioral data are available

    Critical review of the book “Gaze in Human–Robot Communication”

    No full text

    Purchase patterns, socioeconomic status, and political inclination

    No full text
    This paper analyzes millions of credit card transaction records during several months for tens of thousands of individuals from two different countries. The study shows that, purchase patterns are strongly correlated with important societal indices such as socioeconomic status and political inclination. The results suggest the possibility of understanding and predicting the evolution of such societal indices from purchase behavioral patterns, potentially at high temporal and spatial resolutions
    corecore