2 research outputs found

    IMPUTING SOCIAL DEMOGRAPHIC INFORMATION BASED ON PASSIVELY COLLECTED LOCATION DATA AND MACHINE LEARNING METHODS

    Get PDF
    Multiple types of passively collected location data (PCLD) have emerged during the past 20 years. Its capability in travel demand analysis has also been studied and revealed. Unlike the traditional surveys whose sample is designed efficiently and carefully, PCLD features a non-probabilistic sample of dramatically larger size. However, PCLD barely contains any ground truth for both the human subjects involved and the movements they produce. The imputation for such missing information has been evaluated for years, including origin and destination, travel mode, trip purpose, etc. This research intends to advance the utilization of PCLD by imputing social demographic information, which can help to create a panorama for the large volume of travel behaviors observed and to further develop a rational weighting procedure for PCLD. The Conditional Inference Tree model has been employed to address the problems because of its abilities to avoid biased variable selection and overfitting

    What demographic attributes do our digital footprints reveal? A systematic review

    Get PDF
    <div><p>To what extent does our online activity reveal who we are? Recent research has demonstrated that the digital traces left by individuals as they browse and interact with others online may reveal who they are and what their interests may be. In the present paper we report a systematic review that synthesises current evidence on predicting demographic attributes from online digital traces. Studies were included if they met the following criteria: (i) they reported findings where at least one demographic attribute was predicted/inferred from at least one form of digital footprint, (ii) the method of prediction was automated, and (iii) the traces were either visible (e.g. tweets) or non-visible (e.g. clickstreams). We identified 327 studies published up until October 2018. Across these articles, 14 demographic attributes were successfully inferred from digital traces; the most studied included gender, age, location, and political orientation. For each of the demographic attributes identified, we provide a database containing the platforms and digital traces examined, sample sizes, accuracy measures and the classification methods applied. Finally, we discuss the main research trends/findings, methodological approaches and recommend directions for future research.</p></div
    corecore