4 research outputs found
Controlling for Unobserved Confounds in Classification Using Correlational Constraints
As statistical classifiers become integrated into real-world applications, it
is important to consider not only their accuracy but also their robustness to
changes in the data distribution. In this paper, we consider the case where
there is an unobserved confounding variable that influences both the
features and the class variable . When the influence of
changes from training to testing data, we find that the classifier accuracy can
degrade rapidly. In our approach, we assume that we can predict the value of
at training time with some error. The prediction for is then fed to
Pearl's back-door adjustment to build our model. Because of the attenuation
bias caused by measurement error in , standard approaches to controlling for
are ineffective. In response, we propose a method to properly control for
the influence of by first estimating its relationship with the class
variable , then updating predictions for to match that estimated
relationship. By adjusting the influence of , we show that we can build a
model that exceeds competing baselines on accuracy as well as on robustness
over a range of confounding relationships.Comment: 9 page
Appropriation of digital tracking tools in an online weight loss community: Individual and shared experiences
Online health communities provide a space where people seek out and provide support for weight loss activities, including tracking. Our study examined the experiences of members of an online community (r/loseit on Reddit.com) who posted about using digital tracking tools for weight loss. A targeted search garnered 379 public posts, which were analyzed using Thematic Analysis. Four themes reflected members’ individual and shared experiences: Tracking as gaining insight, Tracking as a vehicle of control, Confronting challenges in sustaining tracking and Teaching and learning the skills of tracking. We highlight complex socio-technical processes that members developed around tracking tools and discuss how knowledge of these appropriations can be applied to designing future user-centered tracking tools to support weight loss. We discuss how the social context of an online health community can shape both the usage of tracking tools and self-regulatory processes for health behaviour change
Health Misinformation in Search and Social Media
People increasingly rely on the Internet in order to search for and share health-related information. Indeed, searching for and sharing information about medical treatments are among the most frequent uses of online data. While this is a convenient and fast method to collect information, online sources may contain incorrect information that has the potential to cause harm, especially if people believe what they read without further research or professional medical advice.
The goal of this thesis is to address the misinformation problem in two of the most commonly used online services: search engines and social media platforms. We examined how people use these platforms to search for and share health information. To achieve this, we designed controlled laboratory user studies and employed large-scale social media data analysis tools. The solutions proposed in this thesis can be used to build systems that better support people's health-related decisions.
The techniques described in this thesis addressed online searching and social media sharing in the following manner. First, with respect to search engines, we aimed to determine the extent to which people can be influenced by search engine results when trying to learn about the efficacy of various medical treatments. We conducted a controlled laboratory study wherein we biased the search results towards either correct or incorrect information. We then asked participants to determine the efficacy of different medical treatments. Results showed that people were significantly influenced both positively and negatively by search results bias. More importantly, when the subjects were exposed to incorrect information, they made more incorrect decisions than when they had no interaction with the search results.
Following from this work, we extended the study to gain insights into strategies people use during this decision-making process, via the think-aloud method. We found that, even with verbalization, people were strongly influenced by the search results bias. We also noted that people paid attention to what the majority states, authoritativeness, and content quality when evaluating online content. Understanding the effects of cognitive biases that can arise during online search is a complex undertaking because of the presence of unconscious biases (such as the search results ranking) that the think-aloud method fails to show.
Moving to social media, we first proposed a solution to detect and track misinformation in social media. Using Zika as a case study, we developed a tool for tracking misinformation on Twitter. We collected 13 million tweets regarding the Zika outbreak and tracked rumors outlined by the World Health Organization and the Snopes fact-checking website. We incorporated health professionals, crowdsourcing, and machine learning to capture health-related rumors as well as clarification communications. In this way, we illustrated insights that the proposed tools provide into potentially harmful information on social media, allowing public health researchers and practitioners to respond with targeted and timely action.
From identifying rumor-bearing tweets, we examined individuals on social media who are posting questionable health-related information, in particular those promoting cancer treatments that have been shown to be ineffective. Specifically, we studied 4,212 Twitter users who have posted about one of 139 ineffective ``treatments'' and compared them to a baseline of users generally interested in cancer. Considering features that capture user attributes, writing style, and sentiment, we built a classifier that is able to identify users prone to propagating such misinformation. This classifier achieved an accuracy of over 90%, providing a potential tool for public health officials to identify such individuals for preventive intervention