76 research outputs found

    Classifying the Evolving Mask Debate: A Transferable Machine Learning Framework

    Full text link
    [EN] Anti-maskers represent a community of people that opposes the use of face masks on grounds that they infringe personal freedoms. This community has thoroughly exploited the convenience and reach of online social media platforms such as Facebook and Twitter to spread discordant information about the ineffectiveness and harm caused by masks in order to persuade people to shun their use. Automatic detection and demoting of anti-mask tweets is thus necessary to limit their damage. This is challenging because the mask dialogue continuously evolves with creative arguments that embed emerging knowledge about the virus, changing socio-political landscape, and present policies of public health officers and organizations. Therefore, this paper builds a transferrable machine learning framework that can separate between anti-mask and pro-mask tweets from longitudinal data collected at four epochs during the pandemic. The framework extracts content, emotional, and engagement features that faithfully capture the patterns that are relevant to anti-mask rhetoric, but ignores those related to contextual details. It trains two ensemble learners and two neural network architectures using these features. Ensemble classifiers can identify anti-mask tweets with approximately 80% accuracy and F1-score from both individual and combined data sets. The invariant linguistic features extracted by the framework can thus form the basis of automated classifiers that can efficiently separate other types of falsehoods and misinformation from huge volumes of social media data.Warnken, J.; Gokhale, SS. (2022). Classifying the Evolving Mask Debate: A Transferable Machine Learning Framework. Journal of Computer-Assisted Linguistic Research. 6:1-18. https://doi.org/10.4995/jclr.2022.17493118

    Understanding Common Perceptions from Online Social Media

    Get PDF
    Modern society habitually uses online social media services to publicly share observations, thoughts, opinions, and beliefs at any time and from any location. These geotagged social media posts may provide aggregate insights into people\u27s perceptions on a bad range of topics across a given geographical area beyond what is currently possible through services such as Yelp and Foursquare. This paper develops probabilistic language models to investigate whether collective, topic-based perceptions within a geographical area can be extracted from the content of geotagged Twitter posts. The capability of the methodology is illustrated using tweets from three areas of different sizes. An application of the approach to support power grid restoration following a storm is presented

    Understanding User Triads on Facebook

    Get PDF
    Contemporary approaches that analyze user behavior on online social networks only consider interactions among dyads, which are pairs of directly connected users. A large body of sociological work, however, suggests that mutual connections among users can influence their activities, leading to differences between two- and three-way interactions. This paper explores the dynamics of triads among Facebook users based on the wall posts from the New Orleans regional network. Initially, each connection is categorized as a close friendship or an acquiantance, contingent on the number of wall posts exchanged. Subsequently, the impact of different types of connections comprising triads is examined on the post volume and inter-post times. The analysis finds that these two properties are influenced by the number of close friendships constituting triads

    Architecture-Based Software Reliability Analysis: Overview and Limitations

    No full text

    Long Range Dependence (LRD) in the Arrival Process of Web Robots

    No full text
    There is strong evidence to suggest that a significant proportion of traffic on Web servers, across many domains, can be attributed to Web robots. With the advent of the Social Web, widespread use of semantic Web technologies, and development of service-oriented Web applications, it is expected that this proportion will only rise over time. One of the most important distinctions between robots and humans is the pattern with which they request resources from a Web server. In this paper, we examine the arrival process of Web robot requests across Web servers from three diverse domains. We find that, regardless of the domain, Web robot traffic exhibits long range dependence (LRD) similar to human traffic. We discuss why, at least in some cases, LRD in robot traffic may not be generated by heavy-tailed response sizes as in the case of human traffic

    Detecting Web Robots Using Resource Request Patterns

    No full text
    A significant proportion of Web traffic is now attributed to Web robots, and this proportion is likely to grow over time. These robots may threaten the security, privacy, functionality, and performance of a Web server due to their unregulated crawling behavior. Therefore, to assess their impact, it must be possible to accurately detect Web robot requests. Contemporary detection approaches, however, may cease to be effective as the behavior of both robots and humans evolves. In this paper, we present a novel detection approach that is based on the contrasts in the resource request patterns of robots and humans. The proposed scheme, which relies on an invariant behavioral difference between humans and robots, builds on the lessons from contemporary approaches. We demonstrate that the proposed approach can accurately detect Web robots and argue that it is expected to remain effective even as they continue their rapid evolution

    A Classification Framework for Web Robots

    No full text
    The behavior of modern web robots varies widely when they crawl for different purposes. In this article, we present a framework to classify these web robots from two orthogonal perspectives, namely, their functionality and the types of resources they consume. Applying the classification framework to a year-long access log from the UConn SoE web server, we present trends that point to significant differences in their crawling behavior
    corecore