389 research outputs found
Recommended from our members
Verifying baselines for crisis event information classification on Twitter
Social media are rich information sources during and in the aftermath of crisis events such as earthquakes and terrorist attacks. Despite myriad challenges, with the right tools, significant insight can be gained which can assist emergency responders and related applications. However, most extant approaches are incomparable, using bespoke definitions, models, datasets and even evaluation metrics. Furthermore, it is rare that code, trained models, or exhaustive parametrisation details are made openly available. Thus, even confirmation of self-reported performance is problematic; authoritatively determining the state of the art (SOTA) is essentially impossible. Consequently, to begin addressing such endemic ambiguity, this paper seeks to make 3 contributions: 1) the replication and results confirmation of a leading (and generalisable) technique; 2) testing straightforward modifications of the technique likely to improve performance; and 3) the extension of the technique to a novel and complimentary type of crisis-relevant information to demonstrate it’s generalisability
Partisan Asymmetries in Online Political Activity
We examine partisan differences in the behavior, communication patterns and
social interactions of more than 18,000 politically-active Twitter users to
produce evidence that points to changing levels of partisan engagement with the
American online political landscape. Analysis of a network defined by the
communication activity of these users in proximity to the 2010 midterm
congressional elections reveals a highly segregated, well clustered partisan
community structure. Using cluster membership as a high-fidelity (87% accuracy)
proxy for political affiliation, we characterize a wide range of differences in
the behavior, communication and social connectivity of left- and right-leaning
Twitter users. We find that in contrast to the online political dynamics of the
2008 campaign, right-leaning Twitter users exhibit greater levels of political
activity, a more tightly interconnected social structure, and a communication
network topology that facilitates the rapid and broad dissemination of
political information.Comment: 17 pages, 10 figures, 6 table
Fashion Conversation Data on Instagram
The fashion industry is establishing its presence on a number of
visual-centric social media like Instagram. This creates an interesting clash
as fashion brands that have traditionally practiced highly creative and
editorialized image marketing now have to engage with people on the platform
that epitomizes impromptu, realtime conversation. What kinds of fashion images
do brands and individuals share and what are the types of visual features that
attract likes and comments? In this research, we take both quantitative and
qualitative approaches to answer these questions. We analyze visual features of
fashion posts first via manual tagging and then via training on convolutional
neural networks. The classified images were examined across four types of
fashion brands: mega couture, small couture, designers, and high street. We
find that while product-only images make up the majority of fashion
conversation in terms of volume, body snaps and face images that portray
fashion items more naturally tend to receive a larger number of likes and
comments by the audience. Our findings bring insights into building an
automated tool for classifying or generating influential fashion information.
We make our novel dataset of {24,752} labeled images on fashion conversations,
containing visual and textual cues, available for the research community.Comment: 10 pages, 6 figures, This paper will be presented at ICWSM'1
Comprehensive Review of Opinion Summarization
The abundance of opinions on the web has kindled the study of opinion summarization over the last few years. People have introduced various techniques and paradigms to solving this special task. This survey attempts to systematically investigate the different techniques and approaches used in opinion summarization. We provide a multi-perspective classification of the approaches used and highlight some of the key weaknesses of these approaches. This survey also covers evaluation techniques and data sets used in studying the opinion summarization problem. Finally, we provide insights into some of the challenges that are left to be addressed as this will help set the trend for future research in this area.unpublishednot peer reviewe
Learning structure and schemas from heterogeneous domains in networked systems: a survey
The rapidly growing amount of available digital documents of various formats and the possibility to access these through internet-based technologies in distributed environments, have led to the necessity to develop solid methods to properly organize and structure documents in large digital libraries and repositories. Specifically, the extremely large size of document collections make it impossible to manually organize such documents. Additionally, most of the document sexist in an unstructured form and do not follow any schemas. Therefore, research efforts in this direction are being dedicated to automatically infer structure and schemas. This is essential in order to better organize huge collections as well as to effectively and efficiently retrieve documents in heterogeneous domains in networked system. This paper presents a survey of the state-of-the-art methods for inferring structure from documents and schemas in networked environments. The survey is organized around the most important application domains, namely, bio-informatics, sensor networks, social networks, P2Psystems, automation and control, transportation and privacy preserving for which we analyze the recent developments on dealing with unstructured data in such domains.Peer ReviewedPostprint (published version
Controversy trend detection in social media
In this research, we focus on the early prediction of whether topics are likely to generate significant controversy (in the form of social media such as comments, blogs, etc.). Controversy trend detection is important to companies, governments, national security agencies, and marketing groups because it can be used to identify which issues the public is having problems with and develop strategies to remedy them. For example, companies can monitor their press release to find out how the public is reacting and to decide if any additional public relations action is required, social media moderators can moderate discussions if the discussions start becoming abusive and getting out of control, and governmental agencies can monitor their public policies and make adjustments to the policies to address any public concerns. An algorithm was developed to predict controversy trends by taking into account sentiment expressed in comments, burstiness of comments, and controversy score. To train and test the algorithm, an annotated corpus was developed consisting of 728 news articles and over 500,000 comments on these articles made by viewers from CNN.com. This study achieved an average F-score of 71.3% across all time spans in detection of controversial versus non-controversial topics. The results suggest that it is possible for early prediction of controversy trends leveraging social media
- …