19,747 research outputs found
Pinterest Board Recommendation for Twitter Users
Pinboard on Pinterest is an emerging media to engage online social media
users, on which users post online images for specific topics. Regardless of its
significance, there is little previous work specifically to facilitate
information discovery based on pinboards. This paper proposes a novel pinboard
recommendation system for Twitter users. In order to associate contents from
the two social media platforms, we propose to use MultiLabel classification to
map Twitter user followees to pinboard topics and visual diversification to
recommend pinboards given user interested topics. A preliminary experiment on a
dataset with 2000 users validated our proposed system
Unleashing the Power of Hashtags in Tweet Analytics with Distributed Framework on Apache Storm
Twitter is a popular social network platform where users can interact and
post texts of up to 280 characters called tweets. Hashtags, hyperlinked words
in tweets, have increasingly become crucial for tweet retrieval and search.
Using hashtags for tweet topic classification is a challenging problem because
of context dependent among words, slangs, abbreviation and emoticons in a short
tweet along with evolving use of hashtags. Since Twitter generates millions of
tweets daily, tweet analytics is a fundamental problem of Big data stream that
often requires a real-time Distributed processing. This paper proposes a
distributed online approach to tweet topic classification with hashtags. Being
implemented on Apache Storm, a distributed real time framework, our approach
incrementally identifies and updates a set of strong predictors in the Na\"ive
Bayes model for classifying each incoming tweet instance. Preliminary
experiments show promising results with up to 97% accuracy and 37% increase in
throughput on eight processors.Comment: IEEE International Conference on Big Data 201
Social Media for Cities, Counties and Communities
Social media (i.e., Twitter, Facebook, Flickr, YouTube) and other tools and services with user- generated content have made a staggering amount of information (and misinformation) available. Some government officials seek to leverage these resources to improve services and communication with citizens, especially during crises and emergencies. Yet, the sheer volume of social data streams generates substantial noise that must be filtered. Potential exists to rapidly identify issues of concern for emergency management by detecting meaningful patterns or trends in the stream of messages and information flow. Similarly, monitoring these patterns and themes over time could provide officials with insights into the perceptions and mood of the community that cannot be collected through traditional methods (e.g., phone or mail surveys) due to their substantive costs, especially in light of reduced and shrinking budgets of governments at all levels. We conducted a pilot study in 2010 with government officials in Arlington, Virginia (and to a lesser extent representatives of groups from Alexandria and Fairfax, Virginia) with a view to contributing to a general understanding of the use of social media by government officials as well as community organizations, businesses and the public. We were especially interested in gaining greater insight into social media use in crisis situations (whether severe or fairly routine crises, such as traffic or weather disruptions)
Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples
Machine Learning has been a big success story during the AI resurgence. One
particular stand out success relates to learning from a massive amount of data.
In spite of early assertions of the unreasonable effectiveness of data, there
is increasing recognition for utilizing knowledge whenever it is available or
can be created purposefully. In this paper, we discuss the indispensable role
of knowledge for deeper understanding of content where (i) large amounts of
training data are unavailable, (ii) the objects to be recognized are complex,
(e.g., implicit entities and highly subjective content), and (iii) applications
need to use complementary or related data in multiple modalities/media. What
brings us to the cusp of rapid progress is our ability to (a) create relevant
and reliable knowledge and (b) carefully exploit knowledge to enhance ML/NLP
techniques. Using diverse examples, we seek to foretell unprecedented progress
in our ability for deeper understanding and exploitation of multimodal data and
continued incorporation of knowledge in learning techniques.Comment: Pre-print of the paper accepted at 2017 IEEE/WIC/ACM International
Conference on Web Intelligence (WI). arXiv admin note: substantial text
overlap with arXiv:1610.0770
Slides: Augmenting Lightweight Domain Ontologies with Social Evidence Sources
Recent research shows the potential of utilizing data collected through Web 2.0 applications to capture changes in a domain's terminology. This paper presents an approach to augment corpus-based ontology learning by considering terms from collaborative tagging systems, social networking platforms, and micro-blogging services. The proposed framework collects information on the domain's terminology from domain documents and a seed ontology in a triple store. Data from social sources such as Delicious, Flickr, Technorati and Twitter provide an outside view of the domain and help incorporate external knowledge into the ontology learning process. The neural network technique of spreading activation is used to identify relevant new concepts, and to determine their positions in the extended ontology. Evaluating the method with two measures (PMI and expert judgements) demonstrates the significant benefits of social evidence sources for ontology learning
- …