1,258 research outputs found
SMS: A Framework for Service Discovery by Incorporating Social Media Information
© 2008-2012 IEEE. With the explosive growth of services, including Web services, cloud services, APIs and mashups, discovering the appropriate services for consumers is becoming an imperative issue. The traditional service discovery approaches mainly face two challenges: 1) the single source of description documents limits the effectiveness of discovery due to the insufficiency of semantic information; 2) more factors should be considered with the generally increasing functional and nonfunctional requirements of consumers. In this paper, we propose a novel framework, called SMS, for effectively discovering the appropriate services by incorporating social media information. Specifically, we present different methods to measure four social factors (semantic similarity, popularity, activity, decay factor) collected from Twitter. Latent Semantic Indexing (LSI) model is applied to mine semantic information of services from meta-data of Twitter Lists that contains them. In addition, we assume the target query-service matching function as a linear combination of multiple social factors and design a weight learning algorithm to learn an optimal combination of the measured social factors. Comprehensive experiments based on a real-world dataset crawled from Twitter demonstrate the effectiveness of the proposed framework SMS, through some compared approaches
Enhancing Geospatial Data: Collecting and Visualising User-Generated Content Through Custom Toolkits and Cloud Computing Workflows
Through this thesis we set the hypothesis that, via the creation of a set of custom toolkits, using cloud computing, online user-generated content, can be extracted from emerging large-scale data sets, allowing the collection, analysis and visualisation of geospatial data by social scientists. By the use of a custom-built suite of software, known as the âBigDataToolkitâ, we examine the need and use of cloud computing and custom workflows to open up access to existing online data as well as setting up processes to enable the collection of new data. We examine the use of the toolkit to collect large amounts of data from various online sources, such as Social Media Application Programming Interfaces (APIs) and data stores, to visualise the data collected in real-time. Through the execution of these workflows, this thesis presents an implementation of a smart collector framework to automate the collection process to significantly increase the amount of data that can be obtained from the standard API endpoints. By the use of these interconnected methods and distributed collection workflows, the final system is able to collect and visualise a larger amount of data in real time than single system data collection processes used within traditional social media analysis. Aimed at allowing researchers without a core understanding of the intricacies of computer science, this thesis provides a methodology to open up new data sources to not only academics but also wider participants, allowing the collection of user-generated geographic and textual content, en masse. A series of case studies are provided, covering applications from the single researcher collecting data through to collection via the use of televised media. These are examined in terms of the tools created and the opportunities opened, allowing real-time analysis of data, collected via the use of the developed toolkit
Search Bias Quantification: Investigating Political Bias in Social Media and Web Search
Users frequently use search systems on the Web as well as online social media to learn about ongoing events and public opinion on personalities. Prior studies have shown that the top-ranked results returned by these search engines can shape user opinion about the topic (e.g., event or person) being searched. In case of polarizing topics like politics, where multiple competing perspectives exist, the political bias in the top search results can play a significant role in shaping public opinion towards (or away from) certain perspectives. Given the considerable impact that search bias can have on the user, we propose a generalizable search bias quantification framework that not only measures the political bias in ranked list output by the search system but also decouples the bias introduced by the different sourcesâinput data and ranking system. We apply our framework to study the political bias in searches related to 2016 US Presidential primaries in Twitter social media search and find that both input data and ranking system matter in determining the final search output bias seen by the users. And finally, we use the framework to compare the relative bias for two popular search systemsâTwitter social media search and Google web searchâfor queries related to politicians and political events. We end by discussing some potential solutions to signal the bias in the search results to make the users more aware of them.publishe
CommuniSense: Crowdsourcing Road Hazards in Nairobi
Nairobi is one of the fastest growing metropolitan cities and a major
business and technology powerhouse in Africa. However, Nairobi currently lacks
monitoring technologies to obtain reliable data on traffic and road
infrastructure conditions. In this paper, we investigate the use of mobile
crowdsourcing as means to gather and document Nairobi's road quality
information. We first present the key findings of a city-wide road quality
survey about the perception of existing road quality conditions in Nairobi.
Based on the survey's findings, we then developed a mobile crowdsourcing
application, called CommuniSense, to collect road quality data. The application
serves as a tool for users to locate, describe, and photograph road hazards. We
tested our application through a two-week field study amongst 30 participants
to document various forms of road hazards from different areas in Nairobi. To
verify the authenticity of user-contributed reports from our field study, we
proposed to use online crowdsourcing using Amazon's Mechanical Turk (MTurk) to
verify whether submitted reports indeed depict road hazards. We found 92% of
user-submitted reports to match the MTurkers judgements. While our prototype
was designed and tested on a specific city, our methodology is applicable to
other developing cities.Comment: In Proceedings of 17th International Conference on Human-Computer
Interaction with Mobile Devices and Services (MobileHCI 2015
Leveraging Social Media and Web of Data for Crisis Response Coordination
There is an ever increasing number of users in social media (1B+ Facebook users, 500M+ Twitter users) and ubiquitous mobile access (6B+ mobile phone subscribers) who share their observations and opinions. In addition, the Web of Data and existing knowledge bases keep on growing at a rapid pace. In this scenario, we have unprecedented opportunities to improve crisis response by extracting social signals, creating spatio-temporal mappings, performing analytics on social and Web of Data, and supporting a variety of applications. Such applications can help provide situational awareness during an emergency, improve preparedness, and assist during the rebuilding/recovery phase of a disaster. Data mining can provide valuable insights to support emergency responders and other stakeholders during crisis. However, there are a number of challenges and existing computing technology may not work in all cases. Therefore, our objective here is to present the characterization of such data mining tasks, and challenges that need further research attention
Aggregating Content and Network Information to Curate Twitter User Lists
Twitter introduced user lists in late 2009, allowing users to be grouped
according to meaningful topics or themes. Lists have since been adopted by
media outlets as a means of organising content around news stories. Thus the
curation of these lists is important - they should contain the key information
gatekeepers and present a balanced perspective on a story. Here we address this
list curation process from a recommender systems perspective. We propose a
variety of criteria for generating user list recommendations, based on content
analysis, network analysis, and the "crowdsourcing" of existing user lists. We
demonstrate that these types of criteria are often only successful for datasets
with certain characteristics. To resolve this issue, we propose the aggregation
of these different "views" of a news story on Twitter to produce more accurate
user recommendations to support the curation process
Recommended from our members
Exploring Societal Computing based on the Example of Privacy
Data privacy when using online systems like Facebook and Amazon has become an increasingly popular topic in the last few years. This thesis will consist of the following four projects that aim to address the issues of privacy and software engineering.
First, only a little is known about how users and developers perceive privacy and which concrete measures would mitigate their privacy concerns. To investigate privacy requirements, we conducted an online survey with closed and open questions and collected 408 valid responses. Our results show that users often reduce privacy to security, with data sharing and data breaches being their biggest concerns. Users are more concerned about the content of their documents and their personal data such as location than about their interaction data. Unlike users, developers clearly prefer technical measures like data anonymization and think that privacy laws and policies are less effective. We also observed interesting differences between people from different geographies. For example, people from Europe are more concerned about data breaches than people from North America. People from Asia/Pacific and Europe believe that content and metadata are more critical for privacy than people from North America. Our results contribute to developing a user-driven privacy framework that is based on empirical evidence in addition to the legal, technical, and commercial perspectives.
Second, a related challenge to above, is to make privacy more understandable in complex systems that may have a variety of user interface options, which may change often. As social network platforms have evolved, the ability for users to control how and with whom information is being shared introduces challenges concerning the configuration and comprehension of privacy settings. To address these concerns, our crowd sourced approach simplifies the understanding of privacy settings by using data collected from 512 users over a 17 month period to generate visualizations that allow users to compare their personal settings to an arbitrary subset of individuals of their choosing. To validate our approach we conducted an online survey with closed and open questions and collected 59 valid responses after which we conducted follow-up interviews with 10 respondents. Our results showed that 70% of respondents found visualizations using crowd sourced data useful for understanding privacy settings, and 80% preferred a crowd sourced tool for configuring their privacy settings over current privacy controls.
Third, as software evolves over time, this might introduce bugs that breach users' privacy. Further, there might be system-wide policy changes that could change users' settings to be more or less private than before. We present a novel technique that can be used by end-users for detecting changes in privacy, i.e., regression testing for privacy. Using a social approach for detecting privacy bugs, we present two prototype tools. Our evaluation shows the feasibility and utility of our approach for detecting privacy bugs. We highlight two interesting case studies on the bugs that were discovered using our tools. To the best of our knowledge, this is the first technique that leverages regression testing for detecting privacy bugs from an end-user perspective.
Fourth, approaches to addressing these privacy concerns typically require substantial extra computational resources, which might be beneficial where privacy is concerned, but may have significant negative impact with respect to Green Computing and sustainability, another major societal concern. Spending more computation time results in spending more energy and other resources that make the software system less sustainable. Ideally, what we would like are techniques for designing software systems that address these privacy concerns but which are also sustainable - systems where privacy could be achieved "for free", i.e., without having to spend extra computational effort. We describe how privacy can indeed be achieved for free an accidental and beneficial side effect of doing some existing computation - in web applications and online systems that have access to user data. We show the feasibility, sustainability, and utility of our approach and what types of privacy threats it can mitigate.
Finally, we generalize the problem of privacy and its tradeoffs. As Social Computing has increasingly captivated the general public, it has become a popular research area for computer scientists. Social Computing research focuses on online social behavior and using artifacts derived from it for providing recommendations and other useful community knowledge. Unfortunately, some of that behavior and knowledge incur societal costs, particularly with regards to Privacy, which is viewed quite differently by different populations as well as regulated differently in different locales. But clever technical solutions to those challenges may impose additional societal costs, e.g., by consuming substantial resources at odds with Green Computing, another major area of societal concern. We propose a new crosscutting research area, Societal Computing, that focuses on the technical tradeoffs among computational models and application domains that raise significant societal issues. We highlight some of the relevant research topics and open problems that we foresee in Societal Computing. We feel that these topics, and Societal Computing in general, need to gain prominence as they will provide useful avenues of research leading to increasing benefits for society as a whole
- âŠ