2 research outputs found
What's Hot in Software Engineering Twitter Space?
Abstract—Twitter is a popular means to disseminate infor-mation and currently more than 300 million people are using it actively. Software engineers are no exception; Singer et al. have shown that many developers use Twitter to stay current with recent technological trends. At various time points, many users are posting microblogs (i.e., tweets) about the same topic in Twitter. We refer to this reasonably large set of topically-coherent microblogs in the Twitter space made at a particular point in time as an event. In this work, we perform an exploratory study on software engineering related events in Twitter. We collect a large set of Twitter messages over a period of 8 months that are made by 79,768 Twitter users and filter them by five programming language keywords. We then run a state-of-the-art Twitter event detection algorithm borrowed from the Natural Language Processing (NLP) domain. Next, using the open coding procedure, we manually analyze 1,000 events that are identified by the NLP tool, and create eleven categories of events (10 main categories + “others”). We find that external resource sharing, technical discussion, and software product updates are the “hottest” categories. These findings shed light on hot topics in Twitter that are interesting to many people and they provide guidance to future Twitter analytics studies that develop automated solutions to help users find fresh, relevant, and interesting pieces of information from Twitter stream to keep developers up-to-date with recent trends
Recommended from our members
Enhancing Usability and Explainability of Data Systems
The recent growth of data science expanded its reach to an ever-growing user base of nonexperts, increasing the need for usability, understandability, and explainability in these systems. Enhancing usability makes data systems accessible to people with different skills and backgrounds alike, leading to democratization of data systems. Furthermore, proper understanding of data and data-driven systems is necessary for the users to trust the function of the systems that learn from data. Finally, data systems should be transparent: when a data system behaves unexpectedly or malfunctions, the users deserve proper explanation of what caused the observed incident. Unfortunately, most existing data systems offer limited usability and support for explanations: these systems are usable only by experts with sound technical skills, and even expert users are hindered by the lack of transparency into the systems\u27 inner workings and functions. The aim of my thesis is to bridge the usability gap between nonexpert users and complex data systems, aid all sort of users, including the expert ones, in data and system understanding, and provide explanations that help reason about unexpected outcomes involving data systems. Specifically, my thesis has the following three goals: (1) enhancing usability of data systems for nonexperts, (2) enable data understanding that can assist users in a variety of tasks such as achieving trust in data-driven machine learning, gaining data understanding, and data cleaning, and (3) explaining causes of unexpected outcomes involving data and data systems.
For enhancing usability, we focus on example-driven user intent discovery. We develop systems based on example-driven interactions in two different settings: querying relational databases and personalized document summarization. Towards data understanding, we develop a new data-profiling primitive that can characterize tuples for which a machine-learned model is likely to produce untrustworthy predictions. We also develop an explanation framework to explain causes of such untrustworthy predictions. Additionally, this new data-profiling primitive enables interactive data cleaning. Finally, we develop two explanation frameworks, tailored to provide explanations in debugging data system components, including the data itself. The explanation frameworks focus on explaining the root cause of a concurrent application\u27s intermittent failure and exposing issues in the data that cause a data-driven system to malfunction