2,769 research outputs found

    Stack Overflow: A Code Laundering Platform?

    Full text link
    Developers use Question and Answer (Q&A) websites to exchange knowledge and expertise. Stack Overflow is a popular Q&A website where developers discuss coding problems and share code examples. Although all Stack Overflow posts are free to access, code examples on Stack Overflow are governed by the Creative Commons Attribute-ShareAlike 3.0 Unported license that developers should obey when reusing code from Stack Overflow or posting code to Stack Overflow. In this paper, we conduct a case study with 399 Android apps, to investigate whether developers respect license terms when reusing code from Stack Overflow posts (and the other way around). We found 232 code snippets in 62 Android apps from our dataset that were potentially reused from Stack Overflow, and 1,226 Stack Overflow posts containing code examples that are clones of code released in 68 Android apps, suggesting that developers may have copied the code of these apps to answer Stack Overflow questions. We investigated the licenses of these pieces of code and observed 1,279 cases of potential license violations (related to code posting to Stack overflow or code reuse from Stack overflow). This paper aims to raise the awareness of the software engineering community about potential unethical code reuse activities taking place on Q&A websites like Stack Overflow.Comment: In proceedings of the 24th IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER

    Senior Programmers: Characteristics of Elderly Users from Stack Overflow

    Full text link
    In this paper we presents results of research about elderly users of Stack Overflow (Question and Answer portal for programmers). They have different roles, different main activities and different habits. They are an important part of the community, as they tend to have higher reputation and they like to share their knowledge. This is a great example of possible way of keeping elderly people active and helpful for society

    Personalized Approaches to Supporting the Learning Needs of Lifelong Professional Learners

    Get PDF
    Advanced learning technology research has begun to take on a complex challenge: supporting lifelong learning. Professional learning is an essential subset of lifelong learning that is more tractable than the full lifelong learning challenge. Professionals do not always have access to professional teachers to provide input to the problems they encounter, so they rely on their peers in an online learning community (OLC) to help meet their learning needs. Supporting professional learners within an OLC is a difficult problem as the learning needs of each learner continuously evolve, often in different ways from other learners. Hence, there is a need to provide personalized support to learners adapted to their individual learning needs. This thesis explores personalized approaches for detecting the unperceived learning needs and meeting the expressed learning needs of learners in an OLC. The experimental test bed for this research is Stack Overflow (SO), an OLC used by software professionals. To date, seven experiments have been carried out mining SO peer-peer interaction data. Knowing that question-answerers play a huge role in meeting the learning needs of the question-askers, the first experiment aimed to detect the learning needs of the answerers. Results from experiment 1 show that reputable answerers themselves demonstrate unperceived learning needs as revealed by a decline in quality answers in SO. Of course, a decline in quality answers could impact the help-seeking experience of question-askers; hence experiment 2 sought to understand the effects of the help-seeking experience of question-askers on their enthusiasm to continuously participate within the OLC. As expected, negative help-seeking experiences of question-askers had a large impact on their propensity to seek further help within the OLC. To improve the help-seeking experience of question-askers, it is important to proactively detect the learning needs of the question-answerers before they provide poor quality answers. Thus, in experiment 3 the goal was to predict whether a question-answerer would give a poor answer to a question based on their past peer-peer interactions. Under various assumptions, accuracies ranging from 84.57% to 94.54% were achieved. Next, experiment 4 attempted to detect the unperceived learning needs of question-askers even before they are aware of such needs. Using information about a learner’s interactions over a 5-month period, a prediction was made as to what they would be asking about during the next month, achieving recall and precision values of 0.93 and 0.81. Knowing the learning needs of question-askers early creates an opportunity to predict prospective answerers who could provide timely and quality answers to their question. The goal of experiment 5 was thus to predict the actual answerers for questions based only on information known at the time the question was asked. The iv success rate was at best 63.15%, which would only be marginally useful to inform a real-life peer recommender system. Thus, experiment 6 explored new measures in predicting the answerers, boosting the success rate to 89.64%. Of course, a peer recommender system would be deemed to be especially useful if it can provide prompt interventions, especially to get answers to questions that would otherwise not be answered quickly. To this end, experiment 7 attempted to predict the question-askers whose questions would be answered late or even remain unanswered, and a success rate of 68.4% was achieved. Results from these experiments suggest that modelling the activities of learners in an OLC is key in providing support to them to meet their learning needs. Perhaps, the most important lesson learned in this research is that lightweight approaches can be developed to help meet the evolving learning needs of professionals, even as knowledge changes within a profession. Metrics based on the experiments above are exactly such lightweight methodologies and could be the basis for useful tools to support professional learners

    Что и как спрашивают в социальных вопросно-ответных сервисах по-русски?

    Full text link
    In our study we surveyed different approaches to the study of questions in traditional linguistics, question answering (QA), and, recently, in community question answering (CQA). We adapted a functional-semantic classification scheme for CQA data and manually labeled 2,000 questions in Russian originating from [email protected] CQA service. About half of them are purely conversational and do not aim at obtaining actual information. In the subset of meaningful questions the major classes are requests for recommendations, or how-questions, and fact-seeking questions. The data demonstrate a variety of interrogative sentences as well as a host of formally non-interrogative expressions with the meaning of questions and requests. The observations can be of interest both for linguistics and for practical applications

    Ways to Be Worse Off

    Get PDF
    Does disability make a person worse off? I argue that the best answer is yes and no, because we can be worse off in two conceptually distinct ways. Disabilities usually make us worse off in one way (typified by facing hassles) but not in the other (typified by facing loneliness). Acknowledging two conceptually distinct ways to be worse off has fundamental implications for philosophical theories of well-being. (This paper was awarded the APA’s Routledge, Taylor & Francis Prize in 2017.

    Identifying Roles of Software Developers from their Answers on Stack Overflow

    Get PDF
    Stack Overflow is the world’s largest community of software developers. Users ask and answer questions on various tagged topics of software development. The set of questions a site user answers is representative of their knowledge base, or “wheelhouse”. It is proposed that clustering users by their wheelhouse yields communities of similar software developers by skill-set. These communities represent the different roles within software development and could be used as the basis to define roles at any point in time in an ever-evolving landscape of software development. A network graph of site users, linked if they answered questions on the same topic, was created. Eight distinct communities were identified using the Louvain method. The modularity of this set of communities was 0.46, indicating the presence of community structure that is unlikely to occur randomly. This partition was validated with the results of previous research that used data from the same time period. By extracting the top 5 tags from each identified community, the harmonic F1-score between the communities and the external dataset was found to be 0.75. It was statistically proven with 95% confidence that the communities identified were not identical to the results from the previous research. Nonetheless, there exists a strong similarity to the previous research. Hence, it was suggested that Stack Overflow data could be used to identify and define roles within software development. Upon applying this method to 2021 data, a previously unknown community of experts in R, C and Rust was identified. The method used in this research could be applied directly to any of the 177 Stack Exchange sites and could be used to form the basis of job roles for a wide range of industries

    Holistic recommender systems for software engineering

    Get PDF
    The knowledge possessed by developers is often not sufficient to overcome a programming problem. Short of talking to teammates, when available, developers often gather additional knowledge from development artifacts (e.g., project documentation), as well as online resources. The web has become an essential component in the modern developer’s daily life, providing a plethora of information from sources like forums, tutorials, Q&A websites, API documentation, and even video tutorials. Recommender Systems for Software Engineering (RSSE) provide developers with assistance to navigate the information space, automatically suggest useful items, and reduce the time required to locate the needed information. Current RSSEs consider development artifacts as containers of homogeneous information in form of pure text. However, text is a means to represent heterogeneous information provided by, for example, natural language, source code, interchange formats (e.g., XML, JSON), and stack traces. Interpreting the information from a pure textual point of view misses the intrinsic heterogeneity of the artifacts, thus leading to a reductionist approach. We propose the concept of Holistic Recommender Systems for Software Engineering (H-RSSE), i.e., RSSEs that go beyond the textual interpretation of the information contained in development artifacts. Our thesis is that modeling and aggregating information in a holistic fashion enables novel and advanced analyses of development artifacts. To validate our thesis we developed a framework to extract, model and analyze information contained in development artifacts in a reusable meta- information model. We show how RSSEs benefit from a meta-information model, since it enables customized and novel analyses built on top of our framework. The information can be thus reinterpreted from an holistic point of view, preserving its multi-dimensionality, and opening the path towards the concept of holistic recommender systems for software engineering

    A Human-Centric System for Symbolic Reasoning About Code

    Get PDF
    While testing and tracing on specific input values are useful starting points for students to understand program behavior, ultimately students need to be able to reason rigorously and logically about the correctness of their code on all inputs without having to run the code. Symbolic reasoning is reasoning abstractly about code using arbitrary symbolic input values, as opposed to specific concrete inputs. The overarching goal of this research is to help students learn symbolic reasoning, beginning with code containing simple assertions as a foundation and proceeding to code involving data abstractions and loop invariants. Toward achieving this goal, this research has employed multiple experiments across five years at three institutions: a large, public university, an HBCU (Historically Black Colleges and Universities), and an HSI (Hispanic Serving Institution). A total of 862 students participated across all variations of the study. Interactive, online tools can enhance student learning because they can provide targeted help that would be prohibitively expensive without automation. The research experiments employ two such symbolic reasoning tools that had been developed earlier and a newly designed human-centric reasoning system (HCRS). The HCRS is a first step in building a generalized tutor that achieves a level of resolution necessary to identify difficulties and suggest appropriate interventions. The experiments show the value of tools in pinpointing and classifying difficulties in learning symbolic reasoning, as well as in learning design-by-contract assertions and applying them to develop loop invariants for code involving objects. Statistically significant results include the following. Students are able to learn symbolic reasoning with the aid of instruction and an online tool. Motivation improves student perception and attitude towards symbolic reasoning. Tool usage improves student performance on symbolic reasoning, their explanations of the larger purpose of code segments, and self-efficacy for all subpopulations

    A Computational Model of Trust Based on Dynamic Interaction in the Stack Overflow Community

    Get PDF
    A member’s reputation in an online community is a quantified representation of their trustworthiness within the community. Reputation is calculated using rules-based algorithms which are primarily tied to the upvotes or downvotes a member receives on posts. The main drawback of this form of reputation calculation is the inability to consider dynamic factors such as a member’s activity (or inactivity) within the community. The research involves the construction of dynamic mathematical models to calculate reputation and then determine to what extent these results compare with rules-based models. This research begins with exploratory research of the existing corpus of knowledge. Constructive research in the building of mathematical dynamic models and then empirical research to determine the effectiveness of the models. Data collected from the Stack Overflow (SO) database is used by models to calculate a rule-based and dynamic member reputation and then using statistical correlation testing methods (i.e., Pearson and Spearman) to determine the extent of the relationship. Statistically significant results with moderate relationship size were found from correlation testing between rules-based and dynamic temporal models. The significance of the research and its conclusion that dynamic and temporal models can indeed produce results comparative to that of subjective vote-based systems is important in the context of building trust in online communities. Developing models to determine reputation in online communities based upon member post and comment activity avoids the potential drawbacks associated with vote-based reputation systems
    corecore