Enriched by natural language texts, Stack Overflow code snippets are an
invaluable code-centric knowledge base of small units of source code. Besides
being useful for software developers, these annotated snippets can potentially
serve as the basis for automated tools that provide working code solutions to
specific natural language queries.
With the goal of developing automated tools with the Stack Overflow snippets
and surrounding text, this paper investigates the following questions: (1) How
usable are the Stack Overflow code snippets? and (2) When using text search
engines for matching on the natural language questions and answers around the
snippets, what percentage of the top results contain usable code snippets?
A total of 3M code snippets are analyzed across four languages: C\#, Java,
JavaScript, and Python. Python and JavaScript proved to be the languages for
which the most code snippets are usable. Conversely, Java and C\# proved to be
the languages with the lowest usability rate. Further qualitative analysis on
usable Python snippets shows the characteristics of the answers that solve the
original question. Finally, we use Google search to investigate the alignment
of usability and the natural language annotations around code snippets, and
explore how to make snippets in Stack Overflow an adequate base for future
automatic program generation.Comment: 13th IEEE/ACM International Conference on Mining Software
Repositories, 11 page