109 research outputs found
Supporting Source Code Search with Context-Aware and Semantics-Driven Query Reformulation
Software bugs and failures cost trillions of dollars every year, and could even lead to deadly accidents (e.g., Therac-25 accident). During maintenance, software developers fix numerous bugs and implement hundreds of new features by making necessary changes to the existing software code. Once an issue report (e.g., bug report, change request) is assigned to a developer, she chooses a few important keywords from the report as a search query, and then attempts to find out the exact locations in the software code that need to be either repaired or enhanced. As a part of this maintenance, developers also often select ad hoc queries on the fly, and attempt to locate the reusable code from the Internet that could assist them either in bug fixing or in feature implementation. Unfortunately, even the experienced developers often fail to construct the right search queries. Even if the developers come up with a few ad hoc queries, most of them require frequent modifications which cost significant development time and efforts. Thus, construction of an appropriate query for localizing the software bugs, programming concepts or even the reusable code is a major challenge. In this thesis, we overcome this query construction challenge with six studies, and develop a novel, effective code search solution (BugDoctor) that assists the developers in localizing the software code of interest (e.g., bugs, concepts and reusable code) during software maintenance. In particular, we reformulate a given search query (1) by designing novel keyword selection algorithms (e.g., CodeRank) that outperform the traditional alternatives (e.g., TF-IDF), (2) by leveraging the bug report quality paradigm and source document structures which were previously overlooked and (3) by exploiting the crowd knowledge and word semantics derived from Stack Overflow Q&A site, which were previously untapped. Our experiment using 5000+ search queries (bug reports, change requests, and ad hoc queries) suggests that our proposed approach can improve the given queries significantly through automated query reformulations. Comparison with 10+ existing studies on bug localization, concept location and Internet-scale code search suggests that our approach can outperform the state-of-the-art approaches with a significant margin
Automatic query reformulation for code search using crowdsourced knowledge
Ministry of Education, Singapore under its Academic Research Funding Tier
Boosting API Recommendation with Implicit Feedback
Developers often need to use appropriate APIs to program efficiently, but it
is usually a difficult task to identify the exact one they need from a vast of
candidates. To ease the burden, a multitude of API recommendation approaches
have been proposed. However, most of the currently available API recommenders
do not support the effective integration of users' feedback into the
recommendation loop. In this paper, we propose a framework, BRAID (Boosting
RecommendAtion with Implicit FeeDback), which leverages learning-to-rank and
active learning techniques to boost recommendation performance. By exploiting
users' feedback information, we train a learning-to-rank model to re-rank the
recommendation results. In addition, we speed up the feedback learning process
with active learning. Existing query-based API recommendation approaches can be
plugged into BRAID. We select three state-of-the-art API recommendation
approaches as baselines to demonstrate the performance enhancement of BRAID
measured by Hit@k (Top-k), MAP, and MRR. Empirical experiments show that, with
acceptable overheads, the recommendation performance improves steadily and
substantially with the increasing percentage of feedback data, comparing with
the baselines.Comment: 15 pages, 4 figure
A Systematic Review of Automated Query Reformulations in Source Code Search
Fixing software bugs and adding new features are two of the major maintenance
tasks. Software bugs and features are reported as change requests. Developers
consult these requests and often choose a few keywords from them as an ad hoc
query. Then they execute the query with a search engine to find the exact
locations within software code that need to be changed. Unfortunately, even
experienced developers often fail to choose appropriate queries, which leads to
costly trials and errors during a code search. Over the years, many studies
attempt to reformulate the ad hoc queries from developers to support them. In
this systematic literature review, we carefully select 70 primary studies on
query reformulations from 2,970 candidate studies, perform an in-depth
qualitative analysis (e.g., Grounded Theory), and then answer seven research
questions with major findings. First, to date, eight major methodologies (e.g.,
term weighting, term co-occurrence analysis, thesaurus lookup) have been
adopted to reformulate queries. Second, the existing studies suffer from
several major limitations (e.g., lack of generalizability, vocabulary mismatch
problem, subjective bias) that might prevent their wide adoption. Finally, we
discuss the best practices and future opportunities to advance the state of
research in search query reformulations.Comment: 81 pages, accepted at TOSE
Recommended from our members
Towards Democratizing Data Science with Natural Language Interfaces
Data science has the potential to reshape many sectors of the modern society. This potential can be realized to its maximum only when data science becomes democratized, instead of centralized in a small group of expert data scientists. However, with data becoming more massive and heterogeneous, standing in stark contrast to the spreading demand of data science is the growing gap between human users and data: Every type of data requires extensive specialized training, either to learn a specific query language or a data analytics software. Towards the democratization of data science, in this dissertation we systematically investigate a promising research direction, natural language interface, to bridge the gap between users and data, and make it easier for users who are less technically proficient to access the data analytics power needed for on-demand problem solving and decision making.One of the largest obstacles for general users to access data is the proficiency requirement on formal languages (e.g., SQL) that machines use. Automatically parsing natural language commands from users into formal languages, natural language interfaces can thus play a critical role in democratizing data science. However, a pressing question that is largely left unanswered so far is, how to bootstrap a natural language interface for a new domain? The high cost of data collection and the data-hungry nature of the mainstream neural network models are significantly limiting the wide application of natural language interfaces. The main technical contribution of this dissertation is a systematic framework for bootstrapping natural language interfaces for new domains. Specifically, the proposed framework consists of three complimentary methods: (1) Collecting data at a low cost via crowdsourcing, (2) leveraging existing NLI data from other domains via transfer learning, and (3) letting a bootstrapped model to interact with real users so that it can refine itself over time. Combining the three methods forms a closed data loop for bootstrapping and refining natural language interfaces for any domain.The developed methodologies and frameworks in this dissertation hence pave the path for building data science platforms that everyone can use to process, query, and analyze their data without extensive specialized training. With such AI-powered platforms, users can stay focused on high-level thinking and decision making, instead of overwhelmed by low-level implementation and programming details --- ``\emph{Let machines understand human thinking. Don't let humans think like machines}.'
Big Code Search: A Bibliography
peer reviewedCode search is an essential task in software development. Developers often search the internet and other code databases for necessary source code snippets to ease the development efforts. Code search techniques also help learn programming as novice programmers or students can quickly retrieve (hopefully good) examples already used in actual software projects. Given the recurrence of the code search activity in software development, there is an increasing interest in the research community. To improve the code search experience, the research community suggests many code search tools and techniques. These tools and techniques leverage several different ideas and claim a better code search performance. However, it is still challenging to illustrate a comprehensive view of the field considering that existing studies generally explore narrow and limited subsets of used components. This study aims to devise a grounded approach to understanding the procedure for code search and build an operational taxonomy capturing the critical facets of code search techniques. Additionally, we investigate evaluation methods, benchmarks, and datasets used in the field of code search
Attitudes, behaviors, and learning outcomes from using classtranscribe, a UDL-featured video-based online learning platform with learnersourced text-searchable captions
This thesis consisted of a series of three studies on students' attitudes, behaviors, and learning outcomes from using ClassTranscribe, a Universal Design for Learning (UDL) featured video-based online learning platform. ClassTranscribe provided accurate accessible transcriptions and captioning plus a custom text-searchable interface to rapidly find relevant video moments from the entire course. Users could edit the machine-generated captions in a crowdsourcing way. The system logged student viewing, searching, and editing behaviors as fine-grained web browser interaction events including full-screen-switching, loss-of-focus, caption searching and editing events, and continued-video-watching events with the latter at 15-second granularity.
In Study I, lecture material of a sophomore large-enrollment (N=271) system programming 15-week class in Spring 2019 was delivered solely online using a new video-based web platform - ClassTranscribe. Student learning behaviors and findings from four research questions were presented using individual-level performance and interaction data. Firstly, we reported on learning outcomes from alternative learning paths that arose from the course's application of Universal Design for Learning principles. Secondly, final exam performance was equal or better to prior semesters that utilized traditional in-person live lectures. Thirdly, learning outcomes of low and high performing students were analyzed independently by grouping students into four quartiles based on their non-final-exam course performance of programming assignments and quizzes. We introduced and justified an empirically-defined qualification threshold for sufficient video minutes viewed for each group. In all quartiles, students who watched an above-threshold of video minutes improved their in-group final exam performance (ranging from +6% to +14%) with the largest gain for the lowest-performing quartile. The improvement was similar in magnitude for all groups when expressed as a fraction of unrewarded final exam points. Finally, we found that using ClassTranscribe caption-based video search significantly predicted improvement in final exam scores. Overall, the study presented and evaluated how learner use of online video using ClassTranscribe predicted course performance and positive learning outcomes.
In Study II, we further explored learner's searching behavior, which was shown to be correlated with improved final exam scores in the first study. From Fall 2019 to Summer 2020, engineering students used ClassTranscribe in engineering courses to view course videos and search for video content. The tool collected detailed timestamped student behavioral data from 1,894 students across 25 engineering courses that included what individual students searched for and when. As the first study showed that using ClassTranscribe caption search significantly predicted improvement in final exam scores in a computer science course, in this study, we presented how students used the search functionality based on a more detailed analysis of the log data. The search functionality of ClassTranscribe used the timestamped caption data to find specific video moments both within the current video or across the entire course. The number of search activities per person ranged from zero to 186 events. An in-depth analysis of the students (N=167) who performed 1,022 searches was conducted to gain insight into student search needs and behaviors. Based on the total number of searches performed, students were grouped into “Infrequent Searcher” (< 18 searches) and “Frequent Searcher” (18 to 110 searches) using clustering algorithms. The search queries used by each group were found to follow the Zipf’s Law and were categorized into STEM-related terms, course logistics and others. Our study reported on students’ search context, behaviors, strategies, and optimizations. Using Universal Design for Learning as a foundation, we discussed the implications for educators, designers, and developers who are interested in providing new learning pathways to support and enhance video-based learning environments.
In Study III, we investigated students' attitudes towards learnersourced captioning for lecture videos. We deployed ClassTranscribe in a large (N=387) text retrieval and mining course where 58 learners participated in editing captions of 89 lecture videos, and each lecture video was edited by two editors sequentially. In the following semester, 18 editors participated in follow-up interviews to discuss their experience of using and editing captions in the class. Our study showed how students use captions to learn, and shed light on students' attitudes, motivations, and strategies in collaborating with other learners to fix captions in a learnersourced way
A Systematic and Minimalist Approach to Lower Barriers in Visual Data Exploration
With the increasing availability and impact of data in our lives, we need to make quicker, more accurate, and intricate data-driven decisions. We can see and interact with data, and identify relevant features, trends, and outliers through visual data representations. In addition, the outcomes of data analysis reflect our cognitive processes, which are strongly influenced by the design of tools. To support visual and interactive data exploration, this thesis presents a systematic and minimalist approach.
First, I present the Cognitive Exploration Framework, which identifies six distinct cognitive stages and provides a high-level structure to design guidelines, and evaluation of analysis tools. Next, in order to reduce decision-making complexities in creating effective interactive data visualizations, I present a minimal, yet expressive, model for tabular data using aggregated data summaries and linked selections. I demonstrate its application to common categorical, numerical, temporal, spatial, and set data types. Based on this model, I developed Keshif as an out-of-the-box, web-based tool to bootstrap the data exploration process. Then, I applied it to 160+ datasets across many domains, aiming to serve journalists, researchers, policy makers, businesses, and those tracking personal data.
Using tools with novel designs and capabilities requires learning and help-seeking for both novices and experts. To provide self-service help for visual data interfaces, I present a data-driven contextual in-situ help system, HelpIn, which contrasts with separated and static videos and manuals. Lastly, I present an evaluation on design and graphical perception for dense visualization of sorted numeric data. I contrast the non-hierarchical treemaps against two multi-column chart designs, wrapped bars and piled bars. The results support that multi-column charts are perceptually more accurate than treemaps, and the unconventional piled bars may require more training to read effectively.
This thesis contributes to our understanding on how to create effective data interfaces by systematically focusing on human-facing challenges through minimalist solutions. Future work to extend the power of data analysis to a broader public should continue to evaluate and improve design approaches to address many remaining cognitive, social, educational, and technical challenges
Perspectives on Large Language Models for Relevance Judgment
When asked, current large language models (LLMs) like ChatGPT claim that they
can assist us with relevance judgments. Many researchers think this would not
lead to credible IR research. In this perspective paper, we discuss possible
ways for LLMs to assist human experts along with concerns and issues that
arise. We devise a human-machine collaboration spectrum that allows
categorizing different relevance judgment strategies, based on how much the
human relies on the machine. For the extreme point of "fully automated
assessment", we further include a pilot experiment on whether LLM-based
relevance judgments correlate with judgments from trained human assessors. We
conclude the paper by providing two opposing perspectives - for and against the
use of LLMs for automatic relevance judgments - and a compromise perspective,
informed by our analyses of the literature, our preliminary experimental
evidence, and our experience as IR researchers.
We hope to start a constructive discussion within the community to avoid a
stale-mate during review, where work is dammed if is uses LLMs for evaluation
and dammed if it doesn't
- …