87 research outputs found
Human-Centric Tools for Navigating Code
All software failures are fundamentally the fault of humansthe software\u27s design was flawed. The high cost of such failures ultimately results in developers having to design, implement, and test fixes, which all take considerable time and effort, and may result in more failures. As developers work on software maintenance tasks, they must navigate enormous codebases that may comprise millions of lines of code organized across thousands of modules. However, navigating code carries with it a plethora of problems for developers. In the hopes of addressing these navigation barriers, modern code editor and development environments provide a variety of features to aid in navigation; however, they are not without their limitations. Code navigations take many forms, and in this work I focus on three key types of code navigation in modern software development: navigating the working set, navigating among versions of code, and navigating the code structure. To address the challenges of navigating code, I designed three novel software development tools, one to enhance each type of navigation. First, I designed and implemented Patchworks, a code editor interface to support developers in navigating the working set. Patchworks aims to make these more efficient by providing a fixed grid of open code fragments that developers can quickly navigate. Second, I designed and implemented Yestercode, a code editor extension to support navigating among versions of code. Yestercode does so by providing a comparison view of the current code and a previous version of the same code. Third, I designed and implemented Wandercode, a code editor extension to enable developers to efficiently navigate the structure of their code. Wandercode aims to do so by providing a visualization of the code\u27s call graph overlayed on the code editor. My approach to designing these tools for more efficient code navigation was a human-centric onethat is, based on the needs of actual developers performing real software development tasks. Through user study evaluations, I found that these tools significantly improved developer productivity by reducing developers\u27 time spent navigating and mental effort during software maintenance tasks
How Novices Use LLM-Based Code Generators to Solve CS1 Coding Tasks in a Self-Paced Learning Environment
As Large Language Models (LLMs) gain in popularity, it is important to
understand how novice programmers use them. We present a thematic analysis of
33 learners, aged 10-17, independently learning Python through 45
code-authoring tasks using Codex, an LLM-based code generator. We explore
several questions related to how learners used these code generators and
provide an analysis of the properties of the written prompts and the generated
code. Specifically, we explore (A) the context in which learners use Codex, (B)
what learners are asking from Codex, (C) properties of their prompts in terms
of relation to task description, language, and clarity, and prompt crafting
patterns, (D) the correctness, complexity, and accuracy of the AI-generated
code, and (E) how learners utilize AI-generated code in terms of placement,
verification, and manual modifications. Furthermore, our analysis reveals four
distinct coding approaches when writing code with an AI code generator: AI
Single Prompt, where learners prompted Codex once to generate the entire
solution to a task; AI Step-by-Step, where learners divided the problem into
parts and used Codex to generate each part; Hybrid, where learners wrote some
of the code themselves and used Codex to generate others; and Manual coding,
where learners wrote the code themselves. The AI Single Prompt approach
resulted in the highest correctness scores on code-authoring tasks, but the
lowest correctness scores on subsequent code-modification tasks during
training. Our results provide initial insight into how novice learners use AI
code generators and the challenges and opportunities associated with
integrating them into self-paced learning environments. We conclude with
various signs of over-reliance and self-regulation, as well as opportunities
for curriculum and tool development.Comment: 12 pages, Peer-Reviewed, Accepted for publication in the proceedings
of the 2023 ACM Koli Calling International Conference on Computing Education
Researc
Conversational Challenges in AI-Powered Data Science: Obstacles, Needs, and Design Opportunities
Large Language Models (LLMs) are being increasingly employed in data science
for tasks like data preprocessing and analytics. However, data scientists
encounter substantial obstacles when conversing with LLM-powered chatbots and
acting on their suggestions and answers. We conducted a mixed-methods study,
including contextual observations, semi-structured interviews (n=14), and a
survey (n=114), to identify these challenges. Our findings highlight key issues
faced by data scientists, including contextual data retrieval, formulating
prompts for complex tasks, adapting generated code to local environments, and
refining prompts iteratively. Based on these insights, we propose actionable
design recommendations, such as data brushing to support context selection, and
inquisitive feedback loops to improve communications with AI-based assistants
in data-science tools.Comment: 24 pages, 8 figure
CodeAid: Evaluating a Classroom Deployment of an LLM-based Programming Assistant that Balances Student and Educator Needs
Timely, personalized feedback is essential for students learning programming.
LLM-powered tools like ChatGPT offer instant support, but reveal direct answers
with code, which may hinder deep conceptual engagement. We developed CodeAid,
an LLM-powered programming assistant delivering helpful, technically correct
responses, without revealing code solutions. CodeAid answers conceptual
questions, generates pseudo-code with line-by-line explanations, and annotates
student's incorrect code with fix suggestions. We deployed CodeAid in a
programming class of 700 students for a 12-week semester. A thematic analysis
of 8,000 usages of CodeAid was performed, further enriched by weekly surveys,
and 22 student interviews. We then interviewed eight programming educators to
gain further insights. Our findings reveal four design considerations for
future educational AI assistants: D1) exploiting AI's unique benefits; D2)
simplifying query formulation while promoting cognitive engagement; D3)
avoiding direct responses while encouraging motivated learning; and D4)
maintaining transparency and control for students to asses and steer AI
responses.Comment: CHI 2024 Paper - The paper includes 17 pages, 8 figures, 2 tables,
along with a 2-page appendi
Solar Fusion Cross Sections
We review and analyze the available information for nuclear fusion cross
sections that are most important for solar energy generation and solar neutrino
production. We provide best values for the low-energy cross-section factors
and, wherever possible, estimates of the uncertainties. We also describe the
most important experiments and calculations that are required in order to
improve our knowledge of solar fusion rates.Comment: LaTeX file, 48 pages (figures not included). To appear in Rev. Mod.
Phys., 10/98. All authors now listed. Full postscript version with figures
available at http://www.sns.ias.edu/~jnb/Papers/Preprints/nuclearfusion.htm
Interface Fluctuations on a Hierarchical Lattice
We consider interface fluctuations on a two-dimensional layered lattice where
the couplings follow a hierarchical sequence. This problem is equivalent to the
diffusion process of a quantum particle in the presence of a one-dimensional
hierarchical potential. According to a modified Harris criterion this type of
perturbation is relevant and one expects anomalous fluctuating behavior. By
transfer-matrix techniques and by an exact renormalization group transformation
we have obtained analytical results for the interface fluctuation exponents,
which are discontinuous at the homogeneous lattice limit.Comment: 14 pages plain Tex, one Figure upon request, Phys Rev E (in print
Adjusting for unmeasured confounding in nonrandomized longitudinal studies: a methodological review
OBJECTIVE: Motivated by recent calls to use electronic health records for research, we reviewed the application and development of methods for addressing the bias from unmeasured confounding in longitudinal data. DESIGN: Methodological review of existing literature SETTING: We searched MEDLINE and EMBASE for articles addressing the threat to causal inference from unmeasured confounding in nonrandomised longitudinal health data through quasi-experimental analysis. RESULTS: Among the 121 studies included for review, 84 used instrumental variable analysis (IVA), of which 36 used lagged or historical instruments. Difference-in-differences (DiD) and fixed effects (FE) models were found in 29 studies. Five of these combined IVA with DiD or FE to try to mitigate for time-dependent confounding. Other less frequently used methods included prior event rate ratio adjustment, regression discontinuity nested within pre-post studies, propensity score calibration, perturbation analysis and negative control outcomes. CONCLUSIONS: Well-established econometric methods such as DiD and IVA are commonly used to address unmeasured confounding in non-randomised, longitudinal studies, but researchers often fail to take full advantage of available longitudinal information. A range of promising new methods have been developed, but further studies are needed to understand their relative performance in different contexts before they can be recommended for widespread use
- …