Search CORE

1 research outputs found

Is Model Attention Aligned with Human Attention? An Empirical Study on Large Language Models for Code Generation

Author: Chen Shengmai
Kou Bonan
Ma Lei
Wang Zhijie
Zhang Tianyi
Publication venue
Publication date: 01/06/2023
Field of study

Large Language Models (LLMs) have been demonstrated effective for code generation. Due to the complexity and opacity of LLMs, little is known about how these models generate code. To deepen our understanding, we investigate whether LLMs attend to the same parts of a natural language description as human programmers during code generation. An analysis of five LLMs on a popular benchmark, HumanEval, revealed a consistent misalignment between LLMs' and programmers' attention. Furthermore, we found that there is no correlation between the code generation accuracy of LLMs and their alignment with human programmers. Through a quantitative experiment and a user study, we confirmed that, among twelve different attention computation methods, attention computed by the perturbation-based method is most aligned with human attention and is constantly favored by human programmers. Our findings highlight the need for human-aligned LLMs for better interpretability and programmer trust.Comment: 13 pages, 8 figures, 7 table

arXiv.org e-Print Archive