Investigating the Effectiveness of ChatGPT in Mathematical Reasoning and
Problem Solving: Evidence from the Vietnamese National High School Graduation
Examination
This study offers a complete analysis of ChatGPT's mathematics abilities in
responding to multiple-choice questions for the Vietnamese National High School
Graduation Examination (VNHSGE) on a range of subjects and difficulty levels.
The dataset included 250 questions divided into four levels: knowledge (K),
comprehension (C), application (A), and high application (H), and it included
ten themes that covered diverse mathematical concepts. The outcomes demonstrate
that ChatGPT's performance varies depending on the difficulty level and
subject. It performed best on questions at Level (K), with an accuracy rate of
83%; but, as the difficulty level rose, it scored poorly, with an accuracy
rate of 10%. The study has also shown that ChatGPT significantly succeeds in
providing responses to questions on subjects including exponential and
logarithmic functions, geometric progression, and arithmetic progression. The
study found that ChatGPT had difficulty correctly answering questions on topics
including derivatives and applications, spatial geometry, and Oxyz spatial
calculus. Additionally, this study contrasted ChatGPT outcomes with Vietnamese
students in VNHSGE and in other math competitions. ChatGPT dominated in the SAT
Math competition with a success rate of 70%, followed by VNHSGE mathematics
(58.8%). However, its success rates were lower on other exams, such as AP
Statistics, the GRE Quantitative, AMC 10, AMC 12, and AP Calculus BC. These
results suggest that ChatGPT has the potential to be an effective teaching tool
for mathematics, but more work is needed to enhance its handling of graphical
data and address the challenges presented by questions that are getting more
challenging.Comment: 17 pages, 14 image