Evaluation of basic mathematical abilities of neural networks

Abstract

openIn human cognition, when advanced mathematical abilities reach a certain level, basic numerical skills, such as number sense and elementary calculation, are typically well-developed. In this thesis we investigate whether state-of-the-art artificial neural network models exhibit a similar trend. Indeed, much research has pointed out that large-scale language models (such as ChatGPT) possess exceptional high-level mathematical abilities, but their elementary numeracy skills have often been overlooked. This dissertation focuses on the foundational mathematical abilities of GPT-3.5 (from which ChatGPT was developed), its newest version GPT-4 and six other multi-modal deep learning models. Taking into account the unique characteristics of different neural network models, standardized tests and self-developed tasks were employed to explore the mathematical abilities of these eight models. The findings indicate that GPT-3.5 and GPT-4 are indeed able to exhibit complex mathematical competencies, though basic numeracy skills are not always fully developed (especially in GPT-3.5). In contrast, the six multi-modal models still need to make progress in improving their numerosity perception and number sense to unlock more advanced mathematical abilities.In human cognition, when advanced mathematical abilities reach a certain level, basic numerical skills, such as number sense and elementary calculation, are typically well-developed. In this thesis we investigate whether state-of-the-art artificial neural network models exhibit a similar trend. Indeed, much research has pointed out that large-scale language models (such as ChatGPT) possess exceptional high-level mathematical abilities, but their elementary numeracy skills have often been overlooked. This dissertation focuses on the foundational mathematical abilities of GPT-3.5 (from which ChatGPT was developed), its newest version GPT-4 and six other multi-modal deep learning models. Taking into account the unique characteristics of different neural network models, standardized tests and self-developed tasks were employed to explore the mathematical abilities of these eight models. The findings indicate that GPT-3.5 and GPT-4 are indeed able to exhibit complex mathematical competencies, though basic numeracy skills are not always fully developed (especially in GPT-3.5). In contrast, the six multi-modal models still need to make progress in improving their numerosity perception and number sense to unlock more advanced mathematical abilities

    Similar works