1,147 research outputs found
Explaining Math Word Problem Solvers
Automated math word problem solvers based on neural networks have
successfully managed to obtain 70-80\% accuracy in solving arithmetic word
problems. However, it has been shown that these solvers may rely on superficial
patterns to obtain their equations. In order to determine what information math
word problem solvers use to generate solutions, we remove parts of the input
and measure the model's performance on the perturbed dataset. Our results show
that the model is not sensitive to the removal of many words from the input and
can still manage to find a correct answer when given a nonsense question. This
indicates that automatic solvers do not follow the semantic logic of math word
problems, and may be overfitting to the presence of specific words
A Survey of Deep Learning for Mathematical Reasoning
Mathematical reasoning is a fundamental aspect of human intelligence and is
applicable in various fields, including science, engineering, finance, and
everyday life. The development of artificial intelligence (AI) systems capable
of solving math problems and proving theorems has garnered significant interest
in the fields of machine learning and natural language processing. For example,
mathematics serves as a testbed for aspects of reasoning that are challenging
for powerful deep learning models, driving new algorithmic and modeling
advances. On the other hand, recent advances in large-scale neural language
models have opened up new benchmarks and opportunities to use deep learning for
mathematical reasoning. In this survey paper, we review the key tasks,
datasets, and methods at the intersection of mathematical reasoning and deep
learning over the past decade. We also evaluate existing benchmarks and
methods, and discuss future research directions in this domain.Comment: Accepted to ACL 2023. The repository is available at
https://github.com/lupantech/dl4mat
Mathematics, word problems, common sense, and artificial intelligence
The paper discusses the capacities and limitations of current artificial
intelligence (AI) technology to solve word problems that combine elementary
knowledge with commonsense reasoning. No existing AI systems can solve these
reliably. We review three approaches that have been developed, using AI natural
language technology: outputting the answer directly, outputting a computer
program that solves the problem, and outputting a formalized representation
that can be input to an automated theorem verifier. We review some benchmarks
that have been developed to evaluate these systems and some experimental
studies. We discuss the limitations of the existing technology at solving these
kinds of problems. We argue that it is not clear whether these kinds of
limitations will be important in developing AI technology for pure mathematical
research, but that they will be important in applications of mathematics, and
may well be important in developing programs capable of reading and
understanding mathematical content written by humans
Exploring Culturally Responsive Equitable Problem-Solving Pedagogy: Theorizing, Developing & Teaching
Achievement gaps in mathematics between middle and high school Black students when compared to their white peers exist in part because of access, but also because Black learners’ brilliance is not recognized. Finding ways to help students, especially Black students, become successful mathematical problem solvers was a driving force behind this research. The purpose of this research is to explore ideas of how to improve Black students\u27 opportunities to engage in effective mathematical problem solving to improve their mathematics understanding and achievement. This study introduces the Culturally Responsive Equitable Problem Solving (CREPS) pedagogy situated at the intersections of a conceptual framework comprised of three pedagogies - Gay’s (2002) Culturally Responsive Pedagogy, Aguirre, Mayfield-Ingram, and Martin’s (2013) Equity-Based Mathematics Practices, and Schroeder and Lester’s (1989) Teaching through Problem-Solving.
This dissertation study is guided by several research questions and reported through three separate, but related essays: (a) Theorizing a Culturally Responsive Equitable Problem-Solving Pedagogy; (b) Developing Culturally Responsive Equitable Problem-Solving Pedagogical Knowledge; and (c) Cases of Exemplary Instances of Teaching Culturally Responsive Equitable Problem-Solving Pedagogy. These three essays introduce and explain the tenets of CREPS pedagogy, examine secondary mathematics teachers\u27 culturally responsive teaching readiness and their professional learning of CREPS through CREPS pedagogy, and identify instances of exemplary CREPS pedagogical teaching during lesson study (i.e., collaborative planning and iterative teaching of a CREPS lesson), respectively. There were several findings from this research. The most salient findings were the three CREPS pedagogical moves: (a) development of deep mathematics understanding; (b) acknowledgement of students’ backgrounds; and (c) employment of equitable pedagogical practices. Several ideas for future research are shared related to refining and testing the CREPS pedagogy, developing teachers\u27 CREPS pedagogical knowledge, and teaching experiments for enacting CREPS pedagogy
Lemur: Harmonizing Natural Language and Code for Language Agents
We introduce Lemur and Lemur-Chat, openly accessible language models
optimized for both natural language and coding capabilities to serve as the
backbone of versatile language agents. The evolution from language chat models
to functional language agents demands that models not only master human
interaction, reasoning, and planning but also ensure grounding in the relevant
environments. This calls for a harmonious blend of language and coding
capabilities in the models. Lemur and Lemur-Chat are proposed to address this
necessity, demonstrating balanced proficiencies in both domains, unlike
existing open-source models that tend to specialize in either. Through
meticulous pre-training using a code-intensive corpus and instruction
fine-tuning on text and code data, our models achieve state-of-the-art averaged
performance across diverse text and coding benchmarks among open-source models.
Comprehensive experiments demonstrate Lemur's superiority over existing
open-source models and its proficiency across various agent tasks involving
human communication, tool usage, and interaction under fully- and partially-
observable environments. The harmonization between natural and programming
languages enables Lemur-Chat to significantly narrow the gap with proprietary
models on agent abilities, providing key insights into developing advanced
open-source agents adept at reasoning, planning, and operating seamlessly
across environments. https://github.com/OpenLemur/Lemu
A Survey of Large Language Models
Language is essentially a complex, intricate system of human expressions
governed by grammatical rules. It poses a significant challenge to develop
capable AI algorithms for comprehending and grasping a language. As a major
approach, language modeling has been widely studied for language understanding
and generation in the past two decades, evolving from statistical language
models to neural language models. Recently, pre-trained language models (PLMs)
have been proposed by pre-training Transformer models over large-scale corpora,
showing strong capabilities in solving various NLP tasks. Since researchers
have found that model scaling can lead to performance improvement, they further
study the scaling effect by increasing the model size to an even larger size.
Interestingly, when the parameter scale exceeds a certain level, these enlarged
language models not only achieve a significant performance improvement but also
show some special abilities that are not present in small-scale language
models. To discriminate the difference in parameter scale, the research
community has coined the term large language models (LLM) for the PLMs of
significant size. Recently, the research on LLMs has been largely advanced by
both academia and industry, and a remarkable progress is the launch of ChatGPT,
which has attracted widespread attention from society. The technical evolution
of LLMs has been making an important impact on the entire AI community, which
would revolutionize the way how we develop and use AI algorithms. In this
survey, we review the recent advances of LLMs by introducing the background,
key findings, and mainstream techniques. In particular, we focus on four major
aspects of LLMs, namely pre-training, adaptation tuning, utilization, and
capacity evaluation. Besides, we also summarize the available resources for
developing LLMs and discuss the remaining issues for future directions.Comment: ongoing work; 51 page
Can neural networks do arithmetic? A survey on the elementary numerical skills of state-of-the-art deep learning models
Creating learning models that can exhibit sophisticated reasoning skills is
one of the greatest challenges in deep learning research, and mathematics is
rapidly becoming one of the target domains for assessing scientific progress in
this direction. In the past few years there has been an explosion of neural
network architectures, data sets, and benchmarks specifically designed to
tackle mathematical problems, reporting notable success in disparate fields
such as automated theorem proving, numerical integration, and discovery of new
conjectures or matrix multiplication algorithms. However, despite these
impressive achievements it is still unclear whether deep learning models
possess an elementary understanding of quantities and symbolic numbers. In this
survey we critically examine the recent literature, concluding that even
state-of-the-art architectures often fall short when probed with relatively
simple tasks designed to test basic numerical and arithmetic knowledge
- …