15,694 research outputs found
Progress in AI Planning Research and Applications
Planning has made significant progress since its inception in the 1970s, in terms both of the efficiency and sophistication of its algorithms and representations and its potential for application to real problems. In this paper we sketch the foundations of planning as a sub-field of Artificial Intelligence and the history of its development over the past three decades. Then some of the recent achievements within the field are discussed and provided some experimental data demonstrating the progress that has been made in the application of general planners to realistic and complex problems. The paper concludes by identifying some of the open issues that remain as important challenges for future research in planning
Italian Crossword Generator: Enhancing Education through Interactive Word Puzzles
Educational crosswords offer numerous benefits for students, including
increased engagement, improved understanding, critical thinking, and memory
retention. Creating high-quality educational crosswords can be challenging, but
recent advances in natural language processing and machine learning have made
it possible to use language models to generate nice wordplays. The exploitation
of cutting-edge language models like GPT3-DaVinci, GPT3-Curie, GPT3-Babbage,
GPT3-Ada, and BERT-uncased has led to the development of a comprehensive system
for generating and verifying crossword clues. A large dataset of clue-answer
pairs was compiled to fine-tune the models in a supervised manner to generate
original and challenging clues from a given keyword. On the other hand, for
generating crossword clues from a given text, Zero/Few-shot learning techniques
were used to extract clues from the input text, adding variety and creativity
to the puzzles. We employed the fine-tuned model to generate data and labeled
the acceptability of clue-answer parts with human supervision. To ensure
quality, we developed a classifier by fine-tuning existing language models on
the labeled dataset. Conversely, to assess the quality of clues generated from
the given text using zero/few-shot learning, we employed a zero-shot learning
approach to check the quality of generated clues. The results of the evaluation
have been very promising, demonstrating the effectiveness of the approach in
creating high-standard educational crosswords that offer students engaging and
rewarding learning experiences.Comment: Accepted Paper for CLiC-it 2023 - 9th Italian Conference on
Computational Linguistic
ArabIcros: AI-Powered Arabic Crossword Puzzle Generation for Educational Applications
This paper presents the first Arabic crossword puzzle generator driven by
advanced AI technology. Leveraging cutting-edge large language models including
GPT4, GPT3-Davinci, GPT3-Curie, GPT3-Babbage, GPT3-Ada, and BERT, the system
generates distinctive and challenging clues. Based on a dataset comprising over
50,000 clue-answer pairs, the generator employs fine-tuning, few/zero-shot
learning strategies, and rigorous quality-checking protocols to enforce the
generation of high-quality clue-answer pairs. Importantly, educational
crosswords contribute to enhancing memory, expanding vocabulary, and promoting
problem-solving skills, thereby augmenting the learning experience through a
fun and engaging approach, reshaping the landscape of traditional learning
methods. The overall system can be exploited as a powerful educational tool
that amalgamates AI and innovative learning techniques, heralding a
transformative era for Arabic crossword puzzles and the intersection of
technology and education.Comment: Accepted Paper for ArabicNLP 2023 - The First Arabic Natural Language
Processing Conference - Co-located with EMNLP 2023 in Singapor
Language Models Can Teach Themselves to Program Better
Recent Language Models (LMs) achieve breakthrough performance in code
generation when trained on human-authored problems, even solving some
competitive-programming problems. Self-play has proven useful in games such as
Go, and thus it is natural to ask whether LMs can generate their own
instructive programming problems to improve their performance. We show that it
is possible for an LM to synthesize programming problems and solutions, which
are filtered for correctness by a Python interpreter. The LM's performance is
then seen to improve when it is fine-tuned on its own synthetic problems and
verified solutions; thus the model 'improves itself' using the Python
interpreter. Problems are specified formally as programming puzzles [Schuster
et al., 2021], a code-based problem format where solutions can easily be
verified for correctness by execution. In experiments on publicly-available
LMs, test accuracy more than doubles. This work demonstrates the potential for
code LMs, with an interpreter, to generate instructive problems and improve
their own performance.Comment: 22 pages, 14 figure
App creation in schools for different curricula subjects - lesson learned
The next generation of jobs will be characterized by an increased demand for
people with computational and problem solving skills. In Austria, computer
science topics are underrepresented in school curricula hence teaching time for
these topics is limited. From primary through secondary school, only a few
opportunities exist for young students to explore programming. Furthermore,
today's teachers are rarely trained in computer science, which impairs their
potential to motivate students in these courses. Within the "No One Left
Behind" (NOLB) project, teachers were supported to guide and assist their
students in their learning processes by constructing ideas through game making.
Thus, students created games that referred to different subject areas by using
the programming tool Pocket Code, an app developed at Graz University of
Technology (TU-Graz). This tool helps students to take control of their own
education, becoming more engaged, interested, and empowered as a result. To
ensure an optimal integration of the app in diverse subjects the different
backgrounds (technical and non-technical) of teachers must be considered as
well. First, teachers were supported to use Pocket Code in the different
subjects in school within the feasibility study of the project. Observed
challenges and difficulties using the app have been gathered. Second, we
conducted interviews with teachers and students to underpin our onsite
observations. As a result, it was possible to validate Pocket Codes' potential
to be used in a diverse range of subjects. Third, we focused especially on
those teachers who were not technically trained to provide them with a
framework for Pocket Code units, e.g., with the help of structured lesson plans
and predefined templates.Comment: 10 pages, 5 tables EduLearn 201
Learning Language from a Large (Unannotated) Corpus
A novel approach to the fully automated, unsupervised extraction of
dependency grammars and associated syntax-to-semantic-relationship mappings
from large text corpora is described. The suggested approach builds on the
authors' prior work with the Link Grammar, RelEx and OpenCog systems, as well
as on a number of prior papers and approaches from the statistical language
learning literature. If successful, this approach would enable the mining of
all the information needed to power a natural language comprehension and
generation system, directly from a large, unannotated corpus.Comment: 29 pages, 5 figures, research proposa
Knowledge Based Systems: A Critical Survey of Major Concepts, Issues, and Techniques
This Working Paper Series entry presents a detailed survey of knowledge based systems. After being in a relatively dormant state for many years, only recently is Artificial Intelligence (AI) - that branch of computer science that attempts to have machines emulate intelligent behavior - accomplishing practical results. Most of these results can be attributed to the design and use of Knowledge-Based Systems, KBSs (or ecpert systems) - problem solving computer programs that can reach a level of performance comparable to that of a human expert in some specialized problem domain. These systems can act as a consultant for various requirements like medical diagnosis, military threat analysis, project risk assessment, etc. These systems possess knowledge to enable them to make intelligent desisions. They are, however, not meant to replace the human specialists in any particular domain. A critical survey of recent work in interactive KBSs is reported. A case study (MYCIN) of a KBS, a list of existing KBSs, and an introduction to the Japanese Fifth Generation Computer Project are provided as appendices. Finally, an extensive set of KBS-related references is provided at the end of the report
LatEval: An Interactive LLMs Evaluation Benchmark with Incomplete Information from Lateral Thinking Puzzles
With the continuous evolution and refinement of LLMs, they are endowed with
impressive logical reasoning or vertical thinking capabilities. But can they
think out of the box? Do they possess proficient lateral thinking abilities?
Following the setup of Lateral Thinking Puzzles, we propose a novel evaluation
benchmark, LatEval, which assesses the model's lateral thinking within an
interactive framework. In our benchmark, we challenge LLMs with 2 aspects: the
quality of questions posed by the model and the model's capability to integrate
information for problem-solving. We find that nearly all LLMs struggle with
employing lateral thinking during interactions. For example, even the most
advanced model, GPT-4, exhibits the advantage to some extent, yet still
maintain a noticeable gap when compared to human. This evaluation benchmark
provides LLMs with a highly challenging and distinctive task that is crucial to
an effective AI assistant.Comment: Work in progres
- ā¦