Search CORE

6 research outputs found

EchoPrompt: Instructing the Model to Rephrase Queries for Improved In-context Learning

Author: Mekala Rajasekhar Reddy
Razeghi Yasaman
Singh Sameer
Publication venue
Publication date: 15/09/2023
Field of study

Large language models primarily rely on incontext learning to execute tasks. We introduce EchoPrompt, a simple yet effective approach to prompt the model to rephrase its queries before answering them. EchoPrompt is inspired by self-questioning, a cognitive strategy humans use to vocalize queries before providing answers, thereby reducing misconceptions. Experimental results demonstrate that EchoPrompt leads to substantial improvements in both zero-shot and few-shot in-context learning with standard and chain-of-thought prompting on four families of causal language models. These improvements are observed across various numerical reasoning (GSM8K, SVAMP, MultiArith, SingleOp), reading comprehension (DROP, SQuAD), and logical reasoning (Shuffled Objects, Date Understanding, Coin Flipping) tasks. On average, EchoPrompt improves the Zero-shot-CoT performance of code-davinci-002 by 5% in numerical tasks and 13% in reading comprehension tasks. We investigate the effectiveness of EchoPrompt through ablation studies, which reveal the significance of both original and rephrased queries for EchoPrompt's efficacy. Our empirical results show that EchoPrompt is an effective technique that can easily augment in-context learning for better performance

arXiv.org e-Print Archive

Selective Perception: Optimizing State Descriptions with Reinforcement Learning for Language Model Actors

Author: Baldi Pierre
Fox Roy
Kim Kyungmin
Lanier JB
Nottingham Kolby
Razeghi Yasaman
Singh Sameer
Publication venue
Publication date: 21/07/2023
Field of study

Large language models (LLMs) are being applied as actors for sequential decision making tasks in domains such as robotics and games, utilizing their general world knowledge and planning abilities. However, previous work does little to explore what environment state information is provided to LLM actors via language. Exhaustively describing high-dimensional states can impair performance and raise inference costs for LLM actors. Previous LLM actors avoid the issue by relying on hand-engineered, task-specific protocols to determine which features to communicate about a state and which to leave out. In this work, we propose Brief Language INputs for DEcision-making Responses (BLINDER), a method for automatically selecting concise state descriptions by learning a value function for task-conditioned state descriptions. We evaluate BLINDER on the challenging video game NetHack and a robotic manipulation task. Our method improves task success rate, reduces input size and compute costs, and generalizes between LLM actors

arXiv.org e-Print Archive

Turtle-like Geometry Learning: How Humans and Machines Differ in Learning Turtle Geometry

Author: Doroudi Shayan
Razeghi Yasaman
Rismanchian Sina
Publication venue: AAAI Press
Publication date: 20/05/2024
Field of study

While object recognition is one of the prevalent affordances of humans' perceptual systems, even human infants can prioritize a place system over the object recognition system, that is used when navigating. This ability, combined with active learning strategies can make humans fast learners of Turtle Geometry, a notion introduced about four decades ago. We contrast humans' performances and learning strategies with large visual language models (LVLMs) and as we show, LVLMs fall short of humans in solving Turtle Geometry tasks. We outline different characteristics of human-like learning in the domain of Turtle Geometry that are fundamentally unparalleled in state-of-the-art deep neural networks and can inform future research directions in the field of artificial intelligence

Association for the Advancement of Artificial Intelligence: AAAI Publications

A Theoretically Grounded Benchmark for Evaluating Machine Commonsense

Author: Kejriwal Mayank
McGuinness Deborah L.
Mulvehill Alice M.
Razeghi Yasaman
Santos Henrique
Shen Ke
Publication venue
Publication date: 14/07/2022
Field of study

Programming machines with commonsense reasoning (CSR) abilities is a longstanding challenge in the Artificial Intelligence community. Current CSR benchmarks use multiple-choice (and in relatively fewer cases, generative) question-answering instances to evaluate machine commonsense. Recent progress in transformer-based language representation models suggest that considerable progress has been made on existing benchmarks. However, although tens of CSR benchmarks currently exist, and are growing, it is not evident that the full suite of commonsense capabilities have been systematically evaluated. Furthermore, there are doubts about whether language models are 'fitting' to a benchmark dataset's training partition by picking up on subtle, but normatively irrelevant (at least for CSR), statistical features to achieve good performance on the testing partition. To address these challenges, we propose a benchmark called Theoretically-Grounded Commonsense Reasoning (TG-CSR) that is also based on discriminative question answering, but with questions designed to evaluate diverse aspects of commonsense, such as space, time, and world states. TG-CSR is based on a subset of commonsense categories first proposed as a viable theory of commonsense by Gordon and Hobbs. The benchmark is also designed to be few-shot (and in the future, zero-shot), with only a few training and validation examples provided. This report discusses the structure and construction of the benchmark. Preliminary results suggest that the benchmark is challenging even for advanced language representation models designed for discriminative CSR question answering tasks. Benchmark access and leaderboard: https://codalab.lisn.upsaclay.fr/competitions/3080 Benchmark website: https://usc-isi-i2.github.io/TGCSR

arXiv.org e-Print Archive

Electrochemically deposition of ionic liquid modified graphene oxide for circulated headspace in-tube solid phase microextraction of naphthalene from honey samples followed by on-line liquid chromatography analysis

Author: Aggrawal
Al-Alam
Al-Mamary
Al-rashdi
Albero
Alissandrakis
An
Bansal
Barati
Beyoǧlu
Bianchin
Chen
Corredera
Costa Queiroz
Cuevas-Glory
Derebaş
Dobrinas
Ehsani
Fan
Fazaieli
Giardina
Hadis Abolhasani
Harizanis
Hsiao
Indrayanto
Iwegbue
Kazazic
Koltsakidou
Lee
Liu
Luo
Lyu
Mahshid Manouchehri
Marcano
Maryam Shanehsaz
Moliner-Martinez
Moniruzzaman
Moreda-Piñeiro
Ponnusamy
Poster
Rastkari
Russo
Sadeghi
Saha
Saitta
Santos
Seidi
Shahram Seidi
Shamsipur
Soria
Sun
Tananaki
Tsimeli
Vera Candioti
Wang
Wang
Wang
Wu
Yasaman Razeghi
Zhou
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

In-tube stir bar sorptive extraction based on 3-aminopropyl triethoxysilane surface-modified Ce-doped ZnAl layered double hydroxide thin film for determination of nonsteroidal anti-inflammatory drugs in saliva samples

Author: ARM Silva
AY Park
B Wang
C Hu
F Alipour
F David
F David
F Sánchez-Rojas
FJ Camino-Sánchez
FR Costa
I Aparicio
I Racamonde
K Farhadi
L Adlnasab
M Aghaziarati
M Dinari
M Ghani
M Kawaguchi
M Manouchehri
M Piryaei
M Sajid
M Sajid
MA Rahim
Mahsa Torabi Mirzaee
Mahshid Manouchehri
Maryam Shanehsaz
N Bader
N Dong
Q Tao
S Almeda
S Prasad
S Seidi
S Seidi
S Tang
SH Hashemi
Shahram Seidi
Soares da Silva Burato J Vargas Medina DA, de Toffoli AL
VDSA Leite
W Fan
W Zhou
WA Wan Ibrahim
X Mao
Y Zhang
Yasaman Razeghi
Z Yuan
ŞE Kepekci Tekkeli
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref