26 research outputs found
Empowering Multi-step Reasoning across Languages via Tree-of-Thoughts
Chain-of-Thought (CoT) prompting empowers the reasoning abilities of Large
Language Models (LLMs), eliciting them to solve complex reasoning tasks
step-by-step. However, with the success of CoT methods, the ability to deliver
multi-step reasoning remains limited to English due to the imbalance in the
distribution of the pre-training data, making the other languages a barrier.
In this work, we propose a Cross-lingual multi-step reasoning approach,
aiming to align reasoning processes across different languages. In particular,
our method, through a Self-consistent Cross-lingual prompting mechanism
inspired by the Tree-of-Thoughts approach, delivers multi-step reasoning paths
in different languages that, during the steps, lead to the final solution. Our
experimental evaluations show that our method significantly outperforms
existing prompting methods, reducing the number of interactions and achieving
state-of-the-art performance
HANS, are you clever? Clever Hans Effect Analysis of Neural Systems
Instruction-tuned Large Language Models (It-LLMs) have been exhibiting
outstanding abilities to reason around cognitive states, intentions, and
reactions of all people involved, letting humans guide and comprehend
day-to-day social interactions effectively. In fact, several multiple-choice
questions (MCQ) benchmarks have been proposed to construct solid assessments of
the models' abilities. However, earlier works are demonstrating the presence of
inherent "order bias" in It-LLMs, posing challenges to the appropriate
evaluation. In this paper, we investigate It-LLMs' resilience abilities towards
a series of probing tests using four MCQ benchmarks. Introducing adversarial
examples, we show a significant performance gap, mainly when varying the order
of the choices, which reveals a selection bias and brings into discussion
reasoning abilities. Following a correlation between first positions and model
choices due to positional bias, we hypothesized the presence of structural
heuristics in the decision-making process of the It-LLMs, strengthened by
including significant examples in few-shot scenarios. Finally, by using the
Chain-of-Thought (CoT) technique, we elicit the model to reason and mitigate
the bias by obtaining more robust models
Animate, or inanimate, that is the question for large language models
The cognitive essence of humans is deeply intertwined with the concept of animacy, which plays an essential role in shaping their memory, vision, and multi-layered language understanding. Although animacy appears in language via nuanced constraints on verbs and adjectives, it is also learned and refined through extralinguistic information. Similarly, we assume that the LLMs' limited abilities to understand natural language when processing animacy are motivated by the fact that these models are trained exclusively on text. Hence, the question this paper aims to answer arises: can LLMs, in their digital wisdom, process animacy in a similar way to what humans would do? We then propose a systematic analysis via prompting approaches. In particular, we probe different LLMs by prompting them using animate, inanimate, usual, and stranger contexts. Results reveal that, although LLMs have been trained predominantly on textual data, they exhibit human-like behavior when faced with typical animate and inanimate entities in alignment with earlier studies. Hence, LLMs can adapt to understand unconventional situations by recognizing oddities as animated without needing to interface with unspoken cognitive triggers humans rely on to break down animations
A Trip Towards Fairness: Bias and De-Biasing in Large Language Models
An outbreak in the popularity of transformer-based Language Models (such as
GPT (Brown et al., 2020) and PaLM (Chowdhery et al., 2022)) has opened the
doors to new Machine Learning applications. In particular, in Natural Language
Processing and how pre-training from large text, corpora is essential in
achieving remarkable results in downstream tasks. However, these Language
Models seem to have inherent biases toward certain demographics reflected in
their training data. While research has attempted to mitigate this problem,
existing methods either fail to remove bias altogether, degrade performance, or
are expensive. This paper examines the bias produced by promising Language
Models when varying parameters and pre-training data. Finally, we propose a
de-biasing technique that produces robust de-bias models that maintain
performance on downstream tasks
Knowing Knowledge: Epistemological Study of Knowledge in Transformers
Statistical learners are leading towards auto-epistemic logic, but is it the right way to progress in artificial intelligence (AI)? Ways to discover AI fit the senses and the intellect. The structure of symbols–the operations by which the intellectual solution is realized–and the search for strategic reference points evoke essential issues in the analysis of AI. Studying how knowledge can be represented through methods of theoretical generalization and empirical observation is only the latest step in a long process of evolution. In this paper, we try to outline the origin of knowledge and how modern artificial minds have inherited it
Shedding Light on the Dark Web: Authorship Attribution in Radical Forums
Online users tend to hide their real identities by adopting different names on the Internet. On Facebook or LinkedIn, for example, people usually appear with their real names. On other standard websites, such as forums, people often use nicknames to protect their real identities. Aliases are used when users are trying to protect their anonymity. This can be a challenge to law enforcement trying to identify users who often change nicknames. In unmonitored contexts, such as the dark web, users expect strong identity protection. Thus, without censorship, these users may create parallel social networks where they can engage in potentially malicious activities that could pose security threats. In this paper, we propose a solution to the need to recognize people who anonymize themselves behind nicknames—the authorship attribution (AA) task—in the challenging context of the dark web: specifically, an English-language Islamic forum dedicated to discussions of issues related to the Islamic world and Islam, in which members of radical Islamic groups are present. We provide extensive analysis by testing models based on transformers, styles, and syntactic features. Downstream of the experiments, we show how models that analyze syntax and style perform better than pre-trained universal language models
Crypto<i>Net</i>: Using Auto-Regressive Multi-Layer Artificial Neural Networks to Predict Financial Time Series
When analyzing a financial asset, it is essential to study the trend of its time series. It is also necessary to examine its evolution and activity over time to statistically analyze its possible future behavior. Both retail and institutional investors base their trading strategies on these analyses. One of the most used techniques to study financial time series is to analyze its dynamic structure using auto-regressive models, simple moving average models (SMA), and mixed auto-regressive moving average models (ARMA). These techniques, unfortunately, do not always provide appreciable results both at a statistical level and as the Risk-Reward Ratio (RRR); above all, each system has its pros and cons. In this paper, we present CryptoNet; this system is based on the time series extraction exploiting the vast potential of artificial intelligence (AI) and machine learning (ML). Specifically, we focused on time series trends extraction by developing an artificial neural network, trained and tested on two famous crypto-currencies: Bitcoinand Ether. CryptoNet learning algorithm improved the classic linear regression model up to 31% of MAE (mean absolute error). Results from this work should encourage machine learning techniques in sectors classically reluctant to adopt non-standard approaches
Dis-Cover AI Minds to Preserve Human Knowledge
Modern AI technologies make use of statistical learners that lead to self-empiricist logic, which, unlike human minds, use learned non-symbolic representations. Nevertheless, it seems that it is not the right way to progress in AI. The structure of symbols—the operations by which the intellectual solution is realized—and the search for strategic reference points evoke important issues in the analysis of AI. Studying how knowledge can be represented through methods of theoretical generalization and empirical observation is only the latest step in a long process of evolution. For many years, humans, seeing language as innate, have carried out symbolic theories. Everything seems to have skipped ahead with the advent of Machine Learning. In this paper, after a long analysis of history, the rule-based and the learning-based vision, we would investigate the syntax as possible meeting point between the different learning theories. Finally, we propose a new vision of knowledge in AI models based on a combination of rules, learning, and human knowledge
Dis-Cover AI Minds to Preserve Human Knowledge
Modern AI technologies make use of statistical learners that lead to self-empiricist logic, which, unlike human minds, use learned non-symbolic representations. Nevertheless, it seems that it is not the right way to progress in AI. The structure of symbols—the operations by which the intellectual solution is realized—and the search for strategic reference points evoke important issues in the analysis of AI. Studying how knowledge can be represented through methods of theoretical generalization and empirical observation is only the latest step in a long process of evolution. For many years, humans, seeing language as innate, have carried out symbolic theories. Everything seems to have skipped ahead with the advent of Machine Learning. In this paper, after a long analysis of history, the rule-based and the learning-based vision, we would investigate the syntax as possible meeting point between the different learning theories. Finally, we propose a new vision of knowledge in AI models based on a combination of rules, learning, and human knowledge