Search CORE

8 research outputs found

Recommended from our members

A Note on the Unconditional Bias of the Nadaraya-Watson Regression Estimator

Author: Pelrine Kellin
Publication venue: University of Colorado Boulder
Publication date: 01/01/2018
Field of study

In this note we investigate the order of the unconditional bias of the Nadarya-Watson (Nadaraya, 1964; Watson, 1964) estimator for a multivariate regression. Surprisingly, previous attempts in establishing this result are either imprecise and technically deficient, or of limited use given the assumptions imposed (see inter alia, Glad (1998), Mack and Müller (1988), Pagan and Ullah (1999), and Scott (2015)). The results are also often conflicting (see inter alia, Choi et al. (2000), Chu and Marron (1991), Collomb (1981), and Glad (1998)). Unfortunately, our result here is incomplete, but we highlight the issues and suggest further ideas to resolve them

CU Scholar Institutional Repository

Active Keyword Selection to Track Evolving Topics on Twitter

Author: Lévy Sacha
Pelrine Kellin
Poursafaei Farimah
Rabbany Reihaneh
Publication venue
Publication date: 22/09/2022
Field of study

How can we study social interactions on evolving topics at a mass scale? Over the past decade, researchers from diverse fields such as economics, political science, and public health have often done this by querying Twitter's public API endpoints with hand-picked topical keywords to search or stream discussions. However, despite the API's accessibility, it remains difficult to select and update keywords to collect high-quality data relevant to topics of interest. In this paper, we propose an active learning method for rapidly refining query keywords to increase both the yielded topic relevance and dataset size. We leverage a large open-source COVID-19 Twitter dataset to illustrate the applicability of our method in tracking Tweets around the key sub-topics of Vaccine, Mask, and Lockdown. Our experiments show that our method achieves an average topic-related keyword recall 2x higher than baselines. We open-source our code along with a web interface for keyword selection to make data collection from Twitter more systematic for researchers.Comment: 10 pages, 3 figure

arXiv.org e-Print Archive

Towards Reliable Misinformation Mitigation: Generalization, Uncertainty, and GPT-4

Author: Christoph Joel
Gupta Caleb
Pelrine Kellin
Rabbany Reihaneh
Reksoprodjo Meilina
Publication venue
Publication date: 24/05/2023
Field of study

Misinformation poses a critical societal challenge, and current approaches have yet to produce an effective solution. We propose focusing on generalization, soft classification, and leveraging recent large language models to create more practical tools in contexts where perfect predictions remain unattainable. We begin by demonstrating that GPT-4 and other language models can outperform existing methods in the literature. Next, we explore their generalization, revealing that GPT-4 and RoBERTa-large exhibit critical differences in failure modes, which offer potential for significant performance improvements. Finally, we show that these models can be employed in soft classification frameworks to better quantify uncertainty. We find that models with inferior hard classification results can achieve superior soft classification performance. Overall, this research lays groundwork for future tools that can drive real-world progress on misinformation

arXiv.org e-Print Archive

Combining Confidence Elicitation and Sample-based Methods for Uncertainty Quantification in Misinformation Mitigation

Author: Godbout Jean-François
Pelrine Kellin
Rabbany Reihaneh
Rivera Mauricio
Publication venue
Publication date: 30/01/2024
Field of study

Large Language Models have emerged as prime candidates to tackle misinformation mitigation. However, existing approaches struggle with hallucinations and overconfident predictions. We propose an uncertainty quantification framework that leverages both direct confidence elicitation and sampled-based consistency methods to provide better calibration for NLP misinformation mitigation solutions. We first investigate the calibration of sample-based consistency methods that exploit distinct features of consistency across sample sizes and stochastic levels. Next, we evaluate the performance and distributional shift of a robust numeric verbalization prompt across single vs. two-step confidence elicitation procedure. We also compare the performance of the same prompt with different versions of GPT and different numerical scales. Finally, we combine the sample-based consistency and verbalized methods to propose a hybrid framework that yields a better uncertainty estimation for GPT models. Overall, our work proposes novel uncertainty quantification methods that will improve the reliability of Large Language Models in misinformation mitigation applications.Comment: 12 pages, 11 figure

arXiv.org e-Print Archive

Comparing GPT-4 and Open-Source Language Models in Misinformation Mitigation

Author: Godbout Jean-Francois
Pelrine Kellin
Rabbany Reihaneh
Vergho Tyler
Publication venue
Publication date: 12/01/2024
Field of study

Recent large language models (LLMs) have been shown to be effective for misinformation detection. However, the choice of LLMs for experiments varies widely, leading to uncertain conclusions. In particular, GPT-4 is known to be strong in this domain, but it is closed source, potentially expensive, and can show instability between different versions. Meanwhile, alternative LLMs have given mixed results. In this work, we show that Zephyr-7b presents a consistently viable alternative, overcoming key limitations of commonly used approaches like Llama-2 and GPT-3.5. This provides the research community with a solid open-source option and shows open-source models are gradually catching up on this task. We then highlight how GPT-3.5 exhibits unstable performance, such that this very widely used model could provide misleading results in misinformation detection. Finally, we validate new tools including approaches to structured output and the latest version of GPT-4 (Turbo), showing they do not compromise performance, thus unlocking them for future research and potentially enabling more complex pipelines for misinformation mitigation

arXiv.org e-Print Archive

Open, Closed, or Small Language Models for Text Classification?

Author: Godbout Jean Francois
Pelrine Kellin
Rabbany Reihaneh
Yang Zachary
Yu Hao
Publication venue
Publication date: 19/08/2023
Field of study

Recent advancements in large language models have demonstrated remarkable capabilities across various NLP tasks. But many questions remain, including whether open-source models match closed ones, why these models excel or struggle with certain tasks, and what types of practical procedures can improve performance. We address these questions in the context of classification by evaluating three classes of models using eight datasets across three distinct tasks: named entity recognition, political party prediction, and misinformation detection. While larger LLMs often lead to improved performance, open-source models can rival their closed-source counterparts by fine-tuning. Moreover, supervised smaller models, like RoBERTa, can achieve similar or even greater performance in many datasets compared to generative LLMs. On the other hand, closed models maintain an advantage in hard tasks that demand the most generalizability. This study underscores the importance of model selection based on task requirementsComment: 14 pages, 15 Tables, 1 Figur

arXiv.org e-Print Archive

Quantifying learning-style adaptation in effectiveness of LLM teaching

Author: De Castilho Gabrielle Fidelis
Godbout Jean François
Pelrine Kellin
Rabbany Reihaneh
Weijers Ruben
Publication venue
Publication date: 01/01/2024
Field of study

This preliminary study aims to investigate whether AI, when prompted based on individual learning styles, can effectively improve comprehension and learning experiences in educational settings. It involves tailoring LLMs baseline prompts and comparing the results of a control group receiving standard content and an experimental group receiving learning styletailored content. Preliminary results suggest that GPT-4 can generate responses aligned with various learning styles, indicating the potential for enhanced engagement and comprehension. However, these results also reveal challenges, including the model's tendency for sycophantic behavior and variability in responses. Our findings suggest that a more sophisticated approach is required for integrating AI into education (AIEd) to improve educational outcomes

Utrecht University Repository

Adversarial Policies Beat Superhuman Go AIs

Author: Belrose Nora
Dennis Michael D.
Duan Yawen
Gleave Adam
Levine Sergey
Miller Joseph
Pelrine Kellin
Pogrebniak Viktor
Russell Stuart
Tseng Tom
Wang Tony T.
Publication venue
Publication date: 13/07/2023
Field of study

We attack the state-of-the-art Go-playing AI system KataGo by training adversarial policies against it, achieving a >97% win rate against KataGo running at superhuman settings. Our adversaries do not win by playing Go well. Instead, they trick KataGo into making serious blunders. Our attack transfers zero-shot to other superhuman Go-playing AIs, and is comprehensible to the extent that human experts can implement it without algorithmic assistance to consistently beat superhuman AIs. The core vulnerability uncovered by our attack persists even in KataGo agents adversarially trained to defend against our attack. Our results demonstrate that even superhuman AI systems may harbor surprising failure modes. Example games are available https://goattack.far.ai/.Comment: Accepted to ICML 2023, see paper for changelo

arXiv.org e-Print Archive