Search CORE

16 research outputs found

Comparing Traditional and LLM-based Search for Consumer Choice: A Randomized Experiment

Author: Goldstein Daniel G.
Hofman Jake M.
Rothschild David M.
Spatharioti Sofia Eleni
Publication venue
Publication date: 07/07/2023
Field of study

Recent advances in the development of large language models are rapidly changing how online applications function. LLM-based search tools, for instance, offer a natural language interface that can accommodate complex queries and provide detailed, direct responses. At the same time, there have been concerns about the veracity of the information provided by LLM-based tools due to potential mistakes or fabrications that can arise in algorithmically generated text. In a set of online experiments we investigate how LLM-based search changes people's behavior relative to traditional search, and what can be done to mitigate overreliance on LLM-based output. Participants in our experiments were asked to solve a series of decision tasks that involved researching and comparing different products, and were randomly assigned to do so with either an LLM-based search tool or a traditional search engine. In our first experiment, we find that participants using the LLM-based tool were able to complete their tasks more quickly, using fewer but more complex queries than those who used traditional search. Moreover, these participants reported a more satisfying experience with the LLM-based search tool. When the information presented by the LLM was reliable, participants using the tool made decisions with a comparable level of accuracy to those using traditional search, however we observed overreliance on incorrect information when the LLM erred. Our second experiment further investigated this issue by randomly assigning some users to see a simple color-coded highlighting scheme to alert them to potentially incorrect or misleading information in the LLM responses. Overall we find that this confidence-based highlighting substantially increases the rate at which users spot incorrect information, improving the accuracy of their overall decisions while leaving most other measures unaffected

arXiv.org e-Print Archive

Recommended from our members

A 680,000-person megastudy of nudges to encourage vaccination in pharmacies

Author: Bogard Jonathan E.
Brody Ilana
Chabris Christopher F.
Chang Edward
Chapman Gretchen B.
Dannals Jennifer E.
Gandhi Linnea
Goldstein Noah J.
Goren Amir
Graci Heather N.
Gromet Dena M.
Hershfield Hal
Hirsch Alex
Ho Hung
Kay Joseph S.
Lee Timothy W.
Ludwig Jens
Milkman Katherine L.
Mullainathan Sendhil
Patel Mitesh S.
Rothschild Jake
Publication venue
Publication date: 09/11/2023
Field of study

Encouraging vaccination is a pressing policy problem. To assess whether text-based reminders can encourage pharmacy vaccination and what kinds of messages work best, we conducted a megastudy. We randomly assigned 689,693 Walmart pharmacy patients to receive one of 22 different text reminders using a variety of different behavioral science principles to nudge flu vaccination or to a business-as-usual control condition that received no messages. We found that the reminder texts that we tested increased pharmacy vaccination rates by an average of 2.0 percentage points, or 6.8%, over a 3-mo follow-up period. The most-effective messages reminded patients that a flu shot was waiting for them and delivered reminders on multiple days. The top-performing intervention included two texts delivered 3 d apart and communicated to patients that a vaccine was “waiting for you.” Neither experts nor lay people anticipated that this would be the best-performing treatment, underscoring the value of simultaneously testing many different nudges in a highly powered megastudy

Knowledge UChicago

Online and Social Media Data As an Imperfect Continuous Panel Survey.

Author: David Rothschild
Emre Kıcıman
Fernando Diaz
Jake M Hofman
Michael Gamon
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2016
Field of study

There is a large body of research on utilizing online activity as a survey of political opinion to predict real world election outcomes. There is considerably less work, however, on using this data to understand topic-specific interest and opinion amongst the general population and specific demographic subgroups, as currently measured by relatively expensive surveys. Here we investigate this possibility by studying a full census of all Twitter activity during the 2012 election cycle along with the comprehensive search history of a large panel of Internet users during the same period, highlighting the challenges in interpreting online and social media activity as the results of a survey. As noted in existing work, the online population is a non-representative sample of the offline world (e.g., the U.S. voting population). We extend this work to show how demographic skew and user participation is non-stationary and difficult to predict over time. In addition, the nature of user contributions varies substantially around important events. Furthermore, we note subtle problems in mapping what people are sharing or consuming online to specific sentiment or opinion measures around a particular topic. We provide a framework, built around considering this data as an imperfect continuous panel survey, for addressing these issues so that meaningful insight about public interest and opinion can be reliably extracted from online and social media data

Directory of Open Access Journals

PubMed Central

The Francis Crick Institute

Decathlon performance development

Author: David Rothschild (843634)
Emre Kıcıman (843633)
Fernando Diaz (5666392)
Jake M. Hofman (269642)
Michael Gamon (843632)
Publication venue
Publication date: 01/01/2017
Field of study

Title: Decathlon performance development Objectives: The aim of this thesis was the evaluation of the performance development in combined events during the period of 2009-2016 based on the results of world competitions and variability of the point structure of the individual decathlon. For these purposes were used data from world competitions that were afterwards compared in the way of winning performances and other atheletes. Methods: In this thesis was used the method of comparison and analysis of the best performances with the averages of the performance values of the first five best decathletes. Results: The results of this work concluded that the development of performance in the monitored period did not change significantly. As in previous years, athletes with outstanding sprinter-jumping results dominate in combined events, without obvious point losses in throwing disciplines. The development of performance was significantly influenced by Ashton Eaton and Trey Hardee, whose performance exceeded 8700 points. From the individual performance development of American decathlon athletes, Ashton Eaton confirmed the high efficiency of his best performance progress during the followed period. The difference in point scoring between the winners and other athletes were similar that showed high evenness..

National Repository of Grey Literature

The Francis Crick Institute

Top terms co-occurring with “Obama” on Twitter, from August 1, 2012 through November 6, 2012 (Election Day), complied daily.

Author: David Rothschild (843634)
Emre Kıcıman (843633)
Fernando Diaz (5666392)
Jake M. Hofman (269642)
Michael Gamon (843632)
Publication venue
Publication date
Field of study

Note: Term rows are arranged by time series clustering.</p

The Francis Crick Institute

Percent of Twitter discussion about the presidential candidates conducted by males from October 1, 2012 through November 6, 2012 (Election Day), complied daily.

Author: David Rothschild (843634)
Emre Kıcıman (843633)
Fernando Diaz (5666392)
Jake M. Hofman (269642)
Michael Gamon (843632)
Publication venue
Publication date
Field of study

Note: Each line combines any text that contains the terms Obama, Romney, or both on any of the two mediums. The top line is the same line as the right line from <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0145406#pone.0145406.g002" target="_blank">Fig 2</a>.</p

The Francis Crick Institute

Percent of Twitter discussion about the presidential candidates conducted by geographical division from August 1, 2012 through November 6, 2012 (Election Day), complied daily.

Author: David Rothschild (843634)
Emre Kıcıman (843633)
Fernando Diaz (5666392)
Jake M. Hofman (269642)
Michael Gamon (843632)
Publication venue
Publication date
Field of study

Note: Each line combines any text that contains the terms Obama, Romney, or both on any of the two mediums.</p

The Francis Crick Institute

Percent of overall search, Twitter, and print media discussion about the Obama, Romney, and both from August 1, 2012 through November 6, 2012 (Election Day), complied daily.

Author: David Rothschild (843634)
Emre Kıcıman (843633)
Fernando Diaz (5666392)
Jake M. Hofman (269642)
Michael Gamon (843632)
Publication venue
Publication date
Field of study

Note: Each line sums up the total discussion about all three of the categories separately on any given day and divides by the total discussion.</p

The Francis Crick Institute

Median days since the last election-related tweet by the same person for any given tweet on that day.

Author: David Rothschild (843634)
Emre Kıcıman (843633)
Fernando Diaz (5666392)
Jake M. Hofman (269642)
Michael Gamon (843632)
Publication venue
Publication date
Field of study

Note: October 3 is actually infinity in that the median tweet was created by a person who did not tweet about the election in our sample.</p

The Francis Crick Institute

Percent of Twitter discussion about Obama, Romney, and both that contain a URL from August 1, 2012 through November 6, 2012 (Election Day), complied daily.

Author: David Rothschild (843634)
Emre Kıcıman (843633)
Fernando Diaz (5666392)
Jake M. Hofman (269642)
Michael Gamon (843632)
Publication venue
Publication date
Field of study

Note: Each line shows the percentage of tweets with a URL about all three of the categories separately on any given day.</p

The Francis Crick Institute