16 research outputs found
Comparing Traditional and LLM-based Search for Consumer Choice: A Randomized Experiment
Recent advances in the development of large language models are rapidly
changing how online applications function. LLM-based search tools, for
instance, offer a natural language interface that can accommodate complex
queries and provide detailed, direct responses. At the same time, there have
been concerns about the veracity of the information provided by LLM-based tools
due to potential mistakes or fabrications that can arise in algorithmically
generated text. In a set of online experiments we investigate how LLM-based
search changes people's behavior relative to traditional search, and what can
be done to mitigate overreliance on LLM-based output. Participants in our
experiments were asked to solve a series of decision tasks that involved
researching and comparing different products, and were randomly assigned to do
so with either an LLM-based search tool or a traditional search engine. In our
first experiment, we find that participants using the LLM-based tool were able
to complete their tasks more quickly, using fewer but more complex queries than
those who used traditional search. Moreover, these participants reported a more
satisfying experience with the LLM-based search tool. When the information
presented by the LLM was reliable, participants using the tool made decisions
with a comparable level of accuracy to those using traditional search, however
we observed overreliance on incorrect information when the LLM erred. Our
second experiment further investigated this issue by randomly assigning some
users to see a simple color-coded highlighting scheme to alert them to
potentially incorrect or misleading information in the LLM responses. Overall
we find that this confidence-based highlighting substantially increases the
rate at which users spot incorrect information, improving the accuracy of their
overall decisions while leaving most other measures unaffected
Recommended from our members
A 680,000-person megastudy of nudges to encourage vaccination in pharmacies
Encouraging vaccination is a pressing policy problem. To assess whether text-based reminders can encourage pharmacy vaccination and what kinds of messages work best, we conducted a megastudy. We randomly assigned 689,693 Walmart pharmacy patients to receive one of 22 different text reminders using a variety of different behavioral science principles to nudge flu vaccination or to a business-as-usual control condition that received no messages. We found that the reminder texts that we tested increased pharmacy vaccination rates by an average of 2.0 percentage points, or 6.8%, over a 3-mo follow-up period. The most-effective messages reminded patients that a flu shot was waiting for them and delivered reminders on multiple days. The top-performing intervention included two texts delivered 3 d apart and communicated to patients that a vaccine was “waiting for you.” Neither experts nor lay people anticipated that this would be the best-performing treatment, underscoring the value of simultaneously testing many different nudges in a highly powered megastudy
Online and Social Media Data As an Imperfect Continuous Panel Survey.
There is a large body of research on utilizing online activity as a survey of political opinion to predict real world election outcomes. There is considerably less work, however, on using this data to understand topic-specific interest and opinion amongst the general population and specific demographic subgroups, as currently measured by relatively expensive surveys. Here we investigate this possibility by studying a full census of all Twitter activity during the 2012 election cycle along with the comprehensive search history of a large panel of Internet users during the same period, highlighting the challenges in interpreting online and social media activity as the results of a survey. As noted in existing work, the online population is a non-representative sample of the offline world (e.g., the U.S. voting population). We extend this work to show how demographic skew and user participation is non-stationary and difficult to predict over time. In addition, the nature of user contributions varies substantially around important events. Furthermore, we note subtle problems in mapping what people are sharing or consuming online to specific sentiment or opinion measures around a particular topic. We provide a framework, built around considering this data as an imperfect continuous panel survey, for addressing these issues so that meaningful insight about public interest and opinion can be reliably extracted from online and social media data
Decathlon performance development
Title: Decathlon performance development Objectives: The aim of this thesis was the evaluation of the performance development in combined events during the period of 2009-2016 based on the results of world competitions and variability of the point structure of the individual decathlon. For these purposes were used data from world competitions that were afterwards compared in the way of winning performances and other atheletes. Methods: In this thesis was used the method of comparison and analysis of the best performances with the averages of the performance values of the first five best decathletes. Results: The results of this work concluded that the development of performance in the monitored period did not change significantly. As in previous years, athletes with outstanding sprinter-jumping results dominate in combined events, without obvious point losses in throwing disciplines. The development of performance was significantly influenced by Ashton Eaton and Trey Hardee, whose performance exceeded 8700 points. From the individual performance development of American decathlon athletes, Ashton Eaton confirmed the high efficiency of his best performance progress during the followed period. The difference in point scoring between the winners and other athletes were similar that showed high evenness..
Top terms co-occurring with “Obama” on Twitter, from August 1, 2012 through November 6, 2012 (Election Day), complied daily.
<p><i>Note</i>: Term rows are arranged by time series clustering.</p
Percent of Twitter discussion about the presidential candidates conducted by males from October 1, 2012 through November 6, 2012 (Election Day), complied daily.
<p><i>Note</i>: Each line combines any text that contains the terms Obama, Romney, or both on any of the two mediums. The top line is the same line as the right line from <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0145406#pone.0145406.g002" target="_blank">Fig 2</a>.</p
Percent of Twitter discussion about the presidential candidates conducted by geographical division from August 1, 2012 through November 6, 2012 (Election Day), complied daily.
<p><i>Note</i>: Each line combines any text that contains the terms Obama, Romney, or both on any of the two mediums.</p
Percent of overall search, Twitter, and print media discussion about the Obama, Romney, and both from August 1, 2012 through November 6, 2012 (Election Day), complied daily.
<p><i>Note</i>: Each line sums up the total discussion about all three of the categories separately on any given day and divides by the total discussion.</p
Median days since the last election-related tweet by the same person for any given tweet on that day.
<p><i>Note</i>: October 3 is actually infinity in that the median tweet was created by a person who did not tweet about the election in our sample.</p
Percent of Twitter discussion about Obama, Romney, and both that contain a URL from August 1, 2012 through November 6, 2012 (Election Day), complied daily.
<p><i>Note</i>: Each line shows the percentage of tweets with a URL about all three of the categories separately on any given day.</p