7,481 research outputs found
You can't see what you can't see: Experimental evidence for how much relevant information may be missed due to Google's Web search personalisation
The influence of Web search personalisation on professional knowledge work is
an understudied area. Here we investigate how public sector officials
self-assess their dependency on the Google Web search engine, whether they are
aware of the potential impact of algorithmic biases on their ability to
retrieve all relevant information, and how much relevant information may
actually be missed due to Web search personalisation. We find that the majority
of participants in our experimental study are neither aware that there is a
potential problem nor do they have a strategy to mitigate the risk of missing
relevant information when performing online searches. Most significantly, we
provide empirical evidence that up to 20% of relevant information may be missed
due to Web search personalisation. This work has significant implications for
Web research by public sector professionals, who should be provided with
training about the potential algorithmic biases that may affect their judgments
and decision making, as well as clear guidelines how to minimise the risk of
missing relevant information.Comment: paper submitted to the 11th Intl. Conf. on Social Informatics;
revision corrects error in interpretation of parameter Psi/p in RBO resulting
from discrepancy between the documentation of the implementation in R
(https://rdrr.io/bioc/gespeR/man/rbo.html) and the original definition
(https://dl.acm.org/citation.cfm?id=1852106) as per 20/05/201
Platforms, the First Amendment and Online Speech: Regulating the Filters
In recent years, online platforms have given rise to multiple discussions about what their role is, what their role should be, and whether they should be regulated. The complex nature of these private entities makes it very challenging to place them in a single descriptive category with existing rules. In today’s information environment, social media platforms have become a platform press by providing hosting as well as navigation and delivery of public expression, much of which is done through machine learning algorithms. This article argues that there is a subset of algorithms that social media platforms use to filter public expression, which can be regulated without constitutional objections. A distinction is drawn between algorithms that curate speech for hosting purposes and those that curate for navigation purposes, and it is argued that content navigation algorithms, because of their function, deserve separate constitutional treatment. By analyzing the platforms’ functions independently from one another, this paper constructs a doctrinal and normative framework that can be used to navigate some of the complexity.
The First Amendment makes it problematic to interfere with how platforms decide what to host because algorithms that implement content moderation policies perform functions analogous to an editorial role when deciding whether content should be censored or allowed on the platform. Content navigation algorithms, on the other hand, do not face the same doctrinal challenges; they operate outside of the public discourse as mere information conduits and are thus not subject to core First Amendment doctrine. Their function is to facilitate the flow of information to an audience, which in turn participates in public discourse; if they have any constitutional status, it is derived from the value they provide to their audience as a delivery mechanism of information.
This article asserts that we should regulate content navigation algorithms to an extent. They undermine the notion of autonomous choice in the selection and consumption of content, and their role in today’s information environment is not aligned with a functioning marketplace of ideas and the prerequisites for citizens in a democratic society to perform their civic duties. The paper concludes that any regulation directed to content navigation algorithms should be subject to a lower standard of scrutiny, similar to the standard for commercial speech
Quantifying Biases in Online Information Exposure
Our consumption of online information is mediated by filtering, ranking, and
recommendation algorithms that introduce unintentional biases as they attempt
to deliver relevant and engaging content. It has been suggested that our
reliance on online technologies such as search engines and social media may
limit exposure to diverse points of view and make us vulnerable to manipulation
by disinformation. In this paper, we mine a massive dataset of Web traffic to
quantify two kinds of bias: (i) homogeneity bias, which is the tendency to
consume content from a narrow set of information sources, and (ii) popularity
bias, which is the selective exposure to content from top sites. Our analysis
reveals different bias levels across several widely used Web platforms. Search
exposes users to a diverse set of sources, while social media traffic tends to
exhibit high popularity and homogeneity bias. When we focus our analysis on
traffic to news sites, we find higher levels of popularity bias, with smaller
differences across applications. Overall, our results quantify the extent to
which our choices of online systems confine us inside "social bubbles."Comment: 25 pages, 10 figures, to appear in the Journal of the Association for
Information Science and Technology (JASIST
Scraping the Social? Issues in live social research
What makes scraping methodologically interesting for social and cultural research? This paper seeks to contribute to debates about digital social research by exploring how a ‘medium-specific’ technique for online data capture may be rendered analytically productive for social research. As a device that is currently being imported into social research, scraping has the capacity to re-structure social research, and this in at least two ways. Firstly, as a technique that is not native to social research, scraping risks to introduce ‘alien’ methodological assumptions into social research (such as an pre-occupation with freshness). Secondly, to scrape is to risk importing into our inquiry categories that are prevalent in the social practices enabled by the media: scraping makes available already formatted data for social research. Scraped data, and online social data more generally, tend to come with ‘external’ analytics already built-in. This circumstance is often approached as a ‘problem’ with online data capture, but we propose it may be turned into virtue, insofar as data formats that have currency in the areas under scrutiny may serve as a source of social data themselves. Scraping, we propose, makes it possible to render traffic between the object and process of social research analytically productive. It enables a form of ‘real-time’ social research, in which the formats and life cycles of online data may lend structure to the analytic objects and findings of social research. By way of a conclusion, we demonstrate this point in an exercise of online issue profiling, and more particularly, by relying on Twitter to profile the issue of ‘austerity’. Here we distinguish between two forms of real-time research, those dedicated to monitoring live content (which terms are current?) and those concerned with analysing the liveliness of issues (which topics are happening?)
Auditing News Curation Systems: A Case Study Examining Algorithmic and Editorial Logic in Apple News
This work presents an audit study of Apple News as a sociotechnical news
curation system that exercises gatekeeping power in the media. We examine the
mechanisms behind Apple News as well as the content presented in the app,
outlining the social, political, and economic implications of both aspects. We
focus on the Trending Stories section, which is algorithmically curated, and
the Top Stories section, which is human-curated. Results from a crowdsourced
audit showed minimal content personalization in the Trending Stories section,
and a sock-puppet audit showed no location-based content adaptation. Finally,
we perform an extended two-month data collection to compare the human-curated
Top Stories section with the algorithmically curated Trending Stories section.
Within these two sections, human curation outperformed algorithmic curation in
several measures of source diversity, concentration, and evenness. Furthermore,
algorithmic curation featured more "soft news" about celebrities and
entertainment, while editorial curation featured more news about policy and
international events. To our knowledge, this study provides the first
data-backed characterization of Apple News in the United States.Comment: Preprint, to appear in Proceedings of the Fourteenth International
AAAI Conference on Web and Social Media (ICWSM 2020
Listening between the Lines: Learning Personal Attributes from Conversations
Open-domain dialogue agents must be able to converse about many topics while
incorporating knowledge about the user into the conversation. In this work we
address the acquisition of such knowledge, for personalization in downstream
Web applications, by extracting personal attributes from conversations. This
problem is more challenging than the established task of information extraction
from scientific publications or Wikipedia articles, because dialogues often
give merely implicit cues about the speaker. We propose methods for inferring
personal attributes, such as profession, age or family status, from
conversations using deep learning. Specifically, we propose several Hidden
Attribute Models, which are neural networks leveraging attention mechanisms and
embeddings. Our methods are trained on a per-predicate basis to output rankings
of object values for a given subject-predicate combination (e.g., ranking the
doctor and nurse professions high when speakers talk about patients, emergency
rooms, etc). Experiments with various conversational texts including Reddit
discussions, movie scripts and a collection of crowdsourced personal dialogues
demonstrate the viability of our methods and their superior performance
compared to state-of-the-art baselines.Comment: published in WWW'1
Platform Advocacy and the Threat to Deliberative Democracy
Businesses have long tried to influence political outcomes, but today, there is a new and potent form of corporate political power—Platform Advocacy. Internet-based platforms, such as Facebook, Google, and Uber, mobilize their user bases through direct solicitation of support and the more troubling exploitation of irrational behavior. Platform Advocacy helps platforms push policy agendas that create favorable legal environments for themselves, thereby strengthening their own dominance in the marketplace. This new form of advocacy will have radical effects on deliberative democracy.
In the age of constant digital noise and uncertainty, it is more important than ever to detect and analyze new forms of political power. This Article will contribute to our understanding of one such new form and provide a way forward to ensure the exceptional power of platforms do not improperly influence consumers and, by extension, lawmakers
Recommended from our members
My friends, editors, algorithms, and I: Examining audience attitudes to news selection
Prompted by the ongoing development of content personalization by social networks and mainstream news brands, and recent debates about balancing algorithmic and editorial selection, this study explores what audiences think about news selection mechanisms and why. Analysing data from a 26-country survey (N=53,314), we report the extent to which audiences believe story selection by editors and story selection by algorithms are good ways to get news online and, using multi-level models, explore the relationships that exist between individuals’ characteristics and those beliefs. The results show that, collectively, audiences believe algorithmic selection guided by a user’s past consumption behaviour is a better way to get news than editorial curation. There are, however, significant variations in these beliefs at the individual level. Age, trust in news, concerns about privacy, mobile news access, paying for news, and six other variables had effects. Our results are partly in line with current general theory on algorithmic appreciation, but diverge in our findings on the relative appreciation of algorithms and experts, and in how the appreciation of algorithms can differ according to the data that drive them. We believe this divergence is partly due to our study’s focus on news, showing algorithmic appreciation has context-specific characteristics
- …