Recent legislation required AI platforms to provide APIs for regulators to
assess their compliance with the law. Research has nevertheless shown that
platforms can manipulate their API answers through fairwashing. Facing this
threat for reliable auditing, this paper studies the benefits of the joint use
of platform scraping and of APIs. In this setup, we elaborate on the use of
scraping to detect manipulated answers: since fairwashing only manipulates API
answers, exploiting scraps may reveal a manipulation. To abstract the wide
range of specific API-scrap situations, we introduce a notion of proxy that
captures the consistency an auditor might expect between both data sources. If
the regulator has a good proxy of the consistency, then she can easily detect
manipulation and even bypass the API to conduct her audit. On the other hand,
without a good proxy, relying on the API is necessary, and the auditor cannot
defend against fairwashing.
We then simulate practical scenarios in which the auditor may mostly rely on
the API to conveniently conduct the audit task, while maintaining her chances
to detect a potential manipulation. To highlight the tension between the audit
task and the API fairwashing detection task, we identify Pareto-optimal
strategies in a practical audit scenario.
We believe this research sets the stage for reliable audits in practical and
manipulation-prone setups.Comment: 18 pages, 7 figure