The rapid growth of platforms for customizing Large Language
Models (LLMs), such as OpenAI’s GPTs, has raised new privacy
and security concerns, particularly related to the exposure of user
data via third-party API integrations in LLM apps.To assess privacy
risks and data practices, we conducted a large-scale analysis of
OpenAI’s GPTs ecosystem. Through the analysis of 5,286 GPTs
and the 44,102 parameters they use through API calls to exter nal services, we systematically investigated the types of user data
collected, as well as the completeness and discrepancies between
actual data flows and GPTs stated privacy policies. Our results high light that approximately 35% of API parameters enable the sharing
of sensitive or personally identifiable information, yet only 15%
of corresponding privacy policies provide complete disclosure. By
quantifying these discrepancies, our study exposes critical privacy
risks and underscores the need for stronger oversight and support
tools in LLM-based application development. Furthermore, we un cover widespread problematic practices among GPT creators, such
as missing or inaccurate privacy policies and a misunderstanding
of their privacy responsibilities. Building on these insights, we
propose design recommendations that include actionable measure ments to improve transparency and informed consent, enhance
creator responsibility, and strengthen regulation.This research was partially funded by the INCIBE’s strategic SPRINT (Seguridad y Privacidad en Sistemas con Inteligencia Artificial) C063/23 project with funds from the EU-NextGenerationEU through
the Spanish government’s Plan de Recuperación, Transformación
y Resiliencia; by the Spanish Government via project PID2023-
151536OB-I0; and by the Generalitat Valenciana via project PROMETEO
CIPROM/2023/23.Peer reviewe
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.