Personal Data Flows and Privacy Policy Traceability in Third-party LLM Apps in the GPT Ecosystem

Abstract

The rapid growth of platforms for customizing Large Language Models (LLMs), such as OpenAI’s GPTs, has raised new privacy and security concerns, particularly related to the exposure of user data via third-party API integrations in LLM apps.To assess privacy risks and data practices, we conducted a large-scale analysis of OpenAI’s GPTs ecosystem. Through the analysis of 5,286 GPTs and the 44,102 parameters they use through API calls to exter nal services, we systematically investigated the types of user data collected, as well as the completeness and discrepancies between actual data flows and GPTs stated privacy policies. Our results high light that approximately 35% of API parameters enable the sharing of sensitive or personally identifiable information, yet only 15% of corresponding privacy policies provide complete disclosure. By quantifying these discrepancies, our study exposes critical privacy risks and underscores the need for stronger oversight and support tools in LLM-based application development. Furthermore, we un cover widespread problematic practices among GPT creators, such as missing or inaccurate privacy policies and a misunderstanding of their privacy responsibilities. Building on these insights, we propose design recommendations that include actionable measure ments to improve transparency and informed consent, enhance creator responsibility, and strengthen regulation.This research was partially funded by the INCIBE’s strategic SPRINT (Seguridad y Privacidad en Sistemas con Inteligencia Artificial) C063/23 project with funds from the EU-NextGenerationEU through the Spanish government’s Plan de Recuperación, Transformación y Resiliencia; by the Spanish Government via project PID2023- 151536OB-I0; and by the Generalitat Valenciana via project PROMETEO CIPROM/2023/23.Peer reviewe

Similar works

Full text

thumbnail-image

Digital.CSIC

redirect
Last time updated on 30/12/2025

This paper was published in Digital.CSIC.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.

Licence: info:eu-repo/semantics/openAccess