150 research outputs found

    Practical anonymization for data streams: z-anonymity and relation with k-anonymity

    Get PDF
    With the advent of big data and the emergence of data markets, preserving individuals’ privacy has become of utmost importance. The classical response to this need is anonymization, i.e., sanitizing the information that, directly or indirectly, can allow users’ re-identification. Among the various approaches, -anonymity provides a simple and easy-to-understand protection. However, -anonymity is challenging to achieve in a continuous stream of data and scales poorly when the number of attributes becomes high. In this paper, we study a novel anonymization property called -anonymity that we explicitly design to deal with data streams, i.e., where the decision to publish a given attribute (atomic information) is made in real time. The idea at the base of -anonymity is to release such attribute about a user only if at least other users have exposed the same attribute in a past time window. Depending on the value of , the output stream results -anonymized with a certain probability. To this end, we present a probabilistic model to map the -anonymity into the -anonymity property. The model is not only helpful in studying the -anonymity property, but also general enough to evaluate the probability of achieving -anonymity in data streams, resulting in a generic contribution

    On the Robustness of Topics API to a Re-Identification Attack

    Full text link
    Web tracking through third-party cookies is considered a threat to users' privacy and is supposed to be abandoned in the near future. Recently, Google proposed the Topics API framework as a privacy-friendly alternative for behavioural advertising. Using this approach, the browser builds a user profile based on navigation history, which advertisers can access. The Topics API has the possibility of becoming the new standard for behavioural advertising, thus it is necessary to fully understand its operation and find possible limitations. This paper evaluates the robustness of the Topics API to a re-identification attack where an attacker reconstructs the user profile by accumulating user's exposed topics over time to later re-identify the same user on a different website. Using real traffic traces and realistic population models, we find that the Topics API mitigates but cannot prevent re-identification to take place, as there is a sizeable chance that a user's profile is unique within a website's audience. Consequently, the probability of correct re-identification can reach 15-17%, considering a pool of 1,000 users. We offer the code and data we use in this work to stimulate further studies and the tuning of the Topic API parameters.Comment: Privacy Enhancing Technologies Symposium (PETS) 202

    LIPIcs, Volume 274, ESA 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 274, ESA 2023, Complete Volum

    Understanding and measuring privacy violations in Android apps

    Get PDF
    Increasing data collection and tracking of consumers by today’s online services is becoming a major problem for individuals’ rights. It raises a serious question about whether such data collection can be legally justified under legislation around the globe. Unfortunately, the community lacks insight into such violations in the mobile ecosystem. In this dissertation, we approach these problems by presenting a line of work that provides a comprehensive understanding of privacy violations in Android apps in the wild and automatically measures such violations at scale. First, we build an automated tool that detects unexpected data access based on user perception when interacting with the apps’ user interface. Subsequently, we perform a large-scale study on Android apps to understand how prevalent violations of GDPR’s explicit consent requirement are in the wild. Finally, until now, no study has systematically analyzed the currently implemented consent notices and whether they conform to GDPR in mobile apps. Therefore, we propose a mostly automated and scalable approach to identify the current practices of implemented consent notices. We then develop an automatic tool that detects data sent out to the Internet with different consent conditions. Our result shows the urgent need for more transparent user interface designs to better inform users of data access and call for new tools to support app developers in this endeavor.Die zunehmende Datenerfassung und Verfolgung von Konsumenten durch die heutigen Online-Dienste wird zu einem großen Problem für individuelle Rechte. Es wirft eine ernsthafte Frage auf, ob eine solche Datenerfassung nach der weltweiten Gesetzgebung juristisch begründet werden kann. Leider hat die Gemeinschaft keinen Einblick in diese Verstöße im mobilen Ökosystem. In dieser Dissertation nähern wir uns diesen Problemen, indem wir eine Arbeitslinie vorstellen, die ein umfassendes Verständnis von Datenschutzverletzungen in Android- Apps in der Praxis bietet und solche Verstöße automatisch misst. Zunächst entwickeln wir ein automatisiertes Tool, das unvorhergesehene Datenzugriffe basierend auf der Nutzung der Benutzeroberfläche von Apps erkennt. Danach führen wir eine umfangreiche Studie zu Android-Apps durch, um zu verstehen, wie häufig Verstöße gegen die ausdrückliche Zustimmung der GDPR vorkommen. Schließlich hat bis jetzt keine Studie systematisch die gegenwärtig implementierten Zustimmungen und deren Übereinstimmung mit der GDPR in mobilen Apps analysiert. Daher schlagen wir einen meist automatisierten und skalierbaren Ansatz vor, um die aktuellen Praktiken von Zustimmungen zu identifizieren. Danach entwickeln wir ein Tool, das Daten erkennt, die mit unterschiedlichen Zustimmungsbedingungen ins Internet gesendet werden. Unser Ergebnis zeigt den dringenden Bedarf an einer transparenteren Gestaltung von Benutzeroberflächen, um die Nutzer besser über den Datenzugriff zu informieren, und wir fordern neue Tools, die App-Entwickler bei diesem Unterfangen unterstützen. ii

    Detection and Mitigation of Steganographic Malware

    Get PDF
    A new attack trend concerns the use of some form of steganography and information hiding to make malware stealthier and able to elude many standard security mechanisms. Therefore, this Thesis addresses the detection and the mitigation of this class of threats. In particular, it considers malware implementing covert communications within network traffic or cloaking malicious payloads within digital images. The first research contribution of this Thesis is in the detection of network covert channels. Unfortunately, the literature on the topic lacks of real traffic traces or attack samples to perform precise tests or security assessments. Thus, a propaedeutic research activity has been devoted to develop two ad-hoc tools. The first allows to create covert channels targeting the IPv6 protocol by eavesdropping flows, whereas the second allows to embed secret data within arbitrary traffic traces that can be replayed to perform investigations in realistic conditions. This Thesis then starts with a security assessment concerning the impact of hidden network communications in production-quality scenarios. Results have been obtained by considering channels cloaking data in the most popular protocols (e.g., TLS, IPv4/v6, and ICMPv4/v6) and showcased that de-facto standard intrusion detection systems and firewalls (i.e., Snort, Suricata, and Zeek) are unable to spot this class of hazards. Since malware can conceal information (e.g., commands and configuration files) in almost every protocol, traffic feature or network element, configuring or adapting pre-existent security solutions could be not straightforward. Moreover, inspecting multiple protocols, fields or conversations at the same time could lead to performance issues. Thus, a major effort has been devoted to develop a suite based on the extended Berkeley Packet Filter (eBPF) to gain visibility over different network protocols/components and to efficiently collect various performance indicators or statistics by using a unique technology. This part of research allowed to spot the presence of network covert channels targeting the header of the IPv6 protocol or the inter-packet time of generic network conversations. In addition, the approach based on eBPF turned out to be very flexible and also allowed to reveal hidden data transfers between two processes co-located within the same host. Another important contribution of this part of the Thesis concerns the deployment of the suite in realistic scenarios and its comparison with other similar tools. Specifically, a thorough performance evaluation demonstrated that eBPF can be used to inspect traffic and reveal the presence of covert communications also when in the presence of high loads, e.g., it can sustain rates up to 3 Gbit/s with commodity hardware. To further address the problem of revealing network covert channels in realistic environments, this Thesis also investigates malware targeting traffic generated by Internet of Things devices. In this case, an incremental ensemble of autoencoders has been considered to face the ''unknown'' location of the hidden data generated by a threat covertly exchanging commands towards a remote attacker. The second research contribution of this Thesis is in the detection of malicious payloads hidden within digital images. In fact, the majority of real-world malware exploits hiding methods based on Least Significant Bit steganography and some of its variants, such as the Invoke-PSImage mechanism. Therefore, a relevant amount of research has been done to detect the presence of hidden data and classify the payload (e.g., malicious PowerShell scripts or PHP fragments). To this aim, mechanisms leveraging Deep Neural Networks (DNNs) proved to be flexible and effective since they can learn by combining raw low-level data and can be updated or retrained to consider unseen payloads or images with different features. To take into account realistic threat models, this Thesis studies malware targeting different types of images (i.e., favicons and icons) and various payloads (e.g., URLs and Ethereum addresses, as well as webshells). Obtained results showcased that DNNs can be considered a valid tool for spotting the presence of hidden contents since their detection accuracy is always above 90% also when facing ''elusion'' mechanisms such as basic obfuscation techniques or alternative encoding schemes. Lastly, when detection or classification are not possible (e.g., due to resource constraints), approaches enforcing ''sanitization'' can be applied. Thus, this Thesis also considers autoencoders able to disrupt hidden malicious contents without degrading the quality of the image

    Modeling of Advanced Threat Actors: Characterization, Categorization and Detection

    Full text link
    Tesis por compendio[ES] La información y los sistemas que la tratan son un activo a proteger para personas, organizaciones e incluso países enteros. Nuestra dependencia en las tecnologías de la información es cada día mayor, por lo que su seguridad es clave para nuestro bienestar. Los beneficios que estas tecnologías nos proporcionan son incuestionables, pero su uso también introduce riesgos que ligados a nuestra creciente dependencia de las mismas es necesario mitigar. Los actores hostiles avanzados se categorizan principalmente en grupos criminales que buscan un beneficio económico y en países cuyo objetivo es obtener superioridad en ámbitos estratégicos como el comercial o el militar. Estos actores explotan las tecnologías, y en particular el ciberespacio, para lograr sus objetivos. La presente tesis doctoral realiza aportaciones significativas a la caracterización de los actores hostiles avanzados y a la detección de sus actividades. El análisis de sus características es básico no sólo para conocer a estos actores y sus operaciones, sino para facilitar el despliegue de contramedidas que incrementen nuestra seguridad. La detección de dichas operaciones es el primer paso necesario para neutralizarlas, y por tanto para minimizar su impacto. En el ámbito de la caracterización, este trabajo profundiza en el análisis de las tácticas y técnicas de los actores. Dicho análisis siempre es necesario para una correcta detección de las actividades hostiles en el ciberespacio, pero en el caso de los actores avanzados, desde grupos criminales hasta estados, es obligatorio: sus actividades son sigilosas, ya que el éxito de las mismas se basa, en la mayor parte de casos, en no ser detectados por la víctima. En el ámbito de la detección, este trabajo identifica y justifica los requisitos clave para poder establecer una capacidad adecuada frente a los actores hostiles avanzados. Adicionalmente, proporciona las tácticas que deben ser implementadas en los Centros de Operaciones de Seguridad para optimizar sus capacidades de detección y respuesta. Debemos destacar que estas tácticas, estructuradas en forma de kill-chain, permiten no sólo dicha optimización, sino también una aproximación homogénea y estructurada común para todos los centros defensivos. En mi opinión, una de las bases de mi trabajo debe ser la aplicabilidad de los resultados. Por este motivo, el análisis de tácticas y técnicas de los actores de la amenaza está alineado con el principal marco de trabajo público para dicho análisis, MITRE ATT&CK. Los resultados y propuestas de esta investigación pueden ser directamente incluidos en dicho marco, mejorando así la caracterización de los actores hostiles y de sus actividades en el ciberespacio. Adicionalmente, las propuestas para mejorar la detección de dichas actividades son de aplicación directa tanto en los Centros de Operaciones de Seguridad actuales como en las tecnologías de detección más comunes en la industria. De esta forma, este trabajo mejora de forma significativa las capacidades de análisis y detección actuales, y por tanto mejora a su vez la neutralización de operaciones hostiles. Estas capacidades incrementan la seguridad global de todo tipo de organizaciones y, en definitiva, de nuestra sociedad.[CA] La informació i els sistemas que la tracten són un actiu a protegir per a persones, organitzacions i fins i tot països sencers. La nostra dependència en les tecnologies de la informació es cada dia major, i per aixó la nostra seguretat és clau per al nostre benestar. Els beneficis que aquestes tecnologies ens proporcionen són inqüestionables, però el seu ús també introdueix riscos que, lligats a la nostra creixent dependència de les mateixes és necessari mitigar. Els actors hostils avançats es categoritzen principalment en grups criminals que busquen un benefici econòmic i en països el objectiu dels quals és obtindre superioritat en àmbits estratègics, com ara el comercial o el militar. Aquests actors exploten les tecnologies, i en particular el ciberespai, per a aconseguir els seus objectius. La present tesi doctoral realitza aportacions significatives a la caracterització dels actors hostils avançats i a la detecció de les seves activitats. L'anàlisi de les seves característiques és bàsic no solament per a conéixer a aquests actors i les seves operacions, sinó per a facilitar el desplegament de contramesures que incrementen la nostra seguretat. La detección de aquestes operacions és el primer pas necessari per a netralitzar-les, i per tant, per a minimitzar el seu impacte. En l'àmbit de la caracterització, aquest treball aprofundeix en l'anàlisi de lestàctiques i tècniques dels actors. Aquesta anàlisi sempre és necessària per a una correcta detecció de les activitats hostils en el ciberespai, però en el cas dels actors avançats, des de grups criminals fins a estats, és obligatòria: les seves activitats són sigiloses, ja que l'éxit de les mateixes es basa, en la major part de casos, en no ser detectats per la víctima. En l'àmbit de la detecció, aquest treball identifica i justifica els requisits clau per a poder establir una capacitat adequada front als actors hostils avançats. Adicionalment, proporciona les tàctiques que han de ser implementades en els Centres d'Operacions de Seguretat per a optimitzar les seves capacitats de detecció i resposta. Hem de destacar que aquestes tàctiques, estructurades en forma de kill-chain, permiteixen no només aquesta optimització, sinò tambié una aproximació homogènia i estructurada comú per a tots els centres defensius. En la meva opinio, una de les bases del meu treball ha de ser l'aplicabilitat dels resultats. Per això, l'anàlisi de táctiques i tècniques dels actors de l'amenaça està alineada amb el principal marc públic de treball per a aquesta anàlisi, MITRE ATT&CK. Els resultats i propostes d'aquesta investigació poden ser directament inclosos en aquest marc, millorant així la caracterització dels actors hostils i les seves activitats en el ciberespai. Addicionalment, les propostes per a millorar la detecció d'aquestes activitats són d'aplicació directa tant als Centres d'Operacions de Seguretat actuals com en les tecnologies de detecció més comuns de la industria. D'aquesta forma, aquest treball millora de forma significativa les capacitats d'anàlisi i detecció actuals, i per tant millora alhora la neutralització d'operacions hostils. Aquestes capacitats incrementen la seguretat global de tot tipus d'organitzacions i, en definitiva, de la nostra societat.[EN] Information and its related technologies are a critical asset to protect for people, organizations and even whole countries. Our dependency on information technologies increases every day, so their security is a key issue for our wellness. The benefits that information technologies provide are questionless, but their usage also presents risks that, linked to our growing dependency on technologies, we must mitigate. Advanced threat actors are mainly categorized in criminal gangs, with an economic goal, and countries, whose goal is to gain superiority in strategic affairs such as commercial or military ones. These actors exploit technologies, particularly cyberspace, to achieve their goals. This PhD Thesis significantly contributes to advanced threat actors' categorization and to the detection of their hostile activities. The analysis of their features is a must not only to know better these actors and their operations, but also to ease the deployment of countermeasures that increase our security. The detection of these operations is a mandatory first step to neutralize them, so to minimize their impact. Regarding characterization, this work delves into the analysis of advanced threat actors' tactics and techniques. This analysis is always required for an accurate detection of hostile activities in cyberspace, but in the particular case of advances threat actors, from criminal gangs to nation-states, it is mandatory: their activities are stealthy, as their success in most cases relies on not being detected by the target. Regarding detection, this work identifies and justifies the key requirements to establish an accurate response capability to face advanced threat actors. In addition, this work defines the tactics to be deployed in Security Operations Centers to optimize their detection and response capabilities. It is important to highlight that these tactics, with a kill-chain arrangement, allow not only this optimization, but particularly a homogeneous and structured approach, common to all defensive centers. In my opinion, one of the main bases of my work must be the applicability of its results. For this reason, the analysis of threat actors' tactics and techniques is aligned with the main public framework for this analysis, MITRE ATT&CK. The results and proposals from this research can be directly included in this framework, improving the threat actors' characterization, as well as their cyberspace activities' one. In addition, the proposals to improve these activities' detection are directly applicable both in current Security Operations Centers and in common industry technologies. In this way, I consider that this work significantly improves current analysis and detection capabilities, and at the same time it improves hostile operations' neutralization. These capabilities increase global security for all kind of organizations and, definitely, for our whole society.Villalón Huerta, A. (2023). Modeling of Advanced Threat Actors: Characterization, Categorization and Detection [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/193855Compendi

    “And all the pieces matter...” Hybrid Testing Methods for Android App's Privacy Analysis

    Get PDF
    Smartphones have become inherent to the every day life of billions of people worldwide, and they are used to perform activities such as gaming, interacting with our peers or working. While extremely useful, smartphone apps also have drawbacks, as they can affect the security and privacy of users. Android devices hold a lot of personal data from users, including their social circles (e.g., contacts), usage patterns (e.g., app usage and visited websites) and their physical location. Like in most software products, Android apps often include third-party code (Software Development Kits or SDKs) to include functionality in the app without the need to develop it in-house. Android apps and third-party components embedded in them are often interested in accessing such data, as the online ecosystem is dominated by data-driven business models and revenue streams like advertising. The research community has developed many methods and techniques for analyzing the privacy and security risks of mobile apps, mostly relying on two techniques: static code analysis and dynamic runtime analysis. Static analysis analyzes the code and other resources of an app to detect potential app behaviors. While this makes static analysis easier to scale, it has other drawbacks such as missing app behaviors when developers obfuscate the app’s code to avoid scrutiny. Furthermore, since static analysis only shows potential app behavior, this needs to be confirmed as it can also report false positives due to dead or legacy code. Dynamic analysis analyzes the apps at runtime to provide actual evidence of their behavior. However, these techniques are harder to scale as they need to be run on an instrumented device to collect runtime data. Similarly, there is a need to stimulate the app, simulating real inputs to examine as many code-paths as possible. While there are some automatic techniques to generate synthetic inputs, they have been shown to be insufficient. In this thesis, we explore the benefits of combining static and dynamic analysis techniques to complement each other and reduce their limitations. While most previous work has often relied on using these techniques in isolation, we combine their strengths in different and novel ways that allow us to further study different privacy issues on the Android ecosystem. Namely, we demonstrate the potential of combining these complementary methods to study three inter-related issues: • A regulatory analysis of parental control apps. We use a novel methodology that relies on easy-to-scale static analysis techniques to pin-point potential privacy issues and violations of current legislation by Android apps and their embedded SDKs. We rely on the results from our static analysis to inform the way in which we manually exercise the apps, maximizing our ability to obtain real evidence of these misbehaviors. We study 46 publicly available apps and find instances of data collection and sharing without consent and insecure network transmissions containing personal data. We also see that these apps fail to properly disclose these practices in their privacy policy. • A security analysis of the unauthorized access to permission-protected data without user consent. We use a novel technique that combines the strengths of static and dynamic analysis, by first comparing the data sent by applications at runtime with the permissions granted to each app in order to find instances of potential unauthorized access to permission protected data. Once we have discovered the apps that are accessing personal data without permission, we statically analyze their code in order to discover covert- and side-channels used by apps and SDKs to circumvent the permission system. This methodology allows us to discover apps using the MAC address as a surrogate for location data, two SDKs using the external storage as a covert-channel to share unique identifiers and an app using picture metadata to gain unauthorized access to location data. • A novel SDK detection methodology that relies on obtaining signals observed both in the app’s code and static resources and during its runtime behavior. Then, we rely on a tree structure together with a confidence based system to accurately detect SDK presence without the need of any a priory knowledge and with the ability to discern whether a given SDK is part of legacy or dead code. We prove that this novel methodology can discover third-party SDKs with more accuracy than state-of-the-art tools both on a set of purpose-built ground-truth apps and on a dataset of 5k publicly available apps. With these three case studies, we are able to highlight the benefits of combining static and dynamic analysis techniques for the study of the privacy and security guarantees and risks of Android apps and third-party SDKs. The use of these techniques in isolation would not have allowed us to deeply investigate these privacy issues, as we would lack the ability to provide real evidence of potential breaches of legislation, to pin-point the specific way in which apps are leveraging cover and side channels to break Android’s permission system or we would be unable to adapt to an ever-changing ecosystem of Android third-party companies.The works presented in this thesis were partially funded within the framework of the following projects and grants: • European Union’s Horizon 2020 Innovation Action program (Grant Agreement No. 786741, SMOOTH Project and Grant Agreement No. 101021377, TRUST AWARE Project). • Spanish Government ODIO NºPID2019-111429RB-C21/PID2019-111429RBC22. • The Spanish Data Protection Agency (AEPD) • AppCensus Inc.This work has been supported by IMDEA Networks InstitutePrograma de Doctorado en Ingeniería Telemática por la Universidad Carlos III de MadridPresidente: Srdjan Matic.- Secretario: Guillermo Suárez-Tangil.- Vocal: Ben Stoc
    corecore