    Change of Purpose - The effects of the Purpose Limitation Principle in the General Data Protection Regulation on Big Data Profiling

    Over the past few years, many companies have started to adopt Big Data technologies. Big Data is a method and technology that allows the collection and analysis of huge amounts of all kinds of data, mainly in digital form. Big Data can be used, for example, to create profiles of online shopping users to target ads. I call this Big Data Profiling. Facebook and Google, for example, are able to estimate attributes, such as gender, age and interests, from data provided by their users. This can be worrisome for many users who feel that their privacy is infringed when the Big Data Profiling companies, for example, are able to send advertisements to the users that are scarily relevant to them. Big Data Profiling relies on a vast amount of collected data. Often, at the time of collection, it is not clear how exactly this data will be used and analyzed. The new possibilities with Big Data Profiling have led to companies collecting as much data as possible, and then later figuring out how to extract value from this data. This model can be described as “collect-before select”, since the data is first collected, and then “mined” for correlations that can be used to profile users. In this thesis I analyze whether this form of collection and usage of Personal Data is legal under the General Data Protection Regulation (GDPR), which enters into force in the European Union on 25 May 2018. While many of the provisions of the GDPR already existed in the Data Protection Directive (DPD) since 1995, they have been reinforced and extended in the GDPR. One of the main principles of the GDPR is that of Purpose Limitation. While the principle already exists under the DPD in a very similar fashion, it is likely that it will be enforced more under the GDPR, since the GDPR is directly applicable in member states instead of having to be implemented. The enforcement mechanisms, such as sanctions, have also been significantly strengthened. The Purpose Limitation Principle requires the data controller (such as companies processing Personal Data, like Facebook and Google) to have a specified purpose for and during the collection of Personal Data. Further, the Personal Data cannot be processed beyond this purpose after it has been collected. This seems to run contrary to Big Data Profiling, which regularly looks for purposes only after the Personal Data has been collected. However, I have identified three potential ways the “collect before select” model could still be possible under the GDPR. The first possibility is the anonymization of Personal Data. If data can be efficiently anonymized, it will fall outside of the scope of the GDPR because it will not contain Personal Data after the anonymization. The controller is then free to analyze the data for any purpose, including creating models that could be used to profile other users. However, I found that Big Data methods can often reidentify Personal Data that has been previously anonymized. In such cases even purportedly anonymized data may still fall under the scope of the GDPR. If on the other hand enough Personal Data is removed to make reidentification impossible, the value of the data for large parts of the business world is likely destroyed. The second possibility is collecting Personal Data for a specified purpose that is defined so widely that it covers all potential future use cases. If a controller can collect Personal Data for a vague purpose, such as “marketing”, the controller will have a lot of flexibility in using the data while still being covered by the initial purpose. I found that the GDPR requires data controllers (such as companies) to have a purpose for the data collection that is specific enough so that the data subject is able to determine exactly which kinds of processing the controller will undertake. Having a non-existent or too vague purpose is not sufficient under the GDPR. Companies that collect data with no, or an only vaguely defined, purpose and then try to find a specific purpose for the collected data later will therefore have to stop this practice. The third possibility can be used if the controller wants to re-use Personal Data for further purposes, after the controller has collected the Personal Data initially in compliance with the GDPR for a specified purpose. In this case, the GDPR offers certain possibilities of further processing this data outside of the initial purpose. The GDPR allows this for example if the data subject has given consent to the new purpose. However, I found that Big Data Profiling companies often come up with new purposes later by “letting the data speak”, which means by analyzing the data itself to find new purposes. Before performing an analysis, often the company might not even know how the processing will be done later. In that case, it is impossible to request the data subject’s specific consent, which is required under the GDPR. Even without the data subject’s consent, there are however other possibilities of further processing data under the GDPR, such as determining whether the new processing is compatible with the initial purpose. My thesis examines some of those possibilities for a change of purpose under Big Data Profiling. My conclusion is that the GDPR likely means a drastic impact and limitation on Big Data Profiling as we know it. Personal Data cannot be collected without a purpose or with a vague purpose. Even Personal Data that was collected for a specific purpose cannot be re-used for another purpose except for in very few circumstances. Time will tell how the courts interpret the GDPR and decide different situations, how the companies will adapt to them and if the legislator will react to this reality

    Using artificial intelligence to increase access to justice

    L'intelligence artificielle est l'un des domaines les plus florissants et les plus passionnants de la recherche et de l'industrie. Au cours des derniĂšres annĂ©es, les approches utilisant l'apprentissage profond ont permis de nombreuses avancĂ©es dans divers domaines, notamment la vision par ordinateur, la traduction automatique, la reconnaissance et la gĂ©nĂ©ration d'images, ainsi que la comprĂ©hension et la gĂ©nĂ©ration de textes (tels que GPT-4). Dans cette thĂšse, je cherche Ă  savoir si et comment l'intelligence artificielle peut ĂȘtre utilisĂ©e pour amĂ©liorer l'accĂšs Ă  la justice et Ă  l'information juridique pour les justiciables. Le citoyen moyen est souvent dĂ©passĂ© et impuissant lorsqu'il est confrontĂ© Ă  des problĂšmes juridiques. Il peut avoir du mal Ă  comprendre comment la loi s'applique Ă  sa situation et Ă  utiliser le systĂšme juridique pour rĂ©soudre son problĂšme, mĂȘme s'il est conscient de ses droits. En consĂ©quence, leurs problĂšmes restent irrĂ©solus ou ils ne profitent pas des possibilitĂ©s qui leur sont offertes. C'est pourquoi j'ai dĂ©veloppĂ© et mis en Ɠuvre la mĂ©thodologie "JusticeBot", qui utilise l'IA pour aider les justiciables Ă  rĂ©soudre leurs problĂšmes juridiques. Les outils qui en rĂ©sultent utilisent une approche hybride de raisonnement basĂ©e sur des rĂšgles et des cas pour poser Ă  l'utilisateur des questions pertinentes, analyser sa situation juridique et lui fournir des informations juridiques appropriĂ©es ainsi que des cas antĂ©rieurs similaires liĂ©s Ă  son problĂšme juridique particulier. L'utilisateur peut utiliser ces informations pour nĂ©gocier une solution mutuellement acceptable ou pour naviguer dans le processus juridique ardu. JusticeBot est donc un outil d'intelligence augmentĂ©e, qui amĂ©liore le niveau de connaissance de l'utilisateur pour l'aider Ă  rĂ©soudre ses problĂšmes juridiques. Je dĂ©cris la mĂ©thodologie globale et la maniĂšre dont je l'ai mise en Ɠuvre dans des outils logiciels, par exemple le "JusticeCreator", une interface permettant de crĂ©er et de mettre Ă  jour les outils JusticeBot. Je prĂ©sente Ă©galement un outil JusticeBot dĂ©ployĂ© dans le domaine des litiges entre propriĂ©taires et locataires, qui est accessible au public Ă  l'adresse https://justicebot.ca. Cet outil a Ă©tĂ© utilisĂ© plus de 17 000 fois et 86 % des utilisateurs ayant rĂ©pondu Ă  une enquĂȘte ont dĂ©clarĂ© qu'ils recommanderaient le systĂšme Ă  d'autres personnes. Je pense que JusticeBot peut contribuer Ă  aider les individus Ă  rĂ©soudre leurs problĂšmes juridiques, ainsi qu'Ă  renforcer la confiance et l'identification aux institutions juridiques au niveau sociĂ©tal en amĂ©liorant l'accĂšs Ă  la justice et l'accĂšs Ă  l'information juridique pour le citoyen moyen.Artificial intelligence is one of the most thriving and exciting areas in research and industry. Recently, approaches using deep learning have led to a number of breakthroughs in a range of areas, including computer vision, machine translation, image recognition and generation, and text understanding and generation (such as GPT-4). In this thesis, I investigate if and how artificial intelligence (AI) can be used to improve access to justice and access to legal information for laypeople, i.e. people without legal training. The average citizen is often overwhelmed and helpless when dealing with legal problems. They may struggle to understand how the law applies to their situation, and further have trouble using the judicial system to resolve their issue, even if they are aware of their rights. This results in their problems going unresolved or prevents them from benefiting from opportunities available to them. For this reason, I developed and implemented the “JusticeBot” methodology, which uses AI to support laypeople with their legal issues. The resulting tools use a hybrid rule-based and case-based reasoning approach to ask a user relevant questions, analyze their legal situation, and provide them with suitable legal information and similar previous cases related to their particular legal problem. They can use this information to negotiate a mutually agreeable solution, or to navigate the arduous legal process. Thus, JusticeBot is an augmented intelligence tool, enhancing the user’s knowledge level to help them solve their legal problems. I describe the overall methodology and how I implemented it into software tools, e.g. the “JusticeCreator”, an interface to create and update JusticeBot tools. I also elaborate on a deployed JusticeBot tool in the area of landlord-tenant disputes, which is accessible to the public at https://justicebot.ca. This tool has been used over 17k times, and 86% of users responding to a survey report that they would recommend the system to others. I believe that JusticeBot can contribute to helping individuals resolve their legal problems, as well as increasing trust in and identification with legal institutions on a societal level, by improving access to justice and access to legal information for the average citizen

    Using Large Language Models to Support Thematic Analysis in Empirical Legal Studies

    Thematic analysis and other variants of inductive coding are widely used qualitative analytic methods within empirical legal studies (ELS). We propose a novel framework facilitating effective collaboration of a legal expert with a large language model (LLM) for generating initial codes (phase 2 of thematic analysis), searching for themes (phase 3), and classifying the data in terms of the themes (to kick-start phase 4). We employed the framework for an analysis of a dataset (n=785) of facts descriptions from criminal court opinions regarding thefts. The goal of the analysis was to discover classes of typical thefts. Our results show that the LLM, namely OpenAI's GPT-4, generated reasonable initial codes, and it was capable of improving the quality of the codes based on expert feedback. They also suggest that the model performed well in zero-shot classification of facts descriptions in terms of the themes. Finally, the themes autonomously discovered by the LLM appear to map fairly well to the themes arrived at by legal experts. These findings can be leveraged by legal researchers to guide their decisions in integrating LLMs into their thematic analyses, as well as other inductive coding projects.Comment: 10 pages, 5 figures, 3 table

    Explaining Legal Concepts with Augmented Large Language Models (GPT-4)

    Interpreting the meaning of legal open-textured terms is a key task of legal professionals. An important source for this interpretation is how the term was applied in previous court cases. In this paper, we evaluate the performance of GPT-4 in generating factually accurate, clear and relevant explanations of terms in legislation. We compare the performance of a baseline setup, where GPT-4 is directly asked to explain a legal term, to an augmented approach, where a legal information retrieval module is used to provide relevant context to the model, in the form of sentences from case law. We found that the direct application of GPT-4 yields explanations that appear to be of very high quality on their surface. However, detailed analysis uncovered limitations in terms of the factual accuracy of the explanations. Further, we found that the augmentation leads to improved quality, and appears to eliminate the issue of hallucination, where models invent incorrect statements. These findings open the door to the building of systems that can autonomously retrieve relevant sentences from case law and condense them into a useful explanation for legal scholars, educators or practicing lawyers alike

    How to treat software in the intellectual property framework

    The software industry is today one of the most important industries, and almost all devices we use contain some piece of software. As software has both a literal expression and an inventive aspect, it has been unclear which way the intellectual property should be protected. Software can be covered both by patents, that protect the idea behind the code, and copyright, that protects the expression of this idea. A copyrighted program can be examined and the ideas can be reimplemented in a competitors own “words”, thereby circumventing the copyright restriction. Therefore, many have argued for the use of patents. Patents grant an exclusive right to the inventor to use the invention for 20 years, if certain conditions are met (utility, novelty, inventive step). This gives the inventor a head start in commercializing his invention and hopefully allows him to recoup the investment costs. The patent also acts as a knowledge dissemination tool. After the patent has been granted, the claim describing precisely how to implement the invention is made public. However, patents carry the risk of impeding innovation in the industry by granting a too large monopoly to one entity, preventing competition and continued development in that area. In the United States, patent protection of software has gone through several periods. In the 70s and 80s, three cases established a very narrow patent eligibility of software. It was only patentable together with a specific machine or a process that transformed matter into another form. In the 90s, case law opened for the patenting of basically all software, as long as it was useful. This led to the granting of hundreds of thousands of software patents. Recently, courts have gone back to not treating software as an invention, only allowing patenting if the implementation is inventive. The view how to treat software in legal doctrine is very varied. Some argue that software is fundamentally different from hardware and should therefore not be patent eligible. They also argue that software patents have harmed the software industry. Others argue that software should be patent eligible in order to increase research and innovation. Yet others argue that a new kind of protection should be introduced for software. I find that software is not fundamentally different from other inventions, and that it should therefore be patent eligible, as long as it meets the requirements for patentability. Many of the court cases should have been decided by applying these criteria instead of excluding software as such. The problems in practice also seem to stem from a misapplication of the inventive step and novelty tests. However, the patent term should be decreased for software patents. The current term reflects an innovation pace unsuitable for the software industry. This would solve many of the practical problems of software patenting.Mjukvaruindustrin Ă€r idag en av de viktigaste industrierna i vĂ€rlden, och flerparten av de apparater vi anvĂ€nder innehĂ„ller mjukvara. Eftersom mjukvara har bĂ„de en litterĂ€r del och en uppfinningsrik del har det varit oklart hur mjukvara ska skyddas. Den kan skyddas av patent, som tĂ€cker idĂ©n bakom programmet, och upphovsrĂ€tt, som tĂ€cker uttrycket av idĂ©n. Ett program endast skyddat av upphovsrĂ€tt kan undersökas och idĂ©erna sedan Ă„terskapas med ”andra ord” av en konkurrent. DĂ€rmed kringgĂ„s upphovsrĂ€tten. DĂ€rför har mĂ„nga argumenterat för att anvĂ€nda patent pĂ„ mjukvara. Patent ger en exklusiv rĂ€tt för uppfinnaren att anvĂ€nda en uppfinning sĂ„ lĂ€nge den uppfyller vissa krav (anvĂ€ndbarhet, nyhet, uppfinningshöjd). Detta ger uppfinnaren ett försprĂ„ng i kommersialiseringen av uppfinningen och en möjlighet att Ă„tervinna investeringar. Patentet fungerar Ă€ven som ett sĂ€tt att sprida information. Efter att patentet meddelas blir informationen om hur man implementerar uppfinningen offentlig. Patent löper dock Ă€ven risken att hindra innovation genom att ge ett för stort monopol till en enhet, vilket hindrar konkurrens och vidareutveckling i ett visst omrĂ„de. I USA har patentskydd av mjukvara genomgĂ„tt ett flertal perioder. PĂ„ 70- och 80-talet etablerade tre rĂ€ttsfall en mycket snĂ€v patenterbarhet av mjukvara. Den var endast möjlig att patentera tillsammans med en viss apparat eller en process som förĂ€ndrar materia. PĂ„ 90-talet öppnades möjligheten för patentering av nĂ€stan all mjukvara, sĂ„ lĂ€nge den Ă€r anvĂ€ndbar. Detta ledde till meddelandet av hundratusentals patent pĂ„ mjukvara. PĂ„ den senaste tiden verkar domstolen ha gĂ„tt tillbaka till att behandla mjukvara som icke-patenterbar och istĂ€llet bedöma om implementeringen Ă€r uppfinningsrik. Åsikterna i doktrinen Ă€r vĂ€ldigt varierad. Vissa argumenterar för att mjukvara Ă€r fundamentalt annorlunda frĂ„n andra uppfinningar och dĂ€rför inte bör kunna patenteras. De anser Ă€ven att mjukvarupatent har skadat industrin. Andra anser att mjukvara borde vara patenterbart för att öka forskning och innovation. Andra anser att ett helt nytt skydd borde införas för mjukvara. Jag anser inte att mjukvara Ă€r fundamentalt olikt frĂ„n andra uppfinningar och att den dĂ€rför borde vara patenterbar, om den uppfyller de andra kraven för patenterbarhet. MĂ„nga av rĂ€ttsfallen borde ha avgjorts med anvĂ€ndningen av dessa krav istĂ€llet för att utesluta mjukvaran. De praktiska problemen med mjukvarupatent verkar komma frĂ„n felanvĂ€ndningen av kraven för nyhet och uppfinningshöjd. LĂ€ngden för patent pĂ„ mjukvara borde dock förkortas. Den nuvarande lĂ€ngden pĂ„ 20 Ă„r reflekterar en innovationshastighet som Ă€r opassande för mjukvarubranschen. Detta skulle lösa mĂ„nga av de praktiska problemen

    L'angle Phi: un essai théorique sur le sentiment de présence, les facteurs humains et la performance en réalité virtuelle

    International audienceAbstract The question of the relationship between the sense of presence and performance in virtual reality is fundamental for anyone wishing to use the tool methodologically. Indeed, if the sense of presence can modify performance per se, then individual factors affecting the human–computer interaction might have repercussions on performance, despite being unrelated to it. After a discussion on the sense of presence and the particularities it provokes, this work studies the psychophysiology of virtual reality. This in virtuo experience is understood according to a constitutive and reciprocal relationship with the subject's cognitive profile, made up of all the human, contextual, and motivational factors impacting the processing of immersion. The role and importance of performance in virtual reality is described in this framework in such a way as to be studied methodologically. The presence–performance relationship is discussed based on previous works and analyzed in terms of attentional resources. Finally, the degree of ecological validity of the performance is described as the factor modulating the relationship between the sense of presence and performance (the Phi Angle). Limitations, applications, and test hypotheses of the model are presented. This work not only aims to help explain the conceptualization of virtual reality, but also to improve its methodological framework

    Boundary Layer Separation and Reattachment Detection on Airfoils by Thermal Flow Sensors

    A sensor concept for detection of boundary layer separation (flow separation, stall) and reattachment on airfoils is introduced in this paper. Boundary layer separation and reattachment are phenomena of fluid mechanics showing characteristics of extinction and even inversion of the flow velocity on an overflowed surface. The flow sensor used in this work is able to measure the flow velocity in terms of direction and quantity at the sensor’s position and expected to determine those specific flow conditions. Therefore, an array of thermal flow sensors has been integrated (flush-mounted) on an airfoil and placed in a wind tunnel for measurement. Sensor signals have been recorded at different wind speeds and angles of attack for different positions on the airfoil. The sensors used here are based on the change of temperature distribution on a membrane (calorimetric principle). Thermopiles are used as temperature sensors in this approach offering a baseline free sensor signal, which is favorable for measurements at zero flow. Measurement results show clear separation points (zero flow) and even negative flow values (back flow) for all sensor positions. In addition to standard silicon-based flow sensors, a polymer-based flexible approach has been tested showing similar results

    Disentangling the effects of trait and state worry on error-related brain activity: Results from a randomized controlled trial using worry manipulations

    Enhanced amplitudes of the error-related negativity (ERN) have been suggested to be a transdiagnostic neural risk marker for internalizing psychopathology. Previous studies propose worry to be an underlying mechanism driving the association between enhanced ERN and anxiety. The present preregistered study focused on disentangling possible effects of trait and state worry on the ERN by utilizing a cross sectional observational and a longitudinal randomized controlled experimental design. To this end, we examined the ERN of n = 90 students during a flanker task (T0), which were then randomly assigned to one of three groups (worry induction, worry reduction, passive control group). Following the intervention, participants performed another flanker task (T1) to determine potential alterations of their ERN. Manipulation checks revealed that compared to the control group, state worry increased in the induction but also in the reduction group. ERN amplitudes did not vary as a function of state worry. An association of trait worry with larger ERN amplitudes was only observed in females. Furthermore, we found larger ERN amplitudes in participants with a current or lifetime diagnosis of internalizing disorders. In summary, our findings suggest that the ERN seems to be insensitive to variations in state worry, but that an elevated ERN is associated with the trait-like tendency to worry and internalizing psychopathology, which is consistent with the notion that the ERN likely represents a trait-like neural risk associated with anxiety