3,092 research outputs found

    Undergraduate Catalog of Studies, 2023-2024

    Get PDF

    Undergraduate Catalog of Studies, 2023-2024

    Get PDF

    Is text preprocessing still worth the time? A comparative survey on the influence of popular preprocessing methods on Transformers and traditional classifiers

    Get PDF
    With the advent of the modern pre-trained Transformers, the text preprocessing has started to be neglected and not specifically addressed in recent NLP literature. However, both from a linguistic and from a computer science point of view, we believe that even when using modern Transformers, text preprocessing can significantly impact on the performance of a classification model. We want to investigate and compare, through this study, how preprocessing impacts on the Text Classification (TC) performance of modern and traditional classification models. We report and discuss the preprocessing techniques found in the literature and their most recent variants or applications to address TC tasks in different domains. In order to assess how much the preprocessing affects classification performance, we apply the three top referenced preprocessing techniques (alone or in combination) to four publicly available datasets from different domains. Then, nine machine learning models – including modern Transformers – get the preprocessed text as input. The results presented show that an educated choice on the text preprocessing strategy to employ should be based on the task as well as on the model considered. Outcomes in this survey show that choosing the best preprocessing technique – in place of the worst – can significantly improve accuracy on the classification (up to 25%, as in the case of an XLNet on the IMDB dataset). In some cases, by means of a suitable preprocessing strategy, even a simple Naïve Bayes classifier proved to outperform (i.e., by 2% in accuracy) the best performing Transformer. We found that Transformers and traditional models exhibit a higher impact of the preprocessing on the TC performance. Our main findings are: (1) also on modern pre-trained language models, preprocessing can affect performance, depending on the datasets and on the preprocessing technique or combination of techniques used, (2) in some cases, using a proper preprocessing strategy, simple models can outperform Transformers on TC tasks, (3) similar classes of models exhibit similar level of sensitivity to text preprocessing

    Rights on news : expanding copyright on the internet

    Get PDF
    Defence date: 18 February 2020Examining Board: Prof. Giovanni Sartor, EUI (Supervisor); Prof. Pier Luigi Parcu, EUI; Prof. Lionel Bently, University of Cambridge; Prof. Christophe Geiger, University of StrasbourgThe internet and digital technologies have irreversibly changed the way we find and consume news. Legacy news organisations, publishers of newspapers, have moved to the internet. In the online news environment, however, they are no longer the exclusive suppliers of news. New digital intermediaries have emerged, search engines and news aggregators in particular. They select and display links and fragments of press publishers’ content as a part of their services, without seeking the news organisations’ prior consent. To shield themselves from exploitation by digital intermediaries, press publishers have begun to seek legal protection, and called for the introduction of a new right under the umbrella of copyright and related rights. Following these calls, the press publishers’ right was introduced into the EU copyright framework by the Directive on Copyright in the Digital Single Market in 2019

    Multidisciplinary perspectives on Artificial Intelligence and the law

    Get PDF
    This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (‘AI’) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics – and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the Católica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio

    Undergraduate Catalog of Studies, 2022-2023

    Get PDF

    The politics of content prioritisation online governing prominence and discoverability on digital media platforms

    Get PDF
    This thesis examines the governing systems and industry practices shaping online content prioritisation processes on digital media platforms. Content prioritisation, and the relative prominence and discoverability of content, are investigated through a critical institutional lens as digital decision guidance processes that shape online choice architecture and influence users’ access to content online. This thesis thus shows how prioritisation is never neutral or static and cannot be explained solely by political economic or neoclassical economics approaches. Rather, prioritisation is dynamically shaped by the institutional environment and by the clash between existing media governance systems and those emerging for platform governance. As prioritisation processes influence how audiovisual media services are accessed online, posing questions about the public interest in such forms of intermediation is key. In that context, this research asks how content prioritisation is governed on digital media platforms, and what the elements of a public interest framework for these practices might be. To address these questions, I use a within case study comparative research design focused on the United Kingdom, collecting data by means of semi-structured interviews and document analysis. Through a thematic analysis, I then investigate how institutional arrangements influence both organisational strategies and interests, as well as the relationships among industry and policy actors involved, namely, platform organisations, pay-TV operators, technology manufacturers, content providers including public service media, and regulators. The results provide insights into the ‘black box’ of content prioritisation across three interconnected dimensions: technical, market, and regulatory. In each dimension, a battle between industry and policy actors emerges to influence prioritisation online. As the UK Government and regulator intend to develop new prominence rules, the dispute takes on a normative dimension and gives rise to contested visions of what audiovisual services should be prioritised to the final users, and which private- and public-interest-driven criteria are (or should) be used to determine that. Finally, the analysis shows why it is crucial to reflect on how the public interest is interpreted and operationalised as new prominence regulatory regimes emerge with a variety of sometimes contradictory implications for media pluralism, diversity and audience freedom of choice. The thesis therefore indicates the need for new institutional arrangements and a public interest-driven framework for prioritisation on digital media platforms. Such a framework conceives of public interest content standards as an institutional imperative for media and platform organisations and prompts regulators to develop new online content regulation that is appropriate to changing forms of digital intermediation and emerging audiovisual market conditions. While the empirical focus is on the UK, the implications of the research findings are also considered in the light of developments in the European Union and Council of Europe initiatives that bear on the future discoverability of public interest media services and related prominence regimes

    Second-Person Surveillance: Politics of User Implication in Digital Documentaries

    Get PDF
    This dissertation analyzes digital documentaries that utilize second-person address and roleplay to make users feel implicated in contemporary refugee crises, mass incarceration in the U.S., and state and corporate surveillances. Digital documentaries are seemingly more interactive and participatory than linear film and video documentary as they are comprised of a variety of auditory, visual, and written media, utilize networked technologies, and turn the documentary audience into a documentary user. I draw on scholarship from documentary, game, new media, and surveillance studies to analyze how second-person address in digital documentaries is configured through user positioning and direct address within the works themselves, in how organizations and creators frame their productions, and in how users and players respond in reviews, discussion forums, and Let’s Plays. I build on Michael Rothberg’s theorization of the implicated subject to explore how these digital documentaries bring the user into complicated relationality with national and international crises. Visually and experientially implying that users bear responsibility to the subjects and subject matter, these works can, on the one hand, replicate modes of liberal empathy for suffering, distant “others” and, on the other, simulate one’s own surveillant modes of observation or behavior to mirror it back to users and open up one’s offline thoughts and actions as a site of critique. This dissertation charts how second-person address shapes and limits the political potentialities of documentary projects and connects them to a lineage of direct address from educational and propaganda films, museum exhibits, and serious games. By centralizing the user’s individual experience, the interventions that second-person digital documentaries can make into social discourse change from public, institution-based education to more privatized forms of sentimental education geared toward personal edification and self-realization. Unless tied to larger initiatives or movements, I argue that digital documentaries reaffirm a neoliberal politics of individual self-regulation and governance instead of public education or collective, social intervention. Chapter one focuses on 360-degree virtual reality (VR) documentaries that utilize the feeling of presence to position users as if among refugees and as witnesses to refugee experiences in camps outside of Europe and various dwellings in European cities. My analysis of Clouds Over Sidra (Gabo Arora and Chris Milk 2015) and The Displaced (Imraan Ismail and Ben C. Solomon 2015) shows how these VR documentaries utilize observational realism to make believable and immersive their representations of already empathetic refugees. The empathetic refugee is often young, vulnerable, depoliticized and dehistoricized and is a well-known trope in other forms of humanitarian media that continues into VR documentaries. Forced to Flee (Zahra Rasool 2017), I am Rohingya (Zahra Rasool 2017), So Leben FlĂŒchtlinge in Berlin (Berliner Morgenpost 2017), and Limbo: A Virtual Experience of Waiting for Asylum (Shehani Fernando 2017) disrupt easy immersions into realistic-looking VR experiences of stereotyped representations and user identifications and, instead, can reflect back the user’s political inaction and surveillant modes of looking. Chapter two analyzes web- and social media messenger-based documentaries that position users as outsiders to U.S. mass incarceration. Users are noir-style co-investigators into the crime of the prison-industrial complex in Fremont County, Colorado in Prison Valley: The Prison Industry (David Dufresne and Philippe Brault 2009) and co-riders on a bus transporting prison inmates’ loved ones for visitations to correctional facilities in Upstate New York in A Temporary Contact (Nirit Peled and Sara Kolster 2017). Both projects construct an experience of carceral constraint for users to reinscribe seeming “outside” places, people, and experiences as within the continuation of the racialized and classed politics of state control through mass incarceration. These projects utilize interfaces that create a tension between replicating an exploitative hierarchy between non-incarcerated users and those subject to mass incarceration while also de-immersing users in these experiences to mirror back the user’s supposed distance from this mode of state regulation. Chapter three investigates a type of digital game I term dataveillance simulation games, which position users as surveillance agents in ambiguously dystopian nation-states and force users to use their own critical thinking and judgment to construct the criminality of state-sanctioned surveillance targets. Project Perfect Citizen (Bad Cop Studios 2016), Orwell: Keeping an Eye on You (Osmotic Studios 2016), and Papers, Please (Lucas Pope 2013) all create a dual empathy: players empathize with bureaucratic surveillance agents while empathizing with surveillance targets whose emails, text messages, documents, and social media profiles reveal them to be “normal” people. I argue that while these games show criminality to be a construct, they also utilize a racialized fear of the loss of one’s individual privacy to make players feel like they too could be surveillance targets. Chapter four examines personalized digital documentaries that turn users and their data into the subject matter. Do Not Track (Brett Gaylor 2015), A Week with Wanda (Joe Derry Hall 2019), Stealing Ur Feelings (Noah Levenson 2019), Alfred Premium (JoĂ«l Ronez, Pierre Corbinais, and Émilie F. Grenier 2019), How They Watch You (Nick Briz 2021), and Fairly Intelligentℱ (A.M. Darke 2021) track, monitor, and confront users with their own online behavior to reflect back a corporate surveillance that collects, analyzes, and exploits user data for profit. These digital documentaries utilize emotional fear- and humor-based appeals to persuade users that these technologies are controlling them, shaping their desires and needs, and dehumanizing them through algorithmic surveillance
    • 

    corecore