85 research outputs found

    Automatic fake news detection on Twitter

    Get PDF
    Nowadays, information is easily accessible online, from articles by reliable news agencies to reports from independent reporters, to extreme views published by unknown individuals. Moreover, social media platforms are becoming increasingly important in everyday life, where users can obtain the latest news and updates, share links to any information they want to spread, and post their own opinions. Such information may create difficulties for information consumers as they try to distinguish fake news from genuine news. Indeed, users may not be necessarily aware that the information they encounter is false and may not have the time and effort to fact-check all the claims and information they encounter online. With the amount of information created and shared daily, it is also not feasible for journalists to manually fact-check every published news article, sentence or tweet. Therefore, an automatic fact-checking system that identifies the check-worthy claims and tweets, and then fact-checks these identified check-worthy claims and tweets can help inform the public of fake news circulating online. Existing fake news detection systems mostly rely on the machine learning models’ computational power to automatically identify fake news. Some researchers have focused on extracting the semantic and contextual meaning from news articles, statements, and tweets. These methods aim to identify fake news by analysing the differences in writing style between fake news and factual news. On the other hand, some researchers investigated using social networks information to detect fake news accurately. These methods aim to distinguish fake news from factual news based on the spreading pattern of news, and the statistical information of the engaging users with the propagated news. In this thesis, we propose a novel end-to-end fake news detection framework that leverages both the textual features and social network features, which can be extracted from news, tweets, and their engaging users. Specifically, our proposed end-to-end framework is able to process a Twitter feed, identify check-worthy tweets and sentences using textual features and embedded entity features, and fact-check the claims using previously unexplored information, such as existing fake news collections and user network embeddings. Our ultimate aim is to rank tweets and claims based on their check-worthiness to focus the available computational power on fact-checking the tweets and claims that are important and potentially fake. In particular, we leverage existing fake news collections to identify recurring fake news, while we explore the Twitter users’ engagement with the check-worthy news to identify fake news that are spreading on Twitter. To identify fake news effectively, we first propose the fake news detection framework (FNDF), which consists of the check-worthiness identification phase and the fact-checking phase. These two phases are divided into three tasks: Phase 1 Task 1: check-worthiness identification task; Phase 2 Task 2: recurring fake news identification task; and Phase 2 Task 3: social network structure-assisted fake news detection task. We conduct experiments on two large publicly available datasets, namely the MM-COVID and the stance detection (SD) datasets. The experimental results show that our proposed framework, FNDF, can indeed identify fake news more effectively than the existing SOTA models, with 23.2% and 4.0% significant increases in F1 scores on the two tested datasets, respectively. To identify the check-worthy tweets and claims effectively, we incorporate embedded entities with language representations to form a vector representation of a given text, to identify if the text is check-worthy or not. We conduct experiments using three publicly available datasets, namely, the CLEF 2019, 2020 CheckThat! Lab check-worthy sentence detection dataset, and the CLEF 2021 CheckThat! Lab check-worthy tweets detection dataset. The experimental results show that combining entity representations and language model representations enhance the language model’s performance in identifying check-worthy tweets and sentences. Specifically, combining embedded entities with the language model results in as much as 177.6% increase in MAP on ranking check-worthy tweets,and a 92.9% increase in ranking check-worthy sentences. Moreover, we conduct an ablation study on the proposed end-to-end framework, FNDF, and show that including a model for identifying check-worthy tweets and claims in our end-to-end framework, can significantly increase the F1 score by as much as 14.7%, compared to not including this model in our framework. To identify recurring fake news effectively, we propose an ensemble model of the BM25 scores and the BERT language model. Experiments conducted on two datasets, namely, the WSDM Cup 2019 Fake News Challenge dataset, and the MM-COVID dataset. Experimental results show that enriching the BERT language model with the BM25 scores can help the BERT model identify fake news significantly more accurately by 4.4%. Moreover, the ablation study on the end-to-end fake news detection framework, FNDF, shows that including the identification of recurring fake news model in our proposed framework results in significant increase in terms of F1 score by as much as 15.5%, compared to not including this task in our framework. To leverage the user network structure in detecting fake news, we first obtain user embed- dings from unsupervised user network embeddings based on their friendship or follower connections on Twitter. Next, we use the user embeddings of the users who engaged with the news to represent a check-worthy tweet/claim, thus predicting whether it is fake news. Our results show that using user network embeddings to represent check-worthy tweets/sentences significantly outperforms the SOTA model, which uses language models to represent the tweets/sentences and complex networks requiring handcrafted features, by 12.0% in terms of the F1 score. Furthermore, including the user network assisted fake news detection model in our end-to-end framework, FNDF, significantly increase the F1 score by as much as 29.3%. Overall, this thesis shows that an end-to-end fake news detection framework, FNDF, that identifies check-worthy tweets and claims, then fact-checks the check-worthy tweets and claims, by identifying recurring fake news and leveraging the social network users’ connections, can effectively identify fake news online

    Acts of killing, acts of meaning:an application of corpus pattern analysis to language of animal-killing

    Get PDF
    We are currently witnessing unprecedented levels of ecological destruction and violence visited upon nonhumans. Study of the more-than-human world is now being enthusiastically taken up across a range of disciplines, in what has been called the ‘scholarly animal turn’. This thesis brings together concerns of Critical Animal Studies – along with related threads of posthumanism and new materialist thinking – and Corpus Linguistics, specifically Corpus Pattern Analysis (CPA), to produce a data-driven, lexicocentric study of the discourse of animal-killing. CPA, which has been employed predominantly in corpus lexicography, provides a robust and empirically well-founded basis for the analysis of verbs. Verbs are chosen as they act as the pivot of a clause; analysing them also uncovers their arguments – in this case, participants in material-discursive ‘killing’ events. This project analyses 15 ‘killing’ verbs using CPA as a basis, in what I term a corpus-lexicographical discourse analysis. The data is sampled from an animal-themed corpus of around 9 million words of contemporary British English, and the British National Corpus is used for reference. The findings are both methodological and substantive. CPA is found to be a reliable empirical starting point for discourse analysis, and the lexicographical practice of establishing linguistic ‘norms’ is critical to the identification of anomalous uses. The thesis presents evidence of anthropocentrism inherent in the English lexicon, and demonstrates several ways in which distance is created between participants of ‘killing’ constructions. The analysis also reveals specific ways that verbs can obfuscate, deontologise and deindividualise their arguments. The recommendations, for discourse analysts, include the adoption of CPA and a critical analysis of its resulting patterns in order to demonstrate the precise mechanisms by which verb use can either oppress or empower individuals. Social justice advocates are also alerted to potentially harmful language that might undermine their cause

    The Pollinating Mesh: The Ecological Thought in Indigenous Australian Speculative Fiction

    Full text link
    This thesis studies how the mesh or the idea of interconnectedness among all beings, humans and nonhumans, pollinates Indigenous Australian speculative fiction and how the aesthetics of these texts warrants their reading as sites of these enmeshments. It aims to put this literature in the context of Indigenous cosmologies, epistemologies, ontologies, or metaphysics to establish how these underpin Indigenous literature and frame its reading. To attend to the global pertinence of both the texts under study and the ecological thought as the main conceptual framework, the thesis engages Object Oriented Ontology and adjacent theories of the ontological turn alongside trans-national Indigenous critical thought. Thus, analysing Alexis Wright’s The Swan Book (2013), Ambelin Kwaymullina’s The Interrogation of Ashala Wolf (2010), The Disappearance of Ember Crow (2013), and The Foretelling of Georgie Spider (2015), as well as Kim Scott’s Benang: From the Heart (1999) allows me to establish that the ecological thought thematically informs them in diverse but interlinked ways. The ecological thought establishes enmeshments among all beings in what I posit as the aesthetics and poetics of the uncanny to capture Alexis Wright’s writing as leading us to think ecologically about the enmeshments of all beings in irreducible ways. All beings’ enmeshments attune us to seeking and finding our kin among all beings and I explore this in Ambelin Kwaymullina’s trilogy. In Kwaymullina’s work, I argue that all beings’ enmeshments sees Indigenous survivance as aesthetically coalescing with Indigenous dreams, which are speculatively manoeuvred and explored as the interface between the real and surreal, the material and the spiritual to enact all beings’ enmeshments. The texts thus enact speculative worlds of enmeshments wherein humans, nonhumans, organic, synthetic (AI) alive, dead, undead, spiritual, and nonliving depend on and become with one another for life, survival and survivance. Kwaymullina’s trilogy ultimately enacts a community of beings mediated by thinking about interconnectedness as becoming with and part of one another. The implication of such a way of thinking brings us to rethink what it means to (not) be and the hauntings of identity from the perspective of Indigenous ecological thinking, which my intervention pursues in the reading of Kim Scott’s Benang: From the Heart. The core of my intervention on Benang establishes it as a wellspring of onto-epistemic affordances to understand being and identity as fluid, floating, permeable, leaking, never rigid or definitive. My reading stages how all beings’ enmeshments enhance the protagonist Harley’s regeneration of his effaced Aboriginal identity as an ontological and identity transformation through blood memory, listening to and reading about his family stories, encountering, and becoming with Country and his Aboriginal culture in its material and spiritual aspects. The thesis ends with interrogating my own speaking position as a postcolonial African reader-critic, and what it means for any African to engage with Indigenous cosmologies, epistemologies, and meeting with these literary texts and their philosophical underpinnings. I establish similarities between both worlds and argue that such African texts as Daniel Fagunwa’s Forest of a Thousand Daemons, Tutuola’s The Palm Wine Drunkard and My Life in the Bush of Ghosts, and Okri’s The Famished Road trilogy, epitomise worlds that equally register aesthetics and poetics similar to those in Indigenous Australian literature

    A real time urban sustainability assessment framework for the smart city paradigm

    Get PDF
    Cities have proven to be a great source of concerns on their impact on the world environment and ecosystem. The objective, in a context where environmental concerns are growing rapidly, is no longer to develop liveable cities but to develop sustainable and responsive cities. This study investigates the currently available urban sustainability assessment (USA) schemes and outlines the main issues that the field is facing. After an extensive literature review, the author advocates for a scheme that would dynamically capture urban areas sustainability insights during their operation, a more user-centred and transparent scheme. The methodological approach has enabled the construction of a solid expertise on urban sustainability indicators, the essential role of the smart city and the Internet of Thing for a real-time key performance indicators determination and assessment, and technical and organisational challenges that such solution would encounter. Key domains such as sensing networks, remote sensing and GIS technologies, BIM technologies, Statistical databases and Open Governmental data platform, crowdsourcing and data mining that could support a real-time urban sustainability assessment have been studied. Additionally, the use of semantic web technologies has been investigated as a mean to deal with sources heterogeneity from diverse data structures and their interoperability. An USA ontology has been designed, integrating existing ontologies such as SSN, ifcOWL, cityGML and geoSPARQL. A web application back-end has then been built around this ontology. The application backbone is an Ontology-Based Data Access where a Relational Database is mapped to the USA ontology, enabling to link sensors data to pieces of information on the urban environment. Overall, this study has contributed to the body of knowledge by introducing an Ontology-Based Data Access (OBDA) approach to support real-time urban sustainability assessment leveraging sensors networks. It addresses both technical and organisational challenges that the smart systems domain is facing and is believed to be a valuable approach in the upcoming smart city paradigm. The solution proposed to tackle the research questions still faces some limitations such as a limited validation of the USA scheme, the OBDA limited intelligence, an improvable BIM and cityGML models conversion to RDF or the lack of user interface. Future work should be carried out to overcome those limitations and to provide stakeholders a high-hand service

    Intelligence artificielle: Les défis actuels et l'action d'Inria - Livre blanc Inria

    Get PDF
    Livre blanc Inria N°01International audienceInria white papers look at major current challenges in informatics and mathematics and show actions conducted by our project-teams to address these challenges. This document is the first produced by the Strategic Technology Monitoring & Prospective Studies Unit. Thanks to a reactive observation system, this unit plays a lead role in supporting Inria to develop its strategic and scientific orientations. It also enables the institute to anticipate the impact of digital sciences on all social and economic domains. It has been coordinated by Bertrand Braunschweig with contributions from 45 researchers from Inria and from our partners. Special thanks to Peter Sturm for his precise and complete review.Les livres blancs d’Inria examinent les grands dĂ©fis actuels du numĂ©rique et prĂ©sentent les actions menĂ©es par nosĂ©quipes-projets pour rĂ©soudre ces dĂ©fis. Ce document est le premier produit par la cellule veille et prospective d’Inria. Cette unitĂ©, par l’attention qu’elle porte aux Ă©volutions scientifiques et technologiques, doit jouer un rĂŽle majeur dans la dĂ©termination des orientations stratĂ©giques et scientifiques d’Inria. Elle doit Ă©galement permettre Ă  l’Institut d’anticiper l’impact des sciences du numĂ©rique dans tous les domaines sociaux et Ă©conomiques. Ce livre blanc a Ă©tĂ© coordonnĂ© par Bertrand Braunschweig avec des contributions de 45 chercheurs d’Inria et de ses partenaires. Un grand merci Ă  Peter Sturm pour sa relecture prĂ©cise et complĂšte. Merci Ă©galement au service STIP du centre de Saclay – Île-de-France pour la correction finale de la version française

    Organising knowledge in the age of the semantic web: a study of the commensurability of ontologies

    Get PDF
     This study is directed towards the problem of conceptual translation across different data management systems and formats, with a particular focus on those used in the emerging world of the Semantic Web. Increasingly, organisations have sought to connect information sources and services within and beyond their enterprise boundaries, building upon existing Internet facilities to offer improved research, planning, reporting and management capabilities. The Semantic Web is an ambitious response to this growing demand, offering a standards-based platform for sharing, linking and reasoning with information. The imagined result, a globalised knowledge network formed out of mutually referring data structures termed "ontologies", would make possible new kinds of queries, inferences and amalgamations of information. Such a network, though, is premised upon large numbers of manually drawn links between these ontologies. In practice, establishing these links is a complex translation task requiring considerable time and expertise; invariably, as ontologies and other structured information sources are published, many useful connections are neglected. To combat this, in recent years substantial research has been invested into "ontology matching" - the exploration of algorithmic approaches for automatically translating or aligning ontologies. These approaches, which exploit the explicit semantic properties of individual concepts, have registered impressive precision and recall results against humanly-engineered translations. However they are unable to make use of background cultural information about the overall systems in which those concepts are housed - how those systems are used, for what purpose they were designed, what methodological or theoretical principles underlined their construction, and so on. The present study investigates whether paying attention to these sociological dimensions of electronic knowledge systems could supplement algorithmic approaches in some circumstances. Specifically, it asks whether a holistic notion of commensurability can be useful when aligning or translating between such systems.      The first half of the study introduces the problem, surveys the literature, and outlines the general approach. It then proposes both a theoretical foundation and a practical framework for assessing commensurability of ontologies and other knowledge systems. Chapter 1 outlines the Semantic Web, ontologies and the problem of conceptual translation, and poses the key research questions. Conceptual translation can be treated as, by turns, a social, philosophical, linguistic or technological problem; Chapter 2 surveys a correspondingly wide range of literature and approaches.      The methods employed by the study are described in Chapter 3. Chapter 4 critically examines theories of conceptual schemes and commensurability, while Chapter 5 describes the framework itself, comprising a series of specific dimensions, a broad methodological approach, and a means for generating both qualitative and quantitative assessments. The second half of the study then explores the notion of commensurability through several empirical frames. Chapters 6 to 8 applies the framework to a series of case studies. Chapter 6 presents a brief history of knowledge systems, and compares two of these systems - relational databases and Semantic Web ontologies. Chapter 7, in turn, compares several "upper-level" ontologies - reusable schematisations of abstract concepts like Time and Space . Chapter 8 reviews a recent, widely publicised controversy over the standardisation of document formats. This analysis in particular shows how the opaque dry world of technical specifications can reveal the complex network of social dynamics, interests and beliefs which coordinate and motivate them. Collectively, these studies demonstrate the framework is useful in making evident assumptions which motivate the design of different knowledge systems, and further, in assessing the commensurability of those systems. Chapter 9 then presents a further empirical study; here, the framework is implemented as a software system, and pilot tested among a small cohort of researchers. Finally, Chapter 10 summarises the argumentative trajectory of the study as a whole - that, broadly, an elaborated notion of commensurability can tease out important and salient features of translation inscrutable to purely algorithmic methods - and suggests some possibilities for further work
    • 

    corecore