266 research outputs found

    Methods for improving entity linking and exploiting social media messages across crises

    Get PDF
    Entity Linking (EL) is the task of automatically identifying entity mentions in texts and resolving them to a corresponding entity in a reference knowledge base (KB). There is a large number of tools available for different types of documents and domains, however the literature in entity linking has shown the quality of a tool varies across different corpus and depends on specific characteristics of the corpus it is applied to. Moreover the lack of precision on particularly ambiguous mentions often spoils the usefulness of automated disambiguation results in real world applications. In the first part of this thesis I explore an approximation of the difficulty to link entity mentions and frame it as a supervised classification task. Classifying difficult to disambiguate entity mentions can facilitate identifying critical cases as part of a semi-automated system, while detecting latent corpus characteristics that affect the entity linking performance. Moreover, despiteless the large number of entity linking tools that have been proposed throughout the past years, some tools work better on short mentions while others perform better when there is more contextual information. To this end, I proposed a solution by exploiting results from distinct entity linking tools on the same corpus by leveraging their individual strengths on a per-mention basis. The proposed solution demonstrated to be effective and outperformed the individual entity systems employed in a series of experiments. An important component in the majority of the entity linking tools is the probability that a mentions links to one entity in a reference knowledge base, and the computation of this probability is usually done over a static snapshot of a reference KB. However, an entity’s popularity is temporally sensitive and may change due to short term events. Moreover, these changes might be then reflected in a KB and EL tools can produce different results for a given mention at different times. I investigated the prior probability change over time and the overall disambiguation performance using different KB from different time periods. The second part of this thesis is mainly concerned with short texts. Social media has become an integral part of the modern society. Twitter, for instance, is one of the most popular social media platforms around the world that enables people to share their opinions and post short messages about any subject on a daily basis. At first I presented one approach to identifying informative messages during catastrophic events using deep learning techniques. By automatically detecting informative messages posted by users during major events, it can enable professionals involved in crisis management to better estimate damages with only relevant information posted on social media channels, as well as to act immediately. Moreover I have also performed an analysis study on Twitter messages posted during the Covid-19 pandemic. Initially I collected 4 million tweets posted in Portuguese since the begining of the pandemic and provided an analysis of the debate aroud the pandemic. I used topic modeling, sentiment analysis and hashtags recomendation techniques to provide isights around the online discussion of the Covid-19 pandemic

    Recherche d'Information Sociale et Recommandation: Etat d'art et travaux futurs

    Get PDF
    International audienceThe explosion of web 2.0 and social networks has created an enormous and rewarding source of information that has motivated researchers in different fields to exploit it. Our work revolves around the issue of access and identification of social information and their use in building a user profile enriched with a social dimension, and operating in a process of personalization and recommendation. We study several approaches of Social IR (Information Retrieval), distinguished by the type of incorporated social information. We also study various social recommendation approaches classified by the type of recommendation. We then present a study of techniques for modeling the social user profile dimension, followed by a critical discussion. Thus, we propose our social recommendation approach integrating an advanced social user profile model.L’explosion du web 2.0 et des rĂ©seaux sociaux a crĂ©e une source d’information Ă©norme et enrichissante qui a motivĂ© les chercheurs dans diffĂ©rents domaines Ă  l’exploiter. Notre travail s’articule autour de la problĂ©matique d’accĂšs et d’identification des informations sociales et leur exploitation dans la construction d’un profil utilisateur enrichi d’une dimension sociale, et son exploitation dans un processus de personnalisation et de recommandation. Nous Ă©tudions diffĂ©rentes approches sociales de RI (Recherche d’Information), distinguĂ©es par le type d’informations sociales incorporĂ©es. Nous Ă©tudions Ă©galement diverses approches de recommandation sociale classĂ©es par le type de recommandation. Nous exposons ensuite une Ă©tude des techniques de modĂ©lisation de la dimension sociale du profil utilisateur, suivie par une discussion critique. Ainsi, nous prĂ©sentons notre approche de recommandation sociale proposĂ©e intĂ©grant un modĂšle avancĂ© de profil utilisateur social

    Expecting space:an enactive and active inference approach to transitions

    Get PDF

    A Phenomenological Investigation into the Formation of Primary Delusion

    Get PDF
    In this thesis, I seek to provide a systematic phenomenological account on the formation of the delusion characteristic to schizophrenia, i.e., primary delusion. Although there has been a strong phenomenological research tradition that identifies the altered basic self experience and mood experience as the precursor experiences that underpin the formation of primary delusion, comparatively few investigations have been carried out with respect to their underlying affective dimension. In this thesis, I employ Husserl’s phenomenology to clarify the nature of the altered affective experience present in the early stage of schizophrenia. To be precise, I focus on the kind of experience wherein a person experiences pervasive ‘attraction’ or ‘pull’ coming from different temporal modes of experience (past, present and future) and from every insignificant details of one’s familiar surroundings. In this thesis, I term this kind of experience as ‘affective dysregulation experience’. By carefully demonstrating how such an experience could globally alter the way one experiences time, oneself, and world, I aim to provide an affective centred phenomenological account that can coherently chart out the development of primary delusion from its identified precursor experiences. In developing this affective centred account, I critically assess and refine the predominant phenomenological accounts of primary delusion formation and further chart out a possible way toward a mutual commerce between phenomenologically oriented research and neurobiological research into delusion formation. This thesis is organised into two parts. The first part consists of three chapters. Chapter 1 and Chapter 2 clarify, respectively, the theoretical and the methodological orientation of current research. Chapter 3 addresses the enduring challenge in providing a phenomenological account of primary delusions; the challenge that primary delusion is, in principle, un-understandable. The second half of this thesis critically assesses the predominant contemporary phenomenological account and proposes an affective centred account regarding self-fragmentation (Ch.4), delusional mood (Ch.5), and primary delusion (Ch.6)

    An Eye on Clinical BERT: Investigating Language Model Generalization for Diabetic Eye Disease Phenotyping

    Full text link
    Diabetic eye disease is a major cause of blindness worldwide. The ability to monitor relevant clinical trajectories and detect lapses in care is critical to managing the disease and preventing blindness. Alas, much of the information necessary to support these goals is found only in the free text of the electronic medical record. To fill this information gap, we introduce a system for extracting evidence from clinical text of 19 clinical concepts related to diabetic eye disease and inferring relevant attributes for each. In developing this ophthalmology phenotyping system, we are also afforded a unique opportunity to evaluate the effectiveness of clinical language models at adapting to new clinical domains. Across multiple training paradigms, we find that BERT language models pretrained on out-of-distribution clinical data offer no significant improvement over BERT language models pretrained on non-clinical data for our domain. Our study tempers recent claims that language models pretrained on clinical data are necessary for clinical NLP tasks and highlights the importance of not treating clinical language data as a single homogeneous domain.Comment: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 24 page

    Semantic Annotation and Search: Bridging the Gap between Text, Knowledge and Language

    Get PDF
    In recent years, the ever-increasing quantities of entities in large knowledge bases on the Web, such as DBpedia, Freebase and YAGO, pose new challenges but at the same time open up new opportunities for intelligent information access. These knowledge bases (KBs) have become valuable resources in many research areas, such as natural language processing (NLP) and information retrieval (IR). Recently, almost every major commercial Web search engine has incorporated entities into their search process, including Google’s Knowledge Graph, Yahoo!’s Web of Objects and Microsoft’s Satori Graph/Bing Snapshots. The goal is to bridge the semantic gap between natural language text and formalized knowledge. Within the context of globalization, multilingual and cross-lingual access to information has emerged as an issue of major interest. Nowadays, more and more people from different countries are connecting to the Internet, in particular the Web, and many users can understand more than one language. While the diversity of languages on the Web has been growing, for most people there is still very little content in their native language. As a consequence of the ability to understand more than one language, users are also interested in Web content in other languages than their mother tongue. There is an impending need for technologies that can help in overcoming the language barrier for multilingual and cross-lingual information access. In this thesis, we face the overall research question of how to allow for semantic-aware and cross-lingual processing of Web documents and user queries by leveraging knowledge bases. With the goal of addressing this complex problem, we provide the following solutions: (1) semantic annotation for addressing the semantic gap between Web documents and knowledge; (2) semantic search for coping with the semantic gap between keyword queries and knowledge; (3) the exploitation of cross-lingual semantics for overcoming the language barrier between natural language expressions (i.e., keyword queries and Web documents) and knowledge for enabling cross-lingual semantic annotation and search. We evaluated these solutions and the results showed advances beyond the state-of-the-art. In addition, we implemented a framework of cross-lingual semantic annotation and search, which has been widely used for cross-lingual processing of media content in the context of our research projects

    Long-term Information Preservation and Access

    Get PDF
    An unprecedented amount of information encompassing almost every facet of human activities across the world is generated daily in the form of zeros and ones, and that is often the only form in which such information is recorded. A good fraction of this information needs to be preserved for periods of time ranging from a few years to centuries. Consequently, the problem of preserving digital information over a long-term has attracted the attention of many organizations, including libraries, government agencies, scientific communities, and individual researchers. In this dissertation, we address three issues that are critical to ensure long-term information preservation and access. The first concerns the core requirement of how to guarantee the integrity of preserved contents. Digital information is in general very fragile because of the many ways errors can be introduced, such as errors introduced because of hardware and media degradation, hardware and software malfunction, operational errors, security breaches, and malicious alterations. To address this problem, we develop a new approach based on efficient and rigorous cryptographic techniques, which will guarantee the integrity of preserved contents with extremely high probability even in the presence of malicious attacks. Our prototype implementation of this approach has been deployed and actively used in the past years in several organizations, including the San Diego Super Computer Center, the Chronopolis Consortium, North Carolina State University, and more recently the Government Printing Office. Second, we consider another crucial component in any preservation system - searching and locating information. The ever-growing size of a long-term archive and the temporality of each preserved item introduce a new set of challenges to providing a fast retrieval of content based on a temporal query. The widely-used cataloguing scheme has serious scalability problems. The standard full-text search approach has serious limitations since it does not deal appropriately with the temporal dimension, and, in particular, is incapable of performing relevancy scoring according to the temporal context. To address these problems, we introduce two types of indexing schemes - a location indexing scheme, and a full-text search indexing scheme. Our location indexing scheme provides optimal operations for inserting and locating a specific version of a preserved item given an item ID and a time point, and our full-text search indexing scheme efficiently handles the scalability problem, supporting relevancy scoring within the temporal context at the same time. Finally, we address the problem of organizing inter-related data, so that future accesses and data exploration can be quickly performed. We, in particular, consider web contents, where we combine a link-analysis scheme with a graph partitioning scheme to put together more closely related contents in the same standard web archive container. We conduct experiments that simulate random browsing of preserved contents, and show that our data organization scheme greatly minimizes the number of containers needed to be accessed for a random browsing session. Our schemes have been tested against real-world data of significant scale, and validated through extensive empirical evaluations

    On the Subject of Autism: Lacan, First-Person Writing, and Research

    Get PDF
    In his essay, Don’t Mourn for Us, Jim Sinclair describes autism as a “way of being.” He maintains there is “no normal child hidden behind the autism” and that “it colors every experience, every sensation, perception, thought, emotion, and encounter, every aspect of existence.” In an attempt to appreciate the depth of Sinclair’s statements, this thesis approaches autism as a “way of being” through the psychoanalytic theory of Jacques Lacan. By applying Lacan’s conceptual framework to first-person writing and scientific research, I lay an interdisciplinary foundation for the case I make. Although this project requires significant conceptual scaffolding across different epistemological systems, I consider how Lacanian theory possesses a unique capacity to conceive of autism as a way of being and to open new ways of approaching the source material. Implicitly, Sinclair asks that we consider the question of what it means “to be” – autistic, neurotypical, or otherwise. I approach this from the premise that an individual exists as a thinking being, or a “subject.” Because psychoanalysis is concerned with the constitutive role of the unconscious in structuring consciousness, this thesis invests substantial space in consideration of how the Lacanian subject is oriented around a fundamental lack. To this end, I return frequently to Lacan’s concept of objet a, understood as a representative of the subject’s lack in the perceptual realm that is itself lacking. Further, Lacan’s unique interpretation of Freud consists in placing language as the ultimate mediating structure of subjectivity; it both generates lack and establishes a system for mitigating it. One’s way of being is always a way of being in language.1 Given the predominant roles of language and social communication impairments in the DSM-V diagnostic criteria for autism, a main goal of this project is to consider how an autistic way of being entails a unique structuration of lack.2 Autism and psychoanalysis share a history that extends back to the origins of the diagnosis. I explore this history with a focus on how different psychoanalytic theories conceptualize the autistic subject and to what extent they honor or undermine Sinclair’s position. Contemporary Lacanian thinkers of autism do both. Unique to Lacan’s structural approach, the concept of the Other is inclusive of a radical alterity, yet also the system of language, the body, and certain aspects of the maternal and paternal functions. The subject is unthinkable apart from the Other. I suggest an autistic way of being is discernible in the autistic subject’s relation to each aspect of the Other. I find support for this claim in recent sensorimotor research. Referred to loosely as the movement perspective, this research suggests that differences in how autistic individuals move and perceive others is a “unifying characteristic” of autism.3 Importantly, the movement perspective is proactively inclusive of first-person knowledge. Read through Lacan’s conceptual framework, movement differences address the underlying mechanism of the autistic subject’s relation to the Other, and thus its way of being. Most fundamentally, this thesis is a work of theory that attempts to articulate something universal about being a subject, without simultaneously eliding what is unique about being an autistic subjec

    Why is virtual reality interesting for philosophers?

    Get PDF
    This article explores promising points of contact between philosophy and the expanding field of virtual reality research. Aiming at an interdisciplinary audience, it proposes a series of new research targets by presenting a range of concrete examples characterized by high theoretical relevance and heuristic fecundity. Among these examples are conscious experience itself, “Bayesian” and social VR, amnestic re-embodiment, merging human-controlled avatars and virtual agents, virtual ego-dissolution, controlling the reality/virtuality continuum, the confluence of VR and artificial intelligence (AI) as well as of VR and functional magnetic resonance imaging (fMRI), VR-based social hallucinations and the emergence of a virtual Lebenswelt, religious faith and practical phenomenology. Hopefully, these examples can serve as first proposals for intensified future interaction and mark out some potential new directions for research
    • 

    corecore