70 research outputs found

    Using graph distances for named-entity linking

    Get PDF
    Entity-linking is a natural-language-processing task that consists in identifying strings of text that refer to a particular item in some reference knowledge base. When the knowledge base is Wikipedia, the problem is also referred to as wikification (in this case, items are Wikipedia articles). Entity-linking consists conceptually of many different phases: identifying the portions of text that may refer to an entity (sometimes called "entity detection"), determining a set of concepts (candidates) from the knowledge base that may match each such portion, and choosing one candidate for each set; the latter step, known as candidate selection, is the phase on which this paper focuses. One instance of candidate selection can be formalized as an optimization problem on the underlying concept graph, where the quantity to be optimized is the average distance between the selected items. Inspired by this application, we define a new graph problem which is a natural variant of the Maximum Capacity Representative Set. We prove that our problem is NP-hard for general graphs; we propose several heuristics trying to optimize similar easier objective functions; we show experimentally how these approaches perform with respect to some baselines on a real-world dataset. Finally, in the appendix, we show an exact linear time algorithm that works under some more restrictive assumptions

    Bias Beyond English: Counterfactual Tests for Bias in Sentiment Analysis in Four Languages

    Get PDF
    Sentiment analysis (SA) systems are used in many products and hundreds of languages. Gender and racial biases are well-studied in English SA systems, but understudied in other languages, with few resources for such studies. To remedy this, we build a counterfactual evaluation corpus for gender and racial/migrant bias in four languages. We demonstrate its usefulness by answering a simple but important question that an engineer might need to answer when deploying a system: What biases do systems import from pre-trained models when compared to a baseline with no pre-training? Our evaluation corpus, by virtue of being counterfactual, not only reveals which models have less bias, but also pinpoints changes in model bias behaviour, which enables more targeted mitigation strategies. We release our code and evaluation corpora to facilitate future research

    A comparative performance evaluation of different implementations of the SOAP protocol

    Get PDF
    Abstract—This paper presents a study evaluation of the SOAP [1] protocol performance between two different implementations: Java (Axis2) [2] and Erlang. This comparison has been carried out using several testbeds with input and output data of different sizes. More concretely, we developed three different web services representing typical scenarios likely to be found in real environments. The evaluation is two-fold: we measured both the number of requests per second answered (throughput) by each server and the response to a common server workload, mixing stress and stand-by phases. The Erlang [3] functional programming language claims to be especifically designed and suited for distributed, reliable and soft real-time concurrent systems. Morever, its built-in lightweight processes management and easeness of replication within distributed environments stand out Erlang as an appealing choice for service oriented architectures (SOAs) [4]. On the other hand, we compared this new approximation with the well-known Apache Axis2 project, as it is widely employed on the Web Services field by the Java community. This work allows us to conclude that the Erlang server is more suitable when the computational cost of the web service is low, whereas the Axis2 server is more efficient as the service workload increases. I

    IntentsKB: A Knowledge Base of Entity-Oriented Search Intents

    Full text link
    We address the problem of constructing a knowledge base of entity-oriented search intents. Search intents are defined on the level of entity types, each comprising of a high-level intent category (property, website, service, or other), along with a cluster of query terms used to express that intent. These machine-readable statements can be leveraged in various applications, e.g., for generating entity cards or query recommendations. By structuring service-oriented search intents, we take one step towards making entities actionable. The main contribution of this paper is a pipeline of components we develop to construct a knowledge base of entity intents. We evaluate performance both component-wise and end-to-end, and demonstrate that our approach is able to generate high-quality data.Comment: Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM'18), 2018. 4 pages. 2 figure

    Information extraction from multimedia web documents: an open-source platform and testbed

    No full text
    The LivingKnowledge project aimed to enhance the current state of the art in search, retrieval and knowledge management on the web by advancing the use of sentiment and opinion analysis within multimedia applications. To achieve this aim, a diverse set of novel and complementary analysis techniques have been integrated into a single, but extensible software platform on which such applications can be built. The platform combines state-of-the-art techniques for extracting facts, opinions and sentiment from multimedia documents, and unlike earlier platforms, it exploits both visual and textual techniques to support multimedia information retrieval. Foreseeing the usefulness of this software in the wider community, the platform has been made generally available as an open-source project. This paper describes the platform design, gives an overview of the analysis algorithms integrated into the system and describes two applications that utilise the system for multimedia information retrieval

    Liver X Receptor Activation with an Intranasal Polymer Therapeutic Prevents Cognitive Decline without Altering Lipid Levels

    Get PDF
    The progressive accumulation of amyloid-beta (Aβ) in specific areas of the brain is a common prelude to late-onset of Alzheimer's disease (AD). Although activation of liver X receptors (LXR) with agonists decreases Aβ levels and ameliorates contextual memory deficit, concomitant hypercholesterolemia/hypertriglyceridemia limits their clinical application. DMHCA (N,N-dimethyl-3β-hydroxycholenamide) is an LXR partial agonist that, despite inducing the expression of apolipoprotein E (main responsible of Aβ drainage from the brain) without increasing cholesterol/triglyceride levels, shows nil activity in vivo because of a low solubility and inability to cross the blood brain barrier. Herein, we describe a polymer therapeutic for the delivery of DMHCA. The covalent incorporation of DMHCA into a PEG-dendritic scaffold via carboxylate esters produces an amphiphilic copolymer that efficiently self-assembles into nanometric micelles that exert a biological effect in primary cultures of the central nervous system (CNS) and experimental animals using the intranasal route. After CNS biodistribution and effective doses of DMHCA micelles were determined in nontransgenic mice, a transgenic AD-like mouse model of cerebral amyloidosis was treated with the micelles for 21 days. The benefits of the treatment included prevention of memory deterioration and a significant reduction of hippocampal Aβ oligomers without affecting plasma lipid levels. These results represent a proof of principle for further clinical developments of DMHCA delivery systems.Fil: Navas Guimaraes, Maria Eugenia. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Juan; Argentina. Universidad Catolica de Cuyo. Facultad de Ciencias Medicas. Instituto de Investigacion En Ciencias Biomedicas.; ArgentinaFil: Lopez Blanco, Roi. Universidad de Santiago de Compostela; EspañaFil: Correa, Juan. Universidad de Santiago de Compostela; EspañaFil: Fernandez Villamarin, Marcos. Universidad de Santiago de Compostela; EspañaFil: Bistue Millon, Maria Beatriz. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Juan; Argentina. Universidad Catolica de Cuyo. Facultad de Ciencias Medicas. Instituto de Investigacion En Ciencias Biomedicas.; ArgentinaFil: Martino Adami, Pamela Victoria. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; ArgentinaFil: Morelli, Laura. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; ArgentinaFil: Kumar, Vijay. University of Colorado; Estados UnidosFil: Wempe, Michael F.. University of Colorado; Estados UnidosFil: Cuello, A. C.. McGill University; CanadáFil: Fernandez Megia, Eduardo. Universidad de Santiago de Compostela; EspañaFil: Bruno, Martin. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Juan; Argentina. Universidad Catolica de Cuyo. Facultad de Ciencias Medicas. Instituto de Investigacion En Ciencias Biomedicas.; Argentin
    • …
    corecore