2 research outputs found

    Global Entity Ranking Across Multiple Languages

    Full text link
    We present work on building a global long-tailed ranking of entities across multiple languages using Wikipedia and Freebase knowledge bases. We identify multiple features and build a model to rank entities using a ground-truth dataset of more than 10 thousand labels. The final system ranks 27 million entities with 75% precision and 48% F1 score. We provide performance evaluation and empirical evidence of the quality of ranking across languages, and open the final ranked lists for future research.Comment: 2 Pages, 1 Figure, 2 Tables, WWW2017 Companion, WWW 2017 Companio

    Lithium NLP: A System for Rich Information Extraction from Noisy User Generated Text on Social Media

    Full text link
    In this paper, we describe the Lithium Natural Language Processing (NLP) system - a resource-constrained, high- throughput and language-agnostic system for information extraction from noisy user generated text on social media. Lithium NLP extracts a rich set of information including entities, topics, hashtags and sentiment from text. We discuss several real world applications of the system currently incorporated in Lithium products. We also compare our system with existing commercial and academic NLP systems in terms of performance, information extracted and languages supported. We show that Lithium NLP is at par with and in some cases, outperforms state- of-the-art commercial NLP systems.Comment: 9 pages, 6 figures, 2 tables, EMNLP 2017 Workshop on Noisy User Generated Text WNUT 201
    corecore