2 research outputs found
Global Entity Ranking Across Multiple Languages
We present work on building a global long-tailed ranking of entities across
multiple languages using Wikipedia and Freebase knowledge bases. We identify
multiple features and build a model to rank entities using a ground-truth
dataset of more than 10 thousand labels. The final system ranks 27 million
entities with 75% precision and 48% F1 score. We provide performance evaluation
and empirical evidence of the quality of ranking across languages, and open the
final ranked lists for future research.Comment: 2 Pages, 1 Figure, 2 Tables, WWW2017 Companion, WWW 2017 Companio
Lithium NLP: A System for Rich Information Extraction from Noisy User Generated Text on Social Media
In this paper, we describe the Lithium Natural Language Processing (NLP)
system - a resource-constrained, high- throughput and language-agnostic system
for information extraction from noisy user generated text on social media.
Lithium NLP extracts a rich set of information including entities, topics,
hashtags and sentiment from text. We discuss several real world applications of
the system currently incorporated in Lithium products. We also compare our
system with existing commercial and academic NLP systems in terms of
performance, information extracted and languages supported. We show that
Lithium NLP is at par with and in some cases, outperforms state- of-the-art
commercial NLP systems.Comment: 9 pages, 6 figures, 2 tables, EMNLP 2017 Workshop on Noisy User
Generated Text WNUT 201