thesis

NAMED ENTITY RECOGNITION AND CLASSIFICATION FOR NATURAL LANGUAGE INPUTS AT SCALE

Abstract

Natural language processing (NLP) is a technique by which computers can analyze, understand, and derive meaning from human language. Phrases in a body of natural text that represent names, such as those of persons, organizations or locations are referred to as named entities. Identifying and categorizing these named entities is still a challenging task, research on which, has been carried out for many years. In this project, we build a supervised learning based classifier which can perform named entity recognition and classification (NERC) on input text and implement it as part of a chatbot application. The implementation is then scaled out to handle very high-velocity concurrent inputs and deployed on two clusters of different sizes. We evaluate performance for various input loads and configurations and compare observations to determine an optimal environment

    Similar works