2 research outputs found

    A Historical Gazetteer of American Summer Camps

    Get PDF
    Gazetteers (dictionaries of place names, their classifications, and locations) are fundamental to GIS systems. Historical gazetteers especially are an important resource for aggregating knowledge about places across time, and allow for types of data analysis possible only at scale. With the renewed interest in gazetteers as tools for the digital humanities, there has been a rise in domain-specific gazetteers. One sphere that has yet to develop a historical gazetteer is the organized camping industry. Organized camping, termed thus to distinguish it from the less structured and formalized forms of family camping or backpacking, originated in the late 19th century in the United States and has since spread across the globe. The available primary source material (annual directories and guidebooks dating back to the 1920s) particularly lends itself to the creation of a gazetteer of summer camps in the US. To make the creation of such a gazetteer possible, this project developed a text mining program to turn early editions of the Porter Sargent Handbook of Summer Camps (the most comprehensive camp directories) into the foundations of a gazetteer. Once expanded and enriched, this geodatabase will serve as a resource for the American Camp Association (ACA), the industry’s primary professional organization.Master of Science in Information Scienc

    Enhancing the Performance of Telugu Named Entity Recognition Using Gazetteer Features

    No full text
    Named entity recognition (NER) is a fundamental step for many natural language processing tasks and hence enhancing the performance of NER models is always appreciated. With limited resources being available, NER for South-East Asian languages like Telugu is quite a challenging problem. This paper attempts to improve the NER performance for Telugu using gazetteer-related features, which are automatically generated using Wikipedia pages. We make use of these gazetteer features along with other well-known features like contextual, word-level, and corpus features to build NER models. NER models are developed using three well-known classifiers—conditional random field (CRF), support vector machine (SVM), and margin infused relaxed algorithms (MIRA). The gazetteer features are shown to improve the performance, and theMIRA-based NER model fared better than its counterparts SVM and CRF
    corecore