2 research outputs found

    Автоматизований пошук іменованих сутностей у нерозмічених текстах українською мовою

    No full text
    У роботі описано створений та реалізований алгоритм пошуку іменованих сутностей у текстах українською мовою. Створені програмні інструменти дозволяють виділяти іменовані сутності та зв'язки між ними в графічному режимі. Утиліту реалізовано у вигляді веб-застосунку. За допомогою цього програмного інструментарію створено корпус анотованих NER сутностей текстів у кількості 122 тексти. Проставлено такі види сутностей як персони, організації та географічні об’єкти. Корпус складається з 2731 іменованої сутності.The paper describes the created and implemented algorithm of the search for named entities in the texts in the Ukrainian language. The software tools created on the basis of them allow to allocate the named entities and connections between them in graphic mode. The utility is implemented as a web application. With the help of this software tool, a body of annotated NERs of texts of 122 texts was created. There are such kinds of entities as persons, organizations and geographical objects. The body consists of 2,731 named entities

    New approach for Arabic named entity recognition on social media based on feature selection using genetic algorithm

    Get PDF
    Many features can be extracted from the massive volume of data in different types that are available nowadays on social media. The growing demand for multimedia applications was an essential factor in this regard, particularly in the case of text data. Often, using the full feature set for each of these activities can be time-consuming and can also negatively impact performance. It is challenging to find a subset of features that are useful for a given task due to a large number of features. In this paper, we employed a feature selection approach using the genetic algorithm to identify the optimized feature set. Afterward, the best combination of the optimal feature set is used to identify and classify the Arabic named entities (NEs) based on support vector. Experimental results show that our system reaches a state-of-the-art performance of the Arab NER on social media and significantly outperforms the previous systems
    corecore