4 research outputs found

    NormAPI: An API for normalizing Filipino shortcut texts

    No full text
    © 2014 IEEE. As the number of Internet and mobile phone users grow, texting and chatting have become popular means of communication. Reaching new heights, the extensive use of cellphones and Internet led into the creation of a new language, where words are transformed and made shorter using various styles. Shortcut texting is used in informal venues such as SMS, online, chat rooms, forums and posts in social networks. Huge amounts of data originating from these informal sources can be utilized for various tasks in machine learning and data analytics. As these data may be written in shortcut forms, text normalization is necessary before NLP actions such as information extraction, data mining, text summarization, opinion classification, and even bilingual translations can be fully achieved, by acting as a preprocessing stage that transforms all informal texts back to their original and more understandable forms. This paper is about NormAPI, an API for normalizing Filipino shortcut texts. NormAPI primarily intends to be used as a preprocessing system that corrects informalities in shortcut texts before they are handed for complete data processing

    NormAPI: An API for normalizing Filipino shortcut texts

    No full text
    As the number of Internet and mobile phone users grows, texting and chatting have become popular means of communication. Reaching new heights, the extensive use of cellphones and Internet led into the creation of a new language, where words are transformed and made shorter using various styles. Shortcut texting is used all over the world and in recent years, numerous researchers have created normalization systems in different languages that would transform shortcut texts back into their original forms. This research designed techniques and developed NormAPI, a system that will normalize Filipino shortcut texts. Focused on modern Filipino language which includes code-switching, the system primarily contributes to Natural Language Processing (NLP) research as a preprocessing system that corrects informalities in shortcut texts before they are handed for complete data processing. Functionalities include using four normalization variations namely, Dictionary Substitution Approach (DSA), Statistical Machine Translation (SMT), SMT after DSA and SMT before DSA, with 0.68384, 0.79650, 0.75634 and 0.80750 BLEU scores, respectively. Additionally, options such as setting the dictionary, generating language models, getting BLEU scores and more can be utilized by users based on their preferences

    Laterality and fish welfare - A review

    No full text
    corecore