4,968 research outputs found

    Domain Classification for Marathi Blog Articles using Deep Learning

    Get PDF
    Nowadays the exponential growth of online content, particularly in the form of blog articles is tremendous, the need for effective techniques to automatically categorize them into relevant domains has become increasingly important. To overcome the challenges the domains like natural language processing (NLP), machine learning (ML) and deep learning (DL)are being working as booster effect to emerge out with solutions. In this proposed system methodology-based NLP and DL domain the long short-term memory (LSTM) classifier for domain classification and compared the existing multiclass classification techniques with having accuracy around 94% and 91% by long short-term memory (LSTM) model using two different data sets one is Marathi new article and another one Financial article data set. The proposed model is being compared with multiple other models like naïve bayes (NB), XGBoost, support vector machine (SVM) and random forest (RF). The final estimated result achieved is best combination of dataset and deep learning algorithm LSTM

    Recent Trends in Computational Intelligence

    Get PDF
    Traditional models struggle to cope with complexity, noise, and the existence of a changing environment, while Computational Intelligence (CI) offers solutions to complicated problems as well as reverse problems. The main feature of CI is adaptability, spanning the fields of machine learning and computational neuroscience. CI also comprises biologically-inspired technologies such as the intellect of swarm as part of evolutionary computation and encompassing wider areas such as image processing, data collection, and natural language processing. This book aims to discuss the usage of CI for optimal solving of various applications proving its wide reach and relevance. Bounding of optimization methods and data mining strategies make a strong and reliable prediction tool for handling real-life applications

    A Survey on Arabic Named Entity Recognition: Past, Recent Advances, and Future Trends

    Full text link
    As more and more Arabic texts emerged on the Internet, extracting important information from these Arabic texts is especially useful. As a fundamental technology, Named entity recognition (NER) serves as the core component in information extraction technology, while also playing a critical role in many other Natural Language Processing (NLP) systems, such as question answering and knowledge graph building. In this paper, we provide a comprehensive review of the development of Arabic NER, especially the recent advances in deep learning and pre-trained language model. Specifically, we first introduce the background of Arabic NER, including the characteristics of Arabic and existing resources for Arabic NER. Then, we systematically review the development of Arabic NER methods. Traditional Arabic NER systems focus on feature engineering and designing domain-specific rules. In recent years, deep learning methods achieve significant progress by representing texts via continuous vector representations. With the growth of pre-trained language model, Arabic NER yields better performance. Finally, we conclude the method gap between Arabic NER and NER methods from other languages, which helps outline future directions for Arabic NER.Comment: Accepted by IEEE TKD

    A Survey of Arabic Text Classification Models

    Get PDF
    There is a huge content of Arabic text available over online that requires an organization of these texts. As result, here are many applications of natural languages processing (NLP) that concerns with text organization. One of the is text classification (TC). TC helps to make dealing with unorganized text. However, it is easier to classify them into suitable class or labels. This paper is a survey of Arabic text classification. Also, it presents comparison among different methods in the classification of Arabic texts, where Arabic text is represented a complex text due to its vocabularies. Arabic language is one of the richest languages in the world, where it has many linguistic bases. The researche in Arabic language processing is very few compared to English. As a result, these problems represent challenges in the classification, and organization of specific Arabic text. Text classification (TC) helps to access the most documents, or information that has already classified into specific classes, or categories to one or more classes or categories. In addition, classification of documents facilitate search engine to decrease the amount of document to, and then to become easier to search and matching with queries

    Concept-Based Automatic Amharic Document Categorization

    Get PDF
    Along with the continuously growing volume of information resources, there is a growing interest toward better solutions for finding, filtering and organizing these resources. Automatic text categorization can play an important role in a wide variety of flexible, dynamic, and personalized information management tasks. The aim of this research work is to make use of concepts as a way of improving the categorization process for Amharic1 documents. In recent years, ontology-based document categorization method is introduced to solve the problem of document classification. Previous works on keyword-based document categorization miss some important issues of considering semantic relationships between words. In order to resolve the existing problems, this research proposed a framework that automatically categorizes Amharic documents into predefined categories using concepts. The research shows that the use of concepts for an Amharic document categorizer results in 92.9% accuracy

    Detecting Phishing Websites Using Associative Classification

    Get PDF
    Phishing is a criminal technique employing both social engineering and technical subterfuge to steal consumer's personal identity data and financial account credential. The aim of the phishing website is to steal the victims’ personal information by visiting and surfing a fake webpage that looks like a true one of a legitimate bank or company and asks the victim to enter personal information such as their username, account number, password, credit card number, …,etc. This paper main goal is to investigate the potential use of automated data mining techniques in detecting the complex problem of phishing Websites in order to help all users from being deceived or hacked by stealing their personal information and passwords leading to catastrophic consequences. Experimentations against phishing data sets and using different common associative classification algorithms (MCAR and CBA) and traditional learning approaches have been conducted with reference to classification accuracy. The results show that the MCAR and CBA algorithms outperformed SVM and algorithms. Keywords: Phishing Websites, Data Mining, Associative Classification, Machine Learnin

    Detecting Phishing Websites Using Associative Classification

    Get PDF
    Phishing is a criminal technique employing both social engineering and technical subterfuge to steal consumer's personal identity data and financial account credential. The aim of the phishing website is to steal the victims’ personal information by visiting and surfing a fake webpage that looks like a true one of a legitimate bank or company and asks the victim to enter personal information such as their username, account number, password, credit card number, …,etc. This paper main goal is to investigate the potential use of automated data mining techniques in detecting the complex problem of phishing Websites in order to help all users from being deceived or hacked by stealing their personal information and passwords leading to catastrophic consequences. Experimentations against phishing data sets and using different common associative classification algorithms (MCAR and CBA) and traditional learning approaches have been conducted with reference to classification accuracy. The results show that the MCAR and CBA algorithms outperformed SVM and algorithms. Keywords: Phishing Websites, Data Mining, Associative Classification, Machine Learning
    • …
    corecore