22 research outputs found

    Sentimental classification analysis of polarity multi-view textual data using data mining techniques

    Get PDF
    The data and information available in most community environments is complex in nature. Sentimental data resources may possibly consist of textual data collected from multiple information sources with different representations and usually handled by different analytical models. These types of data resource characteristics can form multi-view polarity textual data. However, knowledge creation from this type of sentimental textual data requires considerable analytical efforts and capabilities. In particular, data mining practices can provide exceptional results in handling textual data formats. Besides, in the case of the textual data exists as multi-view or unstructured data formats, the hybrid and integrated analysis efforts of text data mining algorithms are vital to get helpful results. The objective of this research is to enhance the knowledge discovery from sentimental multi-view textual data which can be considered as unstructured data format to classify the polarity information documents in the form of two different categories or types of useful information. A proposed framework with integrated data mining algorithms has been discussed in this paper, which is achieved through the application of X-means algorithm for clustering and HotSpot algorithm of association rules. The analysis results have shown improved accuracies of classifying the sentimental multi-view textual data into two categories through the application of the proposed framework on online polarity user-reviews dataset upon a given topics

    Text Mining for Pest and Disease Identification on Rice Farming with Interactive Text Messaging

    Get PDF
    To overcome pests and diseases of rice farming, farmers always rely on information and knowledge from agricultural experts for decision making. The problem is that experts are not always available when the farmers need and the cost is quite high. Pests and diseases elimination is hard to be done individually since the farmers are lack of knowledge about the pest types that attack the rice fields. The objective of this study is to build a knowledge-based system that can identify pests and diseases interactively based on the information that has been told by the farmers using SMS communication services. The system can provide a convenience way to the farmers in delivering pests and disease problem information using a natural language. The text mining method performs tokenizing, filtering and porter stemming that used to extract important information sent by a SMS service. The method of Jaccard Similarity Coefficient (JSC) was used to calculate similarities of each pest and disease based on symptoms that are sent by the farmers through SMS. The corpus database usedin this study consists of 28.526 root words, 1.309 stop wordsand 180 words list. Pest and disease database reference in this study was obtained from the Ministry of Agriculture and Fisher (MAF) Timor-Leste. The result of the experiment shows that the system is able to identify the symptoms based on the keywords identified with the accuracy of 81%. The result of pest and disease identification has the accuracy of 86%

    Clustering the Verses of the Holy Qur'an using K-Means Algorithm

    Get PDF
    The Holy Qur’an is a basic living guidance for Muslims. The depth of the sea of knowledge in the Holy Qur’an gives its own attractiveness to many researcher to conduct exploration involving automated application. This article provides a practical works of text mining applied as an initial route which begin with some Qur’anic structures phenomenon by clustering the verses. K-means algorithm has been applied to clustering experiment in a framework of text mining. This study resulted the total of 6236 verses (data corpus), using unsteamed and steamed words which then establish three clusters

    Text mining of articles in an issue of the journal „Economics and Computer Science" dedicated on the DIMBI project

    Get PDF
    The purpose of this article is to use business intelligence techniques to analyse articles in an issue (Volume 2, Issue 5) of the journal „Economics and Computer Science". Since business intelligence methods are many, the research is limited to text mining meth­ods. The research aim is to find terminology which is common for all articles in one issue of the journal. Since the journal has published several thematic issues, it is a research que s­tions to find ontologies in each thematic issue. Rapid Miner is used as a software tool to conduct the text mining techniques. The most frequently used terms are found by Rapid Miner. A manual thematic classification of terms is done. The main groups are: educational, research and software. The proposed methodology may be used by other authors for other surveys in different thematic content

    SME credit application, a text classification approach

    Get PDF
    Project Work presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Marketing Research e CRMDuring the SME credit application process a credit expert will give a specific recommendation to the credit commercial advisor. This recommendation can be classified as positive, negative or partial. This project aims to construct a text classifier model in order to give the recommendation text one of the categories mentioned before. To achieve this, two models are tested using state-of-the-art architecture called BERT proposed by Google in 2019. The first model will use single sentence BERT classification model as proposed by Google. The second model will use SBERT architecture, where BERT embedding model will be fine-tuned for the specific task, a max-pooling layer is added to extract a fixed size vector for all the document and work under fully connected network architecture. Results show that the second approach got better results regarding accuracy, precision and recall. Despite of the bunch of limitations of computational capacity, limited number of tagged examples and BERT maximum sequence length the model show a good first approach to solve the current problem

    PERHITUNGAN ANALISIS SENTIMEN BERBASIS KOMPARASI ALGORITMA NAIVE BAYES DAN K-NEAREST NEIGHBOUR BERBASIS PARTICLE SWARM OPTIMIZATION PADA KOMENTAR INSIDEN PEMBALAP MOTOGP 2015

    Get PDF
    Media to get information about news MotoGP rider very much like media TV, radio, newspapers, magazines, websites and others. But from most of the media is a media website which is very flexible because it can be accessed from a wide variety of places connected to the Internet, the information provided is up to date and everyone can comment on articles related. The information spreads very fast and is accompanied by the freedom of speech can cause various types of opinions, either negative or positive opinion. Classification techniques of some of the most frequently used is Naive Bayes and k-Nearest Neighbour KNN). Naive Bayes classifier is a simple applying Bayes Theorem to independence (independent) high. K-Nearest Neighbor (KNN) classification algorithm predicts the category of the test sample in accordance with the training sample K nearest neighbor to the test sample, and a judge for the category that has the largest category of probability. Therefore, in this study using the merging feature selection methods, namely particle Swarm Optimization in order to improve the accuracy on Naive Bayes and k-Nearest Neighbour. As for the resulting accuracy Naive Bayes algorithm based on Particle Swarm Optimmization with an accuracy of 82.67%. and k-Nearest Neighbour-based Particle Swarm Optimmization with an accuracy of 71.33% It can be concluded that the application of optimization can improve accuracy. Model in Naive Bayes-based Particle Swarm Optimization can provide solutions to the problems of classification review of public opinion news MotoGP racer incident in order to more accurately and optimally. for the model-based k-Nearest Neighbour Particle Swarm Optimization accuracy decreases.Keywords : Media, Classification, Naive Bayes, k-Nearest Neighbour, Particle Swarm Optimization, Text Mining

    PENINGKATAN OPTIMASI SENTIMEN DALAM PELAKSANAAN PROSES PEMILIHAN PRESIDEN BERDASARKAN OPINI PUBLIK DENGAN MENGGUNAKAN ALGORITMA NAĂŹVE BAYES DAN PARICLE SWARM OPTIMIZATION

    Get PDF
    Abstract- The development of increasingly advanced IT in the process of presidential elections. When the Presidential election of 2014 yesterday has a lot of people use the phrase does not educate inappropriate to be delivered among the public. Pros and cons indeed occur among people are so warm that they pour on the internet. This happens because when getting warm diperbincangan 2014 presidential election yesterday happened pengkubu-kubuan two candidates. Society can not adjust the development of IT process well. Naive Bayes is widely used for classification problems in data mining and machine learning for its simplicity and accuracy of classification impressive. Naive Bayes classifier has been shown to be very effective to solve the problem of large scale for text categorization with high accuracy. In addition to having many capabilities mentioned above, however this method has a drawback in the assumptions that are difficult to fulfill, namely the independence of the feature. Particle Swarm Optimization (PSO) is an evolutionary computation technique which is able to produce globally optimal solution in the search space through the interaction of individuals in a swarm of particles. PSO is widely used to solve optimization problems as well as the feature selection. Accuracy is generated on Naive Bayes algorithm amounted to 63.85% and AUC by 0523, while Naive Bayes and Particle Swarm Optimmization with an accuracy of 71.15% and the AUC of 0.600. It can be concluded that the application of optimization can improve the accuracy of 63.85% to 71.15%. Naive Bayes Model and Particle Swarm Optimization can provide solutions to the problems of classification review of public opinion news of the election in order to more accurately and optimally. Keywords:Public Opinion, Classification, Naive Bayes, Particle Swarm Optimization, Text Mining

    New Horizons for Airlines: Consumers’ Adoption of Metaverse - A Qualitative and Quantitative research

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Driven Marketing, specialization in Digital Marketing and AnalyticsMetaverse technology is increasing its relevance in this digital and connected world and airlines should decide what is their strategy and purpose to engage with consumers in this new dimension. Because the metaverse it’s a new technology, airlines must also understand the factors for its acceptance. Though the technology acceptance is already vastly investigated, the review of acceptance for metaverse technology is still reduced. This research intention is to provide an empirical study of the technology acceptance on an airline metaverse and contribute to the findings that reveal new opportunities to engage with consumers/ passengers. Following the literature review we based on a Technology Acceptance Model (TAM) framework that proved to be the most suitable for the technology adoption dimension. This research focused on qualitative and quantitative analysis and our findings reveal the vision from an airline perspective and the factor of adoption of potential users. Qualitative analysis was based on 3 semi-structured interviews targeting airline experts followed by text mining and data analysis via IRAMUTEQ software. Concerning the quantitative analysis was based on a structured questionnaire that uses a convenience sampling technic. A total of 118 replies were collected and analyzed via SmartPLS4 software. The outcomes of this research are insightful and reveal that Gamification and Perceived Consumer Experience have positive and relevant effects on the intention to use the metaverse of an airline. Management contributions, future studies and academic insights are also present in the final section of the research

    Technology in the 21st Century: New Challenges and Opportunities

    Get PDF
    Although big data, big data analytics (BDA) and business intelligence have attracted growing attention of both academics and practitioners, a lack of clarity persists about how BDA has been applied in business and management domains. In reflecting on Professor Ayre's contributions, we want to extend his ideas on technological change by incorporating the discourses around big data, BDA and business intelligence. With this in mind, we integrate the burgeoning but disjointed streams of research on big data, BDA and business intelligence to develop unified frameworks. Our review takes on both technical and managerial perspectives to explore the complex nature of big data, techniques in big data analytics and utilisation of big data in business and management community. The advanced analytics techniques appear pivotal in bridging big data and business intelligence. The study of advanced analytics techniques and their applications in big data analytics led to identification of promising avenues for future research
    corecore