16 research outputs found

    Comparison of Machine Learning Algorithms Using WEKA and Sci-Kit Learn in Classifying Online Shopper Intention

    Get PDF
    The growth of online stores nowadays is very rapid. This is supported by faster and better internet infrastructure. The increasing growth of online stores makes the competition more difficult in this business field. It is necessary for online stores to have a website or an application that is able to measure and classify consumers’ spending intentions, so that the consumers will have eyes on things on the sites and applications to make purchases eventually. Classification of online shoppers’ intentions can be done by using several algorithms, such as Naïve Bayes, Multi-Layer Perceptron, Support Vector Machine, Random Forest and J48 Decision Trees. In this case, the comparison of algorithms is done with two tools, WEKA and Sci-Kit Learn by comparing the values of F1-Score, accuracy, Kappa Statistic and mean absolute error. There is a difference between the test results using WEKA and Sci-Kit Learn on the Support Vector Machine algorithm. Based on this research, the Random Forest algorithm is the most appropriate algorithm to be used as an algorithm for classifying online shoppers’ intentions

    An XGBoost Algorithm for Predicting Purchasing Behaviour on E-Commerce Platforms

    Get PDF
    To improve and enhance the predictive ability of consumer purchasing behaviours on e-commerce platforms, a new method of predicting purchasing behaviour on e-commerce platforms is created in this paper. This study introduced the basic principles of the XGBoost algorithm, analysed the historical data of an e-commerce platform, pre-processed the original data and constructed an e-commerce platform consumer purchase prediction model based on the XGBoost algorithm. By using the traditional random forest algorithm for comparative analysis, the K-fold cross-validation method was further used, combined with model performance indicators such as accuracy rate, precision rate, recall rate and F1-score to evaluate the classification accuracy of the model. The characteristics of the importance of the results were found through visual analysis. The results indicated that using the XGBoost algorithm to predict the purchasing behaviours of e-commerce platform consumers can improve the performance of the method and obtain a better prediction effect. This study provides a reference for improving the accuracy of e-commerce platform consumers\u27 purchasing behaviours prediction, and has important practical significance for the efficient operation of e-commerce platforms

    Customer purchase behavior prediction in E-commerce: a conceptual framework and research agenda

    Get PDF
    Digital retailers are experiencing an increasing number of transactions coming from their consumers online, a consequence of the convenience in buying goods via E-commerce platforms. Such interactions compose complex behavioral patterns which can be analyzed through predictive analytics to enable businesses to understand consumer needs. In this abundance of big data and possible tools to analyze them, a systematic review of the literature is missing. Therefore, this paper presents a systematic literature review of recent research dealing with customer purchase prediction in the E-commerce context. The main contributions are a novel analytical framework and a research agenda in the field. The framework reveals three main tasks in this review, namely, the prediction of customer intents, buying sessions, and purchase decisions. Those are followed by their employed predictive methodologies and are analyzed from three perspectives. Finally, the research agenda provides major existing issues for further research in the field of purchase behavior prediction online

    Prediksi Indeks Harga Konsumen Komoditas Makanan Berbasis Cloud Computing Menggunakan Multilayer Perceptron

    Get PDF
    Teknik prediksi merupakan salah satu area dalam data mining dimana menemukan pola dari sekumpulan data yang mengarah pada prediksi di masa depan. Prediksi dalam bidang ekonomi merupakan prediksi yang mendominasi karena merupakan salah satu parameter berkembangnya suatu negara. Indeks Harga Konsumen menggambarkan tingkat konsumsi barang dan jasa pada masyarakat yang dapat dijadikan acuan nilai inflasi. Mayoritas penelitian yang melakukan prediksi nilai Indeks Harga Konsumen sebelumnya hanya melakukan prediksi menggunakan nilai Indeks Harga Konsumen itu sendiri sebagai nilai input dan output. Penelitian ini membangun model peramalan dengan memanfaatkan multi variabel input yaitu 28 jenis harga bahan pokok harian sebagai nilai input untuk meramal nilai Indeks Harga Konsumen di kota Surabaya periode 2014 sampai 2018 dimana keseluruhan pembangunan model prediksi dilakukan di lingkungan Amazon Cloud Services. Sistem prediksi dibangun dengan algoritma Multilayer Perceptron dengan variasi arsitektur jumlah neuron, epoch, dan hidden layer. Berdasarkan hasil pengujian, akurasi terbaik dengan nilai RMSE 3.380  dihasilkan oleh konfigurasi 2 hidden layer,  hidden layer pertama dan kedua mempunyai neuron masing-masing berjumlah 10 dengan epoch sebesar 1000

    An Enhanced Spectral Clustering Algorithm with S-Distance

    Get PDF
    This work is partially supported by the project "Prediction of diseases through computer assisted diagnosis system using images captured by minimally-invasive and non-invasive modalities", Computer Science and Engineering, PDPM Indian Institute of Information Technology, Design and Manufacturing, Jabalpur India (under ID: SPARCMHRD-231). This work is also partially supported by the project "Smart Solutions in Ubiquitous Computing Environments", Grant Agency of Excellence, University of Hradec Kralove, Faculty of Informatics and Management, Czech Republic (under ID: UHK-FIM-GE-2204/2021); project at Universiti Teknologi Malaysia (UTM) under Research University Grant Vot-20H04, Malaysia Research University Network (MRUN) Vot 4L876 and the Fundamental Research Grant Scheme (FRGS) Vot5F073 supported by the Ministry of Education Malaysia for the completion of the research.Calculating and monitoring customer churn metrics is important for companies to retain customers and earn more profit in business. In this study, a churn prediction framework is developed by modified spectral clustering (SC). However, the similarity measure plays an imperative role in clustering for predicting churn with better accuracy by analyzing industrial data. The linear Euclidean distance in the traditional SC is replaced by the non-linear S-distance (Sd). The Sd is deduced from the concept of S-divergence (SD). Several characteristics of Sd are discussed in this work. Assays are conducted to endorse the proposed clustering algorithm on four synthetics, eight UCI, two industrial databases and one telecommunications database related to customer churn. Three existing clustering algorithms-k-means, density-based spatial clustering of applications with noise and conventional SC-are also implemented on the above-mentioned 15 databases. The empirical outcomes show that the proposed clustering algorithm beats three existing clustering algorithms in terms of its Jaccard index, f-score, recall, precision and accuracy. Finally, we also test the significance of the clustering results by the Wilcoxon's signed-rank test, Wilcoxon's rank-sum test, and sign tests. The relative study shows that the outcomes of the proposed algorithm are interesting, especially in the case of clusters of arbitrary shape.project "Prediction of diseases through computer assisted diagnosis system using images captured by minimally-invasive and non-invasive modalities", Computer Science and Engineering, PDPM Indian Institute of Information Technology, Design and Manufacturing SPARCMHRD-231project "Smart Solutions in Ubiquitous Computing Environments", Grant Agency of Excellence, University of Hradec Kralove, Faculty of Informatics and Management, Czech Republic UHK-FIM-GE-2204/2021Universiti Teknologi Malaysia (UTM) 20H04Malaysia Research University Network (MRUN) 4L876Fundamental Research Grant Scheme (FRGS) by the Ministry of Education Malaysia 5F07

    Estımatıon of automobıle sales wıth artıfıcıal neural network usıng data of onlıne consumer engagement and search engıne

    Get PDF
    Satış miktarının tahmin edilmesi hammadde alımından reklam giderlerinin belirlenmesine kadar işletmelerde birçok bölüme etki etmektedir. Tüketici verileriyle yapılacak satış tahmininde çok sayıda veriye ihtiyaç duyulmaktadır. Fakat birçok tüketici verisinin kişisel olması nedeniyle kullanılması sorun oluşturmaktadır. Satın alma karar sürecinde etkili olan sosyal ağlar ve arama motorlarına ait veriler değerli olmanın yanında herkesin erişimine açıktır. İşletmeler tüketicilerin kişisel verilerini kullanmadan satış rakamlarını gerçeğe yakın tahmin etmeleri avantaj sağlayabilecektir. Çalışmada tüketicilerin satın alma karar sürecinde etkili ve halka açık olan Çevrimiçi Tüketici Bütünleşme ve arama motoru verileri kullanılarak bir otomobil markasının satış miktarının tahmin edilmesi amaçlanmıştır. Çevrimiçi Tüketici Bütünleşmesinin yoğun yaşandığı işletmeye ait Facebook marka sayfasından 2012-2017 yılları arasında oluşturulan 2267 adet gönderi ve bu gönderilere ait beğenme, yorum ve paylaşma verileri Facebook Graph API ile alınmıştır. Arama motoru verileri Google Trends, satış verileri Otomotiv Distribütörleri Derneği web sitesinden elde edilmiştir. Veriler Min-Max yöntemiyle normalleştirilmiş ve yapay sinir ağları, Bayesian Regülasyon geri yayılım yöntemiyle analiz edilmiştir. Facebook marka sayfası ve arama motoru verileriyle %74 korelasyon ve %1 ortalama hata değeriyle başarılı satış miktarı tahmini yapılmıştır. Ayrıca 6 yılı kapsayan detaylı veriler düzenlenerek tanımlayıcı bilgiler olarak sunulmuştur. Çalışmada tüketicilerin kişisel bilgileri kullanılmadan başarılı satış tahminlemesi yapılmıştır. Çalışma gerçek verilere dayanması ve yapay sinir ağlarının işletme alanında kullanımıyla sektöre ve akademik yazına katkı sağlamaktadır.Estimation of sales volume affects many segments of the enterprises from raw material purchase to advertising expenses. A large number of consumer data is needed to be used in sales forecasting. But the use of many consumer data is private, causing problems. Data from social networks and search engines, which are influential in the purchasing decision process, are valuable and accessible to everyone. It would be advantageous for businesses to estimate their sales figures as accurate as possible without using consumers' personal data.The sales volume of an automobile brand is aimed to be estimated by using Online Consumer Engagement and search engine data. 2267 posts, created between 2012-2017, likes, comments, and sharing data of these posts were taken from Facebook brand page of the business where Online Consumer Engagement is intense using Facebook Graph API. Search engine data was obtained from Google Trends, and sales data was obtained from Automotive Distributors Association website. Data were normalized by Min-Max method and analyzed by feedforward artificial neural networks and Bayesian Regulation backpropagation method. the correlation value of Successful is %74 and mean error value is %1, was made with Facebook brand page and search engine data. In addition, detailed data covering 6 years has been prepared and presented as descriptive information. In the study, successful sales estimation was made without using the private information of the consumers. This study contributes to the sector and academic literature by relying on real data and using artificial neural networks in the business administration
    corecore