179 research outputs found

    Data Mining Using RFM Analysis

    Get PDF

    Service-Oriented Data Mining

    Get PDF

    Cluster analysis for physical oceanographic data and oceanographic surveys in Turkish seas

    Get PDF
    Cluster analysis is a useful data mining method to obtain detailed information on the physical state of the ocean. The primary objective of this study is the development of a new spatio-temporal density-based algorithm for clustering physical oceanographic data. This study extends the regular spatial cluster analysis to deal with spatial data at different epochs. It also presents the sensitivity of the new algorithm to different parameter settings. The purpose of the sensitivity analysis presented in this paper is to identify the response of the algorithm to variations in input parameter values and boundary conditions. In order to demonstrate the usage of the new algorithm, this paper presents two oceanographic applications that cluster the sea-surface temperature (SST) and the sea-surface height residual (SSH) data which records the satellite observations of the Turkish Seas. It also evaluates and justifies the clustering results by using a cluster validation technique

    FARKLI BAĞLANTI YÖNTEMLERİ İLE HİYERARŞİK KÜMELEME TOPLULUĞU

    Get PDF
    Kümeleme topluluğu, yüksek kümeleme performansı sağlaması nedeniyle son yıllarda tercih edilen bir teknik haline gelmiştir. Bu çalışmada, Bağlantı-tabanlı Hiyerarşik Kümeleme Topluluğu (BHKT) olarak isimlendirilen yeni bir yaklaşım önerilmektedir. Önerilen yaklaşımda, topluluk elemanları farklı bağlantı yöntemleri kullanarak hiyerarşik kümeleme yapmakta ve sonrasında çoğunluk oylaması ile ortak karar üretmektedir. Çalışmada kullanılan bağlantı yöntemleri: tek bağlantı, tam bağlantı, ortalama bağlantı, merkez bağlantı, Ward yöntemi, komşu birleştirme yöntemi ve ayarlı tam bağlantıdır. Ayrıca çalışmada, farklı boyutlardaki hiyerarşik kümeleme toplulukları incelenmiş ve birbiriyle karşılaştırılmıştır. Deneysel çalışmalarda, hiyerarşik kümeleme toplulukları 8 farklı veri setinde uygulanmış ve tek bir kümeleme algoritmasına göre daha iyi sonuçlar elde edilmiştir

    Data Mining in Banking Sector Using Weighted Decision Jungle Method

    Get PDF
    Classification, as one of the most popular data mining techniques, has been used in the banking sector for different purposes, for example, for bank customer churn prediction, credit approval, fraud detection, bank failure estimation, and bank telemarketing prediction. However, traditional classification algorithms do not take into account the class distribution, which results into undesirable performance on imbalanced banking data. To solve this problem, this paper proposes an approach which improves the decision jungle (DJ) method with a class-based weighting mechanism. The experiments conducted on 17 real-world bank datasets show that the proposed approach outperforms the decision jungle method when handling imbalanced banking data

    Ensemble Methods in Environmental Data Mining

    Get PDF
    Environmental data mining is the nontrivial process of identifying valid, novel, and potentially useful patterns in data from environmental sciences. This chapter proposes ensemble methods in environmental data mining that combines the outputs from multiple classification models to obtain better results than the outputs that could be obtained by an individual model. The study presented in this chapter focuses on several ensemble strategies in addition to the standard single classifiers such as decision tree, naive Bayes, support vector machine, and k-nearest neighbor (KNN), popularly used in literature. This is the first study that compares four ensemble strategies for environmental data mining: (i) bagging, (ii) bagging combined with random feature subset selection (the random forest algorithm), (iii) boosting (the AdaBoost algorithm), and (iv) voting of different algorithms. In the experimental studies, ensemble methods are tested on different real-world environmental datasets in various subjects such as air, ecology, rainfall, and soil

    A Gradual Approach for Multimodel Journey Planning: A Case Study in Izmir, Turkey

    Get PDF
    Planning a journey by integrating route and timetable information from diverse sources of transportation agencies such as bus, ferry, and train can be complicated. A user-friendly, informative journey planning system may simplify a plan by providing assistance in making better use of public transportation. In this study, we presented the service-oriented, multimodel Intelligent Journey Planning System, which we developed to assist travelers in journey planning. We selected Izmir, Turkey, as the pilot city for this system. The multicriteria problem is one of the well-known problems in transportation networks. Our study proposes a gradual path-finding algorithm to solve this problem by considering transfer count and travel time. The algorithm utilizes the techniques of efficient algorithms including round based public transit optimized router, transit node routing, and contraction hierarchies on transportation graph. We employed Dijkstra’s algorithm after the first stage of the path-finding algorithm by applying stage specific rules to reduce search space and runtime. The experimental results show that our path-finding algorithm takes 0.63 seconds of processing time on average, which is acceptable for the user experience

    Multitask-based association rule mining

    Get PDF
    Recently, there has been a growing interest in association rule mining (ARM) in various fields. However,standard ARM algorithms fail to discover rules for multitask problems as they do not consider task-oriented investigationand, therefore, they ignore the correlation among the tasks. Considering this situation, this paper proposes a novelalgorithm, named multitask association rule miner (MTARM), that tends to jointly discover rules by considering multipletasks. This paper also introduces two novel concepts: single-task rule and multiple-task rule. In the first phase of theproposed approach, highly frequent local rules (single-task rules) are explored for each task separately and then theselocal rules are combined to produce the global result (multitask rules) using a majority voting mechanism. Experimentswere conducted on four different real-world multitask learning datasets. The experimental results indicated that theproposed MTARM approach discovers more information than that of traditional ARM algorithms by jointly consideringthe relationships among multiple tasks
    corecore