4 research outputs found

    Finding patterns in student and medical office data using rough sets

    Get PDF
    Data have been obtained from King Khaled General Hospital in Saudi Arabia. In this project, I am trying to discover patterns in these data by using implemented algorithms in an experimental tool, called Rough Set Graphic User Interface (RSGUI). Several algorithms are available in RSGUI, each of which is based in Rough Set theory. My objective is to find short meaningful predictive rules. First, we need to find a minimum set of attributes that fully characterize the data. Some of the rules generated from this minimum set will be obvious, and therefore uninteresting. Others will be surprising, and therefore interesting. Usual measures of strength of a rule, such as length of the rule, certainty and coverage were considered. In addition, a measure of interestingness of the rules has been developed based on questionnaires administered to human subjects. There were bugs in the RSGUI java codes and one algorithm in particular, Inductive Learning Algorithm (ILA) missed some cases that were subsequently resolved in ILA2 but not updated in RSGUI. I solved the ILA issue on RSGUI. So now ILA on RSGUI is running well and gives good results for all cases encountered in the hospital administration and student records data.Master's These

    İmalat sanayinde veri madenciliği destekli tedarikçi seçimi uygulaması

    Get PDF
    06.03.2018 tarihli ve 30352 sayılı Resmi Gazetede yayımlanan “Yükseköğretim Kanunu İle Bazı Kanun Ve Kanun Hükmünde Kararnamelerde Değişiklik Yapılması Hakkında Kanun” ile 18.06.2018 tarihli “Lisansüstü Tezlerin Elektronik Ortamda Toplanması, Düzenlenmesi ve Erişime Açılmasına İlişkin Yönerge” gereğince tam metin erişime açılmıştır.Bilginin temel yapısını oluşturan veri, son dönemde gelişen veri madenciliği kavramı ile dahabir önem kazanmıştır. Dünyada ve Türkiye'de veri madenciliğine olan ilgi ve yatırım büyükmiktarlara ulaşmıştır. Dünyada perakendecilik-marketçilik, e-ticaret, bankacılık, sigortacılık,telekomünikasyon, sağlık ve eğitim alanlarında yaygın olarak kullanılan veri madenciliği, sondönemde Türkiye'de de özellikle marketçilik, banka ve sigortacılık ile e-devlet alanlarındakullanılmaya başlanmıştır.Veri madenciliğinin üretim sektöründe kullanımı ise henüz yaygınlaşmamıştır. Buna gerekçeolarak bu alanda farklı tekniklerin kullanılması gösterilebilir. Ancak son zamanlarda verimadenciliği tekniklerinin, MRP ve ERP sistemleri ile birlikte kullanımı, olumlu sonuçlarvermeye başlamıştır. Hatta veri madenciliğini ERP sistemi içerisinde gösteren yaklaşımlarmevcuttur.Bu çalışmada, veri madenciliğinin tanımı, kullanım alanları, model ve algoritmaları ayrıntılıolarak ele alınmıştır. Uygulama kısmında ise, üterim sektöründe faaliyet gösteren birişletmenin gerçek verileri kullanılmıştır.Birinci aşamada veriler düzenlenerek bir veri seti oluşturulmuş, daha sonra bu veri seti uygunmodel kurularak analiz edilmiştir. Analiz için SPSS Clementine 9.0 yazılımı kullanılmıştır.Elde edilen sonuçlar istatistik yöntemler kullanılarak test edilip, işletmenin tedarikçileri ileolan ilişkilerini etkileyecek anlamlı sonuçlar elde edilmiştir.Son aşamada ise kurulan model; gerek verileri kullanılan işletmenin, gerekse benzerişletmelerin kullanabilecekleri dinamik bir yapıya dönüştürülmüştür.Yaygın kullanım alanlarından farklı olarak, veri madenciliğinin üretim sektöründe debaşarıyla kullanılabilir olduğunu göstermek, hem bu çalışmayı farklı kılmış, hem de bu alandaçalışmak isteyen araştırmacılara bir bakış açısı kazandırmıştır.Being the basic structure of knowledge, data has gained considerable importance with theemergence of the consept of data mining. Investment and interest in data mining has beengrowing and already reached big sums in the world as well as in Turkey. Data mining is usedworldwide in various social and industrial areas such as retail marketing, e-commerce,banking, insurance, telecommunications, health and education. In Turkey, in recent years it isbeing utilesed especially in the areas of retail marketing, banking, insurance and e-state.Using the data mining in manufacturing is not wide-spread for now. The reason for this is thatso many different techniques for different areas are used in data mining. However nowadays,the usage of data mining techniques, with MRP and ERP systems is getting to give goodresults. In fact, there are some approacher which includes ERP systems in data mining.In this resarch, the definition of data mining, the areas of its application, the models and thealgoritms have been examined intensively. In the implemantation stage, real data taken from amanufacturing company have been used.In the first stage, all data have been restored for creating a data-set then this set has beenanalyzed by using an appropriate model. For this purpose, SPSS Clementine 9.0 software hasbeen used. The results obtained, have been tested using statistical methods and results makinggood sense and affecting the relations between the company and suppliers have beenobtained.Finally the model that is developed in this study has been given a dynamic structure that notonly the company whose data were used benefits from it, but also similar companies caneasily adapt for their applications.Proving that data mining can be used in manufacturing area succesfuly has made this resarch,different from the others so, it has contributed a new point of view for the other researchers

    Improved rule discovery performance on uncertainty

    No full text
    In this paper we describe the improved version of a novel rule induction algorithm, namely ILA. We first outline the basic algorithm, and then present how the algorithm is enhanced using the new evaluation metric that handles uncertainty in a given data set. In addition to having a faster induction than the original one, we believe that our contribution comes into picture with a new metric that allows users to define their preferences through a penalty factor. We use this penalty factor to tackle with over-fitting bias, which is inherently found in a great many of inductive algorithms. We compare the improved algorithm ILA-2 to a variety of induction algorithms, including ID3, OC1, C4.5, CN2, and ILA. According to our preliminary experimental work, the algorithm appears to be comparable to the well-known algorithms such as CN2 and C4.5 in terms of accuracy and size
    corecore