2,397 research outputs found

    A survey of kernel and spectral methods for clustering

    Get PDF
    Clustering algorithms are a useful tool to explore data structures and have been employed in many disciplines. The focus of this paper is the partitioning clustering problem with a special interest in two recent approaches: kernel and spectral methods. The aim of this paper is to present a survey of kernel and spectral clustering methods, two approaches able to produce nonlinear separating hypersurfaces between clusters. The presented kernel clustering methods are the kernel version of many classical clustering algorithms, e.g., K-means, SOM and neural gas. Spectral clustering arise from concepts in spectral graph theory and the clustering problem is configured as a graph cut problem where an appropriate objective function has to be optimized. An explicit proof of the fact that these two paradigms have the same objective is reported since it has been proven that these two seemingly different approaches have the same mathematical foundation. Besides, fuzzy kernel clustering methods are presented as extensions of kernel K-means clustering algorithm. (C) 2007 Pattem Recognition Society. Published by Elsevier Ltd. All rights reserved

    Multiple instance fuzzy inference.

    Get PDF
    A novel fuzzy learning framework that employs fuzzy inference to solve the problem of multiple instance learning (MIL) is presented. The framework introduces a new class of fuzzy inference systems called Multiple Instance Fuzzy Inference Systems (MI-FIS). Fuzzy inference is a powerful modeling framework that can handle computing with knowledge uncertainty and measurement imprecision effectively. Fuzzy Inference performs a non-linear mapping from an input space to an output space by deriving conclusions from a set of fuzzy if-then rules and known facts. Rules can be identified from expert knowledge, or learned from data. In multiple instance problems, the training data is ambiguously labeled. Instances are grouped into bags, labels of bags are known but not those of individual instances. MIL deals with learning a classifier at the bag level. Over the years, many solutions to this problem have been proposed. However, no MIL formulation employing fuzzy inference exists in the literature. In this dissertation, we introduce multiple instance fuzzy logic that enables fuzzy reasoning with bags of instances. Accordingly, different multiple instance fuzzy inference styles are proposed. The Multiple Instance Mamdani style fuzzy inference (MI-Mamdani) extends the standard Mamdani style inference to compute with multiple instances. The Multiple Instance Sugeno style fuzzy inference (MI-Sugeno) is an extension of the standard Sugeno style inference to handle reasoning with multiple instances. In addition to the MI-FIS inference styles, one of the main contributions of this work is an adaptive neuro-fuzzy architecture designed to handle bags of instances as input and capable of learning from ambiguously labeled data. The proposed architecture, called Multiple Instance-ANFIS (MI-ANFIS), extends the standard Adaptive Neuro Fuzzy Inference System (ANFIS). We also propose different methods to identify and learn fuzzy if-then rules in the context of MIL. In particular, a novel learning algorithm for MI-ANFIS is derived. The learning is achieved by using the backpropagation algorithm to identify the premise parameters and consequent parameters of the network. The proposed framework is tested and validated using synthetic and benchmark datasets suitable for MIL problems. Additionally, we apply the proposed Multiple Instance Inference to the problem of region-based image categorization as well as to fuse the output of multiple discrimination algorithms for the purpose of landmine detection using Ground Penetrating Radar

    On the role of pre and post-processing in environmental data mining

    Get PDF
    The quality of discovered knowledge is highly depending on data quality. Unfortunately real data use to contain noise, uncertainty, errors, redundancies or even irrelevant information. The more complex is the reality to be analyzed, the higher the risk of getting low quality data. Knowledge Discovery from Databases (KDD) offers a global framework to prepare data in the right form to perform correct analyses. On the other hand, the quality of decisions taken upon KDD results, depend not only on the quality of the results themselves, but on the capacity of the system to communicate those results in an understandable form. Environmental systems are particularly complex and environmental users particularly require clarity in their results. In this paper some details about how this can be achieved are provided. The role of the pre and post processing in the whole process of Knowledge Discovery in environmental systems is discussed

    Koneoppimiskehys petrokemianteollisuuden sovelluksille

    Get PDF
    Machine learning has many potentially useful applications in process industry, for example in process monitoring and control. Continuously accumulating process data and the recent development in software and hardware that enable more advanced machine learning, are fulfilling the prerequisites of developing and deploying process automation integrated machine learning applications which improve existing functionalities or even implement artificial intelligence. In this master's thesis, a framework is designed and implemented on a proof-of-concept level, to enable easy acquisition of process data to be used with modern machine learning libraries, and to also enable scalable online deployment of the trained models. The literature part of the thesis concentrates on studying the current state and approaches for digital advisory systems for process operators, as a potential application to be developed on the machine learning framework. The literature study shows that the approaches for process operators' decision support tools have shifted from rule-based and knowledge-based methods to machine learning. However, no standard methods can be concluded, and most of the use cases are quite application-specific. In the developed machine learning framework, both commercial software and open source components with permissive licenses are used. Data is acquired over OPC UA and then processed in Python, which is currently almost the de facto standard language in data analytics. Microservice architecture with containerization is used in the online deployment, and in a qualitative evaluation, it proved to be a versatile and functional solution.Koneoppimisella voidaan osoittaa olevan useita hyödyllisiÀ kÀyttökohteita prosessiteollisuudessa, esimerkiksi prosessinohjaukseen liittyvissÀ sovelluksissa. Jatkuvasti kerÀÀntyvÀ prosessidata ja toisaalta koneoppimiseen soveltuvien ohjelmistojen sekÀ myös laitteistojen viimeaikainen kehitys johtavat tilanteeseen, jossa prosessiautomaatioon liitettyjen koneoppimissovellusten avulla on mahdollista parantaa nykyisiÀ toiminnallisuuksia tai jopa toteuttaa tekoÀlysovelluksia. TÀssÀ diplomityössÀ suunniteltiin ja toteutettiin prototyypin tasolla koneoppimiskehys, jonka avulla on helppo kÀyttÀÀ prosessidataa yhdessÀ nykyaikaisten koneoppimiskirjastojen kanssa. Kehys mahdollistaa myös koneopittujen mallien skaalautuvan kÀyttöönoton. Diplomityön kirjallisuusosa keskittyy prosessioperaattoreille tarkoitettujen digitaalisten avustajajÀrjestelmien nykytilaan ja toteutustapoihin, avustajajÀrjestelmÀn tai sen pÀÀtöstukijÀrjestelmÀn ollessa yksi mahdollinen koneoppimiskehyksen pÀÀlle rakennettava ohjelma. Kirjallisuustutkimuksen mukaan prosessioperaattorin pÀÀtöstukijÀrjestelmien taustalla olevat menetelmÀt ovat yhÀ useammin koneoppimiseen perustuvia, aiempien sÀÀntö- ja tietÀmyskantoihin perustuvien menetelmien sijasta. SelkeitÀ yhdenmukaisia lÀhestymistapoja ei kuitenkaan ole helposti pÀÀteltÀvissÀ kirjallisuuden perusteella. LisÀksi useimmat tapausesimerkit ovat sovellettavissa vain kyseisissÀ erikoistapauksissa. KehitetyssÀ koneoppimiskehyksessÀ on kÀytetty sekÀ kaupallisia ettÀ avoimen lÀhdekoodin komponentteja. Prosessidata haetaan OPC UA -protokollan avulla, ja sitÀ on mahdollista kÀsitellÀ Python-kielellÀ, josta on muodostunut lÀhes de facto -standardi data-analytiikassa. Kehyksen kÀyttöönottokomponentit perustuvat mikropalveluarkkitehtuuriin ja konttiteknologiaan, jotka osoittautuivat laadullisessa testauksessa monipuoliseksi ja toimivaksi toteutustavaksi

    A Review of Classification Problems and Algorithms in Renewable Energy Applications

    Get PDF
    Classification problems and their corresponding solving approaches constitute one of the fields of machine learning. The application of classification schemes in Renewable Energy (RE) has gained significant attention in the last few years, contributing to the deployment, management and optimization of RE systems. The main objective of this paper is to review the most important classification algorithms applied to RE problems, including both classical and novel algorithms. The paper also provides a comprehensive literature review and discussion on different classification techniques in specific RE problems, including wind speed/power prediction, fault diagnosis in RE systems, power quality disturbance classification and other applications in alternative RE systems. In this way, the paper describes classification techniques and metrics applied to RE problems, thus being useful both for researchers dealing with this kind of problem and for practitioners of the field

    Identification of Alphanumeric Pattern Using Android

    Get PDF
    The “Identification of Alphanumeric pattern using Android” is a smart phone apps using Android platform and combines the functionality of Optical Character Recognition and identification of alphanumeric pattern and after processing, data is stored in server. This paper present, to design an apps using the Android SDK that will enable the Identification of Alphanumeric pattern using optical character reader technique for the Android based smart phone application. Camera, captures the document image and then the OCR is convert that image in to text (Binarization of captured data) according to the Alphanumeric (alphabetic and numeric characters) database and data stored in server. DOI: 10.17762/ijritcc2321-8169.160414
    • 

    corecore