10 research outputs found

    Ensemble learning for ranking interesting attributes

    Get PDF
    Machine learning knowledge representations, such as decision trees; are often incomprehensible to humans. They can also contain errors specific to the representation type and the data used to generate them. By combining larger; less comprehensible decision trees, it is possible to increase their accuracy as an ensemble compared to the best individual tree. The thesis examines an ensemble learning technique and presents a unique knowledge elicitation technique which produces an ordered ranking of attributes by their importance in leading to more desirable classifications. The technique compares full branches of decision trees, finding the set difference of shared attributes. The combination of this information from all ensemble members is used to build an importance table which allows attributes to be ranked ordinally and by relative magnitude. A case study utilizing this method is discussed and its results are presented and summarized

    ZACCAR : sistema de conhecimento para apoio à gestão do relacionamento com clientes

    Get PDF
    Tese de doutoramento em Tecnologias e Sistemas de Informação (área de especialização em Engenharia e Gestão de Sistemas de Informação)Nesta tese apresenta-se uma ferramenta para gestão, integração e consolidação do Conhecimento sobre o Comportamento dos Clientes (CCC) obtido a partir da actuação de ferramentas de data mining sobre bases de dados transaccionais de organizações. As ferramentas de data mining permitem automatizar a detecção de padrões de comportamento dos clientes de uma organização a partir das bases de dados transaccionais, num processo designado por Descoberta do Conhecimento em Bases de Dados (DCBD). Estes padrões podem ser transmitidos aos agentes organizacionais e utilizados em campanhas de marketing e outras actividades no contexto da organização. No entanto, este conhecimento sobre o comportamento dos clientes não é, normalmente, objecto de qualquer tratamento que permita a análise das razões para o seu aparecimento ou da sua evolução bem como a consolidação com outro conhecimento sobre o CCC já existente. Há, pois, neste processo, uma situação que consideramos que pode ser melhorada com a introdução dum novo conceito - a Gestão do CCC – o qual conduz a uma nova actividade organizacional – Zelar pelo CCC. A Gestão do Conhecimento sobre o Comportamento dos Clientes é entendida como a confrontação deste conhecimento com outro conhecimento já existente na organização, resolvendo potenciais conflitos, actualizando-o e acrescentando explicações pertinentes para a evolução temporal verificada. As principais contribuições deste trabalho centram-se: - na apresentação do conceito “a Gestão do Conhecimento sobre o Comportamento dos Clientes” que conduz a uma nova tarefa organizacional “Zelar pelo Conhecimento sobre o Comportamento dos Clientes”; - no estabelecimento de uma forma de estrutura do CCC e seu registo; - na concepção e exploração dum sistema de conhecimento para apoio à gestão do conhecimento do comportamento dos clientes - o sistema ZACCAR (Zelar pela Aquisição do Conhecimento dos Clientes, sua Actualização e Registo) - cujo objectivo principal é permitir a viabilidade da nova tarefa através da: recolha e uniformização dos padrões de comportamento obtidos com uma ferramenta de data mining; confrontação desses padrões com o conhecimento já existente acerca do comportamento dos clientes, actualizando-o; validação e documentação, pelo gestor do conhecimento organizacional, do conhecimento já actualizado; integração do conhecimento depois de actualizado e completado, numa base de conhecimento que fará parte integrante do conhecimento organizacional; - no processo de consolidação do conhecimento descoberto em bases de dados, resolvendo problemas de interpretação, integração e conflitos. Na prossecução dos objectivos que estiveram presentes na elaboração deste trabalho, foi feita uma análise pormenorizada da prática de CRM (Customer Relationship Management) e sua relação com o conhecimento organizacional bem como do CCC com ênfase no tratamento que é dado a este conhecimento. O sistema ZACCAR pode-se considerar um sistema inovador uma vez que permite às organizações dispor de uma base de conhecimento do CCC, actualizada duma forma semi-automática onde está, ainda, registada uma evolução dos padrões de comportamento dos clientes e que faz parte integrante do conhecimento organizacional. Um protótipo do ZACCAR foi desenvolvido, recorrendo a tecnologia existente; para demonstrar a sua exequibilidade, foram efectuados dois estudos de casos os quais demonstram que o sistema possui potencialidades interessantes que se poderão tornar muito úteis em qualquer empresa onde o sistema seja implantado quer como sistema independente quer como integrado noutros sistemas empresariais de maior abrangência.In this thesis it is presented a tool to take care of the Customers’ Behaviour Knowledge (CBK) obtained when a data mining tool acts in the organisational databases to manage and integrate it in the organisational knowledge, through a consolidation process with the existing knowledge. Data mining tools automate the detection of customers'behaviour patterns from the organisational databases in a process called Knowledge Discovery in Databases (KDD). These patterns may be transmitted to organisational agents and used in marketing campaigns and other activities in the organisation. However, the CBK is not usually treated to allow the analysis why it exists or how it evolves as well its consolidation with other existing CBK. So, we consider that, in this process, there is a situation that can be optimized through the introducing of a new concept - the management of CBK - conducting to a new organisational activity - to take care of the CBK. The management of CBK is intended as the confrontation of this knowledge with other existing knowledge, resolving potential conflicts, updating it and adding pertinent explanations to the temporal evolution of the customers'behaviour patterns. The most important contributions of this work are: - the presentation of the concept "The management of the Customers’ Behaviour Knowledge" that allows a new organisational task: "To take care of the Customers’ Behaviour Knowledge"; - the creation and exploration of a knowledge system to help the management of the CBK - the ZACCAR system - whose main objective is to permit the viability of the new task that is got by: the collecting and uniformization of the behaviour patterns obtained with a data mining tool; the confrontation of these patterns with existing CBK, updating it; the validation and documentation, by the manager of the organisational knowledge, of the knowledge after to be updated; the integration of the updated knowledge in a knowledge base that will be an integrant part of the organisational knowledge; - in the consolidation process of the knowledge discovered in databases, resolving interpreting and integration problems as possible conflicts. Attending the objectives considered in this work, it was made a detailed analysis of the practice of CRM (Customer Relationship Management) and its relation with the organisational knowledge as well of the CBK with emphasis in the treatment given to this knowledge ZACCAR can be considered an innovating system as, with it, the organisations can have a knowledge base of the CBK, updated in a semi-automatic process where it can be yet, stored the evolution of the customers'behaviour patterns and turned as an integrant part of the organisational knowledge. It was developed a prototype of ZACCAR, using existing technology; to prove its feasibility it was conducted two case studies; these cases showed that the system has good potentialities that will be very useful in an enterprise where the system can be implemented either as independent system or integrated in other organisational systems with a greater covering.Projecto parcialmente financiado por uma bolsa do PRODEP II, medida 5, acção 5.2, concurso nº1/96, Doutoramentos

    Data mining using neural networks

    Get PDF
    Data mining is about the search for relationships and global patterns in large databases that are increasing in size. Data mining is beneficial for anyone who has a huge amount of data, for example, customer and business data, transaction, marketing, financial, manufacturing and web data etc. The results of data mining are also referred to as knowledge in the form of rules, regularities and constraints. Rule mining is one of the popular data mining methods since rules provide concise statements of potentially important information that is easily understood by end users and also actionable patterns. At present rule mining has received a good deal of attention and enthusiasm from data mining researchers since rule mining is capable of solving many data mining problems such as classification, association, customer profiling, summarization, segmentation and many others. This thesis makes several contributions by proposing rule mining methods using genetic algorithms and neural networks. The thesis first proposes rule mining methods using a genetic algorithm. These methods are based on an integrated framework but capable of mining three major classes of rules. Moreover, the rule mining processes in these methods are controlled by tuning of two data mining measures such as support and confidence. The thesis shows how to build data mining predictive models using the resultant rules of the proposed methods. Another key contribution of the thesis is the proposal of rule mining methods using supervised neural networks. The thesis mathematically analyses the Widrow-Hoff learning algorithm of a single-layered neural network, which results in a foundation for rule mining algorithms using single-layered neural networks. Three rule mining algorithms using single-layered neural networks are proposed for the three major classes of rules on the basis of the proposed theorems. The thesis also looks at the problem of rule mining where user guidance is absent. The thesis proposes a guided rule mining system to overcome this problem. The thesis extends this work further by comparing the performance of the algorithm used in the proposed guided rule mining system with Apriori data mining algorithm. Finally, the thesis studies the Kohonen self-organization map as an unsupervised neural network for rule mining algorithms. Two approaches are adopted based on the way of self-organization maps applied in rule mining models. In the first approach, self-organization map is used for clustering, which provides class information to the rule mining process. In the second approach, automated rule mining takes the place of trained neurons as it grows in a hierarchical structure

    Ontologie-basierte Monosemierung - Bestimmung von Referenzen im SemanticWeb

    Get PDF
    Die vorliegende Arbeit beschäftigt sich mit dieser Thematik, insbesondere mit der Problematik der Ambiguität, die bei der Zusammenführung natürlich-sprachlicher Informationen mit dem durch Ontologien repräsentierten Wissen auftritt

    Ontologie-basierte Monosemierung

    Get PDF
    Ontologien verlangen eine eindeutige Identifikation der darin beschriebenen Elemente. Mit der Einbindung natürlicher Sprache erhält auch die Thematik der Mehrdeutigkeit Einzug in die formal geordnete Darstellung. Eine eindeutige Suche anhand natürlicher Sprache erscheint daher zunächst als unmöglich. Der Fokus dieser Arbeit liegt auf der Lösung des Problems der Ambiguität, die bei der Zusammenführung natürlich-sprachlicher Informationen mit dem durch Ontologien repräsentieren Wissen auftritt

    An Integrated Knowledge Discovery and Data Mining Process Model

    Get PDF
    Enterprise decision making is continuously transforming in the wake of ever increasing amounts of data. Organizations are collecting massive amounts of data in their quest for knowledge nuggets in form of novel, interesting, understandable patterns that underlie these data. The search for knowledge is a multi-step process comprising of various phases including development of domain (business) understanding, data understanding, data preparation, modeling, evaluation and ultimately, the deployment of the discovered knowledge. These phases are represented in form of Knowledge Discovery and Data Mining (KDDM) Process Models that are meant to provide explicit support towards execution of the complex and iterative knowledge discovery process. Review of existing KDDM process models reveals that they have certain limitations (fragmented design, only a checklist-type description of tasks, lack of support towards execution of tasks, especially those of the business understanding phase etc) which are likely to affect the efficiency and effectiveness with which KDDM projects are currently carried out. This dissertation addresses the various identified limitations of existing KDDM process models through an improved model (named the Integrated Knowledge Discovery and Data Mining Process Model) which presents an integrated view of the KDDM process and provides explicit support towards execution of each one of the tasks outlined in the model. We also evaluate the effectiveness and efficiency offered by the IKDDM model against CRISP-DM, a leading KDDM process model, in aiding data mining users to execute various tasks of the KDDM process. Results of statistical tests indicate that the IKDDM model outperforms the CRISP model in terms of efficiency and effectiveness; the IKDDM model also outperforms CRISP in terms of quality of the process model itself