214,607 research outputs found

    ADVANCES IN KNOWLEDGE DISCOVERY IN DATABASES

    Get PDF
    The Knowledge Discovery in Databases and Data Mining field proposes the development of methods and techniques for assigning useful meanings for data stored in databases. It gathers researches from many study fields like machine learning, pattern recognition, databases, statistics, artificial intelligence, knowledge acquisition for expert systems, data visualization and grids. While Data Mining represents a set of specific algorithms of finding useful meanings in stored data, Knowledge Discovery in Databases represents the overall process of finding knowledge and includes the Data Mining as one step among others such as selection, pre�processing, transformation and interpretation of mined data. This paper aims to point the most important steps that were made in the Knowledge Discovery in Databases field of study and to show how the overall process of discovering can be improved in the future.

    Combining expert knowledge and databases for risk management

    Get PDF
    Correctness, transparency and effectiveness are the principalattributes of knowledge derived from databases. In current data miningresearch there is a focus on efficiency improvement of algorithms forknowledge discovery. However important limitations of data mining canonly be dissolved by the integration of knowledge of experts in thefield, encoded in some accessible way, with knowledge derived formpatterns in the database. In this paper we will in particular discussmethods for combining expert knowledge and knowledge derived fromtransaction databases.The framework proposed is applicable to widevariety of risk management problems. We will illustrate the method ina case study on fraud discovery in an insurance company.risk management;datamining;knowledge discovery;knowledge based systems

    Mining geo-referenced databases: a way to improve decision-making

    Get PDF
    Knowledge discovery in databases is a process that aims at the discovery of associations within data sets. The analysis of geo-referenced data demands a particular approach in this process. This chapter presents a new approach to the process of knowledge discovery, in which qualitative geographic identifiers give the positional aspects of geographic data. Those identifiers are manipulated using qualitative reasoning principles, which allows for the inference of new spatial relations required for the data mining step of the knowledge discovery process. The efficacy and usefulness of the implemented system — PADRÃO — has been tested with a bank dataset. The results obtained support that traditional knowledge discovery systems, developed for relational databases and not having semantic knowledge linked to spatial data, can be used in the process of knowledge discovery in geo-referenced databases, since some of this semantic knowledge and the principles of qualitative spatial reasoning are available as spatial domain knowledge

    Conceptual biology, hypothesis discovery, and text mining: Swanson's legacy

    Get PDF
    Innovative biomedical librarians and information specialists who want to expand their roles as expert searchers need to know about profound changes in biology and parallel trends in text mining. In recent years, conceptual biology has emerged as a complement to empirical biology. This is partly in response to the availability of massive digital resources such as the network of databases for molecular biologists at the National Center for Biotechnology Information. Developments in text mining and hypothesis discovery systems based on the early work of Swanson, a mathematician and information scientist, are coincident with the emergence of conceptual biology. Very little has been written to introduce biomedical digital librarians to these new trends. In this paper, background for data and text mining, as well as for knowledge discovery in databases (KDD) and in text (KDT) is presented, then a brief review of Swanson's ideas, followed by a discussion of recent approaches to hypothesis discovery and testing. 'Testing' in the context of text mining involves partially automated methods for finding evidence in the literature to support hypothetical relationships. Concluding remarks follow regarding (a) the limits of current strategies for evaluation of hypothesis discovery systems and (b) the role of literature-based discovery in concert with empirical research. Report of an informatics-driven literature review for biomarkers of systemic lupus erythematosus is mentioned. Swanson's vision of the hidden value in the literature of science and, by extension, in biomedical digital databases, is still remarkably generative for information scientists, biologists, and physicians. © 2006Bekhuis; licensee BioMed Central Ltd

    Product design and manufacturing process improvement using association rules

    Get PDF
    Modern manufacturing systems equipped with computerized data logging systems collect large volumes of data in real time. The data may contain valuable information for operation and control strategies as well as providing knowledge of normal and abnormal operational patterns. Knowledge discovery in databases can be applied to these data to unearth hidden, unknown, representable, and ultimately useful knowledge. Data mining offers tools for discovery of patterns, associations, changes, anomalies, rules, and statistically significant structures and events in data. Extraction of previously unknown, meaningful information from manufacturing databases provides knowledge that may benefit many application areas within the enterprise, for example improving design or fine tuning production processes. This paper examines the application of association rules to manufacturing databases to extract useful information about a manufacturing system's capabilities and its constraints. The quality of each identified rule is tested and, from numerous rules, only those that are statistically very strong and contain substantial design information are selected. The final set of extracted rules contains very interesting information relating to the geometry of the product and also indicates where limitations exist for improvement of the manufacturing processes involved in the production of complex geometric shapes

    A BELIEF-DRIVEN DISCOVERY FRAMEWORK BASED ON DATA MONITORING AND TRIGGERING

    Get PDF
    A new knowledge-discovery framework, called Data Monitoring and Discovery Triggering (DMDT), is defined, where the user specifies monitors that âwatch" for significant changes to the data and changes to the user-defined system of beliefs. Once these changes are detected, knowledge discovery processes, in the form of data mining queries, are triggered. The proposed framework is the result of an observation, made in the previous work of the authors, that when changes to the user-defined beliefs occur, this means that, there are interesting patterns in the data. In this paper, we present an approach for finding these interesting patterns using data monitoring and belief-driven discovery techniques. Our approach is especially useful in those applications where data changes rapidly with time, as in some of the On-Line Transaction Processing (OLTP) systems. The proposed approach integrates active databases, data mining queries and subjective measures of interestingness based on user-defined systems of beliefs in a novel and synergetic way to yield a new type of data mining systems.Information Systems Working Papers Serie

    Computing iceberg concept lattices with Titanic

    Get PDF
    International audienceWe introduce the notion of iceberg concept lattices and show their use in knowledge discovery in databases. Iceberg lattices are a conceptual clustering method, which is well suited for analyzing very large databases. They also serve as a condensed representation of frequent itemsets, as starting point for computing bases of association rules, and as a visualization method for association rules. Iceberg concept lattices are based on the theory of Formal Concept Analysis, a mathematical theory with applications in data analysis, information retrieval, and knowledge discovery. We present a new algorithm called TITANIC for computing (iceberg) concept lattices. It is based on data mining techniques with a level-wise approach. In fact, TITANIC can be used for a more general problem: Computing arbitrary closure systems when the closure operator comes along with a so-called weight function. The use of weight functions for computing closure systems has not been discussed in the literature up to now. Applications providing such a weight function include association rule mining, functional dependencies in databases, conceptual clustering, and ontology engineering. The algorithm is experimentally evaluated and compared with Ganter's Next-Closure algorithm. The evaluation shows an important gain in efficiency, especially for weakly correlated data

    Enterprise knowledge portals: two projects in the United States Department of the Navy

    Get PDF
    Two projects in the US Department of the Navy to develop enterprise portals for facilitating knowledge discovery and dissemination are discussed. The authors describe efforts within a global organization to capitalize on current knowledge management concepts and technologies for knowledge access and sharing in order to provide users with more personalized, responsive and integrated information systems. The Next Generation Library supports knowledge management and networking objectives, as well as providing high-quality content access at the desktop. The Naval Postgraduate School Knowledge Portal, still under development, is designed to link internal administrative databases with current message traffic and external scholarly information resources
    • …
    corecore