34 research outputs found
Discovery of data dependencies in relational databases
Knowledge discovery in databases is not only the nontrivial extraction of implicit, previously unknown and potentially useful information from databases. We argue that in contrast to machine learning, knowledge discovery in databases should be applied to real world databases. Since real world databases are known to be very large, they raise problems of the access. Therefore, real world databases only can be accessed by database management systems and the number of accesses has to be reduced to a minimum. Considering this property, we are forced to use, for example, standard set oriented interfaces of relational database management systems in order to apply methods of knowledge discovery in databases. We present a system for discovering data dependencies, which is build upon a set oriented interface. The point of main effort has been put on the discovery of value restrictions, unary inclusion- and functional dependencies in relational databases. The system also embodies an inference relation to minimize database access
Combining statistical learning with a knowledge-based approach
The paper describes a case study in combining different methods for acquiring medical knowledge. Given a huge amount of noisy, high dimensional numerical time series data describing patients in intensive care, the support vector machine is used to learn when and how to change the dose of which drug. Given medical knowledge about and expertise in clinical decision making, a first-order logic knowledge base about effects of therapeutical interventions has been built. As a preprocessing mechanism it uses another statistical method. The integration of numerical and knowledge-based procedures eases the task of validation in two ways. On one hand, the knowledge base is validated with respect to past patients' records. On the other hand, medical interventions that are recommended by learning results are justified by the knowledge base
Learning First Order Rules in Intensive Care Monitoring
This paper describes a study on learning first order rules in noisy, real world, numerical time series data, describing patients in a intensive care unit. Given specific states a patient is in, the learned rules predict doctors' interventions to restabilize the patient. As a data preparation and abstraction method, statistical phase state models are used. They are used to transform the numerical signals, given on a minute by minute basis, into sequences of time intervals describing level changes. These new predicates are then used by the relational learner Rdt/Db. 1 Introduction Todays Clinical Information Systems (CIS) can provide the health care professional with the complete Electronic Patient Record (EPR) at the point of care. This data may include vital signs (e.g. heart rate, blood pressure), fluid intake and output, medications as well as plans of care, doctor's orders, and entire clinical pathways. These CIS are very complex database systems that comprise between several hund..
A Multistrategy Approach to Relational Knowledge Discovery in Databases
When learning from very large databases, the reduction of complexity is of highest importance. Two extremes of making knowledge discovery in databases (KDD) feasible have been put forward. One extreme is to choose a most simple hypothesis language and so to be capable of very fast learning on real-world databases. The opposite extreme is to select a small data set and be capable of learning very expressive (firstorder logic) hypotheses. A multistrategy approach allows to combine most of the advantages and exclude most of the disadvantages. More simple learning algorithms detect hierarchies that are used in order to structure the hypothesis space for a more complex learning algorithm. The better structured the hypothesis space is, the better can learning prune away uninteresting or losing hypotheses and the faster it becomes. We have combined inductive logic programming (ILP) directly with a relational database. The ILP algorithm is controlled in a model-driven way by the user and in a ..
Direct Access of an ILP Algorithm to a Database Management System
When learning from very large databases, the reduction of complexity is of highest importance. Two extremes of making knowledge discovery in databases (KDD) feasible have been put forward. One extreme is to choose a most simple hypothesis language and so to be capable of very fast learning on real-world databases. The opposite extreme is to select a small data set and be capable of learning very expressive (first-order logic) hypotheses. We have combined inductive logic programming (ILP) directly with a relational database. The tool exploits a declarative specification of the syntactic form of hypotheses. We indicate the impact of different mappings from the learner's representation to the one of the database on the complexity of learning. We demonstrate, how background knowledge can be structured and integrated into our learning framework. We conclude with discussing results from first tests. 1 Introduction Knowledge discovery in databases (KDD) is an application challenging machine ..
Discovery of Data Dependencies in Relational Databases
Since real world databases are known to be very large, they raise problems of the access. Therefore, real world databases onlycan be accessed by database management systems and the number of accesses has to be reduced to a minimum. Considering this property, we are forced to use standard set--oriented interfaces of relational database management systems. We present a system for discovering data dependencies, which is build upon a set--oriented interface. The point of main effort has been put on the discovery of domain restrictions, unary inclusionand functional dependencies in relational databases. The system also embodies an inference relation to minimize database access. 1 Introduction Data dependencies are the most common type of semantic constraints in relational databases which determine the database design. Despite the advent of highly automated tools, database design still consists basically of two types of activities: first, reasoning about data types and data dependencies and..
Combining statistical learning with a knowledge-based approach -- A case study in intensive care monitoring
The paper describes a case study in combining different methods for acquiring medical knowledge. Given a huge amount of noisy, high dimensional numerical time series data describing patients in intensive care, the support vector machine is used to learn when and how to change the dose of which drug. Given medical knowledge about and expertise in clinical decision making, a first-order logic knowledge base about effects of therapeutical interventions has been built. As a preprocessing mechanism it uses another statistical method. The integration of numerical and knowledge-based procedures eases the task of validation in two ways. On one hand, the knowledge base is validated with respect to past patients' records. On the other hand, medical interventions that are recommended by learning results are justified by the knowledge base. 1 Introduction In this paper, we want to present a challenging application of machine learning. The learning methods we use are already well known and theoret..
Ontology-Based Skills Management: Goals, Opportunities and Challenges
Establishing electronically accessible repositories of people s capabilities, experiences, and key knowledge areas is key in setting up Enterprise Knowledge Management. A skills repository can be used for e.g. finding people, staffing, skills gap analysis, and professional development. The ontology based skills management system developed at Swiss Life uses RDF schema for storing ontologies. Its query interface is based on a combined RQL and HTML query engine
Knowledge Discovery and Knowledge Validation in Intensive Care
Operational protocols are a valuable means for quality control. However, developing operational protocols is a highly complex and costly task. We present an integrated approach involving both intelligent data analysis and knowledge acquisition from experts that supports the development of operational protocols. The aim is to ensure high quality standards for the protocol through empirical validation during the development, as well as lower development cost through the use of machine learning and statistical techniques. We demonstrate our approach of integrating expert knowledge with data driven techniques based on our effort to develop an operational protocol for the hemodynamic system. 1. (To appear in "Artificial Intelligence in Medicine", thematic issue on Knowledge-Based Information Management in Intensive Care and Anaesthesia) Morik et al: Knowledge Discovery and Knowledge Validation in Intensive Care 2 of 32 2 Key words operational protocols, online-monitoring, time series a..