18 research outputs found
The investigation of the Bayesian rough set model
AbstractThe original Rough Set model is concerned primarily with algebraic properties of approximately defined sets. The Variable Precision Rough Set (VPRS) model extends the basic rough set theory to incorporate probabilistic information. The article presents a non-parametric modification of the VPRS model called the Bayesian Rough Set (BRS) model, where the set approximations are defined by using the prior probability as a reference. Mathematical properties of BRS are investigated. It is shown that the quality of BRS models can be evaluated using probabilistic gain function, which is suitable for identification and elimination of redundant attributes
The investigation of the Bayesian rough set model
AbstractThe original Rough Set model is concerned primarily with algebraic properties of approximately defined sets. The Variable Precision Rough Set (VPRS) model extends the basic rough set theory to incorporate probabilistic information. The article presents a non-parametric modification of the VPRS model called the Bayesian Rough Set (BRS) model, where the set approximations are defined by using the prior probability as a reference. Mathematical properties of BRS are investigated. It is shown that the quality of BRS models can be evaluated using probabilistic gain function, which is suitable for identification and elimination of redundant attributes
Credibility coefficients based on frequent sets
Credibility coefficients are heuristic measures applied to objects of information system. Credibility coefficients were introduced to assess similarity of objects in respect to other data in information systems or decision tables. By applying knowledge discovery methods it is possible to gain some rules and dependencies between data. However the knowledge obtained from the data can be corrupted or incomplete due to improper data. Hence identification of these exceptions cannot be overestimated. It is assumed that majority of data is correct and only a minor part may be improper. Credibility coefficients of objects should indicate to which group a particular object probably belongs. A main focus of the paper is set on an algorithm of calculating credibility coefficients. This algorithm is based on frequent sets, which are produced while using data analysis based on the rough set theory. Some information on the rough set theory is supplied to enable expression of credibility coefficient formulas. Implementation and applications of credibility coefficients are presented in the paper. Discussion of some practical results of identifying improper data by credibility coefficients is inserted as well
KTDA: emerging patterns based data analysis system
Emerging patterns are kind of relationships discovered in databases containing a decision attribute. They represent contrast characteristics of individual decision classes. This form of knowledge can be useful for experts and has been successfully employed in a field of classification. In this paper we present the KTDA system. It enables discovering emerging patterns and applies them to classification purposes. The system has capabilities of identifying improper data by making use of data credibility analysis, a new approach to assessment data typicality
Detection of heat flux failures in building using a soft computing diagnostic system
The detection of insulation failures in buildings could potentially conserve energy supplies and improve future designs. Improvements to thermal insulation in buildings include the development of models to assess fabric gain - heat flux through exterior walls in the building- and heating processes. Thermal insulation standards are now contractual obligations in new buildings, and the energy efficiency of buildings constructed prior to these regulations has yet to be determined. The main assumption is that it will be based on heat flux and conductivity measurement. Diagnostic systems to detect thermal insulation failures should recognize anomalous situations in a building that relate to insulation, heating and ventilation. This highly relevant issue in the construction sector today is approached through a novel intelligent procedure that can be programmed according to local building and heating system regulations and the specific features of a given climate zone. It is based on the following phases. Firstly, the dynamic thermal performance of different variables is specifically modeled. Secondly, an exploratory projection pursuit method called Cooperative Maximum-Likelihood Hebbian Learning extracts the relevant features. Finally, a supervised neural model and identification techniques constitute the model for the diagnosis of thermal insulation failures in building due to the heat flux through exterior walls, using relevant features of the data set. The reliability of the proposed method is validated with real data sets from several Spanish cities in winter time
CRIS-IR 2006
The recognition of entities and their
relationships in document collections is an important step towards the discovery of latent knowledge as well as to support knowledge management applications.
The challenge lies on how to extract and correlate entities, aiming to answer key knowledge management questions, such as; who works with whom, on which projects, with which customers and on what research areas. The present work proposes a
knowledge mining approach supported by information retrieval and text mining tasks in which its core is based on the correlation of textual elements through the LRD (Latent Relation Discovery) method. Our experiments show that LRD outperform better than
other correlation methods. Also, we present an application in order to demonstrate the approach over knowledge management scenarios.Fundação para a Ciência e a Tecnologia (FCT)
Denmark's Electronic Research Librar