22,851 research outputs found
An overview of decision table literature.
The present report contains an overview of the literature on decision tables since its origin. The goal is to analyze the dissemination of decision tables in different areas of knowledge, countries and languages, especially showing these that present the most interest on decision table use. In the first part a description of the scope of the overview is given. Next, the classification results by topic are explained. An abstract and some keywords are included for each reference, normally provided by the authors. In some cases own comments are added. The purpose of these comments is to show where, how and why decision tables are used. Other examined topics are the theoretical or practical feature of each document, as well as its origin country and language. Finally, the main body of the paper consists of the ordered list of publications with abstract, classification and comments.
An overview of decision table literature 1982-1995.
This report gives an overview of the literature on decision tables over the past 15 years. As much as possible, for each reference, an author supplied abstract, a number of keywords and a classification are provided. In some cases own comments are added. The purpose of these comments is to show where, how and why decision tables are used. The literature is classified according to application area, theoretical versus practical character, year of publication, country or origin (not necessarily country of publication) and the language of the document. After a description of the scope of the interview, classification results and the classification by topic are presented. The main body of the paper is the ordered list of publications with abstract, classification and comments.
AN INFORMATION THEORETIC APPROACH TO THE CONSTRUCTION OF EFFICIENT DECISION TREES
This paper treats the problem of construction of efficient decision trees. Construction of optimal decision trees is an NP-complete problem and, therefore, a heuristic approach for the design of efficient decision trees is considered. The approach is based on information theoretic concepts and the proposed algorithm provides us with a simple procedure for the construction of near-optimal decision trees
Character Selection During Interactive Taxonomic Identification: “Best Characters”
Software interfaces for interactive multiple-entry taxonomic identification (polyclaves) sometimes provide a “best character” or “separation” coefficient, to guide the user to choose a character that could most effectively reduce the number of identification steps required. The coefficient could be particularly helpful when difficult or expensive tasks are needed for forensic identification, and in very large databases, uses that appear likely to increase in importance. Several current systems also provide tools to develop taxonomies or single-entry identification keys, with a variety of coefficients that are appropriate to that purpose. For the identification task, however, information theory neatly applies, and provides the most appropriate coefficient. To our knowledge, Delta-Intkey is the only currently available system that uses a coefficient related to information theory, and it is currently being reimplemented, which may allow for improvement. We describe two improvements to the algorithm used by Delta-Intkey. The first improves transparency as the number of remaining taxa decreases, by normalizing the range of the coefficient to [0,1]. The second concerns numeric ranges, which require consistent treatment of sub-intervals and their end-points. A stand-alone Bestchar program for categorical data is provided, in the Python and R languages. The source code is freely available and dedicated to the Public Domain
The advantages and cost effectiveness of database improvement methods
Relational databases have proved inadequate for supporting new classes of
applications, and as a consequence, a number of new approaches have been taken
(Blaha 1998), (Harrington 2000). The most salient alternatives are denormalisation
and conversion to an object-oriented database (Douglas 1997). Denormalisation
can provide better performance but has deficiencies with respect to
data modelling. Object-oriented databases can provide increased performance
efficiency but without the deficiencies in data modelling (Blaha 2000).
Although there have been various benchmark tests reported, none of these
tests have compared normalised, object oriented and de-normalised databases.
This research shows that a non-normalised database for data containing type
code complexity would be normalised in the process of conversion to an objectoriented
database. This helps to correct badly organised data and so gives the
performance benefits of de-normalisation while improving data modelling.
The costs of conversion from relational databases to object oriented databases
were also examined. Costs were based on published benchmark tests, a
benchmark carried out during this study and case studies. The benchmark tests
were based on an engineering database benchmark. Engineering problems such as
computer-aided design and manufacturing have much to gain from conversion to
object-oriented databases. Costs were calculated for coding and development, and
also for operation. It was found that conversion to an object-oriented database was
not usually cost effective as many of the performance benefits could be achieved
by the far cheaper process of de-normalisation, or by using the performance
improving facilities provided by many relational database systems such as
indexing or partitioning or by simply upgrading the system hardware.
It is concluded therefore that while object oriented databases are a better
alternative for databases built from scratch, the conversion of a legacy relational
database to an object oriented database is not necessarily cost effective
APPLICATION OF INFORMATION THEORY TO THE CONSTRUCTION OF EFFICIENT DECISION TREES
This paper treats the problem of conversion of decision tables to decision trees. In most cases, the construction of optimal decision trees is an NP-complete problem and, therefore, a heuristic approach to this problem is necessary. In our heuristic approach, we apply information theoretic concepts to construct efficient decision trees for decision tables which may include “don’t-care” entries. In contrast to most of the existing heuristic algorithms, our algorithm is systematic and has a sound theoretical justification. The algorithm has low design complexity and yet provides us with near-optimal decision trees
- …