2,300 research outputs found
BlogForever D2.4: Weblog spider prototype and associated methodology
The purpose of this document is to present the evaluation of different solutions for capturing blogs, established methodology and to describe the developed blog spider prototype
Decision support methods in diabetic patient management by insulin administration neural network vs. induction methods for knowledge classification
Diabetes mellitus is now recognised as a major worldwide
public health problem. At present, about 100
million people are registered as diabetic patients. Many
clinical, social and economic problems occur as a
consequence of insulin-dependent diabetes. Treatment
attempts to prevent or delay complications by applying
âoptimalâ glycaemic control. Therefore, there is a
continuous need for effective monitoring of the patient.
Given the popularity of decision tree learning
algorithms as well as neural networks for knowledge
classification which is further used for decision
support, this paper examines their relative merits by
applying one algorithm from each family on a medical
problem; that of recommending a particular diabetes
regime. For the purposes of this study, OC1 a
descendant of Quinlanâs ID3 algorithm was chosen as
decision tree learning algorithm and a generating
shrinking algorithm for learning arbitrary
classifications as a neural network algorithm. These
systems were trained on 646 cases derived from two
countries in Europe and were tested on 100 cases
which were different from the original 646 cases
Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm
This paper introduces ICET, a new algorithm for cost-sensitive
classification. ICET uses a genetic algorithm to evolve a population of biases
for a decision tree induction algorithm. The fitness function of the genetic
algorithm is the average cost of classification when using the decision tree,
including both the costs of tests (features, measurements) and the costs of
classification errors. ICET is compared here with three other algorithms for
cost-sensitive classification - EG2, CS-ID3, and IDX - and also with C4.5,
which classifies without regard to cost. The five algorithms are evaluated
empirically on five real-world medical datasets. Three sets of experiments are
performed. The first set examines the baseline performance of the five
algorithms on the five datasets and establishes that ICET performs
significantly better than its competitors. The second set tests the robustness
of ICET under a variety of conditions and shows that ICET maintains its
advantage. The third set looks at ICET's search in bias space and discovers a
way to improve the search.Comment: See http://www.jair.org/ for any accompanying file
A survey of cost-sensitive decision tree induction algorithms
The past decade has seen a significant interest on the problem of inducing decision trees that take account of costs of misclassification and costs of acquiring the features used for decision making. This survey identifies over 50 algorithms including approaches that are direct adaptations of accuracy based methods, use genetic algorithms, use anytime methods and utilize boosting and bagging. The survey brings together these different studies and novel approaches to cost-sensitive decision tree learning, provides a useful taxonomy, a historical timeline of how the field has developed and should provide a useful reference point for future research in this field
A traffic classification method using machine learning algorithm
Applying concepts of attack investigation in IT industry, this idea has been developed to design
a Traffic Classification Method using Data Mining techniques at the intersection of Machine
Learning Algorithm, Which will classify the normal and malicious traffic. This classification will
help to learn about the unknown attacks faced by IT industry. The notion of traffic classification
is not a new concept; plenty of work has been done to classify the network traffic for
heterogeneous application nowadays. Existing techniques such as (payload based, port based
and statistical based) have their own pros and cons which will be discussed in this
literature later, but classification using Machine Learning techniques is still an open field to explore and has provided very promising results up till now
- âŠ