44,276 research outputs found
Hierarchical classification for multiple, distributed web databases
The proliferation of online information resources increases the importance of effective and efficient distributed searching. Our research aims to provide an alternative hierarchical categorization and search capability based on a Bayesian network learning algorithm. Our proposed approach, which is grounded on automatic textual analysis of subject content of online web databases, attempts to address the database selection problem by first classifying web databases into a hierarchy of topic categories. The experimental results reported demonstrate that such a classification approach not only effectively reduces the class search space, but also helps to significantly improve the accuracy of classification performance
EEF: Exponentially Embedded Families with Class-Specific Features for Classification
In this letter, we present a novel exponentially embedded families (EEF)
based classification method, in which the probability density function (PDF) on
raw data is estimated from the PDF on features. With the PDF construction, we
show that class-specific features can be used in the proposed classification
method, instead of a common feature subset for all classes as used in
conventional approaches. We apply the proposed EEF classifier for text
categorization as a case study and derive an optimal Bayesian classification
rule with class-specific feature selection based on the Information Gain (IG)
score. The promising performance on real-life data sets demonstrates the
effectiveness of the proposed approach and indicates its wide potential
applications.Comment: 9 pages, 3 figures, to be published in IEEE Signal Processing Letter.
IEEE Signal Processing Letter, 201
Attribute Interactions in Medical Data Analysis
There is much empirical evidence about the success of naive Bayesian classification (NBC) in medical applications of attribute-based machine learning. NBC assumes conditional independence between attributes. In classification, such classifiers sum up the pieces of class-related evidence from individual attributes, independently of other attributes. The performance, however, deteriorates significantly when the “interactions” between attributes become critical. We propose an approach to handling attribute interactions within the framework of “voting” classifiers, such as NBC. We propose an operational test for detecting interactions in learning data and a procedure that takes the detected interactions into account while learning. This approach induces a structuring of the domain of attributes, it may lead to improved classifier’s performance and may provide useful novel information for the domain expert when interpreting the results of learning. We report on its application in data analysis and model construction for the prediction of clinical outcome in hip arthroplasty
Dynamic Control of Explore/Exploit Trade-Off In Bayesian Optimization
Bayesian optimization offers the possibility of optimizing black-box
operations not accessible through traditional techniques. The success of
Bayesian optimization methods such as Expected Improvement (EI) are
significantly affected by the degree of trade-off between exploration and
exploitation. Too much exploration can lead to inefficient optimization
protocols, whilst too much exploitation leaves the protocol open to strong
initial biases, and a high chance of getting stuck in a local minimum.
Typically, a constant margin is used to control this trade-off, which results
in yet another hyper-parameter to be optimized. We propose contextual
improvement as a simple, yet effective heuristic to counter this - achieving a
one-shot optimization strategy. Our proposed heuristic can be swiftly
calculated and improves both the speed and robustness of discovery of optimal
solutions. We demonstrate its effectiveness on both synthetic and real world
problems and explore the unaccounted for uncertainty in the pre-determination
of search hyperparameters controlling explore-exploit trade-off.Comment: Accepted for publication in the proceedings of 2018 Computing
Conferenc
- …