9,531 research outputs found
One-Class Classification: Taxonomy of Study and Review of Techniques
One-class classification (OCC) algorithms aim to build classification models
when the negative class is either absent, poorly sampled or not well defined.
This unique situation constrains the learning of efficient classifiers by
defining class boundary just with the knowledge of positive class. The OCC
problem has been considered and applied under many research themes, such as
outlier/novelty detection and concept learning. In this paper we present a
unified view of the general problem of OCC by presenting a taxonomy of study
for OCC problems, which is based on the availability of training data,
algorithms used and the application domains applied. We further delve into each
of the categories of the proposed taxonomy and present a comprehensive
literature review of the OCC algorithms, techniques and methodologies with a
focus on their significance, limitations and applications. We conclude our
paper by discussing some open research problems in the field of OCC and present
our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure
Recommended from our members
Mining learning preferences in web-based instruction: Holists vs. Serialists
Web-based instruction programs are used by learners with diverse knowledge, skills and needs. These differences determine their preferences for the design of Web-based instruction programs and ultimately influence learners' success in using them. Cognitive style has been found to significantly affect learners' preferences of web-based instruction programs. However, the majority of previous studies focus on Field Dependence/Independence. Pask's Holist/Serialist dimension has conceptual links with Field Dependence/Independence but it is left mostly unstudied. Therefore, this study focuses on identifying how this dimension of cognitive style affects learner preferences of Web-based instruction programs. A data mining approach is used to illustrate the difference in preferences between Holists and Serialists. The findings show that there are clear differences in regard to content presentation and navigation support. A set of design features were then produced to help designers incorporate cognitive styles into the development of Web-based instruction programs to ensure that they can accommodate learners' different preferences.This work is partially funded by National Science Council, Taiwan, ROC (NSC 98-2511-S-008-012- MY3; NSC 99-
2511-S-008 -003 -MY2; NSC 99-2631-S-008-001)
Naive Bayes vs. Decision Trees vs. Neural Networks in the Classification of Training Web Pages
Web classification has been attempted through many different technologies. In this study we concentrate on the comparison of Neural Networks (NN), NaĂŻve Bayes (NB) and Decision Tree (DT) classifiers for the automatic analysis and classification of attribute data from training course web pages. We introduce an enhanced NB classifier and run the same data sample through the DT and NN classifiers to determine the success rate of our classifier in the training courses domain. This research shows that our enhanced NB classifier not only outperforms the traditional NB classifier, but also performs similarly as good, if not better, than some more popular, rival techniques. This paper also shows that, overall, our NB classifier is the best choice for the training courses domain, achieving an impressive F-Measure value of over 97%, despite it being trained with fewer samples than any of the classification systems we have encountered
On the Ground Validation of Online Diagnosis with Twitter and Medical Records
Social media has been considered as a data source for tracking disease.
However, most analyses are based on models that prioritize strong correlation
with population-level disease rates over determining whether or not specific
individual users are actually sick. Taking a different approach, we develop a
novel system for social-media based disease detection at the individual level
using a sample of professionally diagnosed individuals. Specifically, we
develop a system for making an accurate influenza diagnosis based on an
individual's publicly available Twitter data. We find that about half (17/35 =
48.57%) of the users in our sample that were sick explicitly discuss their
disease on Twitter. By developing a meta classifier that combines text
analysis, anomaly detection, and social network analysis, we are able to
diagnose an individual with greater than 99% accuracy even if she does not
discuss her health.Comment: Presented at of WWW2014. WWW'14 Companion, April 7-11, 2014, Seoul,
Kore
On the Ground Validation of Online Diagnosis with Twitter and Medical Records
Social media has been considered as a data source for tracking disease.
However, most analyses are based on models that prioritize strong correlation
with population-level disease rates over determining whether or not specific
individual users are actually sick. Taking a different approach, we develop a
novel system for social-media based disease detection at the individual level
using a sample of professionally diagnosed individuals. Specifically, we
develop a system for making an accurate influenza diagnosis based on an
individual's publicly available Twitter data. We find that about half (17/35 =
48.57%) of the users in our sample that were sick explicitly discuss their
disease on Twitter. By developing a meta classifier that combines text
analysis, anomaly detection, and social network analysis, we are able to
diagnose an individual with greater than 99% accuracy even if she does not
discuss her health.Comment: Presented at of WWW2014. WWW'14 Companion, April 7-11, 2014, Seoul,
Kore
- …