18,336 research outputs found
Machine Learning in Automated Text Categorization
The automated categorization (or classification) of texts into predefined
categories has witnessed a booming interest in the last ten years, due to the
increased availability of documents in digital form and the ensuing need to
organize them. In the research community the dominant approach to this problem
is based on machine learning techniques: a general inductive process
automatically builds a classifier by learning, from a set of preclassified
documents, the characteristics of the categories. The advantages of this
approach over the knowledge engineering approach (consisting in the manual
definition of a classifier by domain experts) are a very good effectiveness,
considerable savings in terms of expert manpower, and straightforward
portability to different domains. This survey discusses the main approaches to
text categorization that fall within the machine learning paradigm. We will
discuss in detail issues pertaining to three different problems, namely
document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey
A Hybrid Web Recommendation System based on the Improved Association Rule Mining Algorithm
As the growing interest of web recommendation systems those are applied to
deliver customized data for their users, we started working on this system.
Generally the recommendation systems are divided into two major categories such
as collaborative recommendation system and content based recommendation system.
In case of collaborative recommen-dation systems, these try to seek out users
who share same tastes that of given user as well as recommends the websites
according to the liking given user. Whereas the content based recommendation
systems tries to recommend web sites similar to those web sites the user has
liked. In the recent research we found that the efficient technique based on
asso-ciation rule mining algorithm is proposed in order to solve the problem of
web page recommendation. Major problem of the same is that the web pages are
given equal importance. Here the importance of pages changes according to the
fre-quency of visiting the web page as well as amount of time user spends on
that page. Also recommendation of newly added web pages or the pages those are
not yet visited by users are not included in the recommendation set. To
over-come this problem, we have used the web usage log in the adaptive
association rule based web mining where the asso-ciation rules were applied to
personalization. This algorithm was purely based on the Apriori data mining
algorithm in order to generate the association rules. However this method also
suffers from some unavoidable drawbacks. In this paper we are presenting and
investigating the new approach based on weighted Association Rule Mining
Algorithm and text mining. This is improved algorithm which adds semantic
knowledge to the results, has more efficiency and hence gives better quality
and performances as compared to existing approaches.Comment: 9 pages, 7 figures, 2 table
Application of Weighted Particle Swarm Optimization in Association Rule Mining
Determination of the threshold values of support and confidence, affect the quality of association rule mining up to a great extent. Focus of my study is to apply weighted PSO for evaluating threshold values for support and confidence. The particle swarm optimization algorithm first searches for the optimum fitness value of each particle and then finds corresponding support and confidence as minimal threshold values after the data are transformed into binary values. The proposed method is verified by applying the Food Mart 2000 database of Microsoft SQL Server 2000. I am expecting that the particle swarm optimization algorithm will suggest suitable threshold values and obtain quality rules as per the previous works [1]
- …