1,930 research outputs found

    Clustering outdoor soundscapes using fuzzy ants

    Get PDF
    A classification algorithm for environmental sound recordings or "soundscapes" is outlined. An ant clustering approach is proposed, in which the behavior of the ants is governed by fuzzy rules. These rules are optimized by a genetic algorithm specially designed in order to achieve the optimal set of homogeneous clusters. Soundscape similarity is expressed as fuzzy resemblance of the shape of the sound pressure level histogram, the frequency spectrum and the spectrum of temporal fluctuations. These represent the loudness, the spectral and the temporal content of the soundscapes. Compared to traditional clustering methods, the advantages of this approach are that no a priori information is needed, such as the desired number of clusters, and that a flexible set of soundscape measures can be used. The clustering algorithm was applied to a set of 1116 acoustic measurements in 16 urban parks of Stockholm. The resulting clusters were validated against visitor's perceptual measurements of soundscape quality

    Fuzzy Ants Clustering for Web People Search

    Get PDF
    A search engine query for a person’s name often brings up web pages corresponding to several people who share the same name. The Web People Search (WePS) problem involves organizing such search results for an ambiguous name query in meaningful clusters, that group together all web pages corresponding to one single individual. A particularly challenging aspect of this task is that it is in general not known beforehand how many clusters to expect. In this paper we therefore propose the use of a Fuzzy Ants clustering algorithm that does not rely on prior knowledge of the number of clusters that need to be found in the data. An evaluation on benchmark data sets from SemEval’s WePS1 and WePS2 competitions shows that the resulting system is competitive with the agglomerative clustering Agnes algorithm. This is particularly interesting as the latter involves manual setting of a similarity threshold (or estimating the number of clusters in advance) while the former does not

    Cluster Optimization for Improved Web Usage Mining

    Get PDF
    Now days, World Wide Web (WWW) has become rich and most powerful source of information. Conversely, it has become tricky and critical task to retrieve actual information due to its continuous expansion in dimensions. Web Usage Mining is a step-wise technique of extracting useful access patterns of the user from web. Web personalization makes use of web usage mining techniques, for knowledge acquisition process done by analyzing the user navigational patterns. The web page personalization involves clustering of different web pages having similar navigation patterns for an individual. Since cluster size expands due to the frequent access, optimization or shrinking the size of clusters becomes a chief consideration. This paper proposes a tactic of cluster optimization based on concept of swarm intelligence techniques. Later on based on the recognition of user access patterns, clustering is implemented using neural fuzzy approach i.e. NEF Class algorithm and cluster optimization is implemented using Ant Nest Mate Approach

    An Ant Colony Optimization Based Feature Selection for Web Page Classification

    Get PDF
    The increased popularity of the web has caused the inclusion of huge amount of information to the web, and as a result of this explosive information growth, automated web page classification systems are needed to improve search engines’ performance. Web pages have a large number of features such as HTML/XML tags, URLs, hyperlinks, and text contents that should be considered during an automated classification process. The aim of this study is to reduce the number of features to be used to improve runtime and accuracy of the classification of web pages. In this study, we used an ant colony optimization (ACO) algorithm to select the best features, and then we applied the well-known C4.5, naive Bayes, and k nearest neighbor classifiers to assign class labels to web pages. We used the WebKB and Conference datasets in our experiments, and we showed that using the ACO for feature selection improves both accuracy and runtime performance of classification. We also showed that the proposed ACO based algorithm can select better features with respect to the well-known information gain and chi square feature selection methods

    An ant colony optimization algorithm-based classification for the diagnosis of primary headaches using a website questionnaire expert system

    Get PDF
    The purpose of this research was to evaluate the classification accuracy of the ant colony optimization algorithm for the diagnosis of primary headaches using a website questionnaire expert system that was completed by patients. This cross-sectional study was conducted in 850 headache patients who randomly applied to hospital from three cities in Turkey with the assistance of a neurologist in each city. The patients filled in a detailed web-based headache questionnaire. Finally, neurologists' diagnosis results were compared with the classification results of an ant colony optimization-based classification algorithm. The ant colony algorithm for diagnosis classified patients with 96.9412% overall accuracy. Diagnosis accuracies of migraine, tension-type, and cluster headaches were 98.2%, 92.4%, and 98.2% respectively. The ant colony optimization-based algorithm has a successful classification potential on headache diagnosis. On the other hand, headache diagnosis using a website-based algorithm will be useful for neurologists in order to gather quick and precise results as well as tracking patients for their headache symptoms and medication usage by using electronic records from the Internet

    Swarm intelligence for clustering dynamic data sets for web usage mining and personalization.

    Get PDF
    Swarm Intelligence (SI) techniques were inspired by bee swarms, ant colonies, and most recently, bird flocks. Flock-based Swarm Intelligence (FSI) has several unique features, namely decentralized control, collaborative learning, high exploration ability, and inspiration from dynamic social behavior. Thus FSI offers a natural choice for modeling dynamic social data and solving problems in such domains. One particular case of dynamic social data is online/web usage data which is rich in information about user activities, interests and choices. This natural analogy between SI and social behavior is the main motivation for the topic of investigation in this dissertation, with a focus on Flock based systems which have not been well investigated for this purpose. More specifically, we investigate the use of flock-based SI to solve two related and challenging problems by developing algorithms that form critical building blocks of intelligent personalized websites, namely, (i) providing a better understanding of the online users and their activities or interests, for example using clustering techniques that can discover the groups that are hidden within the data; and (ii) reducing information overload by providing guidance to the users on websites and services, typically by using web personalization techniques, such as recommender systems. Recommender systems aim to recommend items that will be potentially liked by a user. To support a better understanding of the online user activities, we developed clustering algorithms that address two challenges of mining online usage data: the need for scalability to large data and the need to adapt cluster sing to dynamic data sets. To address the scalability challenge, we developed new clustering algorithms using a hybridization of traditional Flock-based clustering with faster K-Means based partitional clustering algorithms. We tested our algorithms on synthetic data, real VCI Machine Learning repository benchmark data, and a data set consisting of real Web user sessions. Having linear complexity with respect to the number of data records, the resulting algorithms are considerably faster than traditional Flock-based clustering (which has quadratic complexity). Moreover, our experiments demonstrate that scalability was gained without sacrificing quality. To address the challenge of adapting to dynamic data, we developed a dynamic clustering algorithm that can handle the following dynamic properties of online usage data: (1) New data records can be added at any time (example: a new user is added on the site); (2) Existing data records can be removed at any time. For example, an existing user of the site, who no longer subscribes to a service, or who is terminated because of violating policies; (3) New parts of existing records can arrive at any time or old parts of the existing data record can change. The user\u27s record can change as a result of additional activity such as purchasing new products, returning a product, rating new products, or modifying the existing rating of a product. We tested our dynamic clustering algorithm on synthetic dynamic data, and on a data set consisting of real online user ratings for movies. Our algorithm was shown to handle the dynamic nature of data without sacrificing quality compared to a traditional Flock-based clustering algorithm that is re-run from scratch with each change in the data. To support reducing online information overload, we developed a Flock-based recommender system to predict the interests of users, in particular focusing on collaborative filtering or social recommender systems. Our Flock-based recommender algorithm (FlockRecom) iteratively adjusts the position and speed of dynamic flocks of agents, such that each agent represents a user, on a visualization panel. Then it generates the top-n recommendations for a user based on the ratings of the users that are represented by its neighboring agents. Our recommendation system was tested on a real data set consisting of online user ratings for a set of jokes, and compared to traditional user-based Collaborative Filtering (CF). Our results demonstrated that our recommender system starts performing at the same level of quality as traditional CF, and then, with more iterations for exploration, surpasses CF\u27s recommendation quality, in terms of precision and recall. Another unique advantage of our recommendation system compared to traditional CF is its ability to generate more variety or diversity in the set of recommended items. Our contributions advance the state of the art in Flock-based 81 for clustering and making predictions in dynamic Web usage data, and therefore have an impact on improving the quality of online services

    Combining rough and fuzzy sets for feature selection

    Get PDF
    corecore