21 research outputs found

    Swarm intelligence for clustering dynamic data sets for web usage mining and personalization.

    Get PDF
    Swarm Intelligence (SI) techniques were inspired by bee swarms, ant colonies, and most recently, bird flocks. Flock-based Swarm Intelligence (FSI) has several unique features, namely decentralized control, collaborative learning, high exploration ability, and inspiration from dynamic social behavior. Thus FSI offers a natural choice for modeling dynamic social data and solving problems in such domains. One particular case of dynamic social data is online/web usage data which is rich in information about user activities, interests and choices. This natural analogy between SI and social behavior is the main motivation for the topic of investigation in this dissertation, with a focus on Flock based systems which have not been well investigated for this purpose. More specifically, we investigate the use of flock-based SI to solve two related and challenging problems by developing algorithms that form critical building blocks of intelligent personalized websites, namely, (i) providing a better understanding of the online users and their activities or interests, for example using clustering techniques that can discover the groups that are hidden within the data; and (ii) reducing information overload by providing guidance to the users on websites and services, typically by using web personalization techniques, such as recommender systems. Recommender systems aim to recommend items that will be potentially liked by a user. To support a better understanding of the online user activities, we developed clustering algorithms that address two challenges of mining online usage data: the need for scalability to large data and the need to adapt cluster sing to dynamic data sets. To address the scalability challenge, we developed new clustering algorithms using a hybridization of traditional Flock-based clustering with faster K-Means based partitional clustering algorithms. We tested our algorithms on synthetic data, real VCI Machine Learning repository benchmark data, and a data set consisting of real Web user sessions. Having linear complexity with respect to the number of data records, the resulting algorithms are considerably faster than traditional Flock-based clustering (which has quadratic complexity). Moreover, our experiments demonstrate that scalability was gained without sacrificing quality. To address the challenge of adapting to dynamic data, we developed a dynamic clustering algorithm that can handle the following dynamic properties of online usage data: (1) New data records can be added at any time (example: a new user is added on the site); (2) Existing data records can be removed at any time. For example, an existing user of the site, who no longer subscribes to a service, or who is terminated because of violating policies; (3) New parts of existing records can arrive at any time or old parts of the existing data record can change. The user\u27s record can change as a result of additional activity such as purchasing new products, returning a product, rating new products, or modifying the existing rating of a product. We tested our dynamic clustering algorithm on synthetic dynamic data, and on a data set consisting of real online user ratings for movies. Our algorithm was shown to handle the dynamic nature of data without sacrificing quality compared to a traditional Flock-based clustering algorithm that is re-run from scratch with each change in the data. To support reducing online information overload, we developed a Flock-based recommender system to predict the interests of users, in particular focusing on collaborative filtering or social recommender systems. Our Flock-based recommender algorithm (FlockRecom) iteratively adjusts the position and speed of dynamic flocks of agents, such that each agent represents a user, on a visualization panel. Then it generates the top-n recommendations for a user based on the ratings of the users that are represented by its neighboring agents. Our recommendation system was tested on a real data set consisting of online user ratings for a set of jokes, and compared to traditional user-based Collaborative Filtering (CF). Our results demonstrated that our recommender system starts performing at the same level of quality as traditional CF, and then, with more iterations for exploration, surpasses CF\u27s recommendation quality, in terms of precision and recall. Another unique advantage of our recommendation system compared to traditional CF is its ability to generate more variety or diversity in the set of recommended items. Our contributions advance the state of the art in Flock-based 81 for clustering and making predictions in dynamic Web usage data, and therefore have an impact on improving the quality of online services

    Semantic Selection of Internet Sources through SWRL Enabled OWL Ontologies

    Get PDF
    This research examines the problem of Information Overload (IO) and give an overview of various attempts to resolve it. Furthermore, argue that instead of fighting IO, it is advisable to start learning how to live with it. It is unlikely that in modern information age, where users are producer and consumer of information, the amount of data and information generated would decrease. Furthermore, when managing IO, users are confined to the algorithms and policies of commercial Search Engines and Recommender Systems (RSs), which create results that also add to IO. this research calls to initiate a change in thinking: this by giving greater power to users when addressing the relevance and accuracy of internet searches, which helps in IO. However powerful search engines are, they do not process enough semantics in the moment when search queries are formulated. This research proposes a semantic selection of internet sources, through SWRL enabled OWL ontologies. the research focuses on SWT and its Stack because they (a)secure the semantic interpretation of the environments where internet searches take place and (b) guarantee reasoning that results in the selection of suitable internet sources in a particular moment of internet searches. Therefore, it is important to model the behaviour of users through OWL concepts and reason upon them in order to address IO when searching the internet. Thus, user behaviour is itemized through user preferences, perceptions and expectations from internet searches. The proposed approach in this research is a Software Engineering (SE) solution which provides computations based on the semantics of the environment stored in the ontological model

    A Usability Approach to Improving the User Experience in Web Directories

    Get PDF
    Submitted for the degree of Doctor of Philosophy, Queen Mary, University of Londo

    A usability approach to improving the user experience in web directories

    Get PDF
    PhDWeb directories are hierarchically organised website collections that offer users subjectbased access to the Web. They played a significant part in navigating the Web in the past but their role has been weakened in recent years due to their cumbersome expanding collections. This thesis presents a unified framework combining the advantages of personalisation and redefined directory search for improving the usability of Web directories. The thesis begins with an examination of classification schemes that identifies the rigidity of hierarchical classifications and their suitability for Web directories in contrast to faceted classifications. This leads on to an Ontological Sketch Modelling (OSM) case study which identifies the misfits affecting user navigation in Web directories from known rigidity issues. The thesis continues with a review of personalisation techniques and a discussion of the user search model of Web directories following the suggested directions of improvement from the case study. A proposed user-centred framework to improve the usability of Web directories which consists of an individual content-based personalisation model and a redefined search model is then implemented as D-Persona and D-Search respectively. The remainder of the thesis is concerned with a usability test of D-Persona and D-Search aimed at discovering the efficiency, effectiveness and user satisfaction of the solution. This involves an experimental design, test results and discussions for the comparative user study. This thesis extracts a formal definition of the rigidity of hierarchies from their characteristics and justifies why hierarchies are still better suited than facets in organising Web directories. Second, it identifies misfits causing poor usability in Web directories based on the discovered rigidity of hierarchies. Third, it proposes a solution to tackle the misfits and improve the usability of Web directories which has been experimentally proved to be successful

    Usability in digitalen Kooperationsnetzwerken. Nutzertests und Logfile-Analyse als kombinierte Methode

    Get PDF
    Usability is a key factor when developing new applications. The interaction between the users and the application should be efficient, effective and engaging. Furthermore, a good usability includes a high error tolerance and an good learnability. Different methods allow the measurement of usability throughout the development (process). All methods have in common that the different employed steps like planning, conducting and evaluating are rather time-consuming. When end-users are included as subjects, usability tests are employed. Due to the high time-effort, usually ten or less tests are conducted. The thesis tries to solve this point by trying to combine usability tests and logfile analysis. The empirical work is two-folded. First, usability tests within a learning management system (LMS) are logged in the background. These logfiles are assigned to severe usability problems. Second, the paths of the severe usability problems are combined with logfile data from a real-world LMS that runs the same application. The real-world logfiles contain a period of about 300 days with 133 active users. Prior to the combination, both data sets converted into a similar format. Being a new procedure, the definite similarity value had to be specified by descriptive statistics and visual inspections. The final combination makes it possible to determine the severity of usability problems on the basis of real-world usage data. The proposed method offers a more precise overview of the occurrence of the found usability problems, independent of the test situation. This thesis provides additional value to the fields of (Web) Data Mining, Usability and Human-Computer Interaction (HCI). It also offers additional knowledge to the field of software development, quantitative and quantitative research as well as computer-supported cooperative work (CSCW) and learning management systems (LMS)

    Personalized Recommendations Based On Users’ Information-Centered Social Networks

    Get PDF
    The overwhelming amount of information available today makes it difficult for users to find useful information and as the solution to this information glut problem, recommendation technologies emerged. Among the several streams of related research, one important evolution in technology is to generate recommendations based on users’ own social networks. The idea to take advantage of users’ social networks as a foundation for their personalized recommendations evolved from an Internet trend that is too important to neglect – the explosive growth of online social networks. In spite of the widely available and diversified assortment of online social networks, most recent social network-based recommendations have concentrated on limited kinds of online sociality (i.e., trust-based networks and online friendships). Thus, this study tried to prove the expandability of social network-based recommendations to more diverse and less focused social networks. The online social networks considered in this dissertation include: 1) a watching network, 2) a group membership, and 3) an academic collaboration network. Specifically, this dissertation aims to check the value of users’ various online social connections as information sources and to explore how to include them as a foundation for personalized recommendations. In our results, users in online social networks shared similar interests with their social partners. An in-depth analysis about the shared interests indicated that online social networks have significant value as a useful information source. Through the recommendations generated by the preferences of social connection, the feasibility of users’ social connections as a useful information source was also investigated comprehensively. The social network-based recommendations produced as good as, or sometimes better, suggestions than traditional collaborative filtering recommendations. Social network-based recommendations were also a good solution for the cold-start user problem. Therefore, in order for cold-start users to receive reasonably good recommendations, it is more effective to be socially associated with other users, rather than collecting a few more items. To conclude, this study demonstrates the viability of multiple social networks as a means for gathering useful information and addresses how different social networks of a novelty value can improve upon conventional personalization technology

    Establishing User Requirements for a Recommender System in an Online Union Catalogue: an Investigation of WorldCat.org

    Get PDF
    This project, undertaken in collaboration with OCLC, aimed to investigate the potential role of recommendations within WorldCat, the publicly accessible union catalogue of libraries participating in the OCLC global cooperative. The goal of the project was a set of conceptual design guidelines for a WorldCat.org recommender system, based on a comprehensive understanding of the systems users and their needs. Taking a mixed-methods approach, the investigation consisted of four phases. Phase one consisted of twenty-one focus groups with key user goups held in three locations; the UK, the US, and Australia and New Zealand. Phase 2 consisted of a pop-up survey implemented on WorldCat.org, and gathered 2,918 responses. Phase three represented an analysis of two months of WorldCat.org transaction log data, consisting of over 15,000,000 sessions. Phase four was a lab based user study investigating and comparing the use of WorldCat.org with Amazon. Findings from each strand were integrated, and the key themes to emerge from the research are discussed. Different methods of classifying the WorldCat.org user population are presented, along with a taxonomy of work- and search-tasks. Key perspectives on the utility of a recommender system are considered, along with a reflection on how the information search behaviour exhibited by users interacting with recommendations while undertaking typical catalogue tasks can be interpreted. Based on the enriched perspective of the system, and the role of recommendation in the catalogue, a series of conceptual design specifications are presented for the development of a WorldCat.org recommender system

    Tune your brown clustering, please

    Get PDF
    Brown clustering, an unsupervised hierarchical clustering technique based on ngram mutual information, has proven useful in many NLP applications. However, most uses of Brown clustering employ the same default configuration; the appropriateness of this configuration has gone predominantly unexplored. Accordingly, we present information for practitioners on the behaviour of Brown clustering in order to assist hyper-parametre tuning, in the form of a theoretical model of Brown clustering utility. This model is then evaluated empirically in two sequence labelling tasks over two text types. We explore the dynamic between the input corpus size, chosen number of classes, and quality of the resulting clusters, which has an impact for any approach using Brown clustering. In every scenario that we examine, our results reveal that the values most commonly used for the clustering are sub-optimal
    corecore