162,633 research outputs found
The Design and Implementation of Collaborative Filtering in Data Mining
Data mining is the process of discovering explicit knowledge from large amounts of data stored in database, data warehouse or other repositories. There have been many studies about models of data mining such as association rule, sequential pattern and so on. Collaborative filtering is one of data mining models. In this paper, we propose two approaches to solving the mining process of collaborative filtering. Finally, collaborative filtering mining is applied to Knowledge Management system
Centric Model Assessment for Collaborative Data Mining
Data mining is the task of discovering interesting patterns from large amounts of data. There are many data mining tasks, such as classification, clustering, association rule mining and sequential pattern mining. Sequential pattern mining finds sets of data items that occur together frequently in some sequences. Collaborative data mining refers to a data mining setting where different groups are geographically dispersed but work together on the same problem in a collaborative way. Such a setting requires adequate software support. Group work is widespread in education. The goal is to enable the groups and their facilitators to see relevant as pects of the groups operation and provide feedbacks if these are more likely to be associated with positive or negative outcomes and where the problems are. We explore how useful mirror information can be extracted via a theory-driven approach and a range of clustering and sequential pattern mining. In this paper we describe an experiment with a simple implementation of such a collaborative data mining environment. The experiment brings to light several problems, one of which is related to model assessment. We discuss several possible solutions. This discussion can contribute to a better understanding of how collaborative data mining is best organized
Graph-RAT programming environment
Graph-RAT is a new programming environment specializing in relational data mining. It incorporates a number of different techniques into a single framework for data collection, data cleaning, propositionalization, and analysis. The language is functional where algorithms are executed over arbitrary sub-graphs of the data. Analytical results can be conducted using collaborative filtering or machine learning techniques. The example algorithms are under BSD license
A Hybrid Web Recommendation System based on the Improved Association Rule Mining Algorithm
As the growing interest of web recommendation systems those are applied to
deliver customized data for their users, we started working on this system.
Generally the recommendation systems are divided into two major categories such
as collaborative recommendation system and content based recommendation system.
In case of collaborative recommen-dation systems, these try to seek out users
who share same tastes that of given user as well as recommends the websites
according to the liking given user. Whereas the content based recommendation
systems tries to recommend web sites similar to those web sites the user has
liked. In the recent research we found that the efficient technique based on
asso-ciation rule mining algorithm is proposed in order to solve the problem of
web page recommendation. Major problem of the same is that the web pages are
given equal importance. Here the importance of pages changes according to the
fre-quency of visiting the web page as well as amount of time user spends on
that page. Also recommendation of newly added web pages or the pages those are
not yet visited by users are not included in the recommendation set. To
over-come this problem, we have used the web usage log in the adaptive
association rule based web mining where the asso-ciation rules were applied to
personalization. This algorithm was purely based on the Apriori data mining
algorithm in order to generate the association rules. However this method also
suffers from some unavoidable drawbacks. In this paper we are presenting and
investigating the new approach based on weighted Association Rule Mining
Algorithm and text mining. This is improved algorithm which adds semantic
knowledge to the results, has more efficiency and hence gives better quality
and performances as compared to existing approaches.Comment: 9 pages, 7 figures, 2 table
An improved switching hybrid recommender system using naive Bayes classifier and collaborative filtering
Recommender Systems apply machine learning and data mining techniques for filtering unseen information and can predict whether a user would like a given resource. To date a number of recommendation algorithms have been proposed, where collaborative filtering and content-based filtering are the two most famous and adopted recommendation techniques. Collaborative filtering recommender systems recommend items by identifying other users with similar taste and use their opinions for recommendation; whereas content-based recommender systems recommend items based on the content information of the items. These systems suffer from scalability, data sparsity, over specialization, and cold-start problems resulting in poor quality recommendations and reduced coverage. Hybrid recommender systems combine individual systems to avoid certain aforementioned limitations of these systems. In this paper, we proposed a unique switching hybrid recommendation approach by combining a Naive Bayes classification approach with the collaborative filtering. Experimental results on two different data sets, show that the proposed algorithm is scalable and provide better performance – in terms of accuracy and coverage – than other algorithms while at the same time eliminates some recorded problems with the recommender systems
Innovative Platform for Designing Hybrid Collaborative & Context-Aware Data Mining Scenarios
The process of knowledge discovery involves nowadays a major number of
techniques. Context-Aware Data Mining (CADM) and Collaborative Data Mining
(CDM) are some of the recent ones. the current research proposes a new hybrid
and efficient tool to design prediction models called Scenarios
Platform-Collaborative & Context-Aware Data Mining (SP-CCADM). Both CADM and
CDM approaches are included in the new platform in a flexible manner; SP-CCADM
allows the setting and testing of multiple configurable scenarios related to
data mining at once. The introduced platform was successfully tested and
validated on real life scenarios, providing better results than each standalone
technique-CADM and CDM. Nevertheless, SP-CCADM was validated with various
machine learning algorithms-k-Nearest Neighbour (k-NN), Deep Learning (DL),
Gradient Boosted Trees (GBT) and Decision Trees (DT). SP-CCADM makes a step
forward when confronting complex data, properly approaching data contexts and
collaboration between data. Numerical experiments and statistics illustrate in
detail the potential of the proposed platform.Comment: 15 figure
A Data Mining Toolbox for Collaborative Writing Processes
Collaborative writing (CW) is an essential skill in academia and industry. Providing support during the process of CW can be useful not only for achieving better quality documents, but also for improving the CW skills of the writers. In order to properly support collaborative writing, it is essential to understand how ideas and concepts are developed during the writing process, which consists of a series of steps of writing activities. These steps can be considered as sequence patterns comprising both time events and the semantics of the changes made during those steps. Two techniques can be combined to examine those patterns: process mining, which focuses on extracting process-related knowledge from event logs recorded by an information system; and semantic analysis, which focuses on extracting knowledge about what the student wrote or edited. This thesis contributes (i) techniques to automatically extract process models of collaborative writing processes and (ii) visualisations to describe aspects of collaborative writing. These two techniques form a data mining toolbox for collaborative writing by using process mining, probabilistic graphical models, and text mining. First, I created a framework, WriteProc, for investigating collaborative writing processes, integrated with the existing cloud computing writing tools in Google Docs. Secondly, I created new heuristic to extract the semantic nature of text edits that occur in the document revisions and automatically identify the corresponding writing activities. Thirdly, based on sequences of writing activities, I propose methods to discover the writing process models and transitional state diagrams using a process mining algorithm, Heuristics Miner, and Hidden Markov Models, respectively. Finally, I designed three types of visualisations and made contributions to their underlying techniques for analysing writing processes. All components of the toolbox are validated against annotated writing activities of real documents and a synthetic dataset. I also illustrate how the automatically discovered process models and visualisations are used in the process analysis with real documents written by groups of graduate students. I discuss how the analyses can be used to gain further insight into how students work and create their collaborative documents
- …