10 research outputs found
Book Recommendation Based on Library Loan Records and Bibliographic Information
AbstractIn order to show the effectiveness of using (a) library loan records and (b) information about book contents as a basis for book recommendations, we entered various data into a support vector machine (SVM), used it to recommend books to subjects, and asked them for evaluations of the recommendations that were given. The data that we used were (1) confidence and support with an association rule that was based on the loan records, (2) similarities between book titles, (3) matches/mismatches between the Nippon Decimal Classification (NDC) categories of the books, and (4) similarities between the outlines of the books in the BOOK Database. The subjects were 32 students who belonged to T University. The books that we recommended and the loan records that we used were obtained from the T University Library. The results showed that the combinations of (1), (2), (3) and (1), (2) were rated more favorably by the subjects than the other combinations. However, the books that were recommended by Amazon were rated even more favorably by the subjects. This is a topic for further research
A Conceptual Data Mining Model (DMM) used in Selective Dissemination of Information (SDI): a case study of Strathmore University library
Rationale - The process of locating and acquiring relevant information from libraries is getting more complicated due to the vast amount of information resources one has to plough through. To serve users purposefully, an academic library should be able to avail to users the tools and services that lessen the task of searching for information.
Design - The research proposed a two-phase data mining through analysing the access behaviour of users. In the first phase, the Ant Colony Clustering Algorithm was used as the data mining method and separated users into several clusters depending on access records used. The clusters were in the form of course groupings. Users who have similar interests and behaviour were collected in the same cluster. In the second phase, the user records in the same cluster were analysed further. The second phase relied on association which was used to discover the relationship between users and information resources, usersβ interests and their information access behaviour.
Findings - It was ascertained that although users were able to locate and retrieve the information they needed, it was not up to the degree of satisfaction they expected. Furthermore, it took them some time to acquire the information. Using data mining together with selective dissemination of information would enable users to access relevant information without promptly thus saving time and other resources.
Practical implications - The mining of user data within library databases would facilitate a better understanding of user needs and requirements leading to the development and delivery of specialised and more fulfilling services.
Originality - The proposed DMM model is original as it is one of a kind that suggests integrating SDI with data mining in libraries
Comparative study of apriori-variant algorithms
Big Data era is currently generating tremendous
amount of data in various fields such as finance,
social media, transportation and medicine. Handling and processing this βbig dataβ demand powerful data mining methods and analysis tools that can turn data into useful knowledge. One of data mining methods is frequent itemset mining that has been implemented in real world applications, such as identifying buying
patterns in grocery and online customersβ behavior.Apriori is a classical algorithm in frequent itemset mining, that able to discover large number or itemset with a certain threshold value. However, the algorithm suffers from scanning time problem while generating candidates of frequent itemsets.This study presents a comparative study between several Apriori-variant algorithms and examines their scanning time.We performed experiments using several sets of different transactional data.The result shows that the improved Apriori algorithm manage to
produce itemsets faster than the original Apriori
algorithm
Apply Text Mining Analytics to Virtual Reference Services: A Case Study on the Email Q & A Service at an Academic Health Sciences Library
Academic libraries receive and reply numerous of patronsβ emails via their virtual reference service, such as Ask a Librarian. This paper presented a text mining approach to analyzing one-year email records accumulated from the Ask-a-Librarian service by the Health Science Library (HSL) at the University of North Carolina at Chapel Hill. This study will help HSL improve their email service by revealing key topics from user questions and the characteristics of user information seeking behavior.Master of Science in Information Scienc
Facilitating resource allocation decision through bibliomining: the case of UTM's library
Library has vastly developed and demand from the users, institutions, international organization needs and technology advancement has changed the library planning and decision making approach in many ways including library budgeting, human resource and infrastructure allocations. This research described (a) the investigation undertaken to examine the characteristics of data from data reservoirs regarding user/patron information and circulation information. (b) The information seeking to explore the patterns and trends among these data reservoirs using data mining analysis with about 957,224 borrowing history and overall 31,052 registered readers and 139,195 title author of books from the Universiti Teknologi Malaysia library since 2008 to 2010. (c) To study how constructed patterns and trends generate informed decisions on resource allocation for circulation function by using cluster analysis, frequency statistics, averages and aggregates and market basket analysis algorithm. This thesis highlights the finding of a research using data mining technique (CRISP-DM) to explore the potentials of the bibliographic data of an academic library. With nearly 1 million records of collection in various formats, the Library of Universiti Teknologi Malaysia has been chosen as the case study for the research. The data mining technique was adopted to explore the relationship among statistically patterned and clustered bibliographic data. Bibliomining are tools that can visualize how libraries manage their costs, staff activity, customer service, user needs, marketing, popular books, circulation, reference transaction, quality of collection, educational programs etc. Similar data mining techniques are suggested to be employed in different library settings and even enterprises as to make more effective use of organizational resources
Optimizing E-Management Using Web Data Mining
Today, one of the biggest challenges that E-management systems face is the explosive growth of operating data and to use this data to enhance services. Web usage mining has emerged as an important technique to provide useful management information from user's Web data. One of the areas where such information is needed is the Web-based academic digital libraries. A digital library (D-library) is an information resource system to store resources in digital format and provide access to users through the network. Academic libraries offer a huge amount of information resources, these information resources overwhelm students and makes it difficult for them to access to relevant information. Proposed solutions to alleviate this issue emphasize the need to build Web recommender systems that make it possible to offer each student with a list of resources that they would be interested in. Collaborative filtering is the most successful technique used to offer recommendations to users. Collaborative filtering provides recommendations according to the user relevance feedback that tells the system their preferences. Most recent work on D-library recommender systems uses explicit feedback.
Explicit feedback requires students to rate resources which make the recommendation process not realistic because few students are willing to provide their interests explicitly. Thus, collaborative filtering suffers from βdata sparsityβ problem. In response to this problem, the study proposed a Web usage mining framework to alleviate the sparsity problem. The framework incorporates clustering mining technique and usage data in the recommendation process. Students perform different actions on D-library, in this study five different actions are identified, including printing, downloading, bookmarking, reading, and viewing the abstract. These actions provide the system with large quantities of implicit feedback data. The proposed framework also utilizes clustering data mining approach to reduce the sparsity problem. Furthermore, generating recommendations based on clusters produce better results because students belonging to the same cluster usually have similar interests.
The proposed framework is divided into two main components: off-line and online components. The off-line component is comprised of two stages: data pre-processing and the derivation of student clusters. The online component is comprised of two stages: building student's profile and generating recommendations. The second stage consists of three steps, in the first step the target student profile is classified to the closest cluster profile using the cosine similarity measure. In the second phase, the Pearson correlation coefficient method is used to select the most similar students to the target student from the chosen cluster to serve as a source of prediction. Finally, a top-list of resources is presented. Using the Book-Crossing dataset the effectiveness of the proposed framework was evaluated based on sparsity level, and Mean Absolute Error (MAE) regarding accuracy. The proposed framework reduced the sparsity level between (0.07% and 26.71%) in the sub-matrices, whereas the sparsity level is between 99.79% and 78.81% using the proposed framework, and 99.86% (for the original matrix) before applying the proposed framework. The experimental results indicated that by using the proposed framework the performance is as much as 13.12% better than clustering-only explicit feedback data, and 21.14% better than the standard K Nearest Neighbours method. The overall results show that the proposed framework can alleviate the Sparsity problem resulting in improving the accuracy of the recommendations
Smart library model based on big data technologies
ΠΡΠ΅Π΄ΠΌΠ΅Ρ ΠΈΡΡΡΠ°ΠΆΠΈΠ²Π°ΡΠ°
Π΄ΠΎΠΊΡΠΎΡΡΠΊΠ΅ Π΄ΠΈΡΠ΅ΡΡΠ°ΡΠΈΡΠ΅ ΡΠ΅ ΡΠ°Π·Π²ΠΎΡ ΠΌΠΎΠ΄Π΅Π»Π° ΠΏΠ°ΠΌΠ΅Ρ Π½Π΅
Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊΠ΅ Π·Π°ΡΠ½ΠΎΠ²Π°Π½ΠΎΠ³ Π½Π° big data ΡΠ΅Ρ
Π½ΠΎΠ»ΠΎΠ³ΠΈΡΠ°ΠΌΠ° ΠΈ ΡΠ΅ΡΠ²ΠΈΡΠΈΠΌΠ°. Π¦Π΅Π½ΡΡΠ°Π»Π½ΠΈ
ΠΈΡΡΡΠ°ΠΆΠΈΠ²Π°ΡΠΊΠΈ ΠΏΡΠΎΠ±Π»Π΅ΠΌ ΡΠ°Π·ΠΌΠ°ΡΡΠ°Π½ Ρ ΡΠ°Π΄Ρ ΡΠ΅ ΡΠ°Π·Π²ΠΎΡ big data ΠΈΠ½ΡΡΠ°ΡΡΡΡΠΊΡΡΡΠ΅ ΠΈ
ΡΠ΅ΡΠ²ΠΈΡΠ° ΠΏΠ°ΠΌΠ΅ΡΠ½Π΅ Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊΠ΅ ΠΊΠΎΡΠΈ ΠΎΠΌΠΎΠ³ΡΡΠ°Π²Π°ΡΡ ΠΈΠ½ΡΠ΅Π»ΠΈΠ³Π΅Π½ΡΠ½Ρ ΠΏΡΠ΅ΡΡΠ°Π³Ρ ΠΈ
ΠΏΡΠ΅ΠΏΠΎΡΡΠΊΡ Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΡΠΊΠΎΠ³ ΡΠ°Π΄ΡΠΆΠ°ΡΠ°. ΠΠΎΡΠ΅Π±Π°Π½ ΡΠΈΡ ΡΠ°Π΄Π° ΡΠ΅ Π΄Π° ΠΈΡΠΏΠΈΡΠ° ΠΌΠΎΠ³ΡΡΠ½ΠΎΡΡ
ΠΈΠ½ΡΠ΅Π³ΡΠ°ΡΠΈΡΠ΅ ΡΠ°Π·Π²ΠΈΡΠ΅Π½ΠΎΠ³ ΠΌΠΎΠ΄Π΅Π»Π° ΡΠ° ΠΏΠ°ΠΌΠ΅ΡΠ½ΠΈΠΌ ΠΎΠ±ΡΠ°Π·ΠΎΠ²Π½ΠΈΠΌ ΠΎΠΊΡΡΠΆΠ΅ΡΠΈΠΌΠ° Ρ ΡΠΈΡΡ
ΡΠ½Π°ΠΏΡΠ΅ΡΠ΅ ΡΠ° ΠΊΠ²Π°Π»ΠΈΡΠ΅ΡΠ° ΠΎΠ±ΡΠ°Π·ΠΎΠ²Π½ΠΎΠ³ ΠΏΡΠΎΡΠ΅ΡΠ°.
Π£ Π΄ΠΎΠΊΡΠΎΡΡΠΊΠΎΡ Π΄ΠΈΡΠ΅ΡΡΠ°ΡΠΈΡΠΈ ΡΠ΅ ΠΏΡΠ΅Π΄ΡΡΠ°Π²ΡΠ΅Π½ ΠΌΠΎΠ΄Π΅Π» ΠΏΠ°ΠΌΠ΅ΡΠ½Π΅ Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊΠ΅ ΠΊΠ°ΠΎ ΠΈΠ½ΡΠ΅Π³ΡΠ°Π»Π½ΠΎΠ³ Π΄Π΅Π»Π° ΠΎΠ±ΡΠ°Π·ΠΎΠ²Π½ΠΎΠ³ ΡΠΈΡΡΠ΅ΠΌΠ° ΠΊΠΎΡΠΈ ΠΌΠΎΠΆΠ΅ Π΄Π° ΠΏΠΎΠ±ΠΎΡΡΠ° ΠΊΠ²Π°Π»ΠΈΡΠ΅Ρ ΠΈ ΡΠ²Π΅ΠΎΠ±ΡΡ
Π²Π°ΡΠ½ΠΎΡΡ Π½Π°ΡΡΠ°Π²Π½ΠΈΡ
ΡΠ΅ΡΡΡΡΠ° ΠΈ ΠΏΠΎΠ²Π΅ΡΠ° ΠΌΠΎΡΠΈΠ²Π°ΡΠΈΡΡ Ρ ΠΏΡΠΎΡΠ΅ΡΡ ΡΡΠ΅ΡΠ° ΠΏΡΠ΅ΠΏΠΎΡΡΡΠΈΠ²Π°ΡΠ΅ΠΌ ΡΠ°Π΄ΡΠΆΠ°ΡΠ° ΠΎΠ΄ ΠΈΠ½ΡΠ΅ΡΠ΅ΡΠ°. ΠΠΎΠ΄Π΅Π» ΠΎΠΏΠΈΡΠ°Π½ Ρ ΡΠ°Π΄Ρ ΠΎΠΌΠΎΠ³ΡΡΠ°Π²Π° ΠΏΡΠΈΠΌΠ΅Π½Ρ big data ΡΠΈΡΡΠ΅ΠΌΠ° Π·Π° Π°Π½Π°Π»ΠΈΠ·Ρ, ΠΎΠ±ΡΠ°Π΄Ρ ΠΈ Π²ΠΈΠ·ΡΠ°Π»ΠΈΠ·Π°ΡΠΈΡΡ ΠΏΠΎΠ΄Π°ΡΠ°ΠΊΠ° ΠΏΡΠΈΠΊΡΠΏΡΠ΅Π½ΠΈΡ
ΠΈΠ· ΡΠ°Π·Π»ΠΈΡΠΈΡΠΈΡ
ΠΈΠ·Π²ΠΎΡΠ° ΠΈ ΠΎΠ±ΡΡ
Π²Π°ΡΠ° ΡΠΈΡ
ΠΎΠ²Ρ ΠΈΠ½ΡΠ΅Π³ΡΠ°ΡΠΈΡΡ Ρ ΠΏΠ°ΠΌΠ΅ΡΠ½Ρ Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊΡ. Π¦ΠΈΡ ΡΠ°Π·Π²ΠΎΡΠ° ΠΏΠ°ΠΌΠ΅ΡΠ½ΠΈΡ
Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊΠ° ΡΠ΅ Π΄Π° ΡΠ΅ ΡΠ½Π°ΠΏΡΠ΅Π΄Π΅ Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΡΠΊΠΈ ΠΏΠΎΡΠ»ΠΎΠ²Π½ΠΈ ΠΏΡΠΎΡΠ΅ΡΠΈ ΠΈ Π΄Π° ΡΠ΅ ΠΊΠΎΡΠΈΡΠ½ΠΈΡΠΈΠΌΠ° ΠΏΡΡΠΆΠ΅ ΠΈΠ½ΠΎΠ²Π°ΡΠΈΠ²Π½ΠΈ ΡΠ΅ΡΠ²ΠΈΡΠΈ Π·Π° ΠΏΡΠ΅ΡΡΠ°Π³Ρ ΠΈ ΠΊΠΎΡΠΈΡΡΠ΅ΡΠ΅ ΡΠ°Π΄ΡΠΆΠ°ΡΠ°.
Π£ Π΄ΠΈΡΠ΅ΡΡΠ°ΡΠΈΡΠΈ ΡΠ΅ ΡΠ°Π·ΠΌΠ°ΡΡΠ°ΡΡ ΡΠ°Π·Π»ΠΈΡΠΈΡΠ΅ ΠΏΠ΅ΡΡΠΏΠ΅ΠΊΡΠΈΠ²Π΅ ΠΈΠΌΠΏΠ»Π΅ΠΌΠ΅Π½ΡΠ°ΡΠΈΡΠ΅ big data ΡΠ΅ΡΠ΅ΡΠ° Π·Π° ΠΏΠ°ΠΌΠ΅ΡΠ½Π΅ Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊΠ΅ ΠΊΠ°ΠΎ Π΄Π΅ΠΎ ΠΊΠΎΠ½ΡΠΈΠ½ΡΠΈΡΠ°Π½ΠΎΠ³ ΠΎΠ±ΡΠ°Π·ΠΎΠ²Π½ΠΎΠ³ ΠΏΡΠΎΡΠ΅ΡΠ°, ΡΠ° ΠΏΠΎΡΠ΅Π±Π½ΠΈΠΌ ΡΠΎΠΊΡΡΠΎΠΌ Π½Π° ΠΈΠ½ΡΠ΅Π³ΡΠ°ΡΠΈΡΡ ΡΡΠ°Π΄ΠΈΡΠΈΠΎΠ½Π°Π»Π½ΠΈΡ
ΡΠΈΡΡΠ΅ΠΌΠ° ΠΈ big data ΡΠ΅Ρ
Π½ΠΎΠ»ΠΎΠ³ΠΈΡΠ°. ΠΠΎΡΠ΅Π΄ Π½Π°Π²Π΅Π΄Π΅Π½ΠΈΡ
ΠΊΠΎΠΌΠΏΠΎΠ½Π΅Π½Π°ΡΠ° ΡΠΈΡΡΠ΅ΠΌΠ°, ΠΌΠΎΠ΄Π΅Π» ΠΎΠ±ΡΡ
Π²Π°ΡΠ° ΠΈΠ½ΡΡΠ°ΡΡΡΡΠΊΡΡΡΡ ΠΈ ΠΈΠ½ΡΠ΅Π³ΡΠ°ΡΠΈΡΡ ΡΠΈΡΡΠ΅ΠΌΠ° ΠΏΡΠ΅ΠΏΠΎΡΡΠΊΠ΅ ΠΊΠΎΠ»Π°Π±ΠΎΡΠ°ΡΠΈΠ²Π½ΠΎΠ³ ΡΠΈΠ»ΡΡΠΈΡΠ°ΡΠ° ΠΈΠ·Π²ΠΎΡΠ° ΡΠ°Π·Π»ΠΈΡΠΈΡΠΈΡ
ΠΏΠΎΠ΄Π°ΡΠ°ΠΊΠ° ΡΠ° big data ΡΠ΅Ρ
Π½ΠΎΠ»ΠΎΠ³ΠΈΡΠ°ΠΌΠ°.
ΠΠΎΠ΄Π΅Π» ΡΠ΅ Π΅Π²Π°Π»ΡΠΈΡΠ°Π½ ΠΊΡΠΎΠ· ΡΠ΅ΡΡΠΈΡΠ°ΡΠ΅ ΠΈ ΠΌΠ΅ΡΠ΅ΡΠ΅ ΡΠ΅Π»Π΅Π²Π°Π½ΡΠ½ΠΈΡ
ΠΏΠ°ΡΠ°ΠΌΠ΅ΡΠ°ΡΠ° ΠΏΠ΅ΡΡΠΎΡΠΌΠ°Π½ΡΠΈ ΠΊΠΎΡΠΈ ΡΡΠΈΡΡ Π½Π° Π΅ΡΠΈΠΊΠ°ΡΠ½ΠΎΡΡ ΠΏΡΠ΅Π΄Π»ΠΎΠΆΠ΅Π½ΠΎΠ³ ΠΌΠΎΠ΄Π΅Π»Π°.The subject of this doctoral dissertation research is the development of a smart library
model based on big data technologies and services . The central research problem
discussed in the thesis is the development of big data infrastr ucture and smart library
services that enable intelligent searches and recommendations from the library content.
A particular focus of the paper is an examination of the possibility of integrating the
developed model into a smart educational environment in order to improve the quality
of the educational process.
The thesis presents a model of the smart library as an integral part of the educational
system that would improve quality level and comprehesivness of learning resources and
increase the motivation of its users through content aware recommendations. The model
described in the thesis considers the possibilities of applying a big data system for the
collection, analysis, processing and visualization of data from multiple sources, and the
integration of data into the smart library . The goal of developing a smart library is to
improve the libraryβs business process and to offer users innovative metho ds to search
and content use.
The thesis discusses the perspective of the implementation of a big data solu
tion for
smart libraries as a part of a continuous learning process with the aim of improving the
results of library operations by integrating traditional systems with big data technology.
In addition to the above system components, the model includes the infrastructure and
integration of a recommender system for collaborative filtering by incorporating
multiple sources of differential data with big data technologies.
Within the evaluation of the model, testing and measurement of the relevant
performance p arameters which influence the efficiency of the proposed model were
carried out
Establishing User Requirements for a Recommender System in an Online Union Catalogue: an Investigation of WorldCat.org
This project, undertaken in collaboration with OCLC, aimed to investigate the potential role of recommendations within WorldCat, the publicly accessible union catalogue of libraries participating in the OCLC global cooperative. The goal of the project was a set of conceptual design guidelines for a WorldCat.org recommender system, based on a comprehensive understanding of the systems users and their needs.
Taking a mixed-methods approach, the investigation consisted of four phases. Phase one consisted of twenty-one focus groups with key user goups held in three locations; the UK, the US, and Australia and New Zealand. Phase 2 consisted of a pop-up survey implemented on WorldCat.org, and gathered 2,918 responses. Phase three represented an analysis of two months of WorldCat.org transaction log data, consisting of over 15,000,000 sessions. Phase four was a lab based user study investigating and comparing the use of WorldCat.org with Amazon.
Findings from each strand were integrated, and the key themes to emerge from the research are discussed. Different methods of classifying the WorldCat.org user population are presented, along with a taxonomy of work- and search-tasks. Key perspectives on the utility of a recommender system are considered, along with a reflection on how the information search behaviour exhibited by users interacting with recommendations while undertaking typical catalogue tasks can be interpreted.
Based on the enriched perspective of the system, and the role of recommendation in the catalogue, a series of conceptual design specifications are presented for the development of a WorldCat.org recommender system