39 research outputs found

    Optimizing E-Management Using Web Data Mining

    Get PDF
    Today, one of the biggest challenges that E-management systems face is the explosive growth of operating data and to use this data to enhance services. Web usage mining has emerged as an important technique to provide useful management information from user's Web data. One of the areas where such information is needed is the Web-based academic digital libraries. A digital library (D-library) is an information resource system to store resources in digital format and provide access to users through the network. Academic libraries offer a huge amount of information resources, these information resources overwhelm students and makes it difficult for them to access to relevant information. Proposed solutions to alleviate this issue emphasize the need to build Web recommender systems that make it possible to offer each student with a list of resources that they would be interested in. Collaborative filtering is the most successful technique used to offer recommendations to users. Collaborative filtering provides recommendations according to the user relevance feedback that tells the system their preferences. Most recent work on D-library recommender systems uses explicit feedback. Explicit feedback requires students to rate resources which make the recommendation process not realistic because few students are willing to provide their interests explicitly. Thus, collaborative filtering suffers from “data sparsity” problem. In response to this problem, the study proposed a Web usage mining framework to alleviate the sparsity problem. The framework incorporates clustering mining technique and usage data in the recommendation process. Students perform different actions on D-library, in this study five different actions are identified, including printing, downloading, bookmarking, reading, and viewing the abstract. These actions provide the system with large quantities of implicit feedback data. The proposed framework also utilizes clustering data mining approach to reduce the sparsity problem. Furthermore, generating recommendations based on clusters produce better results because students belonging to the same cluster usually have similar interests. The proposed framework is divided into two main components: off-line and online components. The off-line component is comprised of two stages: data pre-processing and the derivation of student clusters. The online component is comprised of two stages: building student's profile and generating recommendations. The second stage consists of three steps, in the first step the target student profile is classified to the closest cluster profile using the cosine similarity measure. In the second phase, the Pearson correlation coefficient method is used to select the most similar students to the target student from the chosen cluster to serve as a source of prediction. Finally, a top-list of resources is presented. Using the Book-Crossing dataset the effectiveness of the proposed framework was evaluated based on sparsity level, and Mean Absolute Error (MAE) regarding accuracy. The proposed framework reduced the sparsity level between (0.07% and 26.71%) in the sub-matrices, whereas the sparsity level is between 99.79% and 78.81% using the proposed framework, and 99.86% (for the original matrix) before applying the proposed framework. The experimental results indicated that by using the proposed framework the performance is as much as 13.12% better than clustering-only explicit feedback data, and 21.14% better than the standard K Nearest Neighbours method. The overall results show that the proposed framework can alleviate the Sparsity problem resulting in improving the accuracy of the recommendations

    Citation recommendation: approaches and datasets

    Get PDF
    Citation recommendation describes the task of recommending citations for a given text. Due to the overload of published scientific works in recent years on the one hand, and the need to cite the most appropriate publications when writing scientific texts on the other hand, citation recommendation has emerged as an important research topic. In recent years, several approaches and evaluation data sets have been presented. However, to the best of our knowledge, no literature survey has been conducted explicitly on citation recommendation. In this article, we give a thorough introduction to automatic citation recommendation research. We then present an overview of the approaches and data sets for citation recommendation and identify differences and commonalities using various dimensions. Last but not least, we shed light on the evaluation methods and outline general challenges in the evaluation and how to meet them. We restrict ourselves to citation recommendation for scientific publications, as this document type has been studied the most in this area. However, many of the observations and discussions included in this survey are also applicable to other types of text, such as news articles and encyclopedic articles

    Knowledge-Based Techniques for Scholarly Data Access: Towards Automatic Curation

    Get PDF
    Accessing up-to-date and quality scientific literature is a critical preliminary step in any research activity. Identifying relevant scholarly literature for the extents of a given task or application is, however a complex and time consuming activity. Despite the large number of tools developed over the years to support scholars in their literature surveying activity, such as Google Scholar, Microsoft Academic search, and others, the best way to access quality papers remains asking a domain expert who is actively involved in the field and knows research trends and directions. State of the art systems, in fact, either do not allow exploratory search activity, such as identifying the active research directions within a given topic, or do not offer proactive features, such as content recommendation, which are both critical to researchers. To overcome these limitations, we strongly advocate a paradigm shift in the development of scholarly data access tools: moving from traditional information retrieval and filtering tools towards automated agents able to make sense of the textual content of published papers and therefore monitor the state of the art. Building such a system is however a complex task that implies tackling non trivial problems in the fields of Natural Language Processing, Big Data Analysis, User Modelling, and Information Filtering. In this work, we introduce the concept of Automatic Curator System and present its fundamental components.openDottorato di ricerca in InformaticaopenDe Nart, Dari

    Citation Recommendation: Approaches and Datasets

    Get PDF
    Citation recommendation describes the task of recommending citations for a given text. Due to the overload of published scientific works in recent years on the one hand, and the need to cite the most appropriate publications when writing scientific texts on the other hand, citation recommendation has emerged as an important research topic. In recent years, several approaches and evaluation data sets have been presented. However, to the best of our knowledge, no literature survey has been conducted explicitly on citation recommendation. In this article, we give a thorough introduction into automatic citation recommendation research. We then present an overview of the approaches and data sets for citation recommendation and identify differences and commonalities using various dimensions. Last but not least, we shed light on the evaluation methods, and outline general challenges in the evaluation and how to meet them. We restrict ourselves to citation recommendation for scientific publications, as this document type has been studied the most in this area. However, many of the observations and discussions included in this survey are also applicable to other types of text, such as news articles and encyclopedic articles.Comment: to be published in the International Journal on Digital Librarie

    Combination of web usage, content and structure information for diverse web mining applications in the tourism context and the context of users with disabilities

    Get PDF
    188 p.This PhD focuses on the application of machine learning techniques for behaviourmodelling in different types of websites. Using data mining techniques two aspects whichare problematic and difficult to solve have been addressed: getting the system todynamically adapt to possible changes of user preferences, and to try to extract theinformation necessary to ensure the adaptation in a transparent manner for the users,without infringing on their privacy. The work in question combines information of differentnature such as usage information, content information and website structure and usesappropriate web mining techniques to extract as much knowledge as possible from thewebsites. The extracted knowledge is used for different purposes such as adaptingwebsites to the users through proposals of interesting links, so that the users can get therelevant information more easily and comfortably; for discovering interests or needs ofusers accessing the website and to inform the service providers about it; or detectingproblems during navigation.Systems have been successfully generated for two completely different fields: thefield of tourism, working with the website of bidasoa turismo (www.bidasoaturismo.com)and, the field of disabled people, working with discapnet website (www.discapnet.com)from ONCE/Tecnosite foundation

    Research in the Archival Multiverse

    Get PDF
    Over the past 15 years, the field of archival studies around the world has experienced unprecedented growth within the academy and within the profession, and archival studies graduate education programs today have among the highest enrolments in any information field. During the same period, there has also been unparalleled expansion and innovation in the diversity of methods and theories being applied in archival scholarship. Global in scope, Research in the Archival Multiverse compiles critical and reflective essays across a wide range of emerging research areas and interests in archival studies; it aims to provide current and future archival academics with a text addressing possible methods and theoretical frameworks that have been and might be used in archival scholarship and research

    Geographic information extraction from texts

    Get PDF
    A large volume of unstructured texts, containing valuable geographic information, is available online. This information – provided implicitly or explicitly – is useful not only for scientific studies (e.g., spatial humanities) but also for many practical applications (e.g., geographic information retrieval). Although large progress has been achieved in geographic information extraction from texts, there are still unsolved challenges and issues, ranging from methods, systems, and data, to applications and privacy. Therefore, this workshop will provide a timely opportunity to discuss the recent advances, new ideas, and concepts but also identify research gaps in geographic information extraction

    Research in the Archival Multiverse

    Get PDF
    Over the past 15 years, the field of archival studies around the world has experienced unprecedented growth within the academy and within the profession, and archival studies graduate education programs today have among the highest enrolments in any information field. During the same period, there has also been unparalleled expansion and innovation in the diversity of methods and theories being applied in archival scholarship. Global in scope, Research in the Archival Multiverse compiles critical and reflective essays across a wide range of emerging research areas and interests in archival studies; it aims to provide current and future archival academics with a text addressing possible methods and theoretical frameworks that have been and might be used in archival scholarship and research

    Study on open science: The general state of the play in Open Science principles and practices at European life sciences institutes

    Get PDF
    Nowadays, open science is a hot topic on all levels and also is one of the priorities of the European Research Area. Components that are commonly associated with open science are open access, open data, open methodology, open source, open peer review, open science policies and citizen science. Open science may a great potential to connect and influence the practices of researchers, funding institutions and the public. In this paper, we evaluate the level of openness based on public surveys at four European life sciences institute

    Computation in Complex Networks

    Get PDF
    Complex networks are one of the most challenging research focuses of disciplines, including physics, mathematics, biology, medicine, engineering, and computer science, among others. The interest in complex networks is increasingly growing, due to their ability to model several daily life systems, such as technology networks, the Internet, and communication, chemical, neural, social, political and financial networks. The Special Issue “Computation in Complex Networks" of Entropy offers a multidisciplinary view on how some complex systems behave, providing a collection of original and high-quality papers within the research fields of: • Community detection • Complex network modelling • Complex network analysis • Node classification • Information spreading and control • Network robustness • Social networks • Network medicin
    corecore