2 research outputs found

    Atıf Klasiklerinin Etkisinin ve İlgililik Sıralamalarının Pennant Diyagramları ile Analizi

    Get PDF
    Citation indexes are important authority resources for measuring the contribution of scientists and scientific publications to literature. Many studies in information retrieval are based on research aiming to develop retrieval algorithms. These studies tend to receive citations from different fields because of the interdisciplinary nature of information retrieval. Therefore, it is important to analyze the so-called “citation classics” retrospectively to find out their impact on other fields. Yet, it is not easy to do this using citation indexes, especially for relatively old papers, as traditional citation analysis tends not to reveal the full impact of a work on other studies at its time and periods that follow. In order to see the big picture it is important to study the contribution of these studies on other disciplines as well. In this study the impact of Maron and Kuhns’ citation classic on “probabilistic retrieval” published in 1960 has been visualized using pennant diagrams that were developed on the basis of relevance theory, information retrieval and bibliometrics. We hypothesized that “The interdisciplinary relations that are unobservable with traditional citation analysis can be revealed using the pennant diagrams method”. In order to test the hypothesis works that cited Maron and Kuhns’ study between the years of 1960 and 2015 have been downloaded with their references (a total of 4,176 unique works) and graphics have been prepared by the macros written in MS Excel. Of 4,176 works, 90 were selected using convenience sampling techniques to create static and interactive pennant diagrams for further analysis. Another important output of this study is the relevance rankings. As an alternative to the relevance rankings based on the similarity of references already used in citation indexes, relevance rankings have been created using the pennant diagrams that took into account not only items that cited the core (seed) paper but also citations to the items that cited the core paper. Relevance rankings based on the similarity of references and that of pennant diagrams have been compared. Findings support the hypothesis in that pennant diagrams provide information as to which papers that the core paper on probabilistic model influenced or got influenced from, directly or indirectly. Relevance ranking based on pennant diagrams revealed the impact of the core paper on information retrieval field as well as on other disciplines. Furthermore, it identified the relations between these somewhat disconnected fields, between authors, works, and journals that cannot be readily identified using traditional citation analysis. Relevance rankings using pennant diagrams seem to have been more successful than the relevance rankings based on references similarity. This study is the first such study in Turkey that uses pennant diagrams for relevance rankings. The data used in graphs and relevance rankings are available through citation indexes (the frequencies of total citations and co-citations). Thus, alternative relevance rankings based on pennant diagrams can be offered to users. Pennant diagrams can help researchers track the relevant literature more easily as well as identify how a core work influences other works in a specific field or in other fields

    A bi-directional unified Model for information retrieval

    Get PDF
    Relevance matching between two information objects such as a document and query or a user and product (e.g. movie) is an important problem in information retrieval systems. The most common and most successful way to approach this problem is by probabilistically modelling the relevance between information objects, and computing their relevance matching as the probability of relevance. The objective of a probabilistic relevance retrieval model is to compute the probability of relevance between a given information object pair using all the available information about the individual objects (e.g., document and query), the existing relevance information on both objects and all the information available on other information objects (other documents, queries in the collection and the relevance information on them). The probabilistic retrieval models developed to date are not capable of utilising all available information due to the lack of a unified theory for relevance modelling. More than three decades ago, the notion of simultaneously utilising the relevance information about individual user needs and individual documents to come to a retrieval decision was formalised as the problem of a unified relevance model for Information Retrieval (IR). Since the inception of the unified model, a number of unsuccessful attempts have been made to develop a formal probabilistic relevance model to solve the problem. This thesis provides a new theory and a probabilistic relevance framework that not only solves the problem of the original unified relevance model but also provides the capability to utilise any available information about the information objects in computing the probability of relevance. In this thesis, we consider information matching between two objects (e.g. documents and queries) to be bi-directional preference matching and the relevance between them is thus established and estimated on top of the bi-directional relationship. A key benefit of this bi-directional approach is that the resulting probabilistic bi-directional unified model not only solves the original problem of a unified model in information retrieval but also has the ability to incorporate all of the available information on the information objects (documents and queries) into a single model while computing the probability of relevance. Theoretically, we demonstrate the effectiveness of applying our single framework by deriving relevance ranking functions for popular retrieval scenarios such as collaborative filtering (recommendation), group recommendation and ad-hoc retrieval. In the past, the solution for relevance matching in each of these retrieval scenarios approached with a different solution/framework, partly due to the kind of information available to the retrieval system for computing the probability of relevance. However, the underlying problem of information matching is the same in all scenarios, and a solution to the problem of a unified model should be applicable to all scenarios. One of the interesting aspects of our new theory and model in applying to a collaborative filtering scenario is that it computes the probability of relevance between a given user and a given item while not applying any dimensionality reduction technique or computing the explicit similarity between the users/items, which is contrary to the state-of-the-art collaborative filtering/recommender models (e.g. Matrix Factorisation methods, neighbourhood-based methods). This property allows the retrieval model to model users and items independently with their own features, rather than forcing it to use a common feature space (e.g., common hidden factor-features between a user-item pair of objects or a common vocabulary space between a document-query pair of objects). The effectiveness of this theoretical framework is demonstrated in various real-world applications by experimenting on datasets in collaborative filtering, group recommendation and ad-hoc retrieval tasks. For collaborative filtering and group recommendation the model convincingly out-performs various state-of-the-art recommender models (or frameworks). For ad-hoc retrieval, the model also outperforms the state-of-the-art information retrieval models when it is restricted to use the same information used by the other models. The bi-directional unified model allows the building of both search and personalisation/recommender (or collaborative filtering) systems from a single model, which has not been possible before with the existing probabilistic relevance models. Finally, our theory and its framework have been adopted by some large companies in gaming, venture-capital matching, retail and media, and deployed on their web systems to match their customers, often in the tens of millions, with relevant content
    corecore