159 research outputs found

    Do Open Access Articles Have a Greater Research Impact?

    Get PDF
    While many authors believe that their work has a greater research impact if it is freely available, studies to demonstrate that impact are few. This study looks at articles in four disciplines at varying stages of adoption of open access—philosophy, political science, electrical and electronic engineering and mathematics—to see if they have a greater impact, as measured by citations in the ISI Web of Science database, if their authors make them freely available on the Internet. The finding is that, across all four disciplines, freely available articles do have a greater research impact. Shedding light on this category of open access reveals that scholars in diverse disciplines are both adopting open access practices and being rewarded for it

    The Most Influential Paper Gerard Salton Never Wrote

    Get PDF
    Gerard Salton is often credited with developing the vector space model (VSM) for information retrieval (IR). Citations to Salton give the impression that the VSM must have been articulated as an IR model sometime between 1970 and 1975. However, the VSM as it is understood today evolved over a longer time period than is usually acknowledged, and an articulation of the model and its assumptions did not appear in print until several years after those assumptions had been criticized and alternative models proposed. An often cited overview paper titled ???A Vector Space Model for Information Retrieval??? (alleged to have been published in 1975) does not exist, and citations to it represent a confusion of two 1975 articles, neither of which were overviews of the VSM as a model of information retrieval. Until the late 1970s, Salton did not present vector spaces as models of IR generally but rather as models of specifi c computations. Citations to the phantom paper refl ect an apparently widely held misconception that the operational features and explanatory devices now associated with the VSM must have been introduced at the same time it was fi rst proposed as an IR model.published or submitted for publicatio

    Do Open Access Articles Have a Greater Research Impact?

    Get PDF
    While many authors believe that their work has a greater research impact if it is freely available, studies to demonstrate that impact are few. This study looks at articles in four disciplines at varying stages of adoption of open access—philosophy, political science, electrical and electronic engineering and mathematics—to see if they have a greater impact, as measured by citations in the ISI Web of Science database, if their authors make them freely available on the Internet. The finding is that, across all four disciplines, freely available articles do have a greater research impact. Shedding light on this category of open access reveals that scholars in diverse disciplines are both adopting open access practices and being rewarded for it

    Social impact retrieval: measuring author inïŹ‚uence on information retrieval

    Get PDF
    The increased presence of technologies collectively referred to as Web 2.0 mean the entire process of new media production and dissemination has moved away from an authorcentric approach. Casual web users and browsers are increasingly able to play a more active role in the information creation process. This means that the traditional ways in which information sources may be validated and scored must adapt accordingly. In this thesis we propose a new way in which to look at a user's contributions to the network in which they are present, using these interactions to provide a measure of authority and centrality to the user. This measure is then used to attribute an query-independent interest score to each of the contributions the author makes, enabling us to provide other users with relevant information which has been of greatest interest to a community of like-minded users. This is done through the development of two algorithms; AuthorRank and MessageRank. We present two real-world user experiments which focussed around multimedia annotation and browsing systems that we built; these systems were novel in themselves, bringing together video and text browsing, as well as free-text annotation. Using these systems as examples of real-world applications for our approaches, we then look at a larger-scale experiment based on the author and citation networks of a ten year period of the ACM SIGIR conference on information retrieval between 1997-2007. We use the citation context of SIGIR publications as a proxy for annotations, constructing large social networks between authors. Against these networks we show the eïŹ€ectiveness of incorporating user generated content, or annotations, to improve information retrieval

    Argumentative zoning information extraction from scientific text

    Get PDF
    Let me tell you, writing a thesis is not always a barrel of laughs—and strange things can happen, too. For example, at the height of my thesis paranoia, I had a re-current dream in which my cat Amy gave me detailed advice on how to restructure the thesis chapters, which was awfully nice of her. But I also had a lot of human help throughout this time, whether things were going fine or beserk. Most of all, I want to thank Marc Moens: I could not have had a better or more knowledgable supervisor. He always took time for me, however busy he might have been, reading chapters thoroughly in two days. He both had the calmness of mind to give me lots of freedom in research, and the right judgement to guide me away, tactfully but determinedly, from the occasional catastrophe or other waiting along the way. He was great fun to work with and also became a good friend. My work has profitted from the interdisciplinary, interactive and enlightened atmosphere at the Human Communication Centre and the Centre for Cognitive Science (which is now called something else). The Language Technology Group was a great place to work in, as my research was grounded in practical applications develope

    The Role of Document Structure and Citation Analysis in Literature Information Retrieval

    Get PDF
    Literature Information Retrieval (IR) is the task of searching relevant publications given a particular information need expressed as a set of queries. With the staggering growth of scientific literature, it is critical to design effective retrieval solutions to facilitate efficient access to them. We hypothesize that particular genre specific characteristics of scientific literature such as metadata and citations are potentially helpful for enhancing scientific literature search. We conducted systematic and extensive IR experiments on open information retrieval test collections to investigate their roles in enhancing literature information retrieval effectiveness. This thesis consists of three major parts of studies. First, we examined the role of document structure in literature search through comprehensive studies on the retrieval effectiveness of a set of structure-aware retrieval models on ad hoc scientific literature search tasks. Second, under the language modeling retrieval framework, we studied exploiting citation and co-citation analysis results as sources of evidence for enhancing literature search. Specifically, we examined relevant document distribution patterns over partitioned clusters of document citation and co-citation graphs; we examined seven ways of modeling document prior probabilities of being relevant based on document citation and co-citation analysis; we studied the effectiveness of boosting retrieved documents with scores of their neighborhood documents in terms co-citation counts, co-citation similarities and Howard White's pennant scores. Third, we combined both structured retrieval features and citation related features in developing machine learned retrieval models for literatures search and assessed the effectiveness of learning to rank algorithms and various literature-specific features. Our major findings are as follows. State-of-the-art structure-ware retrieval models though reportedly perform well in known item finding tasks do not significantly outperform non-fielded baseline retrieval models in ad hoc literature information retrieval. Though relevant document distributions over citation and co-citation network graph partitions reveal favorable pattern, citation and co-citation analysis results on the current iSearch test collection only modestly improve retrieval effectiveness. However, priors derived from co-citation analysis outperform that derived from citation analysis, and pennant score for document expansion outperforms raw co-citation count or cosine similarity of co-citation counts. Our learning to rank experiments show that in a heterogeneous collection setting, citation related features can significantly outperform baselines.Ph.D., Information Studies -- Drexel University, 201

    NLP Driven Models for Automatically Generating Survey Articles for Scientific Topics.

    Full text link
    This thesis presents new methods that use natural language processing (NLP) driven models for summarizing research in scientific fields. Given a topic query in the form of a text string, we present methods for finding research articles relevant to the topic as well as summarization algorithms that use lexical and discourse information present in the text of these articles to generate coherent and readable extractive summaries of past research on the topic. In addition to summarizing prior research, good survey articles should also forecast future trends. With this motivation, we present work on forecasting future impact of scientific publications using NLP driven features.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113407/1/rahuljha_1.pd

    The Janus Faced Scholar:a Festschrift in honour of Peter Ingwersen

    Get PDF

    Evaluation of the Effectiveness of Cosine Similarity in Predicting Relevance between Paired Citing and Cited Sentences.

    Get PDF
    Citation analysis has a long history in Information Science. We examined the potential of cosine similarity to predict relevance between citing sentences and the articles they cite. An expert evaluated 22,697 pairs of cited and citing sentences, and marked 544 as relevant to one another. Cosine similarity gave 8386 of these pairs a similarity score over zero, which included 339 relevant pairs. (4% precision, 65% recall). Under 0.01% of each cited article was relevant to the citing sentence, making precise retrieval challenging. We performed a detailed error analysis. Cosine similarity performance was reduced by insufficient window size, affixes, hyphenation, acronyms and abbreviations. The following preprocessing steps would improve retrieval performance: using a stemming algorithm that accounts for prefixes, expanding the window of comparison from sentences to paragraphs, identifying synonyms and expanding abbreviations. Further investigation of the possibilities of cosine similarity is necessary, but such investigation is worth pursuit
    • 

    corecore