13,490 research outputs found

    On inverted index compression for search engine efficiency

    Get PDF
    Efficient access to the inverted index data structure is a key aspect for a search engine to achieve fast response times to users’ queries . While the performance of an information retrieval (IR) system can be enhanced through the compression of its posting lists, there is little recent work in the literature that thoroughly compares and analyses the performance of modern integer compression schemes across different types of posting information (document ids, frequencies, positions). In this paper, we experiment with different modern integer compression algorithms, integrating these into a modern IR system. Through comprehensive experiments conducted on two large, widely used document corpora and large query sets, our results show the benefit of compression for different types of posting information to the space- and time-efficiency of the search engine. Overall, we find that the simple Frame of Reference compression scheme results in the best query response times for all types of posting information. Moreover, we observe that the frequency and position posting information in Web corpora that have large volumes of anchor text are more challenging to compress, yet compression is beneficial in reducing average query response times

    Index ordering by query-independent measures

    Get PDF
    Conventional approaches to information retrieval search through all applicable entries in an inverted file for a particular collection in order to find those documents with the highest scores. For particularly large collections this may be extremely time consuming. A solution to this problem is to only search a limited amount of the collection at query-time, in order to speed up the retrieval process. In doing this we can also limit the loss in retrieval efficacy (in terms of accuracy of results). The way we achieve this is to firstly identify the most “important” documents within the collection, and sort documents within inverted file lists in order of this “importance”. In this way we limit the amount of information to be searched at query time by eliminating documents of lesser importance, which not only makes the search more efficient, but also limits loss in retrieval accuracy. Our experiments, carried out on the TREC Terabyte collection, report significant savings, in terms of number of postings examined, without significant loss of effectiveness when based on several measures of importance used in isolation, and in combination. Our results point to several ways in which the computation cost of searching large collections of documents can be significantly reduced

    Getting Help from Course Management Software to Teach a Large-enrollment Introductory Geology Class

    Get PDF
    This article deals with utilizing the Internet as a medium for empowering learning and course management of on-campus classes using the enterprise-wide software system WebCT, which has proven to be very useful in managing a large introductory geology class. The author presents the results of her students' learning experience with WebCT. The article also provides a simple tutorial on how to create an Internet-enhanced course in less than a day using WebCT and with no prior knowledge of HTML language or FTP procedures. Educational levels: Graduate or professional

    Keeping Up To Date with IP News Services and Blogs: Drowning in a Sea Of Sameness?

    Get PDF
    It seems like so many IP related Websites you visit invite you to join their free email list to keep you up to date. Sources span a wide spectrum including governmental organizations, non-governmental organizations, educational institutions, consulting services, law firms, commercial publishers and more. These sources span the spectrum from free, to low fee to premium pricing. With all of this information overload and choices, how do you differentiate and choose news sources? The goals of this article are twofold. Goal one is to present a survey of types and categories of IP news tools available to IP researchers. Since these tools change with time, goal two is to present strategies and approaches to consider when assembling your portfolio of news sources. I use the term researcher to include anyone looking for news, including lawyers, paraprofessionals, academics, students, corporate searchers and more. Some of this material may be yesterday\u27s news for some and breaking news for others. My hope is that you will find value added in some tools and strategies. Before I present the survey of tools, I want to propose some initial general strategies that might be helpful to apply as the detail of the tools unfold

    Web 2.0 technologies for learning: the current landscape – opportunities, challenges and tensions: supplementary materials

    Get PDF
    These supplementary materials accompany the report ‘Web 2.0 technologies for learning: the current landscape – opportunities, challenges and tensions’, which is the first report from research commissioned by Becta into Web 2.0 technologies for learning at Key Stages 3 and 4. This report describes findings from the commissioned literature review of the then current landscape concerning learner use of Web 2.0 technologies and the implications for teachers, schools, local authorities and policy makers

    TLAD 2010 Proceedings:8th international workshop on teaching, learning and assesment of databases (TLAD)

    Get PDF
    This is the eighth in the series of highly successful international workshops on the Teaching, Learning and Assessment of Databases (TLAD 2010), which once again is held as a workshop of BNCOD 2010 - the 27th International Information Systems Conference. TLAD 2010 is held on the 28th June at the beautiful Dudhope Castle at the Abertay University, just before BNCOD, and hopes to be just as successful as its predecessors.The teaching of databases is central to all Computing Science, Software Engineering, Information Systems and Information Technology courses, and this year, the workshop aims to continue the tradition of bringing together both database teachers and researchers, in order to share good learning, teaching and assessment practice and experience, and further the growing community amongst database academics. As well as attracting academics from the UK community, the workshop has also been successful in attracting academics from the wider international community, through serving on the programme committee, and attending and presenting papers.This year, the workshop includes an invited talk given by Richard Cooper (of the University of Glasgow) who will present a discussion and some results from the Database Disciplinary Commons which was held in the UK over the academic year. Due to the healthy number of high quality submissions this year, the workshop will also present seven peer reviewed papers, and six refereed poster papers. Of the seven presented papers, three will be presented as full papers and four as short papers. These papers and posters cover a number of themes, including: approaches to teaching databases, e.g. group centered and problem based learning; use of novel case studies, e.g. forensics and XML data; techniques and approaches for improving teaching and student learning processes; assessment techniques, e.g. peer review; methods for improving students abilities to develop database queries and develop E-R diagrams; and e-learning platforms for supporting teaching and learning
    • …
    corecore