295,093 research outputs found

    Content-Based Weak Supervision for Ad-Hoc Re-Ranking

    Full text link
    One challenge with neural ranking is the need for a large amount of manually-labeled relevance judgments for training. In contrast with prior work, we examine the use of weak supervision sources for training that yield pseudo query-document pairs that already exhibit relevance (e.g., newswire headline-content pairs and encyclopedic heading-paragraph pairs). We also propose filtering techniques to eliminate training samples that are too far out of domain using two techniques: a heuristic-based approach and novel supervised filter that re-purposes a neural ranker. Using several leading neural ranking architectures and multiple weak supervision datasets, we show that these sources of training pairs are effective on their own (outperforming prior weak supervision techniques), and that filtering can further improve performance.Comment: SIGIR 2019 (short paper

    Don’t be scared: JISC TechDis - Making documents more accessible!

    Get PDF
    An article from JISC TechDis explaining how to use the Styles and Formatting toolbar from Microsoft to make documents more accessible

    Maskil(im) and Rabbim: from Daniel to Qumran

    Get PDF

    WWW Programming using computational logic systems (and the PiLLoW/Ciao library)

    Get PDF
    We discuss from a practical point of view a number of issues involved in writing Internet and WWW applications using LP/CLP systems. We describe Pd_l_oW, a public-domain Internet and WWW programming library for LP/CLP systems which we argüe significantly simplifies the process of writing such applications. Pd_l_oW provides facilities for generating HTML structured documents, producing HTML forms, writing form handlers, accessing and parsing WWW documents, and accessing code posted at HTTP addresses. We also describe the architecture of some application classes, using a high-level model of client-server interaction, active modules. We then propose an architecture for automatic LP/CLP code downloading for local execution, using generic browsers. Finally, we also provide an overview of related work on the topic. The PiLLoW library has been developed in the context of the &- Prolog and CIAO systems, but it has been adapted to a number of popular LP/CLP systems, supporting most of its functionality

    Stamp Duties (Miscellaneous) Amendment Act, 1995, No. 72

    Get PDF

    Annual accountability returns 2011

    Get PDF

    Transcript Conventions and Examples for Audio and Video Stories

    Get PDF
    This document was developed to guide members of the research team as they prepared transcripts of audio and video stories that were collected as part of the Launching through the Surf: The Dory Fleet of Pacific City project. The document includes information on general formatting conventions, stylistic issues, and examples for reference

    Fuzzy Content Mining for Targeted Advertisement

    Get PDF
    Content-targeted advertising system is becoming an increasingly important part of the funding source of free web services. Highly efficient content analysis is the pivotal key of such a system. This project aims to establish a content analysis engine involving fuzzy logic that is able to automatically analyze real user-posted Web documents such as blog entries. Based on the analysis result, the system matches and retrieves the most appropriate Web advertisements. The focus and complexity is on how to better estimate and acquire the keywords that represent a given Web document. Fuzzy Web mining concept will be applied to synthetically consider multiple factors of Web content. A Fuzzy Ranking System is established based on certain fuzzy (and some crisp) rules, fuzzy sets, and membership functions to get the best candidate keywords. Once it is has obtained the keywords, the system will retrieve corresponding advertisements from certain providers through Web services as matched advertisements, similarly to retrieving a products list from Amazon.com. In 87% of the cases, the results of this system can match the accuracy of the Google Adwords system. Furthermore, this expandable system will also be a solid base for further research and development on this topic
    corecore