325 research outputs found

    Plagiarism Detection in arXiv

    Full text link
    We describe a large-scale application of methods for finding plagiarism in research document collections. The methods are applied to a collection of 284,834 documents collected by arXiv.org over a 14 year period, covering a few different research disciplines. The methodology efficiently detects a variety of problematic author behaviors, and heuristics are developed to reduce the number of false positives. The methods are also efficient enough to implement as a real-time submission screen for a collection many times larger.Comment: Sixth International Conference on Data Mining (ICDM'06), Dec 200

    Scalable Winner Determination in Advertising Auctions

    Full text link
    Internet search results are a growing and highly profitable advertising platform. Search providers auction advertising slots to advertisers on their search result pages. Due to the high volume of searches and the users' low tolerance for search result latency, it is imperative to resolve these auctions fast. Current approaches restrict the expressiveness of bids in order to achieve fast winner determination, which is the problem of allocating slots to advertisers so as to maximize the expected revenue given the advertisers' bids. The goal of our work is to permit more expressive bidding, thus allowing advertisers to achieve complex advertising goals, while still providing fast and scalable techniques for winner determination. We also discuss the application of our framework to advertising in massively multiplayer online games.NS

    Special Section on the International Conference on Data Engineering 2015

    Get PDF
    The papers in this special section were presented at the 31st International Conference on Data Engineering that was held in Seoul, Korea, on April 13-17, 2015. 17, 2015
    • …
    corecore