325 research outputs found
Plagiarism Detection in arXiv
We describe a large-scale application of methods for finding plagiarism in
research document collections. The methods are applied to a collection of
284,834 documents collected by arXiv.org over a 14 year period, covering a few
different research disciplines. The methodology efficiently detects a variety
of problematic author behaviors, and heuristics are developed to reduce the
number of false positives. The methods are also efficient enough to implement
as a real-time submission screen for a collection many times larger.Comment: Sixth International Conference on Data Mining (ICDM'06), Dec 200
Scalable Winner Determination in Advertising Auctions
Internet search results are a growing and highly profitable advertising platform.
Search providers auction advertising slots to advertisers on their search result pages.
Due to the high volume of searches and the users' low tolerance for search result latency, it is imperative to resolve these auctions fast.
Current approaches restrict the expressiveness of bids in order to achieve fast winner determination, which is the problem of allocating slots to advertisers so as to maximize the expected revenue given the advertisers' bids.
The goal of our work is to permit more expressive bidding, thus allowing advertisers to achieve complex advertising goals, while still providing fast and scalable techniques for winner determination. We also discuss the application of our framework to advertising in massively multiplayer online games.NS
Special Section on the International Conference on Data Engineering 2015
The papers in this special section were presented at the 31st International Conference on Data Engineering that was held in Seoul, Korea, on April 13-17, 2015. 17, 2015
- …