Search CORE

1,012 research outputs found

Ranking News-Quality Multimedia

Author: Arapakis Ioannis
Hasler David
Michael
Tang Z
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/10/2018
Field of study

News editors need to find the photos that best illustrate a news piece and fulfill news-media quality standards, while being pressed to also find the most recent photos of live events. Recently, it became common to use social-media content in the context of news media for its unique value in terms of immediacy and quality. Consequently, the amount of images to be considered and filtered through is now too much to be handled by a person. To aid the news editor in this process, we propose a framework designed to deliver high-quality, news-press type photos to the user. The framework, composed of two parts, is based on a ranking algorithm tuned to rank professional media highly and a visual SPAM detection module designed to filter-out low-quality media. The core ranking algorithm is leveraged by aesthetic, social and deep-learning semantic features. Evaluation showed that the proposed framework is effective at finding high-quality photos (true-positive rate) achieving a retrieval MAP of 64.5% and a classification precision of 70%.Comment: To appear in ICMR'1

arXiv.org e-Print Archive

Crossref

Automated Crowdturfing Attacks and Defenses in Online Review Systems

Author: Arisoy Ebru
Fei Geli
Kakhki Arash Molavi
Kim Gyuwan
Lee Kyumin
Lee Kyumin
Li Fangtao
Maas Andrew L.
Maxwell Harper F.
Mukherjee Arjun
Sutskever Ilya
Publication venue
Publication date: 07/09/2017
Field of study

Malicious crowdsourcing forums are gaining traction as sources of spreading misinformation online, but are limited by the costs of hiring and managing human workers. In this paper, we identify a new class of attacks that leverage deep learning language models (Recurrent Neural Networks or RNNs) to automate the generation of fake online reviews for products and services. Not only are these attacks cheap and therefore more scalable, but they can control rate of content output to eliminate the signature burstiness that makes crowdsourced campaigns easy to detect. Using Yelp reviews as an example platform, we show how a two phased review generation and customization attack can produce reviews that are indistinguishable by state-of-the-art statistical detectors. We conduct a survey-based user study to show these reviews not only evade human detection, but also score high on "usefulness" metrics by users. Finally, we develop novel automated defenses against these attacks, by leveraging the lossy transformation introduced by the RNN training and generation cycle. We consider countermeasures against our mechanisms, show that they produce unattractive cost-benefit tradeoffs for attackers, and that they can be further curtailed by simple constraints imposed by online service providers

arXiv.org e-Print Archive

Crossref

Evolution of a Web-Scale Near Duplicate Image Detection System

Author: Gusev Andrey
Xu Jiajing
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/09/2022
Field of study

Detecting near duplicate images is fundamental to the content ecosystem of photo sharing web applications. However, such a task is challenging when involving a web-scale image corpus containing billions of images. In this paper, we present an efficient system for detecting near duplicate images across 8 billion images. Our system consists of three stages: candidate generation, candidate selection, and clustering. We also demonstrate that this system can be used to greatly improve the quality of recommendations and search results across a number of real-world applications. In addition, we include the evolution of the system over the course of six years, bringing out experiences and lessons on how new systems are designed to accommodate organic content growth as well as the latest technology. Finally, we are releasing a human-labeled dataset of ~53,000 pairs of images introduced in this paper

arXiv.org e-Print Archive

On Robust Image Spam Filtering via Comprehensive Visual Modeling

Author: CHENG Zhiyong
DENG Robert H.,
NIE Liqiang
SHEN Jialie
YAN Shuicheng
Publication venue: 'Elsevier BV'
Publication date: 01/10/2015
Field of study

Institutional Knowledge at Singapore Management University