33,849 research outputs found
Identification versus CBCD: a comparison of different evaluation techniques
Fingerprint techniques have a significant advantage in respect of watermarking: a fingerprint can be extracted in each moment of the lifetime of a multimedia content. This aspect is fundamental to solve the problem of copy detection mainly because many copies can be available in huge amount of data in circulation and because each copy can be attacked in several ways (compression, re-encoding, text-overlay, etc.). In this paper the problem of copy detection is studied and tested from two different point of views: content based and identification approaches. The results show that the proposed system is quite robust to some copy modifications and most of all show that the overall results depend on the evaluation method used for testing
Bilkent University Multimedia Database Group at TRECVID 2008
Bilkent University Multimedia Database Group (BILMDG) participated in two tasks at TRECVID 2008: content-based copy detection (CBCD) and high-level feature extraction (FE). Mostly MPEG-7 [1] visual features, which are also used as low-level features in our MPEG-7 compliant video database management system, are extracted for these tasks. This paper discusses our approaches in each task
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries
With advanced image journaling tools, one can easily alter the semantic
meaning of an image by exploiting certain manipulation techniques such as
copy-clone, object splicing, and removal, which mislead the viewers. In
contrast, the identification of these manipulations becomes a very challenging
task as manipulated regions are not visually apparent. This paper proposes a
high-confidence manipulation localization architecture which utilizes
resampling features, Long-Short Term Memory (LSTM) cells, and encoder-decoder
network to segment out manipulated regions from non-manipulated ones.
Resampling features are used to capture artifacts like JPEG quality loss,
upsampling, downsampling, rotation, and shearing. The proposed network exploits
larger receptive fields (spatial maps) and frequency domain correlation to
analyze the discriminative characteristics between manipulated and
non-manipulated regions by incorporating encoder and LSTM network. Finally,
decoder network learns the mapping from low-resolution feature maps to
pixel-wise predictions for image tamper localization. With predicted mask
provided by final layer (softmax) of the proposed architecture, end-to-end
training is performed to learn the network parameters through back-propagation
using ground-truth masks. Furthermore, a large image splicing dataset is
introduced to guide the training process. The proposed method is capable of
localizing image manipulations at pixel level with high precision, which is
demonstrated through rigorous experimentation on three diverse datasets
Deep Multimodal Image-Repurposing Detection
Nefarious actors on social media and other platforms often spread rumors and
falsehoods through images whose metadata (e.g., captions) have been modified to
provide visual substantiation of the rumor/falsehood. This type of modification
is referred to as image repurposing, in which often an unmanipulated image is
published along with incorrect or manipulated metadata to serve the actor's
ulterior motives. We present the Multimodal Entity Image Repurposing (MEIR)
dataset, a substantially challenging dataset over that which has been
previously available to support research into image repurposing detection. The
new dataset includes location, person, and organization manipulations on
real-world data sourced from Flickr. We also present a novel, end-to-end, deep
multimodal learning model for assessing the integrity of an image by combining
information extracted from the image with related information from a knowledge
base. The proposed method is compared against state-of-the-art techniques on
existing datasets as well as MEIR, where it outperforms existing methods across
the board, with AUC improvement up to 0.23.Comment: To be published at ACM Multimeda 2018 (orals
- …