5,936 research outputs found
Analysis of smartphone model identification using digital images
This paper is focused on smartphone model identification using image features. A total of 64 image features - broadly categorized into colour features, wavelet features and image quality features - are extracted from high-resolution smartphone images. A binary-class turned to multiclass support vector machine (SVM) is used as the classifier. Experimental results based on 1800 images captured with 10 different smartphone/tablet devices are promising in correctly identifying source smartphone model. Image quality metrics and wavelet features are shown to contain the most useful device/model information compared to colour features. However, compared to colour features, quality and wavelet features are highly sensitive to simple image modifications. The combined set of colour, quality and wavelet features achieves the overall best identification accuracy
Unconventional TV Detection using Mobile Devices
Recent studies show that the TV viewing experience is changing giving the
rise of trends like "multi-screen viewing" and "connected viewers". These
trends describe TV viewers that use mobile devices (e.g. tablets and smart
phones) while watching TV. In this paper, we exploit the context information
available from the ubiquitous mobile devices to detect the presence of TVs and
track the media being viewed. Our approach leverages the array of sensors
available in modern mobile devices, e.g. cameras and microphones, to detect the
location of TV sets, their state (ON or OFF), and the channels they are
currently tuned to. We present the feasibility of the proposed sensing
technique using our implementation on Android phones with different realistic
scenarios. Our results show that in a controlled environment a detection
accuracy of 0.978 F-measure could be achieved.Comment: 4 pages, 14 figure
AudioPairBank: Towards A Large-Scale Tag-Pair-Based Audio Content Analysis
Recently, sound recognition has been used to identify sounds, such as car and
river. However, sounds have nuances that may be better described by
adjective-noun pairs such as slow car, and verb-noun pairs such as flying
insects, which are under explored. Therefore, in this work we investigate the
relation between audio content and both adjective-noun pairs and verb-noun
pairs. Due to the lack of datasets with these kinds of annotations, we
collected and processed the AudioPairBank corpus consisting of a combined total
of 1,123 pairs and over 33,000 audio files. One contribution is the previously
unavailable documentation of the challenges and implications of collecting
audio recordings with these type of labels. A second contribution is to show
the degree of correlation between the audio content and the labels through
sound recognition experiments, which yielded results of 70% accuracy, hence
also providing a performance benchmark. The results and study in this paper
encourage further exploration of the nuances in audio and are meant to
complement similar research performed on images and text in multimedia
analysis.Comment: This paper is a revised version of "AudioSentibank: Large-scale
Semantic Ontology of Acoustic Concepts for Audio Content Analysis
An In-Depth Study on Open-Set Camera Model Identification
Camera model identification refers to the problem of linking a picture to the
camera model used to shoot it. As this might be an enabling factor in different
forensic applications to single out possible suspects (e.g., detecting the
author of child abuse or terrorist propaganda material), many accurate camera
model attribution methods have been developed in the literature. One of their
main drawbacks, however, is the typical closed-set assumption of the problem.
This means that an investigated photograph is always assigned to one camera
model within a set of known ones present during investigation, i.e., training
time, and the fact that the picture can come from a completely unrelated camera
model during actual testing is usually ignored. Under realistic conditions, it
is not possible to assume that every picture under analysis belongs to one of
the available camera models. To deal with this issue, in this paper, we present
the first in-depth study on the possibility of solving the camera model
identification problem in open-set scenarios. Given a photograph, we aim at
detecting whether it comes from one of the known camera models of interest or
from an unknown one. We compare different feature extraction algorithms and
classifiers specially targeting open-set recognition. We also evaluate possible
open-set training protocols that can be applied along with any open-set
classifier, observing that a simple of those alternatives obtains best results.
Thorough testing on independent datasets shows that it is possible to leverage
a recently proposed convolutional neural network as feature extractor paired
with a properly trained open-set classifier aiming at solving the open-set
camera model attribution problem even to small-scale image patches, improving
over state-of-the-art available solutions.Comment: Published through IEEE Access journa
Designing a fruit identification algorithm in orchard conditions to develop robots using video processing and majority voting based on hybrid artificial neural network
The first step in identifying fruits on trees is to develop garden robots for different purposes
such as fruit harvesting and spatial specific spraying. Due to the natural conditions of the fruit
orchards and the unevenness of the various objects throughout it, usage of the controlled conditions
is very difficult. As a result, these operations should be performed in natural conditions, both
in light and in the background. Due to the dependency of other garden robot operations on the
fruit identification stage, this step must be performed precisely. Therefore, the purpose of this
paper was to design an identification algorithm in orchard conditions using a combination of video
processing and majority voting based on different hybrid artificial neural networks. The different
steps of designing this algorithm were: (1) Recording video of different plum orchards at different
light intensities; (2) converting the videos produced into its frames; (3) extracting different color
properties from pixels; (4) selecting effective properties from color extraction properties using
hybrid artificial neural network-harmony search (ANN-HS); and (5) classification using majority
voting based on three classifiers of artificial neural network-bees algorithm (ANN-BA), artificial
neural network-biogeography-based optimization (ANN-BBO), and artificial neural network-firefly
algorithm (ANN-FA). Most effective features selected by the hybrid ANN-HS consisted of the third
channel in hue saturation lightness (HSL) color space, the second channel in lightness chroma hue
(LCH) color space, the first channel in L*a*b* color space, and the first channel in hue saturation
intensity (HSI). The results showed that the accuracy of the majority voting method in the best execution
and in 500 executions was 98.01% and 97.20%, respectively. Based on different performance evaluation
criteria of the classifiers, it was found that the majority voting method had a higher performance.European Union (EU) under Erasmus+ project entitled
“Fostering Internationalization in Agricultural Engineering in Iran and Russia” [FARmER] with grant
number 585596-EPP-1-2017-1-DE-EPPKA2-CBHE-JPinfo:eu-repo/semantics/publishedVersio
Multimodal Visual Concept Learning with Weakly Supervised Techniques
Despite the availability of a huge amount of video data accompanied by
descriptive texts, it is not always easy to exploit the information contained
in natural language in order to automatically recognize video concepts. Towards
this goal, in this paper we use textual cues as means of supervision,
introducing two weakly supervised techniques that extend the Multiple Instance
Learning (MIL) framework: the Fuzzy Sets Multiple Instance Learning (FSMIL) and
the Probabilistic Labels Multiple Instance Learning (PLMIL). The former encodes
the spatio-temporal imprecision of the linguistic descriptions with Fuzzy Sets,
while the latter models different interpretations of each description's
semantics with Probabilistic Labels, both formulated through a convex
optimization algorithm. In addition, we provide a novel technique to extract
weak labels in the presence of complex semantics, that consists of semantic
similarity computations. We evaluate our methods on two distinct problems,
namely face and action recognition, in the challenging and realistic setting of
movies accompanied by their screenplays, contained in the COGNIMUSE database.
We show that, on both tasks, our method considerably outperforms a
state-of-the-art weakly supervised approach, as well as other baselines.Comment: CVPR 201
Touchalytics: On the Applicability of Touchscreen Input as a Behavioral Biometric for Continuous Authentication
We investigate whether a classifier can continuously authenticate users based
on the way they interact with the touchscreen of a smart phone. We propose a
set of 30 behavioral touch features that can be extracted from raw touchscreen
logs and demonstrate that different users populate distinct subspaces of this
feature space. In a systematic experiment designed to test how this behavioral
pattern exhibits consistency over time, we collected touch data from users
interacting with a smart phone using basic navigation maneuvers, i.e., up-down
and left-right scrolling. We propose a classification framework that learns the
touch behavior of a user during an enrollment phase and is able to accept or
reject the current user by monitoring interaction with the touch screen. The
classifier achieves a median equal error rate of 0% for intra-session
authentication, 2%-3% for inter-session authentication and below 4% when the
authentication test was carried out one week after the enrollment phase. While
our experimental findings disqualify this method as a standalone authentication
mechanism for long-term authentication, it could be implemented as a means to
extend screen-lock time or as a part of a multi-modal biometric authentication
system.Comment: to appear at IEEE Transactions on Information Forensics & Security;
Download data from http://www.mariofrank.net/touchalytics
- …