3,763 research outputs found
AXES at TRECVID 2012: KIS, INS, and MED
The AXES project participated in the interactive instance search task (INS), the known-item search task (KIS), and the multimedia event detection task (MED) for TRECVid 2012. As in our TRECVid 2011 system, we used nearly identical search systems and user interfaces for both INS and KIS. Our interactive INS and KIS systems focused this year on using classifiers trained at query time with positive examples collected from external search engines. Participants in our KIS experiments were media professionals from the BBC; our INS experiments were carried out by students and researchers at Dublin City University. We performed comparatively well in both experiments. Our best KIS run found 13 of the 25 topics, and our best INS runs outperformed all other submitted runs in terms of P@100. For MED, the system presented was based on a minimal number of low-level descriptors, which we chose to be as large as computationally feasible. These descriptors are aggregated to produce high-dimensional video-level signatures, which are used to train a set of linear classifiers. Our MED system achieved the second-best score of all submitted runs in the main track, and best score in the ad-hoc track, suggesting that a simple system based on state-of-the-art low-level descriptors can give relatively high performance. This paper describes in detail our KIS, INS, and MED systems and the results and findings of our experiments
SVS-JOIN : efficient spatial visual similarity join for geo-multimedia
In the big data era, massive amount of multimedia data with geo-tags has been generated and collected by smart devices equipped with mobile communications module and position sensor module. This trend has put forward higher request on large-scale geo-multimedia retrieval. Spatial similarity join is one of the significant problems in the area of spatial database. Previous works focused on spatial textual document search problem, rather than geo-multimedia retrieval. In this paper, we investigate a novel geo-multimedia retrieval paradigm named spatial visual similarity join (SVS-JOIN for short), which aims to search similar geo-image pairs in both aspects of geo-location and visual content. Firstly, the definition of SVS-JOIN is proposed and then we present the geographical similarity and visual similarity measurement. Inspired by the approach for textual similarity join, we develop an algorithm named SVS-JOIN B by combining the PPJOIN algorithm and visual similarity. Besides, an extension of it named SVS-JOIN G is developed, which utilizes spatial grid strategy to improve the search efficiency. To further speed up the search, a novel approach called SVS-JOIN Q is carefully designed, in which a quadtree and a global inverted index are employed. Comprehensive experiments are conducted on two geo-image datasets and the results demonstrate that our solution can address the SVS-JOIN problem effectively and efficiently
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Adversarial Detection of Flash Malware: Limitations and Open Issues
During the past four years, Flash malware has become one of the most
insidious threats to detect, with almost 600 critical vulnerabilities targeting
Adobe Flash disclosed in the wild. Research has shown that machine learning can
be successfully used to detect Flash malware by leveraging static analysis to
extract information from the structure of the file or its bytecode. However,
the robustness of Flash malware detectors against well-crafted evasion attempts
- also known as adversarial examples - has never been investigated. In this
paper, we propose a security evaluation of a novel, representative Flash
detector that embeds a combination of the prominent, static features employed
by state-of-the-art tools. In particular, we discuss how to craft adversarial
Flash malware examples, showing that it suffices to manipulate the
corresponding source malware samples slightly to evade detection. We then
empirically demonstrate that popular defense techniques proposed to mitigate
evasion attempts, including re-training on adversarial examples, may not always
be sufficient to ensure robustness. We argue that this occurs when the feature
vectors extracted from adversarial examples become indistinguishable from those
of benign data, meaning that the given feature representation is intrinsically
vulnerable. In this respect, we are the first to formally define and
quantitatively characterize this vulnerability, highlighting when an attack can
be countered by solely improving the security of the learning algorithm, or
when it requires also considering additional features. We conclude the paper by
suggesting alternative research directions to improve the security of
learning-based Flash malware detectors
Making Laplacians commute
In this paper, we construct multimodal spectral geometry by finding a pair of
closest commuting operators (CCO) to a given pair of Laplacians. The CCOs are
jointly diagonalizable and hence have the same eigenbasis. Our construction
naturally extends classical data analysis tools based on spectral geometry,
such as diffusion maps and spectral clustering. We provide several synthetic
and real examples of applications in dimensionality reduction, shape analysis,
and clustering, demonstrating that our method better captures the inherent
structure of multi-modal data
- …