Search CORE

1,668 research outputs found

The problems and challenges of managing crowd sourced audio-visual evidence

Author: Lallie Harjinder Singh
Publication venue: 'MDPI AG'
Publication date: 01/01/2014
Field of study

A number of recent incidents, such as the Stanley Cup Riots, the uprisings in the Middle East and the London riots have demonstrated the value of crowd sourced audio-visual evidence wherein citizens submit audio-visual footage captured on mobile phones and other devices to aid governmental institutions, responder agencies and law enforcement authorities to confirm the authenticity of incidents and, in the case of criminal activity, to identify perpetrators. The use of such evidence can present a significant logistical challenge to investigators, particularly because of the potential size of data gathered through such mechanisms and the added problems of time-lining disparate sources of evidence and, subsequently, investigating the incident(s). In this paper we explore this problem and, in particular, outline the pressure points for an investigator. We identify and explore a number of particular problems related to the secure receipt of the evidence, imaging, tagging and then time-lining the evidence, and the problem of identifying duplicate and near duplicate items of audio-visual evidence

CiteSeerX

Directory of Open Access Journals

Warwick Research Archives Portal Repository

Uncovering source code reuse in large-scale academic environments

Author: Arwin
Baxter
Bejarano
Chuda
Clough
Cosma
Faidhi
Feng
Flores
Halstead
Harrison
Hislop
Jankowitz
Koschke
Kuo
Manning
McCabe
McNamee
Menai
Potthast
Prechelt
Robertson
Rosales
Spinellis
Whale
Wise
Witten
Publication venue: 'Wiley'
Publication date: 01/01/2015
Field of study

The advent of the Internet has caused an increase in content reuse, including source code. The purpose of this research is to uncover potential cases of source code reuse in large-scale environments. A good example is academia, where massive courses are taught to students who must demonstrate that they have acquired the knowledge. The need of detecting content reuse in quasi real-time encourages the development of automatic systems such as the one described in this paper for source code reuse detection. Our approach is based on the comparison of programs at character level. It is able to find potential cases of reuse across a huge number of assignments. It achieved better results than JPlag, the most used online system to find similarities among multiple sets of source codes. The most common obfuscation operations we found were changes in identifier names, comments and indentation. 2014 Wiley Periodicals, Inc. Comput Appl Eng Educ 23:383–390, 2015; View this article online at wileyonlinelibrary.com/journal/cae; DOI 10.1002/cae.21608Flores Sáez, E.; Barrón Cedeño, LA.; Moreno Boronat, LA.; Rosso, P. (2015). Uncovering source code reuse in large-scale academic environments. Computer Applications in Engineering Education. 23(3):383-390. doi:10.1002/cae.21608S38339023

RiuNet

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Learning deep embeddings by learning to rank

Author: He Kun
Publication venue
Publication date: 05/02/2019
Field of study

We study the problem of embedding high-dimensional visual data into low-dimensional vector representations. This is an important component in many computer vision applications involving nearest neighbor retrieval, as embedding techniques not only perform dimensionality reduction, but can also capture task-specific semantic similarities. In this thesis, we use deep neural networks to learn vector embeddings, and develop a gradient-based optimization framework that is capable of optimizing ranking-based retrieval performance metrics, such as the widely used Average Precision (AP) and Normalized Discounted Cumulative Gain (NDCG). Our framework is applied in three applications. First, we study Supervised Hashing, which is concerned with learning compact binary vector embeddings for fast retrieval, and propose two novel solutions. The first solution optimizes Mutual Information as a surrogate ranking objective, while the other directly optimizes AP and NDCG, based on the discovery of their closed-form expressions for discrete Hamming distances. These optimization problems are NP-hard, therefore we derive their continuous relaxations to enable gradient-based optimization with neural networks. Our solutions establish the state-of-the-art on several image retrieval benchmarks. Next, we learn deep neural networks to extract Local Feature Descriptors from image patches. Local features are used universally in low-level computer vision tasks that involve sparse feature matching, such as image registration and 3D reconstruction, and their matching is a nearest neighbor retrieval problem. We leverage our AP optimization technique to learn both binary and real-valued descriptors for local image patches. Compared to competing approaches, our solution eliminates complex heuristics, and performs more accurately in the tasks of patch verification, patch retrieval, and image matching. Lastly, we tackle Deep Metric Learning, the general problem of learning real-valued vector embeddings using deep neural networks. We propose a learning to rank solution through optimizing a novel quantization-based approximation of AP. For downstream tasks such as retrieval and clustering, we demonstrate promising results on standard benchmarks, especially in the few-shot learning scenario, where the number of labeled examples per class is limited

Boston University Institutional Repository (OpenBU)

Deep Supervised Hashing using Symmetric Relative Entropy

Author: Bai Xiao
Hancock Edwin R.
Luan Xiushu
Luo Jie
Zhang Xueni
Zhou Lei
Publication venue: 'Elsevier BV'
Publication date: 11/07/2019
Field of study

By virtue of their simplicity and efficiency, hashing algorithms have achieved significant success on large-scale approximate nearest neighbor search. Recently, many deep neural network based hashing methods have been proposed to improve the search accuracy by simultaneously learning both the feature representation and the binary hash functions. Most deep hashing methods depend on supervised semantic label information for preserving the distance or similarity between local structures, which unfortunately ignores the global distribution of the learned hash codes. We propose a novel deep supervised hashing method that aims to minimize the information loss generated during the embedding process. Specifically, the information loss is measured by the Jensen-Shannon divergence to ensure that compact hash codes have a similar distribution with those from the original images. Experimental results show that our method outperforms current state-of-the-art approaches on two benchmark datasets

Deep Hashing for Image Similarity Search

Author: Al Kobaisi Ali
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2020
Field of study

Hashing for similarity search is one of the most widely used methods to solve the approximate nearest neighbor search problem. In this method, one first maps data items from a real valued high-dimensional space to a suitable low dimensional binary code space and then performs the approximate nearest neighbor search in this code space instead. This is beneficial because the search in the code space can be solved more efficiently in terms of runtime complexity and storage consumption. Obviously, for this method to succeed, it is necessary that similar data items be mapped to binary code words that have small Hamming distance. For real-world data such as images, one usually proceeds as follows. For each data item, a pre-processing algorithm removes noise and insignificant information and extracts important discriminating information to generate a feature vector that captures the important semantic content. Next, a vector hash function maps this real valued feature vector to a binary code word. It is also possible to use the raw feature vectors afterwards to further process the search result candidates produced by binary hash codes. In this dissertation we focus on the following. First, developing a learning based counterpart for the MinHash hashing algorithm. Second, presenting a new unsupervised hashing method UmapHash to map the neighborhood relations of data items from the feature vector space to the binary hash code space. Finally, an application of the aforementioned hashing methods for rapid face image recognition