Search CORE

794 research outputs found

Efficient Data Analytics on Augmented Similarity Triplets

Author: Ahmad Muhammad
Ali Sarwan
Karim Asim
Khan Imdadullah
Shakeel Muhammad Haroon
Zaman Arif
Publication venue
Publication date: 27/12/2019
Field of study

Many machine learning methods (classification, clustering, etc.) start with a known kernel that provides similarity or distance measure between two objects. Recent work has extended this to situations where the information about objects is limited to comparisons of distances between three objects (triplets). Humans find the comparison task much easier than the estimation of absolute similarities, so this kind of data can be easily obtained using crowd-sourcing. In this work, we give an efficient method of augmenting the triplets data, by utilizing additional implicit information inferred from the existing data. Triplets augmentation improves the quality of kernel-based and kernel-free data analytics tasks. Secondly, we also propose a novel set of algorithms for common supervised and unsupervised machine learning tasks based on triplets. These methods work directly with triplets, avoiding kernel evaluations. Experimental evaluation on real and synthetic datasets shows that our methods are more accurate than the current best-known techniques

arXiv.org e-Print Archive

HodgeRank with Information Maximization for Crowdsourced Pairwise Ranking Aggregation

Author: Chen Xi
Huang Qingming
Xiong Jiechao
Xu Qianqian
Yao Yuan
Publication venue
Publication date: 16/11/2017
Field of study

Recently, crowdsourcing has emerged as an effective paradigm for human-powered large scale problem solving in various domains. However, task requester usually has a limited amount of budget, thus it is desirable to have a policy to wisely allocate the budget to achieve better quality. In this paper, we study the principle of information maximization for active sampling strategies in the framework of HodgeRank, an approach based on Hodge Decomposition of pairwise ranking data with multiple workers. The principle exhibits two scenarios of active sampling: Fisher information maximization that leads to unsupervised sampling based on a sequential maximization of graph algebraic connectivity without considering labels; and Bayesian information maximization that selects samples with the largest information gain from prior to posterior, which gives a supervised sampling involving the labels collected. Experiments show that the proposed methods boost the sampling efficiency as compared to traditional sampling schemes and are thus valuable to practical crowdsourcing experiments.Comment: Accepted by AAAI201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Datasets, Clues and State-of-the-Arts for Multimedia Forensics: An Extensive Review

Author: Vishwakarma Dinesh Kumar
Yadav Ankit
Publication venue
Publication date: 13/01/2024
Field of study

With the large chunks of social media data being created daily and the parallel rise of realistic multimedia tampering methods, detecting and localising tampering in images and videos has become essential. This survey focusses on approaches for tampering detection in multimedia data using deep learning models. Specifically, it presents a detailed analysis of benchmark datasets for malicious manipulation detection that are publicly available. It also offers a comprehensive list of tampering clues and commonly used deep learning architectures. Next, it discusses the current state-of-the-art tampering detection methods, categorizing them into meaningful types such as deepfake detection methods, splice tampering detection methods, copy-move tampering detection methods, etc. and discussing their strengths and weaknesses. Top results achieved on benchmark datasets, comparison of deep learning approaches against traditional methods and critical insights from the recent tampering detection methods are also discussed. Lastly, the research gaps, future direction and conclusion are discussed to provide an in-depth understanding of the tampering detection research arena

arXiv.org e-Print Archive

Person Re-identification with Deep Learning

Author: Fang Liang
Publication venue
Publication date: 26/11/2019
Field of study

In this work, we survey the state of the art of person re-identification and introduce the basics of the deep learning method for implementing this task. Moreover, we propose a new structure for this task. The core content of our work is to optimize the model that is composed of a pre-trained network to distinguish images from different people with representative features. The experiment is implemented on three public person datasets and evaluated with evaluation metrics that are mean Average Precision (mAP) and Cumulative Matching Characteristic (CMC). We take the BNNeck structure proposed by Luo et al. [25] as the baseline model. It adopts several tricks for the training, such as the mini-batch strategy of loading images, data augmentation for improving the model’s robustness, dynamic learning rate, label-smoothing regularization, and the L2 regularization to reach a remarkable performance. Inspired from that, we propose a novel structure named SplitReID that trains the model in separated feature embedding spaces with multiple losses, which outperforms the BNNeck structure and achieves competitive performance on three datasets. Additionally, the SplitReID structure holds the property of lightweight computation complexity that it requires fewer parameters for the training and inference compared to the BNNeck structure. Person re-identification can be deployed without high-resolution images and fixed angle of pedestrians with the deep learning method to achieve outstanding performance. Therefore, it holds an immeasurable prospect in practical applications, especially for the security fields, even though there are still some challenges like occlusions to be overcome

Trepo - Institutional Repository of Tampere University

Robust PCA as Bilinear Decomposition with Outlier-Sparsity Regularization

Author: Giannakis Georgios B.
Mateos Gonzalo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/11/2011
Field of study

Principal component analysis (PCA) is widely used for dimensionality reduction, with well-documented merits in various applications involving high-dimensional data, including computer vision, preference measurement, and bioinformatics. In this context, the fresh look advocated here permeates benefits from variable selection and compressive sampling, to robustify PCA against outliers. A least-trimmed squares estimator of a low-rank bilinear factor analysis model is shown closely related to that obtained from an

\ell_0

-(pseudo)norm-regularized criterion encouraging sparsity in a matrix explicitly modeling the outliers. This connection suggests robust PCA schemes based on convex relaxation, which lead naturally to a family of robust estimators encompassing Huber's optimal M-class as a special case. Outliers are identified by tuning a regularization parameter, which amounts to controlling sparsity of the outlier matrix along the whole robustification path of (group) least-absolute shrinkage and selection operator (Lasso) solutions. Beyond its neat ties to robust statistics, the developed outlier-aware PCA framework is versatile to accommodate novel and scalable algorithms to: i) track the low-rank signal subspace robustly, as new data are acquired in real time; and ii) determine principal components robustly in (possibly) infinite-dimensional feature spaces. Synthetic and real data tests corroborate the effectiveness of the proposed robust PCA schemes, when used to identify aberrant responses in personality assessment surveys, as well as unveil communities in social networks, and intruders from video surveillance data.Comment: 30 pages, submitted to IEEE Transactions on Signal Processin

arXiv.org e-Print Archive

CiteSeerX

Crossref