9 research outputs found

    An Improved Adaptive Filtering Technique to Remove High Density Salt-and-Pepper Noise using Multiple Last Processed Pixels

    Get PDF
    This paper presents an efficient algorithm which can remove high density salt-and-pepper noise from corrupted digital image This technique differentiates between corrupted and uncorrupted pixels and performs the filtering process only on the corrupted ones The proposed algorithm calculates median only among the noise-free neighborhoods in the processing window and replaces the centre corrupted pixel with that median value The adaptive behavior is enabled here by expanding the processing window based on neighbourhood noise-free pixels In case of high density noise corruption where no noise-free neighborhood is found within the maximum size of window this algorithm takes last processed pixels into the account While most of the existing filtering techniques use only one last processed pixel after reaching maximum window the proposed algorithm considers multiple last processed pixels rather than considering a single one so that more accurate decision can be taken in order to replace the corrupted pixe

    Supervised Spectral Subspace Clustering for Visual Dictionary Creation in the Context of Image Classification

    No full text
    International audienceWhen building traditional Bag of Visual Words (BOW) for image classification, the K-means algorithm is usually used on a large set of high dimensional local descriptors to build the visual dictionary. However, it is very likely that, to find a good visual vocabulary, only a sub-part of the descriptor space of each visual word is truly relevant. We propose a novel framework for creating the visual dictionary based on a spectral subspace clustering method instead of the traditional K-means algorithm. A strategy for adding supervised information during the subspace clustering process is formulated to obtain more discriminative visual words. Experimental results on real world image dataset show that the proposed framework for dictionary creation improves the classification accuracy compared to using traditionally built BOW

    Variational Fair Clustering

    Full text link
    We propose a general variational framework of fair clustering, which integrates an original Kullback-Leibler (KL) fairness term with a large class of clustering objectives, including prototype or graph based. Fundamentally different from the existing combinatorial and spectral solutions, our variational multi-term approach enables to control the trade-off levels between the fairness and clustering objectives. We derive a general tight upper bound based on a concave-convex decomposition of our fairness term, its Lipschitz-gradient property and the Pinsker's inequality. Our tight upper bound can be jointly optimized with various clustering objectives, while yielding a scalable solution, with convergence guarantee. Interestingly, at each iteration, it performs an independent update for each assignment variable. Therefore, it can be easily distributed for large-scale datasets. This scalability is important as it enables to explore different trade-off levels between the fairness and clustering objectives. Unlike spectral relaxation, our formulation does not require computing its eigenvalue decomposition. We report comprehensive evaluations and comparisons with state-of-the-art methods over various fair-clustering benchmarks, which show that our variational formulation can yield highly competitive solutions in terms of fairness and clustering objectives.Comment: Accepted to be published in AAAI 2021. The Code is available at: https://github.com/imtiazziko/Variational-Fair-Clusterin

    Mutual Information-based Generalized Category Discovery

    Full text link
    We introduce an information-maximization approach for the Generalized Category Discovery (GCD) problem. Specifically, we explore a parametric family of loss functions evaluating the mutual information between the features and the labels, and find automatically the one that maximizes the predictive performances. Furthermore, we introduce the Elbow Maximum Centroid-Shift (EMaCS) technique, which estimates the number of classes in the unlabeled set. We report comprehensive experiments, which show that our mutual information-based approach (MIB) is both versatile and highly competitive under various GCD scenarios. The gap between the proposed approach and the existing methods is significant, more so when dealing with fine-grained classification problems. Our code: https://github.com/fchiaroni/Mutual-Information-Based-GCD

    Transductive Information Maximization For Few-Shot Learning

    No full text
    International audienceWe introduce Transductive Infomation Maximization (TIM) for few-shot learning. Our method maximizes the mutual information between the query features and their label predictions for a given few-shot task, in conjunction with a supervision loss based on the support set. Furthermore, we propose a new alternating-direction solver for our mutual-information loss, which substantially speeds up transductiveinference convergence over gradient-based optimization, while yielding similar accuracy. TIM inference is modular: it can be used on top of any base-training feature extractor. Following standard transductive few-shot settings, our comprehensive experiments 2 demonstrate that TIM outperforms state-of-the-art methods significantly across various datasets and networks, while used on top of a fixed feature extractor trained with simple cross-entropy on the base classes, without resorting to complex meta-learning schemes. It consistently brings between 2% and 5% improvement in accuracy over the best performing method, not only on all the well-established few-shot benchmarks but also on more challenging scenarios, with domain shifts and larger numbers of classes

    Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?

    No full text
    We show that the way inference is performed in few-shot segmentation tasks has a substantial effect on performances-an aspect often overlooked in the literature in favor of the meta-learning paradigm. We introduce a transductive inference for a given query image, leveraging the statistics of its unlabeled pixels, by optimizing a new loss containing three complementary terms: i) the crossentropy on the labeled support pixels; ii) the Shannon entropy of the posteriors on the unlabeled query-image pixels; and iii) a global KL-divergence regularizer based on the proportion of the predicted foreground. As our inference uses a simple linear classifier of the extracted features, its computational load is comparable to inductive inference and can be used on top of any base training. Foregoing episodic training and using only standard cross-entropy training on the base classes, our inference yields competitive performances on standard benchmarks in the 1-shot scenarios. As the number of available shots increases, the gap in performances widens: on PASCAL-5 i , our method brings about 5% and 6% improvements over the state-ofthe-art, in the 5-and 10-shot scenarios, respectively. Furthermore, we introduce a new setting that includes domain shifts, where the base and novel classes are drawn from different datasets. Our method achieves the best performances in this more realistic setting. Our code is freely available online: https://github.com/mboudiaf/ RePRI-for-Few-Shot-Segmentation

    A Unifying Mutual Information View of Metric Learning: Cross-Entropy vs. Pairwise Losses

    No full text
    International audienceStatistical methods protecting sensitive information or the identity of the data owner have become critical to ensure privacy of individuals as well as of organizations. This paper investigates anonymization methods based on representation learning and deep neural networks, and motivated by novel informationtheoretical bounds. We introduce a novel training objective for simultaneously training a predictor over target variables of interest (the regular labels) while preventing an intermediate representation to be predictive of the private labels. The architecture is based on three sub-networks: one going from input to representation, one from representation to predicted regular labels, and one from representation to predicted private labels. The training procedure aims at learning representations that preserve the relevant part of the information (about regular labels) while dismissing information about the private labels which correspond to the identity of a person. We demonstrate the success of this approach for two distinct classification versus anonymization tasks (handwritten digits and sentiment analysis)
    corecore