43,192 research outputs found
Multi Focus Image Fusion with variable size windows
[EN] In this paper we present the Linear Image Combination Algorithm with Variable Windows (CLI-VV) for the fusion of multifocus images. Unlike the CLI-S algorithm presented in a previous work, the CLI-VV algorithm allows to automatically determine the optimal size of the window in each pixel for the segmentation of the regions with the highest sharpness. We also present the generalized CLI-VV Algorithm for the fusion of sets of multi-focus images with more than two images. This new algorithm is called Variable Windows Multi-focus Fusion (FM-VV). The CLI-VV Algorithm was tested with 21 pairs of synthetic images and 29 pairs of real multi-focus images, and the FM-VV Algorithm on 5 trios of multi-focus images. In all the tests a competitive accuracy was obtained, with execution times lower than those reported in the literature.[ES] En este artÃculo presentamos el Algoritmo Combinación Lineal de Imágenes con Ventanas Variables (CLI-VV) para la fusión de imágenes multi-foco. A diferencia del Algoritmo CLI-S presentado en un trabajo anterior, el algoritmo CLI-VV permite determinar automáticamente el tamaño óptimo de la ventana en cada pÃxel para la segmentación de las regiones con la mayor nitidez. También presentamos la generalizado el Algoritmo CLI-VV para la fusión de conjuntos de imágenes multi-foco con más de dos imágenes. A este nuevo algoritmo lo denominamos Fusión Multi-foco con Ventanas Variables (FM-VV). El Algoritmo CLI-VV se probó con 21 pares de imágenes sintéticas y 29 pares de imágenes multi-foco reales, y el Algoritmo FM-VV sobre 5 trÃos de imágenes multi-foco. En todos los ejemplos se obtuvo un porcentaje de acierto competitivos, producidos en tiempos de ejecución menores a los reportados en la literatura.Calderon, F.; Garnica-Carrillo, A.; Flores, JJ. (2018). Fusión de Imágenes Multi-Foco con Ventanas Variables. Revista Iberoamericana de Automática e Informática industrial. 15(3):262-276. https://doi.org/10.4995/riai.2017.8852OJS262276153Aslantas, V., Kurban, R., 2010. Fusion of multi-focus images using differential evolution algorithm. Expert Systems with Applications 37 (12), 8861 - 8870. https://doi.org/10.1016/j.eswa.2010.06.011Aslantas, V., Toprak, A. N., 2014. A pixel based multi-focus image fusion method. Optics Communications 332, 350 - 358. https://doi.org/10.1016/j.optcom.2014.07.044Aslantas, V., Toprak, A. N., 2017. Multi-focus image fusion based on optimal defocus estimation. Computers and Electrical Engineering. https://doi.org/10.1016/j.compeleceng.2017.02.003Assirati, L., Silva, N. R., Berton, L., Lopes, A. A., Bruno, O. M., 2014. Performing edge detection by difference of gaussians using q-gaussian kernels. Journal of Physics: Conference Series 490 (1), 012020. https://doi.org/10.1088/1742-6596/490/1/012020Bai, X., Zhang, Y., Zhou, F., Xue, B., 2015. Quadtree-based multi-focus image fusion using a weighted focus-measure. Information Fusion 22, 105 - 118. https://doi.org/10.1016/j.inffus.2014.05.003Calderon, F., Garnica, A., 2014. Multi focus image fusion based on linear combination of images. IEEE, pp. 1-7. https://doi.org/10.1109/ROPEC.2014.7036340Calderon, F., Garnica-Carrillo, A., Flores, J. J., 2016. Fusión de imágenes multi foco basado en la combinación lineal de imágenes utilizando imágenes incrementales. Revista Iberoamericana de Automática e Informática Industrial RIAI 13 (4), 450 - 461. https://doi.org/10.1016/j.riai.2016.07.002Cao, L., Jin, L., Tao, H., Li, G., Zhuang, Z., Zhang, Y., Feb 2015. Multi-focus image fusion based on spatial frequency in discrete cosine transform domain. Signal Processing Letters, IEEE 22 (2), 220-224. https://doi.org/10.1109/LSP.2014.2354534Chai, Y., Li, H., Li, Z., 2011. Multifocus image fusion scheme using focused region detection and multiresolution. Optics Communications 284 (19), 4376 - 4389. https://doi.org/10.1016/j.optcom.2011.05.046De, I., Chanda, B., 2013. Multi-focus image fusion using a morphology-based focus measure in a quad-tree structure. Information Fusion 14 (2), 136 - 146. https://doi.org/10.1016/j.inffus.2012.01.007Duan, J., Meng, G., Xiang, S., Pan, C., 2014. Multifocus image fusion via focus segmentation and region reconstruction. Neurocomputing 140, 193 - 209. https://doi.org/10.1016/j.neucom.2014.03.023Eskicioglu, A., Fisher, P., Dec 1995. Image quality measures and their performance. Communications, IEEE Transactions on 43 (12), 2959-2965. https://doi.org/10.1109/26.477498Kong, W., Lei, Y., 2017. Multi-focus image fusion using biochemical ion exchange model. Applied Soft Computing 51, 314 - 327. https://doi.org/10.1016/j.asoc.2016.11.033Kuthirummal, S., Nagahara, H., Zhou, C., Nayar, S., Jan 2011. Flexible depth of field photography. Pattern Analysis and Machine Intelligence, IEEE Transactions on 33 (1), 58-71. https://doi.org/10.1109/TPAMI.2010.66Lewis, J. J., O'Callaghan, R. J., Nikolov, S. G., Bull, D. R., Canagarajah, N., 2007. Pixel- and region-based image fusion with complex wavelets. Information Fusion 8 (2), 119 - 130, special Issue on Image Fusion: Advances in the State of the Art. https://doi.org/10.1016/j.inffus.2005.09.006Li, H., Chai, Y., Li, Z., 2013a. Multi-focus image fusion based on nonsubsampled contourlet transform and focused regions detection. Optik - International Journal for Light and Electron Optics 124 (1), 40 - 51. https://doi.org/10.1016/j.ijleo.2011.11.088Li, H., Chai, Y., Li, Z., 2013b. A new fusion scheme for multifocus images based on focused pixels detection. Machine vision and applications 24 (6), 1167-1181. https://doi.org/10.1007/s00138-013-0502-4Li, H., Manjunath, B., Mitra, S., 1995. Multisensor image fusion using the wavelet transform. Graphical Models and Image Processing 57 (3), 235 - 245. https://doi.org/10.1006/gmip.1995.1022Li, S., Kang, X., Fang, L., Hu, J., Yin, H., 2017. Pixel-level image fusion: A survey of the state of the art. Information Fusion 33, 100 - 112. https://doi.org/10.1016/j.inffus.2016.05.004Li, S., Kwok, J. T., Wang, Y., 2001. Combination of images with diverse focuses using the spatial frequency. Information Fusion 2 (3), 169 - 176. https://doi.org/10.1016/S1566-2535(01)00038-0Li, S., Kwok, J. T., Wang, Y., 2002. Multifocus image fusion using artificial neural networks. Pattern Recognition Letters 23 (8), 985 - 997. https://doi.org/10.1016/S0167-8655(02)00029-6Li, S., Yang, B., 2008a. Multifocus image fusion by combining curvelet and wavelet transform. Pattern Recognition Letters 29 (9), 1295-1301. https://doi.org/10.1016/j.patrec.2008.02.002Li, S., Yang, B., 2008b. Multifocus image fusion using region segmentation and spatial frequency. Image and Vision Computing 26 (7), 971 - 979. https://doi.org/10.1016/j.imavis.2007.10.012Li, X., He, M., Roux, M., August 2010. Multifocus image fusion based on redundant wavelet transform. Image Processing, IET 4 (4), 283-293. https://doi.org/10.1049/iet-ipr.2008.0259Liu, Y., Chen, X., Peng, H., Wang, Z., 2017a. Multi-focus image fusion with a deep convolutional neural network. Information Fusion 36, 191 - 207. https://doi.org/10.1016/j.inffus.2016.12.001Liu, Z., Chai, Y., Yin, H., Zhou, J., Zhu, Z., 2017b. A novel multi-focus image fusion approach based on image decomposition. Information Fusion 35, 102 - 116. https://doi.org/10.1016/j.inffus.2016.09.007Long, J., Shelhamer, E., Darrell, T., 2014. Fully convolutional networks for semantic segmentation. CoRR abs/1411.4038.Luo, X., Zhang, J., Dai, Q., 2012. A regional image fusion based on similarity characteristics. Signal Processing 92 (5), 1268 - 1280. https://doi.org/10.1016/j.sigpro.2011.11.021Ma, Y., Zhan, K.,Wang, Z., service), S. O., 2011. Applications of pulse-coupled neural networks.Malviya, A., Bhirud, S., Dec 2009. Wavelet based multi-focus image fusion. In: Methods and Models in Computer Science, 2009. ICM2CS 2009. Proceeding of International Conference on. pp. 1-6. https://doi.org/10.1109/ICM2CS.2009.5397990Nejati, M., Samavi, S., Shirani, S., 2015. Multi-focus image fusion using dictionary-based sparse representation. Information Fusion 25, 72 - 84. https://doi.org/10.1016/j.inffus.2014.10.004Orozco, R. I., 2013. Fusión de imágenes multifoco por medio de filtrado de regiones de alta y baja frecuencia. Master's thesis, División de Estudios de Postgrado. Facultad de IngenierÃa Eléctrica. UMSNH, Morelia Michoacan Mexico.Pagidimarry, M., Babu, K. A., 2011. An all approach for multi-focus image fusion using neural network. Artificial Intelligent Systems and Machine Learning 3 (12), 732-739.Pajares, G., de la Cruz, J. M., 2004. A wavelet-based image fusion tutorial. Pattern Recognition 37 (9), 1855 - 1872. https://doi.org/10.1016/j.patcog.2004.03.010Piella, G., 2003. A general framework for multiresolution image fusion: from pixels to regions. Information Fusion 4 (4), 259 - 280. https://doi.org/10.1016/S1566-2535(03)00046-0Pramanik, S., Prusty, S., Bhattacharjee, D., Bhunre, P. K., 2013. A region-topixel based multi-sensor image fusion. Procedia Technology 10, 654 - 662. https://doi.org/10.1016/j.protcy.2013.12.407Qu, X., Hou, Y., Lam, F., Guo, D., Zhong, J., Chen, Z., 2014. Magnetic resonance image reconstruction from undersampled measurements using a patchbased nonlocal operator. Medical Image Analysis 18 (6), 843 - 856, sparse Methods for Signal Reconstruction and Medical Image Analysis. https://doi.org/10.1016/j.media.2013.09.007Riaz, M., Park, S., Ahmad, M., Rasheed, W., Park, J., 2008. Generalized laplacian as focus measure. In: Bubak, M., van Albada, G., Dongarra, J., Sloot, P. (Eds.), Computational Science ICCS 2008. Vol. 5101 of Lecture Notes in Computer Science. Springer Berlin Heidelberg, pp. 1013-1021. https://doi.org/10.1007/978-3-540-69384-0_106Rivera, M., Ocegueda, O., Marroquin, J., Dec 2007. Entropy-controlled quadratic markov measure field models for efficient image segmentation. Image Processing, IEEE Transactions on 16 (12), 3047-3057. https://doi.org/10.1109/TIP.2007.909384Sezan, M., Pavlovic, G., Tekalp, A., Erdem, A., Apr 1991. On modeling the focus blur in image restoration. In: Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference on. pp. 2485-2488 vol.4. https://doi.org/10.1109/ICASSP.1991.150905Shah, P., Merchant, S. N., Desai, U. B., 2013. Multifocus and multispectral image fusion based on pixel significance using multiresolution decomposition. Signal, Image and Video Processing 7 (1), 95-109. https://doi.org/10.1007/s11760-011-0219-7Shi, W., Zhu, C., Tian, Y., Nichol, J., 2005. Wavelet-based image fusion and quality assessment. International Journal of Applied Earth Observation and Geoinformation 6 (3-4), 241 - 251. https://doi.org/10.1016/j.jag.2004.10.010Tian, J., Chen, L., Sept 2010. Multi-focus image fusion using wavelet-domain statistics. In: Image Processing (ICIP), 2010 17th IEEE International Conference on. pp. 1205-1208. https://doi.org/10.1109/ICIP.2010.5651791Viola, P., Jones, M., 2001. Rapid object detection using a boosted cascade of simple features. In: Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on. Vol. 1. pp. I-511-I-518 vol.1. https://doi.org/10.1109/CVPR.2001.990517Yang, Y., 2011. A novel fDWTg based multi-focus image fusion method. Procedia Engineering 24 (0), 177 - 181, international Conference on Advances in Engineering 2011.Yang, Y., Huang, S., Gao, J., Qian, Z., 2014. Multi-focus image fusion using an effective discrete wavelet transform based algorithm. Measurement Science Review 14 (2), 102 - 108. https://doi.org/10.2478/msr-2014-0014Yang, Y., Tong, S., Huang, S., Lin, P., 2015. Multifocus image fusion based on nsct and focused area detection. IEEE Sensors Journal 15 (5), 2824-2838. Zhang, B., Lu, X., Pei, H., Liu, H., Zhao, Y., Zhou, W., 2016a. Multi-focus image fusion algorithm based on focused region extraction. Neurocomputing 174, 733 - 748. https://doi.org/10.1016/j.neucom.2015.09.092Zhang, Q., long Guo, B., 2009. Multifocus image fusion using the nonsubsampled contourlet transform. Signal Processing 89 (7), 1334 - 1346. https://doi.org/10.1016/j.sigpro.2009.01.012Zhang, Y., Chen, L., Zhao, Z., Jia, J., 2016b. Multi-focus image fusion based on cartoon-texture image decomposition. Optik - International Journal for Light and Electron Optics 127 (3), 1291 - 1296. https://doi.org/10.1016/j.ijleo.2015.10.098Zhang, Z., Blum, R., Aug 1999. A categorization of multiscale-decompositionbased image fusion schemes with a performance study for a digital camera application. Proceedings of the IEEE 87 (8), 1315-1326. https://doi.org/10.1109/5.775414Zhao, H., Li, Q., Feng, H., 2008. Multi-focus color image fusion in the HSI space using the sum-modified-laplacian and a coarse edge map. Image and Vision Computing 26 (9), 1285 - 1295. https://doi.org/10.1016/j.imavis.2008.03.007Zhou, L., Ji, G., Shi, C., Feng, C., Nian, R., 2006. A Multi-focus Image Fusion Method Based on Image Information Features and the Artificial Neural Networks. Vol. 344. Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 747-752. https://doi.org/10.1007/978-3-540-37256-1_91Zhou, Z., Li, S., Wang, B., 2014. Multi-scale weighted gradient-based fusion for multi-focus images. Information Fusion 20 (0), 60 - 72. https://doi.org/10.1016/j.inffus.2013.11.00
Effective Image Retrieval via Multilinear Multi-index Fusion
Multi-index fusion has demonstrated impressive performances in retrieval task
by integrating different visual representations in a unified framework.
However, previous works mainly consider propagating similarities via neighbor
structure, ignoring the high order information among different visual
representations. In this paper, we propose a new multi-index fusion scheme for
image retrieval. By formulating this procedure as a multilinear based
optimization problem, the complementary information hidden in different indexes
can be explored more thoroughly. Specially, we first build our multiple indexes
from various visual representations. Then a so-called index-specific functional
matrix, which aims to propagate similarities, is introduced for updating the
original index. The functional matrices are then optimized in a unified tensor
space to achieve a refinement, such that the relevant images can be pushed more
closer. The optimization problem can be efficiently solved by the augmented
Lagrangian method with theoretical convergence guarantee. Unlike the
traditional multi-index fusion scheme, our approach embeds the multi-index
subspace structure into the new indexes with sparse constraint, thus it has
little additional memory consumption in online query stage. Experimental
evaluation on three benchmark datasets reveals that the proposed approach
achieves the state-of-the-art performance, i.e., N-score 3.94 on UKBench, mAP
94.1\% on Holiday and 62.39\% on Market-1501.Comment: 12 page
Scale-Invariant Structure Saliency Selection for Fast Image Fusion
In this paper, we present a fast yet effective method for pixel-level
scale-invariant image fusion in spatial domain based on the scale-space theory.
Specifically, we propose a scale-invariant structure saliency selection scheme
based on the difference-of-Gaussian (DoG) pyramid of images to build the
weights or activity map. Due to the scale-invariant structure saliency
selection, our method can keep both details of small size objects and the
integrity information of large size objects in images. In addition, our method
is very efficient since there are no complex operation involved and easy to be
implemented and therefore can be used for fast high resolution images fusion.
Experimental results demonstrate the proposed method yields competitive or even
better results comparing to state-of-the-art image fusion methods both in terms
of visual quality and objective evaluation metrics. Furthermore, the proposed
method is very fast and can be used to fuse the high resolution images in
real-time. Code is available at https://github.com/yiqingmy/Fusion
Boosting in Image Quality Assessment
In this paper, we analyze the effect of boosting in image quality assessment
through multi-method fusion. Existing multi-method studies focus on proposing a
single quality estimator. On the contrary, we investigate the generalizability
of multi-method fusion as a framework. In addition to support vector machines
that are commonly used in the multi-method fusion, we propose using neural
networks in the boosting. To span different types of image quality assessment
algorithms, we use quality estimators based on fidelity, perceptually-extended
fidelity, structural similarity, spectral similarity, color, and learning. In
the experiments, we perform k-fold cross validation using the LIVE, the
multiply distorted LIVE, and the TID 2013 databases and the performance of
image quality assessment algorithms are measured via accuracy-, linearity-, and
ranking-based metrics. Based on the experiments, we show that boosting methods
generally improve the performance of image quality assessment and the level of
improvement depends on the type of the boosting algorithm. Our experimental
results also indicate that boosting the worst performing quality estimator with
two or more additional methods leads to statistically significant performance
enhancements independent of the boosting technique and neural network-based
boosting outperforms support vector machine-based boosting when two or more
methods are fused.Comment: Paper: 6 pages, 5 tables, 1 figure, Presentation: 16 slides
[Ancillary files
Orientation Driven Bag of Appearances for Person Re-identification
Person re-identification (re-id) consists of associating individual across
camera network, which is valuable for intelligent video surveillance and has
drawn wide attention. Although person re-identification research is making
progress, it still faces some challenges such as varying poses, illumination
and viewpoints. For feature representation in re-identification, existing works
usually use low-level descriptors which do not take full advantage of body
structure information, resulting in low representation ability.
%discrimination. To solve this problem, this paper proposes the mid-level
body-structure based feature representation (BSFR) which introduces body
structure pyramid for codebook learning and feature pooling in the vertical
direction of human body. Besides, varying viewpoints in the horizontal
direction of human body usually causes the data missing problem, , the
appearances obtained in different orientations of the identical person could
vary significantly. To address this problem, the orientation driven bag of
appearances (ODBoA) is proposed to utilize person orientation information
extracted by orientation estimation technic. To properly evaluate the proposed
approach, we introduce a new re-identification dataset (Market-1203) based on
the Market-1501 dataset and propose a new re-identification dataset (PKU-Reid).
Both datasets contain multiple images captured in different body orientations
for each person. Experimental results on three public datasets and two proposed
datasets demonstrate the superiority of the proposed approach, indicating the
effectiveness of body structure and orientation information for improving
re-identification performance.Comment: 13 pages, 15 figures, 3 tables, submitted to IEEE Transactions on
Circuits and Systems for Video Technolog
CGGAN: A Context Guided Generative Adversarial Network For Single Image Dehazing
Image haze removal is highly desired for the application of computer vision.
This paper proposes a novel Context Guided Generative Adversarial Network
(CGGAN) for single image dehazing. Of which, an novel new encoder-decoder is
employed as the generator. And it consists of a feature-extraction-net, a
context-extractionnet, and a fusion-net in sequence. The feature extraction-net
acts as a encoder, and is used for extracting haze features. The
context-extraction net is a multi-scale parallel pyramid decoder, and is used
for extracting the deep features of the encoder and generating coarse dehazing
image. The fusion-net is a decoder, and is used for obtaining the final
haze-free image. To obtain more better results, multi-scale information
obtained during the decoding process of the context extraction decoder is used
for guiding the fusion decoder. By introducing an extra coarse decoder to the
original encoder-decoder, the CGGAN can make better use of the deep feature
information extracted by the encoder. To ensure our CGGAN work effectively for
different haze scenarios, different loss functions are employed for the two
decoders. Experiments results show the advantage and the effectiveness of our
proposed CGGAN, evidential improvements over existing state-of-the-art methods
are obtained.Comment: 12 pages, 7 figures, 3 table
A novel hybrid score level and decision level fusion scheme for cancelable multi-biometric verification
In spite of the benefits of biometric-based authentication systems, there are
few concerns raised because of the sensitivity of biometric data to outliers,
low performance caused due to intra-class variations and privacy invasion
caused by information leakage. To address these issues, we propose a hybrid
fusion framework where only the protected modalities are combined to fulfill
the requirement of secrecy and performance improvement. This paper presents a
method to integrate cancelable modalities utilizing mean-closure weighting
(MCW) score level and Dempster-Shafer (DS) theory based decision level fusion
for iris and fingerprint to mitigate the limitations in the individual score or
decision fusion mechanisms. The proposed hybrid fusion scheme incorporates the
similarity scores from different matchers corresponding to each protected
modality. The individual scores obtained from different matchers for each
modality are combined using MCW score fusion method. The MCW technique achieves
the optimal weight for each matcher involved in the score computation. Further,
DS theory is applied to the induced scores to output the final decision. The
rigorous experimental evaluations on three virtual databases indicate that the
proposed hybrid fusion framework outperforms over the component level or
individual fusion methods (score level and decision level fusion). As a result,
we achieve (48%,66%), (72%,86%) and (49%,38%) of performance improvement over
unimodal cancelable iris and unimodal cancelable fingerprint verification
systems for Virtual_A, Virtual_B and Virtual_C databases, respectively. Also,
the proposed method is robust enough to the variability of scores and outliers
satisfying the requirement of secure authentication
Hybrid Distortion Aggregated Visual Comfort Assessment for Stereoscopic Image Retargeting
Visual comfort is a quite important factor in 3D media service. Few research
efforts have been carried out in this area especially in case of 3D content
retargeting which may introduce more complicated visual distortions. In this
paper, we propose a Hybrid Distortion Aggregated Visual Comfort Assessment
(HDA-VCA) scheme for stereoscopic retargeted images (SRI), considering
aggregation of hybrid distortions including structure distortion, information
loss, binocular incongruity and semantic distortion. Specifically, a Local-SSIM
feature is proposed to reflect the local structural distortion of SRI, and
information loss is represented by Dual Natural Scene Statistics (D-NSS)
feature extracted from the binocular summation and difference channels.
Regarding binocular incongruity, visual comfort zone, window violation,
binocular rivalry, and accommodation-vergence conflict of human visual system
(HVS) are evaluated. Finally, the semantic distortion is represented by the
correlation distance of paired feature maps extracted from original
stereoscopic image and its retargeted image by using trained deep neural
network. We validate the effectiveness of HDA-VCA on published Stereoscopic
Image Retargeting Database (SIRD) and two stereoscopic image databases IEEE-SA
and NBU 3D-VCA. The results demonstrate HDA-VCA's superior performance in
handling hybrid distortions compared to state-of-the-art VCA schemes.Comment: 13 pages, 11 figures, 4 table
Modality-specific Cross-modal Similarity Measurement with Recurrent Attention Network
Nowadays, cross-modal retrieval plays an indispensable role to flexibly find
information across different modalities of data. Effectively measuring the
similarity between different modalities of data is the key of cross-modal
retrieval. Different modalities such as image and text have imbalanced and
complementary relationships, which contain unequal amount of information when
describing the same semantics. For example, images often contain more details
that cannot be demonstrated by textual descriptions and vice versa. Existing
works based on Deep Neural Network (DNN) mostly construct one common space for
different modalities to find the latent alignments between them, which lose
their exclusive modality-specific characteristics. Different from the existing
works, we propose modality-specific cross-modal similarity measurement (MCSM)
approach by constructing independent semantic space for each modality, which
adopts end-to-end framework to directly generate modality-specific cross-modal
similarity without explicit common representation. For each semantic space,
modality-specific characteristics within one modality are fully exploited by
recurrent attention network, while the data of another modality is projected
into this space with attention based joint embedding to utilize the learned
attention weights for guiding the fine-grained cross-modal correlation
learning, which can capture the imbalanced and complementary relationships
between different modalities. Finally, the complementarity between the semantic
spaces for different modalities is explored by adaptive fusion of the
modality-specific cross-modal similarities to perform cross-modal retrieval.
Experiments on the widely-used Wikipedia and Pascal Sentence datasets as well
as our constructed large-scale XMediaNet dataset verify the effectiveness of
our proposed approach, outperforming 9 state-of-the-art methods.Comment: 13 pages, submitted to IEEE Transactions on Image Processin
Multi-feature Fusion for Image Retrieval Using Constrained Dominant Sets
Aggregating different image features for image retrieval has recently shown
its effectiveness. While highly effective, though, the question of how to
uplift the impact of the best features for a specific query image persists as
an open computer vision problem. In this paper, we propose a computationally
efficient approach to fuse several hand-crafted and deep features, based on the
probabilistic distribution of a given membership score of a constrained cluster
in an unsupervised manner. First, we introduce an incremental nearest neighbor
(NN) selection method, whereby we dynamically select k-NN to the query. We then
build several graphs from the obtained NN sets and employ constrained dominant
sets (CDS) on each graph G to assign edge weights which consider the intrinsic
manifold structure of the graph, and detect false matches to the query.
Finally, we elaborate the computation of feature positive-impact weight (PIW)
based on the dispersive degree of the characteristics vector. To this end, we
exploit the entropy of a cluster membership-score distribution. In addition,
the final NN set bypasses a heuristic voting scheme. Experiments on several
retrieval benchmark datasets show that our method can improve the
state-of-the-art result
- …