29 research outputs found
CentralNet: a Multilayer Approach for Multimodal Fusion
This paper proposes a novel multimodal fusion approach, aiming to produce
best possible decisions by integrating information coming from multiple media.
While most of the past multimodal approaches either work by projecting the
features of different modalities into the same space, or by coordinating the
representations of each modality through the use of constraints, our approach
borrows from both visions. More specifically, assuming each modality can be
processed by a separated deep convolutional network, allowing to take decisions
independently from each modality, we introduce a central network linking the
modality specific networks. This central network not only provides a common
feature embedding but also regularizes the modality specific networks through
the use of multi-task learning. The proposed approach is validated on 4
different computer vision tasks on which it consistently improves the accuracy
of existing multimodal fusion approaches
Accessibility-based reranking in multimedia search engines
Traditional multimedia search engines retrieve results based mostly on the query submitted by the user, or using a log of previous searches to provide personalized results, while not considering the accessibility of the results for users with vision or other types of impairments. In this paper, a novel approach is presented which incorporates the accessibility of images for users with various vision impairments, such as color blindness, cataract and glaucoma, in order to rerank the results of an image search engine. The accessibility of individual images is measured through the use of vision simulation filters. Multi-objective optimization techniques utilizing the image accessibility scores are used to handle users with multiple vision impairments, while the impairment profile of a specific user is used to select one from the Pareto-optimal solutions. The proposed approach has been tested with two image datasets, using both simulated and real impaired users, and the results verify its applicability. Although the proposed method has been used for vision accessibility-based reranking, it can also be extended for other types of personalization context
Investigating non-classical correlations between decision fused multi-modal documents
Correlation has been widely used to facilitate various information retrieval methods such as query expansion, relevance feedback, document clustering, and multi-modal fusion. Especially, correlation and independence are important issues when fusing different modalities that influence a multi-modal information retrieval process. The basic idea of correlation is that an observable can help predict or enhance another observable. In quantum mechanics, quantum correlation, called entanglement, is a sort of correlation between the observables measured in atomic-size particles when these particles are not necessarily collected in ensembles. In this paper, we examine a multimodal fusion scenario that might be similar to that encountered in physics by firstly measuring two observables (i.e., text-based relevance and image-based relevance) of a multi-modal document without counting on an ensemble of multi-modal documents already labeled in terms of these two variables. Then, we investigate the existence of non-classical correlations between pairs of multi-modal documents. Despite there are some basic differences between entanglement and classical correlation encountered in the macroscopic world, we investigate the existence of this kind of non-classical correlation through the Bell inequality violation. Here, we experimentally test several novel association methods in a small-scale experiment. However, in the current experiment we did not find any violation of the Bell inequality. Finally, we present a series of interesting discussions, which may provide theoretical and empirical insights and inspirations for future development of this direction
iCLAP: Shape Recognition by Combining Proprioception and Touch Sensing
The work presented in this paper was partially supported by the Engineering and Physical Sciences Council (EPSRC) Grant (Ref: EP/N020421/1) and the King’s-China Scholarship Council Ph.D. scholarship
SecureCSearch: Secure Searching in PDF over Untrusted Cloud Servers
© 2019 IEEE. The usage of cloud for data storage has become ubiquitous. To prevent data leakage and hacks, it is common to encrypt the data (e.g. PDF files) before sending it to a cloud. However, this limits the search for specific files containing certain keywords over an encrypted cloud data. The traditional method is to take down all files from a cloud, store them locally, decrypt and then search over them, defeating the purpose of using a cloud. In this paper, we propose a method, called SecureCSearch, to perform keyword search operations on the encrypted PDF files over cloud in an efficient manner. The proposed method makes use of Shamir's Secret Sharing scheme in a novel way to create encrypted shares of the PDF file and the keyword to search. We show that the proposed method maintains the security of the data and incurs minimal computation cost
Secret sharing approach for securing cloud-based pre-classification volume ray-casting
With the evolution in cloud computing, cloud-based volume rendering, which outsources data rendering tasks to cloud datacenters, is attracting interest. Although this new rendering technique has many advantages, allowing third-party access to potentially sensitive volume data raises security and privacy concerns. In this paper, we address these concerns for cloud-based pre-classification volume ray-casting by using Shamir’s (k, n) secret sharing and its variant (l, k, n) ramp secret sharing, which are homomorphic to addition and scalar multiplication operations, to hide color information of volume data/images in datacenters. To address the incompatibility issue of the modular prime operation used in secret sharing technique with the floating point operations of ray-casting, we consider excluding modular prime operation from secret sharing or converting the floating number operations of ray-casting to fixed point operations – the earlier technique degrades security and the later degrades image quality. Both these techniques, however, result in significant data overhead. To lessen the overhead at the cost of high security, we propose a modified ramp secret sharing scheme that uses the three color components in one secret sharing polynomial and replaces the shares in floating point with smaller integers
A design methodology for selecting and placement of sensors in multimedia surveillance systems
This paper addresses the problem of how to select the optimal number of sensors and how to determine their placement in a given monitored area for multimedia surveillance systems. We propose to solve this problem by obtaining a novel performance metric in terms of a probability measure for accomplishing the task as a function of set of sensors and their placement. This measure is then used to find the optimal set. The same measure can be used to analyze the degradation in system 's performance with respect to the failure of various sensors. We also build a surveillance system using the optimal set of sensors obtained based on the proposed design methodology. Experimental results show the effectiveness of the proposed design methodology in selecting the optimal set of sensors and their placement
Learning A Priori Constrained Weighted Majority Votes
The published version is available here: http://link.springer.com/article/10.1007/s10994-014-5462-zInternational audienceWeighted majority votes allow one to combine the output of several classifiers or voters. MinCq is a recent algorithm for optimizing the weight of each voter based on the minimization of a theoretical bound over the risk of the vote with elegant PAC-Bayesian generalization guarantees. However, while it has demonstrated good performance when combining weak classifiers, MinCq cannot make use of the useful a priori knowledge that one may have when using a mixture of weak and strong voters. In this paper, we propose P-MinCq, an extension of MinCq that can incorporate such knowledge in the form of a constraint over the distribution of the weights, along with general proofs of convergence that stand in the sample compression setting for data-dependent voters. The approach is applied to a vote of k-NN classifiers with a specific modeling of the voters' performance. P-MinCq significantly outperforms the classic k-NN classifier, a symmetric NN and MinCq using the same voters. We show that it is also competitive with LMNN, a popular metric learning algorithm, and that combining both approaches further reduces the error