39 research outputs found

    Shaping Visual Representations with Attributes for Few-Shot Recognition

    Full text link
    Few-shot recognition aims to recognize novel categories under low-data regimes. Some recent few-shot recognition methods introduce auxiliary semantic modality, i.e., category attribute information, into representation learning, which enhances the feature discrimination and improves the recognition performance. Most of these existing methods only consider the attribute information of support set while ignoring the query set, resulting in a potential loss of performance. In this letter, we propose a novel attribute-shaped learning (ASL) framework, which can jointly perform query attributes generation and discriminative visual representation learning for few-shot recognition. Specifically, a visual-attribute predictor (VAP) is constructed to predict the attributes of queries. By leveraging the attributes information, an attribute-visual attention module (AVAM) is designed, which can adaptively utilize attributes and visual representations to learn more discriminative features. Under the guidance of attribute modality, our method can learn enhanced semantic-aware representation for classification. Experiments demonstrate that our method can achieve competitive results on CUB and SUN benchmarks. Our source code is available at: \url{https://github.com/chenhaoxing/ASL}.Comment: accepted by IEEE Signal Process. Let

    Sparse Spatial Transformers for Few-Shot Learning

    Full text link
    Learning from limited data is a challenging task since the scarcity of data leads to a poor generalization of the trained model. The classical global pooled representation is likely to lose useful local information. Recently, many few shot learning methods address this challenge by using deep descriptors and learning a pixel-level metric. However, using deep descriptors as feature representations may lose the contextual information of the image. And most of these methods deal with each class in the support set independently, which cannot sufficiently utilize discriminative information and task-specific embeddings. In this paper, we propose a novel Transformer based neural network architecture called Sparse Spatial Transformers (SSFormers), which can find task-relevant features and suppress task-irrelevant features. Specifically, we first divide each input image into several image patches of different sizes to obtain dense local features. These features retain contextual information while expressing local information. Then, a sparse spatial transformer layer is proposed to find spatial correspondence between the query image and the entire support set to select task-relevant image patches and suppress task-irrelevant image patches. Finally, we propose to use an image patch matching module for calculating the distance between dense local representations, thus to determine which category the query image belongs to in the support set. Extensive experiments on popular few-shot learning benchmarks show that our method achieves the state-of-the-art performance

    Cryptanalysis of the convex hull click human identification protocol

    Get PDF
    Recently, a convex hull-based human identification protocol was proposed by Sobrado and Birget, whose steps can be performed by humans without additional aid. The main part of the protocol involves the user mentally forming a convex hull of secret icons in a set of graphical icons and then clicking randomly within this convex hull. While some rudimentary security issues of this protocol have been discussed, a comprehensive security analysis has been lacking. In this paper, we analyze the security of this convex hull-based protocol. In particular, we show two probabilistic attacks that reveal the user’s secret after the observation of only a handful of authentication sessions. These attacks can be efficiently implemented as their time and space complexities are considerably less than brute force attack. We show that while the first attack can be mitigated through appropriately chosen values of system parameters, the second attack succeeds with a non-negligible probability even with large system parameter values that cross the threshold of usability

    Joint Projection Learning and Tensor Decomposition Based Incomplete Multi-view Clustering

    Full text link
    Incomplete multi-view clustering (IMVC) has received increasing attention since it is often that some views of samples are incomplete in reality. Most existing methods learn similarity subgraphs from original incomplete multi-view data and seek complete graphs by exploring the incomplete subgraphs of each view for spectral clustering. However, the graphs constructed on the original high-dimensional data may be suboptimal due to feature redundancy and noise. Besides, previous methods generally ignored the graph noise caused by the inter-class and intra-class structure variation during the transformation of incomplete graphs and complete graphs. To address these problems, we propose a novel Joint Projection Learning and Tensor Decomposition Based method (JPLTD) for IMVC. Specifically, to alleviate the influence of redundant features and noise in high-dimensional data, JPLTD introduces an orthogonal projection matrix to project the high-dimensional features into a lower-dimensional space for compact feature learning.Meanwhile, based on the lower-dimensional space, the similarity graphs corresponding to instances of different views are learned, and JPLTD stacks these graphs into a third-order low-rank tensor to explore the high-order correlations across different views. We further consider the graph noise of projected data caused by missing samples and use a tensor-decomposition based graph filter for robust clustering.JPLTD decomposes the original tensor into an intrinsic tensor and a sparse tensor. The intrinsic tensor models the true data similarities. An effective optimization algorithm is adopted to solve the JPLTD model. Comprehensive experiments on several benchmark datasets demonstrate that JPLTD outperforms the state-of-the-art methods. The code of JPLTD is available at https://github.com/weilvNJU/JPLTD.Comment: IEEE Transactions on Neural Networks and Learning Systems, 202

    10-Formyl-2,4,6,8,12-penta­nitro-2,4,6,8,10,12-hexa­azatetra­cyclo­[5.5.0.03,11.05,9]dodeca­ne

    Get PDF
    The title compound, C7H7N11O11 (PNMFIW), is a caged heterocycle substituted with five nitro and one formyl groups. It is related to the hexa­azaisowurtzitane family of high-density high-energy polycyclic cage compounds. Four nitro groups are appended to the four N atoms of the two five-membered rings, while a nitro group and a formyl are attached to the two N atoms of the six-membered ring

    Contrastive Latent Space Reconstruction Learning for Audio-Text Retrieval

    Full text link
    Cross-modal retrieval (CMR) has been extensively applied in various domains, such as multimedia search engines and recommendation systems. Most existing CMR methods focus on image-to-text retrieval, whereas audio-to-text retrieval, a less explored domain, has posed a great challenge due to the difficulty to uncover discriminative features from audio clips and texts. Existing studies are restricted in the following two ways: 1) Most researchers utilize contrastive learning to construct a common subspace where similarities among data can be measured. However, they considers only cross-modal transformation, neglecting the intra-modal separability. Besides, the temperature parameter is not adaptively adjusted along with semantic guidance, which degrades the performance. 2) These methods do not take latent representation reconstruction into account, which is essential for semantic alignment. This paper introduces a novel audio-text oriented CMR approach, termed Contrastive Latent Space Reconstruction Learning (CLSR). CLSR improves contrastive representation learning by taking intra-modal separability into account and adopting an adaptive temperature control strategy. Moreover, the latent representation reconstruction modules are embedded into the CMR framework, which improves modal interaction. Experiments in comparison with some state-of-the-art methods on two audio-text datasets have validated the superiority of CLSR.Comment: Accepted by The 35th IEEE International Conference on Tools with Artificial Intelligence. (ICTAI 2023

    10-Formyl-2,4,6,8,12-penta­nitro-2,4,6,8,10,12-hexa­azatetra­cyclo­[5.5.0.05,9.03,11]dodecane acetone solvate

    Get PDF
    The title compound, C7H7N11O11·C3H6O, consisting of one mol­ecule of 10-formyl-2,4,6,8,12-penta­nitro-2,4,6,8,10,12-hexa­azatetra­cyclo­[5.5.0.05,9.03,11]dodecane (penta­nitro­mono­form­yl­hexa­aza­isowurtzitane, PNMFIW) and one acetone solvent mol­ecule, is a member of the caged hexa­azaisowurtzitane family. PNMFIW has a cage structure which is constructed from one six-membered and two five-membered rings which are linked by a C—C bond, thus creating two seven-membered rings. In the PNMFIW mol­ecule, one formyl group is bonded to the N heteroatom of the six-membered cycle, and five nitro groups are appended to other five N heteroatom of the caged structure. The acetone solvent mol­ecule is arranged beside a five-membered plane of PNMFIW with an O atom and an H atom close (with respect to the sum of the van der Waals radii) to the neighbouring nitro O atom [O⋯O = 2.957 (3) and 2.852 (3) Å; O⋯ H = 2.692 (2), 2.526 (3) and 2.432 (3) Å]

    4,10-Diformyl-2,6,8,12-tetra­nitro-2,4,6,8,10,12-hexa­azatetra­cyclo­[5.5.0.05,9.03,11]dodeca­ne

    Get PDF
    The title compound TNDFIW, C8H8N10O10, is a caged heterocycle substituted with four nitro and two formyl groups. It is related to the hexa­azaisowurtzitane family of high-density high-energy polycyclic cage compounds. Four nitro groups are appended to the four N atoms of the two five-membered rings, while the other two formyl groups are attached to the two N atoms of the six-membered ring, which adopts a boat conformation. The compound has a cage structure which is constructed from one six-membered and two five-membered rings which are closed by a C—C bond, thus creating two seven-membered rings. There are a number of close intermolecular contacts [O⋯O = 2.827 (5), 2.853 (4) and 2.891 (5) Å; O⋯N = 2.746 (2) and 2.895 (2) Å] The calculated density of TNDFIW is 1.891 Mg m−3

    Simplified Revocable Hierarchical Identity-Based Encryption from Lattices

    Get PDF
    As an extension of identity-based encryption (IBE), revocable hierarchical IBE (RHIBE) supports both key revocation and key delegation simultaneously, which are two important functionalities for cryptographic use in practice. Recently in PKC 2019, Katsumata et al. constructed the first lattice-based RHIBE scheme with decryption key exposure resistance (DKER). Such constructions are all based on bilinear or multilinear maps before their work. In this paper, we simplify the construction of RHIBE scheme with DKER provided by Katsumata et al. With our new treatment of the identity spaces and the time period space, there is only one short trapdoor base in the master secret key and in the secret key of each identity. In addition, we claim that some items in the keys can also be removed due to the DKER setting. Our first RHIBE scheme in the standard model is presented as a result of the above simplification. Furthermore, based on the technique for lattice basis delegation in fixed dimension, we construct our second RHIBE scheme in the random oracle model. It has much shorter items in keys and ciphertexts than before, and also achieves the adaptive-identity security under the learning with errors (LWE) assumption
    corecore