207 research outputs found

    Why KDAC? A general activation function for knowledge discovery

    Full text link
    Deep learning oriented named entity recognition (DNER) has gradually become the paradigm of knowledge discovery, which greatly promotes domain intelligence. However, the current activation function of DNER fails to treat gradient vanishing, no negative output or non-differentiable existence, which may impede knowledge exploration caused by the omission and incomplete representation of latent semantics. To break through the dilemma, we present a novel activation function termed KDAC. Detailly, KDAC is an aggregation function with multiple conversion modes. The backbone of the activation region is the interaction between exponent and linearity, and the both ends extend through adaptive linear divergence, which surmounts the obstacle of gradient vanishing and no negative output. Crucially, the non-differentiable points are alerted and eliminated by an approximate smoothing algorithm. KDAC has a series of brilliant properties, including nonlinear, stable near-linear transformation and derivative, as well as dynamic style, etc. We perform experiments based on BERT-BiLSTM-CNN-CRF model on six benchmark datasets containing different domain knowledge, such as Weibo, Clinical, E-commerce, Resume, HAZOP and People's daily. The evaluation results show that KDAC is advanced and effective, and can provide more generalized activation to stimulate the performance of DNER. We hope that KDAC can be exploited as a promising activation function to devote itself to the construction of knowledge.Comment: Accepted by Neurocomputin

    CityDreamer: Compositional Generative Model of Unbounded 3D Cities

    Full text link
    In recent years, extensive research has focused on 3D natural scene generation, but the domain of 3D city generation has not received as much exploration. This is due to the greater challenges posed by 3D city generation, mainly because humans are more sensitive to structural distortions in urban environments. Additionally, generating 3D cities is more complex than 3D natural scenes since buildings, as objects of the same class, exhibit a wider range of appearances compared to the relatively consistent appearance of objects like trees in natural scenes. To address these challenges, we propose CityDreamer, a compositional generative model designed specifically for unbounded 3D cities, which separates the generation of building instances from other background objects, such as roads, green lands, and water areas, into distinct modules. Furthermore, we construct two datasets, OSM and GoogleEarth, containing a vast amount of real-world city imagery to enhance the realism of the generated 3D cities both in their layouts and appearances. Through extensive experiments, CityDreamer has proven its superiority over state-of-the-art methods in generating a wide range of lifelike 3D cities.Comment: Project page: https://haozhexie.com/project/city-dreame

    Fingerprint Presentation Attack Detector Using Global-Local Model

    Full text link
    The vulnerability of automated fingerprint recognition systems (AFRSs) to presentation attacks (PAs) promotes the vigorous development of PA detection (PAD) technology. However, PAD methods have been limited by information loss and poor generalization ability, resulting in new PA materials and fingerprint sensors. This paper thus proposes a global-local model-based PAD (RTK-PAD) method to overcome those limitations to some extent. The proposed method consists of three modules, called: 1) the global module; 2) the local module; and 3) the rethinking module. By adopting the cut-out-based global module, a global spoofness score predicted from nonlocal features of the entire fingerprint images can be achieved. While by using the texture in-painting-based local module, a local spoofness score predicted from fingerprint patches is obtained. The two modules are not independent but connected through our proposed rethinking module by localizing two discriminative patches for the local module based on the global spoofness score. Finally, the fusion spoofness score by averaging the global and local spoofness scores is used for PAD. Our experimental results evaluated on LivDet 2017 show that the proposed RTK-PAD can achieve an average classification error (ACE) of 2.28% and a true detection rate (TDR) of 91.19% when the false detection rate (FDR) equals 1.0%, which significantly outperformed the state-of-the-art methods by ∼\sim10% in terms of TDR (91.19% versus 80.74%).Comment: This paper was accepted by IEEE Transactions on Cybernetics. Current version is updated with minor revisions on introduction and related work

    Long-Range Feature Propagating for Natural Image Matting

    Full text link
    Natural image matting estimates the alpha values of unknown regions in the trimap. Recently, deep learning based methods propagate the alpha values from the known regions to unknown regions according to the similarity between them. However, we find that more than 50\% pixels in the unknown regions cannot be correlated to pixels in known regions due to the limitation of small effective reception fields of common convolutional neural networks, which leads to inaccurate estimation when the pixels in the unknown regions cannot be inferred only with pixels in the reception fields. To solve this problem, we propose Long-Range Feature Propagating Network (LFPNet), which learns the long-range context features outside the reception fields for alpha matte estimation. Specifically, we first design the propagating module which extracts the context features from the downsampled image. Then, we present Center-Surround Pyramid Pooling (CSPP) that explicitly propagates the context features from the surrounding context image patch to the inner center image patch. Finally, we use the matting module which takes the image, trimap and context features to estimate the alpha matte. Experimental results demonstrate that the proposed method performs favorably against the state-of-the-art methods on the AlphaMatting and Adobe Image Matting datasets

    Taming Self-Supervised Learning for Presentation Attack Detection: De-Folding and De-Mixing

    Full text link
    Biometric systems are vulnerable to Presentation Attacks (PA) performed using various Presentation Attack Instruments (PAIs). Even though there are numerous Presentation Attack Detection (PAD) techniques based on both deep learning and hand-crafted features, the generalization of PAD for unknown PAI is still a challenging problem. In this work, we empirically prove that the initialization of the PAD model is a crucial factor for the generalization, which is rarely discussed in the community. Based on such observation, we proposed a self-supervised learning-based method, denoted as DF-DM. Specifically, DF-DM is based on a global-local view coupled with De-Folding and De-Mixing to derive the task-specific representation for PAD. During De-Folding, the proposed technique will learn region-specific features to represent samples in a local pattern by explicitly minimizing generative loss. While De-Mixing drives detectors to obtain the instance-specific features with global information for more comprehensive representation by minimizing interpolation-based consistency. Extensive experimental results show that the proposed method can achieve significant improvements in terms of both face and fingerprint PAD in more complicated and hybrid datasets when compared with state-of-the-art methods. When training in CASIA-FASD and Idiap Replay-Attack, the proposed method can achieve an 18.60% Equal Error Rate (EER) in OULU-NPU and MSU-MFSD, exceeding baseline performance by 9.54%. The source code of the proposed technique is available at https://github.com/kongzhecn/dfdm.Comment: Accepted by IEEE Transactions on Neural Networks and Learning Systems (TNNLS

    Multi-view 3D Face Reconstruction Based on Flame

    Full text link
    At present, face 3D reconstruction has broad application prospects in various fields, but the research on it is still in the development stage. In this paper, we hope to achieve better face 3D reconstruction quality by combining multi-view training framework with face parametric model Flame, propose a multi-view training and testing model MFNet (Multi-view Flame Network). We build a self-supervised training framework and implement constraints such as multi-view optical flow loss function and face landmark loss, and finally obtain a complete MFNet. We propose innovative implementations of multi-view optical flow loss and the covisible mask. We test our model on AFLW and facescape datasets and also take pictures of our faces to reconstruct 3D faces while simulating actual scenarios as much as possible, which achieves good results. Our work mainly addresses the problem of combining parametric models of faces with multi-view face 3D reconstruction and explores the implementation of a Flame based multi-view training and testing framework for contributing to the field of face 3D reconstruction
    • …
    corecore