25 research outputs found

    Unbalanced data processing using oversampling: machine Learning

    Get PDF
    Nowadays, the DL algorithms show good results when used in the solution of different problems which present similar characteristics as the great amount of data and high dimensionality. However, one of the main challenges that currently arises is the classification of high dimensionality databases, with very few samples and high-class imbalance. Biomedical databases of gene expression microarrays present the characteristics mentioned above, presenting problems of class imbalance, with few samples and high dimensionality. The problem of class imbalance arises when the set of samples belonging to one class is much larger than the set of samples of the other class or classes. This problem has been identified as one of the main challenges of the algorithms applied in the context of Big Data. The objective of this research is the study of genetic expression databases, using conventional methods of sub and oversampling for the balance of classes such as RUS, ROS and SMOTE. The databases were modified by applying an increase in their imbalance and in another case generating artificial noise

    DeepSign: Deep On-Line Signature Verification

    Full text link
    Deep learning has become a breathtaking technology in the last years, overcoming traditional handcrafted approaches and even humans for many different tasks. However, in some tasks, such as the verification of handwritten signatures, the amount of publicly available data is scarce, what makes difficult to test the real limits of deep learning. In addition to the lack of public data, it is not easy to evaluate the improvements of novel proposed approaches as different databases and experimental protocols are usually considered. The main contributions of this study are: i) we provide an in-depth analysis of state-of-the-art deep learning approaches for on-line signature verification, ii) we present and describe the new DeepSignDB on-line handwritten signature biometric public database, iii) we propose a standard experimental protocol and benchmark to be used for the research community in order to perform a fair comparison of novel approaches with the state of the art, and iv) we adapt and evaluate our recent deep learning approach named Time-Aligned Recurrent Neural Networks (TA-RNNs) for the task of on-line handwritten signature verification. This approach combines the potential of Dynamic Time Warping and Recurrent Neural Networks to train more robust systems against forgeries. Our proposed TA-RNN system outperforms the state of the art, achieving results even below 2.0% EER when considering skilled forgery impostors and just one training signature per user

    GANprintR: Improved Fakes and Evaluation of the State of the Art in Face Manipulation Detection

    Full text link
    © 2020 IEEE.  Personal use of this material is permitted.  Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksThe availability of large-scale facial databases, together with the remarkable progresses of deep learning technologies, in particular Generative Adversarial Networks (GANs), have led to the generation of extremely realistic fake facial content, raising obvious concerns about the potential for misuse. Such concerns have fostered the research on manipulation detection methods that, contrary to humans, have already achieved astonishing results in various scenarios. In this study, we focus on the synthesis of entire facial images, which is a specific type of facial manipulation. The main contributions of this study are four-fold: i) a novel strategy to remove GAN 'fingerprints' from synthetic fake images based on autoencoders is described, in order to spoof facial manipulation detection systems while keeping the visual quality of the resulting images; ii) an in-depth analysis of the recent literature in facial manipulation detection; iii) a complete experimental assessment of this type of facial manipulation, considering the state-of-the-art fake detection systems (based on holistic deep networks, steganalysis, and local artifacts), remarking how challenging is this task in unconstrained scenarios; and finally iv) we announce a novel public database, named iFakeFaceDB, yielding from the application of our proposed GAN-fingerprint Removal approach (GANprintR) to already very realistic synthetic fake images. The results obtained in our empirical evaluation show that additional efforts are required to develop robust facial manipulation detection systems against unseen conditions and spoof techniques, such as the one proposed in this studyThis work has been supported by projects: PRIMA (H2020-MSCA-ITN-2019-860315), TRESPASS-ETN (H2020-MSCA-ITN2019-860813), BIBECA (RTI2018-101248-B-I00 MINECO/FEDER), BioGuard (Ayudas Fundación BBVA a Equipos de Investigación Cientíifica 2017), Accenture, by NOVA LINCS (UIDB/04516/2020) with the financial support of FCT - Fundação para a Ciência e a Tecnologia, through national funds, and by FCT/MCTES through national funds and co-funded by EU under the project UIDB/EEA/50008/202

    BioTouchPass: Handwritten Passwords for Touchscreen Biometrics

    Full text link
    This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibleThis work enhances traditional authentication systems based on Personal Identification Numbers (PIN) and One- Time Passwords (OTP) through the incorporation of biometric information as a second level of user authentication. In our proposed approach, users draw each digit of the password on the touchscreen of the device instead of typing them as usual. A complete analysis of our proposed biometric system is carried out regarding the discriminative power of each handwritten digit and the robustness when increasing the length of the password and the number of enrolment samples. The new e-BioDigit database, which comprises on-line handwritten digits from 0 to 9, has been acquired using the finger as input on a mobile device. This database is used in the experiments reported in this work and it is available together with benchmark results in GitHub1. Finally, we discuss specific details for the deployment of our proposed approach on current PIN and OTP systems, achieving results with Equal Error Rates (EERs) ca. 4.0% when the attacker knows the password. These results encourage the deployment of our proposed approach in comparison to traditional PIN and OTP systems where the attack would have 100% success rate under the same impostor scenarioThis work has been supported by projects: BIBECA (MINECO), Bio-Guard (Ayudas Fundación BBVA a Equipos de Investigación Científica 2017) and by UAM-CecaBank. Ruben Tolosana is supported by a FPU Fellowship from Spanish MEC

    Parallel border tracking in binary images for multicore computers

    Get PDF
    [EN] Border tracking in binary images is an important operation in many computer vision applications. The problem consists in finding borders in a 2D binary image (where all of the pixels are either 0 or 1). There are several algorithms available for this problem, but most of them are sequential. In a former paper, a parallel border tracking algorithm was proposed. This algorithm was designed to run in Graphics Processing units, and it was based on the sequential algorithm known as the Suzuki algorithm. In this paper, we adapt the previously proposed GPU algorithm so that it can be executed in multicore computers. The resulting algorithm is evaluated against its GPU counterpart. The results show that the performance of the GPU algorithm worsens (or even fails) for very large images or images with many borders. On the other hand, the proposed multicore algorithm can efficiently cope with large images.Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This work has been partially supported by the Spanish Ministry of Science, Innovation, and Universities, jointly with the European Union, through Grants RTI2018-098085-BC41, PID2021-125736OB-I00 and PID2020-113656RB-C22 (MCIN/AEI/10.13039/501100011033/, "ERDF A way of making Europe"). Also, the GVA has partially supported this research through project PROMETEO/2019/109.García Mollá, VM.; Alonso-Jordá, P. (2023). Parallel border tracking in binary images for multicore computers. The Journal of Supercomputing. 79:9915-9931. https://doi.org/10.1007/s11227-023-05052-2991599317

    SELM: Siamese extreme learning machine with application to face biometrics

    Full text link
    This version of the article has been accepted for publication, after peer review (when applicable) and is subject to Springer Nature’s AM terms of use, but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: http://dx.doi.org/10.1007/s00521-022-07100-zExtreme learning machine (ELM) is a powerful classification method and is very competitive among existing classification methods. It is speedy at training. Nevertheless, it cannot perform face verification tasks properly because face verification tasks require the comparison of facial images of two individuals simultaneously and decide whether the two faces identify the same person. The ELM structure was not designed to feed two input data streams simultaneously. Thus, in 2-input scenarios, ELM methods are typically applied using concatenated inputs. However, this setup consumes two times more computational resources, and it is not optimized for recognition tasks where learning a separable distance metric is critical. For these reasons, we propose and develop a Siamese extreme learning machine (SELM). SELM was designed to be fed with two data streams in parallel simultaneously. It utilizes a dual-stream Siamese condition in the extra Siamese layer to transform the data before passing it to the hidden layer. Moreover, we propose a Gender-Ethnicity-dependent triplet feature exclusively trained on various specific demographic groups. This feature enables learning and extracting useful facial features of each group. Experiments were conducted to evaluate and compare the performances of SELM, ELM, and deep convolutional neural network (DCNN). The experimental results showed that the proposed feature could perform correct classification at 97:87% accuracy and 99:45% area under the curve (AUC). They also showed that using SELM in conjunction with the proposed feature provided 98:31% accuracy and 99:72% AUC. SELM outperformed the robust performances over the well-known DCNN and ELM methods.This work was supported by the Faculty of Information Technology, King Mongkut’s Institute of Technology Ladkrabang and projects BIBECA (RTI2018-101248-B-I00 MINECO/FEDER) and BBforTAI (PID2021-127641OB-I00 MICINN/FEDER)
    corecore