6,995 research outputs found

    Iris classification based on sparse representations using on-line dictionary learning for large-scale de-duplication applications

    Get PDF
    De-duplication of biometrics is not scalable when the number of people to be enrolled into the biometric system runs into billions, while creating a unique identity for every person. In this paper, we propose an iris classification based on sparse representation of log-gabor wavelet features using on-line dictionary learning (ODL) for large-scale de-duplication applications. Three different iris classes based on iris fiber structures, namely, stream, flower, jewel and shaker, are used for faster retrieval of identities. Also, an iris adjudication process is illustrated by comparing the matched iris-pair images side-by-side to make the decision on the identification score using color coding. Iris classification and adjudication are included in iris de-duplication architecture to speed-up the identification process and to reduce the identification errors. The efficacy of the proposed classification approach is demonstrated on the standard iris database, UPOL

    Multi-modal gated recurrent units for image description

    Full text link
    Using a natural language sentence to describe the content of an image is a challenging but very important task. It is challenging because a description must not only capture objects contained in the image and the relationships among them, but also be relevant and grammatically correct. In this paper a multi-modal embedding model based on gated recurrent units (GRU) which can generate variable-length description for a given image. In the training step, we apply the convolutional neural network (CNN) to extract the image feature. Then the feature is imported into the multi-modal GRU as well as the corresponding sentence representations. The multi-modal GRU learns the inter-modal relations between image and sentence. And in the testing step, when an image is imported to our multi-modal GRU model, a sentence which describes the image content is generated. The experimental results demonstrate that our multi-modal GRU model obtains the state-of-the-art performance on Flickr8K, Flickr30K and MS COCO datasets.Comment: 25 pages, 7 figures, 6 tables, magazin

    Towards Odor-Sensitive Mobile Robots

    Get PDF
    J. Monroy, J. Gonzalez-Jimenez, "Towards Odor-Sensitive Mobile Robots", Electronic Nose Technologies and Advances in Machine Olfaction, IGI Global, pp. 244--263, 2018, doi:10.4018/978-1-5225-3862-2.ch012 Versión preprint, con permiso del editorOut of all the components of a mobile robot, its sensorial system is undoubtedly among the most critical ones when operating in real environments. Until now, these sensorial systems mostly relied on range sensors (laser scanner, sonar, active triangulation) and cameras. While electronic noses have barely been employed, they can provide a complementary sensory information, vital for some applications, as with humans. This chapter analyzes the motivation of providing a robot with gas-sensing capabilities and also reviews some of the hurdles that are preventing smell from achieving the importance of other sensing modalities in robotics. The achievements made so far are reviewed to illustrate the current status on the three main fields within robotics olfaction: the classification of volatile substances, the spatial estimation of the gas dispersion from sparse measurements, and the localization of the gas source within a known environment

    Robust density modelling using the student's t-distribution for human action recognition

    Full text link
    The extraction of human features from videos is often inaccurate and prone to outliers. Such outliers can severely affect density modelling when the Gaussian distribution is used as the model since it is highly sensitive to outliers. The Gaussian distribution is also often used as base component of graphical models for recognising human actions in the videos (hidden Markov model and others) and the presence of outliers can significantly affect the recognition accuracy. In contrast, the Student's t-distribution is more robust to outliers and can be exploited to improve the recognition rate in the presence of abnormal data. In this paper, we present an HMM which uses mixtures of t-distributions as observation probabilities and show how experiments over two well-known datasets (Weizmann, MuHAVi) reported a remarkable improvement in classification accuracy. © 2011 IEEE

    Modeling Taxi Drivers' Behaviour for the Next Destination Prediction

    Full text link
    In this paper, we study how to model taxi drivers' behaviour and geographical information for an interesting and challenging task: the next destination prediction in a taxi journey. Predicting the next location is a well studied problem in human mobility, which finds several applications in real-world scenarios, from optimizing the efficiency of electronic dispatching systems to predicting and reducing the traffic jam. This task is normally modeled as a multiclass classification problem, where the goal is to select, among a set of already known locations, the next taxi destination. We present a Recurrent Neural Network (RNN) approach that models the taxi drivers' behaviour and encodes the semantics of visited locations by using geographical information from Location-Based Social Networks (LBSNs). In particular, RNNs are trained to predict the exact coordinates of the next destination, overcoming the problem of producing, in output, a limited set of locations, seen during the training phase. The proposed approach was tested on the ECML/PKDD Discovery Challenge 2015 dataset - based on the city of Porto -, obtaining better results with respect to the competition winner, whilst using less information, and on Manhattan and San Francisco datasets.Comment: preprint version of a paper submitted to IEEE Transactions on Intelligent Transportation System
    corecore