18 research outputs found

    Addressing missing values in kernel-based multimodal biometric fusion using neutral point substitution

    Get PDF
    In multimodal biometric information fusion, it is common to encounter missing modalities in which matching cannot be performed. As a result, at the match score level, this implies that scores will be missing. We address the multimodal fusion problem involving missing modalities (scores) using support vector machines with the Neutral Point Substitution (NPS) method. The approach starts by processing each modality using a kernel. When a modality is missing, at the kernel level, the missing modality is substituted by one that is unbiased with regards to the classification, called a neutral point. Critically, unlike conventional missing-data substitution methods, explicit calculation of neutral points may be omitted by virtue of their implicit incorporation within the SVM training framework. Experiments based on the publicly available Biosecure DS2 multimodal (scores) data set shows that the SVM-NPS approach achieves very good generalization performance compared to the sum rule fusion, especially with severe missing modalities

    Bellman Functions on Trees for Segmentation, Generalized Smoothing, Matching and Multi-Alignment in Massive Data Sets

    No full text
    A massive data set is considered as a set of experimentally acquired values of a number of variables each of which is associated with the respective node of an undirected adjacency graph that presets the fixed structure of the data set. The class of data analysis problems under consideration is outlined by the assumption that the ultimate aim of processing can be represented as a transformation of the original data array into a secondary array of the same structure but with node variables of, generally speaking, different nature, i.e. different ranges. Such a generalized problem is set as the formal problem of optimization (minimization or maximization) of a real-valued objective function of all the node variables. The objective function is assumed to consist of additive constituents of one or two arguments, respectively, node and edge functions. The former of them carry the data-dependent information on the sought-for values of the secondary variables, whereas the latter ones are mean..

    Optimization Algorithms for Separable Functions With Tree-Like Adjacency of Variables and Their Application to the Analysis of Massive Data Sets

    No full text
    A massive data set is considered as a set of experimentally acquired values of a number of variables each of which is associated with the respective node of an undirected adjacency graph that presets the fixed structure of the data set. The class of data analysis problems under consideration is outlined by the assumption that the ultimate aim of processing can be represented as a transformation of the original data array into a secondary array of the same structure but with node variables of, generally speaking, different nature, i.e. different ranges. Such a generalized problem is set as the formal problem of optimization (minimization or maximization) of a real-valued objective function of all the node variables. The objective function is assumed to consist of additive constituents of one or two arguments, respectively, node and edge functions. The former of them carry the data-dependent information on the sought-for values of the secondary variables, whereas the latter ones are mean..

    Massive Data Set Analysis in Seismic Explorations for Oil and Gas in Crystalline Basement Interval

    No full text
    On the basis of the optimization-based approach to the analysis of massive ordered data sets, a new method is proposed for computer-aided interpretation of seismic exploratory data from the so-called crystalline basement of the Earth mantle, which underlies the relatively thin sedimentary cover having been, up to now, the almost exceptional object of seismic explorations. The seismic exploratory data sets, seismic sections and cubes, are a class of, respectively, two- and three-dimensional data arrays, which are analyzed in the course of gas and oil reserves prospecting with the purpose of studying the structure of the underground rock mass. The seismic data sets consist of synchronous records of reflected seismic signals registered by a large number of geophones (seismic sensors) placed along a straight line or in the nodes of a rectangular lattice on the earth surface. As the source of the initial seismic pulse, usually serves a series of explosions, responses to which are averaged i..

    Multi-class Classication in Big Data

    No full text
    The paper suggests the on-line multi-class classier with a sublinear computational complexity relative to the number of training objects. The proposed approach is based on the combining of two-class probabilistic classifiers. Pairwise coupling is a popular multi-class classification method that combines all comparisons for each pair of classes. Unfortunately pairwise coupling suffers in many cases from incompatibility in that some regions of its input space the sum of probabilities are not equal to one. In this paper we propose the optimal approximation for probabilities in each point of object space. This paper proposes a new probabilistic interpretation of the Support Vector Machine for obtaining class probabilities. We show how the SVM can be viewed as a maximum likelihood estimate of a class of probabilistic models. As a computational method for big data we use the stochastic gradient descent approach minimizing directly the primal SVM objective. Unfortunately the hinge loss of the true SVM classier did not allow to use SGD procedure for determining the classier bias. In this paper we propose the piece-wise quadratic loss that helps to overcome this obstacle and gives an instrument to obtain the bias from SGD procedure
    corecore