7,174 research outputs found

    A New Measure for Analyzing and Fusing Sequences of Objects

    Get PDF
    This work is related to the combinatorial data analysis problem of seriation used for data visualization and exploratory analysis. Seriation re-sequences the data, so that more similar samples or objects appear closer together, whereas dissimilar ones are further apart. Despite the large number of current algorithms to realize such re-sequencing, there has not been a systematic way for analyzing the resulting sequences, comparing them, or fusing them to obtain a single unifying one. We propose a new positional proximity measure that evaluates the similarity of two arbitrary sequences based on their agreement on pairwise positional information of the sequenced objects. Furthermore, we present various statistical properties of this measure as well as its normalized version modeled as an instance of the generalized correlation coefficient. Based on this measure, we define a new procedure for consensus seriation that fuses multiple arbitrary sequences based on a quadratic assignment problem formulation and an efficient way of approximating its solution. We also derive theoretical links with other permutation distance functions and present their associated combinatorial optimization forms for consensus tasks. The utility of the proposed contributions is demonstrated through the comparison and fusion of multiple seriation algorithms we have implemented, using many real-world datasets from different application domains

    Robust unattended and stolen object detection by fusing simple algorithms

    Full text link
    Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. J. C. San Miguel, and J. M. Martínez, "Robust unattended and stolen object detection by fusing simple algorithms", in IEEE Fifth International Conference on Advanced Video and Signal Based Surveillance, 2008. AVSS '08, 2008, p. 18 - 25In this paper a new approach for detecting unattended or stolen objects in surveillance video is proposed. It is based on the fusion of evidence provided by three simple detectors. As a first step, the moving regions in the scene are detected and tracked. Then, these regions are classified as static or dynamic objects and human or nonhuman objects. Finally, objects detected as static and nonhuman are analyzed with each detector. Data from these detectors are fused together to select the best detection hypotheses. Experimental results show that the fusion-based approach increases the detection reliability as compared to the detectors and performs considerably well across a variety of multiple scenarios operating at realtime.This work is supported by Cátedra Infoglobal-UAM for “Nuevas Tecnologías de video aplicadas a la seguridad”, by the Spanish Government (TEC2007-65400 SemanticVideo), by the Comunidad de Madrid (S- 050/TIC-0223 - ProMultiDis-CM), by the Consejería de Educación of the Comunidad de Madrid and by the European Social Fund

    Probabilistic Clustering of Sequences: Inferring new bacterial regulons by comparative genomics

    Full text link
    Genome wide comparisons between enteric bacteria yield large sets of conserved putative regulatory sites on a gene by gene basis that need to be clustered into regulons. Using the assumption that regulatory sites can be represented as samples from weight matrices we derive a unique probability distribution for assignments of sites into clusters. Our algorithm, 'PROCSE' (probabilistic clustering of sequences), uses Monte-Carlo sampling of this distribution to partition and align thousands of short DNA sequences into clusters. The algorithm internally determines the number of clusters from the data, and assigns significance to the resulting clusters. We place theoretical limits on the ability of any algorithm to correctly cluster sequences drawn from weight matrices (WMs) when these WMs are unknown. Our analysis suggests that the set of all putative sites for a single genome (e.g. E. coli) is largely inadequate for clustering. When sites from different genomes are combined and all the homologous sites from the various species are used as a block, clustering becomes feasible. We predict 50-100 new regulons as well as many new members of existing regulons, potentially doubling the number of known regulatory sites in E. coli.Comment: 27 pages including 9 figures and 3 table
    • …
    corecore