4,715 research outputs found

    Visual and computational analysis of structure-activity relationships in high-throughput screening data

    Get PDF
    Novel analytic methods are required to assimilate the large volumes of structural and bioassay data generated by combinatorial chemistry and high-throughput screening programmes in the pharmaceutical and agrochemical industries. This paper reviews recent work in visualisation and data mining that can be used to develop structure-activity relationships from such chemical/biological datasets

    Topology-preserving perceptual segmentation using the Combinatorial Pyramid

    Get PDF
    Scene understanding and other high-level visual tasks usually rely on segmenting the captured images for dealing with a more efficient mid-level representation. Although this segmentation stage will consider topological constraints for the set of obtained regions (e.g., their internal connectivity), it is typical that the importance of preserving the topological relationships among regions will be not taken into account. Contrary to other similar approaches, this paper presents a bottom-up approach for perceptual segmentation of natural images which preserves the topology of the image. The segmentation algorithm consists of two consecutive stages: firstly, the input image is partitioned into a set of blobs of uniform colour (pre-segmentation stage) and then, using a more complex distance which integrates edge and region descriptors, these blobs are hierarchically merged (perceptual grouping). Both stages are addressed using the Combinatorial Pyramid, a hierarchical structure which can correctly encode relationships among image regions at upper levels. The performance of the proposed approach has been initially evaluated with respect to groundtruth segmentation data using the Berkeley Segmentation Dataset and Benchmark. Although additional descriptors must be added to deal with small regions and textured surfaces, experimental results reveal that the proposed perceptual grouping provides satisfactory scores

    Indexability, concentration, and VC theory

    Get PDF
    Degrading performance of indexing schemes for exact similarity search in high dimensions has long since been linked to histograms of distributions of distances and other 1-Lipschitz functions getting concentrated. We discuss this observation in the framework of the phenomenon of concentration of measure on the structures of high dimension and the Vapnik-Chervonenkis theory of statistical learning.Comment: 17 pages, final submission to J. Discrete Algorithms (an expanded, improved and corrected version of the SISAP'2010 invited paper, this e-print, v3

    Generating Second Order (Co)homological Information within AT-Model Context

    Get PDF
    In this paper we design a new family of relations between (co)homology classes, working with coefficients in a field and starting from an AT-model (Algebraic Topological Model) AT(C) of a finite cell complex C These relations are induced by elementary relations of type “to be in the (co)boundary of” between cells. This high-order connectivity information is embedded into a graph-based representation model, called Second Order AT-Region-Incidence Graph (or AT-RIG) of C. This graph, having as nodes the different homology classes of C, is in turn, computed from two generalized abstract cell complexes, called primal and dual AT-segmentations of C. The respective cells of these two complexes are connected regions (set of cells) of the original cell complex C, which are specified by the integral operator of AT(C). In this work in progress, we successfully use this model (a) in experiments for discriminating topologically different 3D digital objects, having the same Euler characteristic and (b) in designing a parallel algorithm for computing potentially significant (co)homological information of 3D digital objects.Ministerio de Economía y Competitividad MTM2016-81030-PMinisterio de Economía y Competitividad TEC2012-37868-C04-0

    A survey of outlier detection methodologies

    Get PDF
    Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review

    A Diagram Is Worth A Dozen Images

    Full text link
    Diagrams are common tools for representing complex concepts, relationships and events, often when it would be difficult to portray the same information with natural images. Understanding natural images has been extensively studied in computer vision, while diagram understanding has received little attention. In this paper, we study the problem of diagram interpretation and reasoning, the challenging task of identifying the structure of a diagram and the semantics of its constituents and their relationships. We introduce Diagram Parse Graphs (DPG) as our representation to model the structure of diagrams. We define syntactic parsing of diagrams as learning to infer DPGs for diagrams and study semantic interpretation and reasoning of diagrams in the context of diagram question answering. We devise an LSTM-based method for syntactic parsing of diagrams and introduce a DPG-based attention model for diagram question answering. We compile a new dataset of diagrams with exhaustive annotations of constituents and relationships for over 5,000 diagrams and 15,000 questions and answers. Our results show the significance of our models for syntactic parsing and question answering in diagrams using DPGs

    Flow-based Influence Graph Visual Summarization

    Full text link
    Visually mining a large influence graph is appealing yet challenging. People are amazed by pictures of newscasting graph on Twitter, engaged by hidden citation networks in academics, nevertheless often troubled by the unpleasant readability of the underlying visualization. Existing summarization methods enhance the graph visualization with blocked views, but have adverse effect on the latent influence structure. How can we visually summarize a large graph to maximize influence flows? In particular, how can we illustrate the impact of an individual node through the summarization? Can we maintain the appealing graph metaphor while preserving both the overall influence pattern and fine readability? To answer these questions, we first formally define the influence graph summarization problem. Second, we propose an end-to-end framework to solve the new problem. Our method can not only highlight the flow-based influence patterns in the visual summarization, but also inherently support rich graph attributes. Last, we present a theoretic analysis and report our experiment results. Both evidences demonstrate that our framework can effectively approximate the proposed influence graph summarization objective while outperforming previous methods in a typical scenario of visually mining academic citation networks.Comment: to appear in IEEE International Conference on Data Mining (ICDM), Shen Zhen, China, December 201
    corecore