3,196 research outputs found

    Application of Approximate Pattern Matching in Two Dimensional Spaces to Grid Layout for Biochemical Network Maps

    Get PDF
    Background For visualizing large-scale biochemical network maps, it is important to calculate the coordinates of molecular nodes quickly and to enhance the understanding or traceability of them. The grid layout is effective in drawing compact, orderly, balanced network maps with node label spaces, but existing grid layout algorithms often require a high computational cost because they have to consider complicated positional constraints through the entire optimization process. Results We propose a hybrid grid layout algorithm that consists of a non-grid, fast layout (preprocessor) algorithm and an approximate pattern matching algorithm that distributes the resultant preprocessed nodes on square grid points. To demonstrate the feasibility of the hybrid layout algorithm, it is characterized in terms of the calculation time, numbers of edge-edge and node-edge crossings, relative edge lengths, and F-measures. The proposed algorithm achieves outstanding performances compared with other existing grid layouts. Conclusions Use of an approximate pattern matching algorithm quickly redistributes the laid-out nodes by fast, non-grid algorithms on the square grid points, while preserving the topological relationships among the nodes. The proposed algorithm is a novel use of the pattern matching, thereby providing a breakthrough for grid layout. This application program can be freely downloaded from http://www.cadlive.jp/hybridlayout/hybridlayout.html

    Clustering for Data Reduction: A Divide and Conquer Approach

    Get PDF
    We consider the problem of reducing a potentially very large dataset to a subset of representative prototypes. Rather than searching over the entire space of prototypes, we first roughly divide the data into balanced clusters using bisecting k-means and spectral cuts, and then find the prototypes for each cluster by affinity propagation. We apply our algorithm to text data, where we perform an order of magnitude faster than simply looking for prototypes on the entire dataset. Furthermore, our "divide and conquer" approach actually performs more accurately on datasets which are well bisected, as the greedy decisions of affinity propagation are confined to classes of already similar items

    Space Partitioning Schemes and Algorithms for Generating Regular and Spiral Treemaps

    Full text link
    Treemaps have been widely applied to the visualization of hierarchical data. A treemap takes a weighted tree and visualizes its leaves in a nested planar geometric shape, with sub-regions partitioned such that each sub-region has an area proportional to the weight of its associated leaf nodes. Efficiently generating visually appealing treemaps that also satisfy other quality criteria is an interesting problem that has been tackled from many directions. We present an optimization model and five new algorithms for this problem, including two divide and conquer approaches and three spiral treemap algorithms. Our optimization model is able to generate superior treemaps that could serve as a benchmark for comparing the quality of more computationally efficient algorithms. Our divide and conquer and spiral algorithms either improve the performance of their existing counterparts with respect to aspect ratio and stability or perform competitively. Our spiral algorithms also expand their applicability to a wider range of input scenarios. Four of these algorithms are computationally efficient as well with quasilinear running times and the last algorithm achieves a cubic running time. A full version of this paper with all appendices, data, and source codes is available at \anonymizeOSF{\OSFSupplementText}

    Meshing Deforming Spacetime for Visualization and Analysis

    Full text link
    We introduce a novel paradigm that simplifies the visualization and analysis of data that have a spatially/temporally varying frame of reference. The primary application driver is tokamak fusion plasma, where science variables (e.g., density and temperature) are interpolated in a complex magnetic field-line-following coordinate system. We also see a similar challenge in rotational fluid mechanics, cosmology, and Lagrangian ocean analysis; such physics implies a deforming spacetime and induces high complexity in volume rendering, isosurfacing, and feature tracking, among various visualization tasks. Without loss of generality, this paper proposes an algorithm to build a simplicial complex -- a tetrahedral mesh, for the deforming 3D spacetime derived from two 2D triangular meshes representing consecutive timesteps. Without introducing new nodes, the resulting mesh fills the gap between 2D meshes with tetrahedral cells while satisfying given constraints on how nodes connect between the two input meshes. In the algorithm we first divide the spacetime into smaller partitions to reduce complexity based on the input geometries and constraints. We then independently search for a feasible tessellation of each partition taking nonconvexity into consideration. We demonstrate multiple use cases for a simplified visualization analysis scheme with a synthetic case and fusion plasma applications

    Eigenvector Synchronization, Graph Rigidity and the Molecule Problem

    Full text link
    The graph realization problem has received a great deal of attention in recent years, due to its importance in applications such as wireless sensor networks and structural biology. In this paper, we extend on previous work and propose the 3D-ASAP algorithm, for the graph realization problem in R3\mathbb{R}^3, given a sparse and noisy set of distance measurements. 3D-ASAP is a divide and conquer, non-incremental and non-iterative algorithm, which integrates local distance information into a global structure determination. Our approach starts with identifying, for every node, a subgraph of its 1-hop neighborhood graph, which can be accurately embedded in its own coordinate system. In the noise-free case, the computed coordinates of the sensors in each patch must agree with their global positioning up to some unknown rigid motion, that is, up to translation, rotation and possibly reflection. In other words, to every patch there corresponds an element of the Euclidean group Euc(3) of rigid transformations in R3\mathbb{R}^3, and the goal is to estimate the group elements that will properly align all the patches in a globally consistent way. Furthermore, 3D-ASAP successfully incorporates information specific to the molecule problem in structural biology, in particular information on known substructures and their orientation. In addition, we also propose 3D-SP-ASAP, a faster version of 3D-ASAP, which uses a spectral partitioning algorithm as a preprocessing step for dividing the initial graph into smaller subgraphs. Our extensive numerical simulations show that 3D-ASAP and 3D-SP-ASAP are very robust to high levels of noise in the measured distances and to sparse connectivity in the measurement graph, and compare favorably to similar state-of-the art localization algorithms.Comment: 49 pages, 8 figure

    Adaptive content mapping for internet navigation

    Get PDF
    The Internet as the biggest human library ever assembled keeps on growing. Although all kinds of information carriers (e.g. audio/video/hybrid file formats) are available, text based documents dominate. It is estimated that about 80% of all information worldwide stored electronically exists in (or can be converted into) text form. More and more, all kinds of documents are generated by means of a text processing system and are therefore available electronically. Nowadays, many printed journals are also published online and may even discontinue to appear in print form tomorrow. This development has many convincing advantages: the documents are both available faster (cf. prepress services) and cheaper, they can be searched more easily, the physical storage only needs a fraction of the space previously necessary and the medium will not age. For most people, fast and easy access is the most interesting feature of the new age; computer-aided search for specific documents or Web pages becomes the basic tool for information-oriented work. But this tool has problems. The current keyword based search machines available on the Internet are not really appropriate for such a task; either there are (way) too many documents matching the specified keywords are presented or none at all. The problem lies in the fact that it is often very difficult to choose appropriate terms describing the desired topic in the first place. This contribution discusses the current state-of-the-art techniques in content-based searching (along with common visualization/browsing approaches) and proposes a particular adaptive solution for intuitive Internet document navigation, which not only enables the user to provide full texts instead of manually selected keywords (if available), but also allows him/her to explore the whole database

    Configurable Strategies for Work-stealing

    Full text link
    Work-stealing systems are typically oblivious to the nature of the tasks they are scheduling. For instance, they do not know or take into account how long a task will take to execute or how many subtasks it will spawn. Moreover, the actual task execution order is typically determined by the underlying task storage data structure, and cannot be changed. There are thus possibilities for optimizing task parallel executions by providing information on specific tasks and their preferred execution order to the scheduling system. We introduce scheduling strategies to enable applications to dynamically provide hints to the task-scheduling system on the nature of specific tasks. Scheduling strategies can be used to independently control both local task execution order as well as steal order. In contrast to conventional scheduling policies that are normally global in scope, strategies allow the scheduler to apply optimizations on individual tasks. This flexibility greatly improves composability as it allows the scheduler to apply different, specific scheduling choices for different parts of applications simultaneously. We present a number of benchmarks that highlight diverse, beneficial effects that can be achieved with scheduling strategies. Some benchmarks (branch-and-bound, single-source shortest path) show that prioritization of tasks can reduce the total amount of work compared to standard work-stealing execution order. For other benchmarks (triangle strip generation) qualitatively better results can be achieved in shorter time. Other optimizations, such as dynamic merging of tasks or stealing of half the work, instead of half the tasks, are also shown to improve performance. Composability is demonstrated by examples that combine different strategies, both within the same kernel (prefix sum) as well as when scheduling multiple kernels (prefix sum and unbalanced tree search)

    Integrated multiple sequence alignment

    Get PDF
    Sammeth M. Integrated multiple sequence alignment. Bielefeld (Germany): Bielefeld University; 2005.The thesis presents enhancements for automated and manual multiple sequence alignment: existing alignment algorithms are made more easily accessible and new algorithms are designed for difficult cases. Firstly, we introduce the QAlign framework, a graphical user interface for multiple sequence alignment. It comprises several state-of-the-art algorithms and supports their parameters by convenient dialogs. An alignment viewer with guided editing functionality can also highlight or print regions of the alignment. Also phylogenetic features are provided, e.g., distance-based tree reconstruction methods, corrections for multiple substitutions and a tree viewer. The modular concept and the platform-independent implementation guarantee an easy extensibility. Further, we develop a constrained version of the divide-and-conquer alignment such that it can be restricted by anchors found earlier with local alignments. It can be shown that this method shares attributes of both, local and global aligners, in the quality of results as well as in the computation time. We further modify the local alignment step to work on bipartite (or even multipartite) sets for sequences where repeats overshadow valuable sequence information. In the end a technique is established that can accurately align sequences containing eventually repeated motifs. Finally, another algorithm is presented that allows to compare tandem repeat sequences by aligning them with respect to their possible repeat histories. We describe an evolutionary model including tandem duplications and excisions, and give an exact algorithm to compare two sequences under this model
    corecore