15,425 research outputs found

    Fast parallel algorithms for the unit cost editing distance between trees

    Full text link
    1. Problem Ordered labeled trees are trees whose nodes are labeled and in which the ° left-to-right order among siblings is significant. We consider the distance between two trees to be the minimum number of edit operations (insert, delete, and modify) necessary to transform one tree to another. We present three algorithms to find the distance. The first algorithm is a simple dynamic program-ming algorithm based on a postorder traversal whose complexity improves upon the best previ-ously published algorithm due to Tai (T79 in JACM). The second and third algorithms are parallel algorithms based on the application of suf-fix trees to the comparison problem. The cost of executing these algorithms is a monotonic increas-ing function of the distance between the two trees. Results Let trees T I and T2 have numbers of levels L i and L 2 respectively. Let k be the actual distance between T 1 and T2. Let N be rain (IT11, IT2]). The asymptotic running times (assuming a concurrent-read concurrent-write parallel random access machine) are: A lgor i thm T ime Processors Tai IT l lX [T2[xL~XL] Alg l [Tx [ × Ir=l xLI×L

    A clique-based method for the edit distance between unordered trees and its application to analysis of glycan structures

    Get PDF
    [Background]Measuring similarities between tree structured data is important for analysis of RNA secondary structures, phylogenetic trees, glycan structures, and vascular trees. The edit distance is one of the most widely used measures for comparison of tree structured data. However, it is known that computation of the edit distance for rooted unordered trees is NP-hard. Furthermore, there is almost no available software tool that can compute the exact edit distance for unordered trees. [Results]In this paper, we present a practical method for computing the edit distance between rooted unordered trees. In this method, the edit distance problem for unordered trees is transformed into the maximum clique problem and then efficient solvers for the maximum clique problem are applied. We applied the proposed method to similar structure search for glycan structures. The result suggests that our proposed method can efficiently compute the edit distance for moderate size unordered trees. It also suggests that the proposed method has the accuracy comparative to those by the edit distance for ordered trees and by an existing method for glycan search. [Conclusions]The proposed method is simple but useful for computation of the edit distance between unordered trees. The object code is available upon request

    A reliable order-statistics-based approximate nearest neighbor search algorithm

    Full text link
    We propose a new algorithm for fast approximate nearest neighbor search based on the properties of ordered vectors. Data vectors are classified based on the index and sign of their largest components, thereby partitioning the space in a number of cones centered in the origin. The query is itself classified, and the search starts from the selected cone and proceeds to neighboring ones. Overall, the proposed algorithm corresponds to locality sensitive hashing in the space of directions, with hashing based on the order of components. Thanks to the statistical features emerging through ordering, it deals very well with the challenging case of unstructured data, and is a valuable building block for more complex techniques dealing with structured data. Experiments on both simulated and real-world data prove the proposed algorithm to provide a state-of-the-art performance

    Bayesian graph edit distance

    Get PDF
    This paper describes a novel framework for comparing and matching corrupted relational graphs. The paper develops the idea of edit-distance originally introduced for graph-matching by Sanfeliu and Fu [1]. We show how the Levenshtein distance can be used to model the probability distribution for structural errors in the graph-matching problem. This probability distribution is used to locate matches using MAP label updates. We compare the resulting graph-matching algorithm with that recently reported by Wilson and Hancock. The use of edit-distance offers an elegant alternative to the exhaustive compilation of label dictionaries. Moreover, the method is polynomial rather than exponential in its worst-case complexity. We support our approach with an experimental study on synthetic data and illustrate its effectiveness on an uncalibrated stereo correspondence problem. This demonstrates experimentally that the gain in efficiency is not at the expense of quality of match

    Computerized Analysis of Magnetic Resonance Images to Study Cerebral Anatomy in Developing Neonates

    Get PDF
    The study of cerebral anatomy in developing neonates is of great importance for the understanding of brain development during the early period of life. This dissertation therefore focuses on three challenges in the modelling of cerebral anatomy in neonates during brain development. The methods that have been developed all use Magnetic Resonance Images (MRI) as source data. To facilitate study of vascular development in the neonatal period, a set of image analysis algorithms are developed to automatically extract and model cerebral vessel trees. The whole process consists of cerebral vessel tracking from automatically placed seed points, vessel tree generation, and vasculature registration and matching. These algorithms have been tested on clinical Time-of- Flight (TOF) MR angiographic datasets. To facilitate study of the neonatal cortex a complete cerebral cortex segmentation and reconstruction pipeline has been developed. Segmentation of the neonatal cortex is not effectively done by existing algorithms designed for the adult brain because the contrast between grey and white matter is reversed. This causes pixels containing tissue mixtures to be incorrectly labelled by conventional methods. The neonatal cortical segmentation method that has been developed is based on a novel expectation-maximization (EM) method with explicit correction for mislabelled partial volume voxels. Based on the resulting cortical segmentation, an implicit surface evolution technique is adopted for the reconstruction of the cortex in neonates. The performance of the method is investigated by performing a detailed landmark study. To facilitate study of cortical development, a cortical surface registration algorithm for aligning the cortical surface is developed. The method first inflates extracted cortical surfaces and then performs a non-rigid surface registration using free-form deformations (FFDs) to remove residual alignment. Validation experiments using data labelled by an expert observer demonstrate that the method can capture local changes and follow the growth of specific sulcus

    Identifying Real Estate Opportunities using Machine Learning

    Full text link
    The real estate market is exposed to many fluctuations in prices because of existing correlations with many variables, some of which cannot be controlled or might even be unknown. Housing prices can increase rapidly (or in some cases, also drop very fast), yet the numerous listings available online where houses are sold or rented are not likely to be updated that often. In some cases, individuals interested in selling a house (or apartment) might include it in some online listing, and forget about updating the price. In other cases, some individuals might be interested in deliberately setting a price below the market price in order to sell the home faster, for various reasons. In this paper, we aim at developing a machine learning application that identifies opportunities in the real estate market in real time, i.e., houses that are listed with a price substantially below the market price. This program can be useful for investors interested in the housing market. We have focused in a use case considering real estate assets located in the Salamanca district in Madrid (Spain) and listed in the most relevant Spanish online site for home sales and rentals. The application is formally implemented as a regression problem that tries to estimate the market price of a house given features retrieved from public online listings. For building this application, we have performed a feature engineering stage in order to discover relevant features that allows for attaining a high predictive performance. Several machine learning algorithms have been tested, including regression trees, k-nearest neighbors, support vector machines and neural networks, identifying advantages and handicaps of each of them.Comment: 24 pages, 13 figures, 5 table

    Single-picture reconstruction and rendering of trees for plausible vegetation synthesis

    Get PDF
    State-of-the-art approaches for tree reconstruction either put limiting constraints on the input side (requiring multiple photographs, a scanned point cloud or intensive user input) or provide a representation only suitable for front views of the tree. In this paper we present a complete pipeline for synthesizing and rendering detailed trees from a single photograph with minimal user effort. Since the overall shape and appearance of each tree is recovered from a single photograph of the tree crown, artists can benefit from georeferenced images to populate landscapes with native tree species. A key element of our approach is a compact representation of dense tree crowns through a radial distance map. Our first contribution is an automatic algorithm for generating such representations from a single exemplar image of a tree. We create a rough estimate of the crown shape by solving a thin-plate energy minimization problem, and then add detail through a simplified shape-from-shading approach. The use of seamless texture synthesis results in an image-based representation that can be rendered from arbitrary view directions at different levels of detail. Distant trees benefit from an output-sensitive algorithm inspired on relief mapping. For close-up trees we use a billboard cloud where leaflets are distributed inside the crown shape through a space colonization algorithm. In both cases our representation ensures efficient preservation of the crown shape. Major benefits of our approach include: it recovers the overall shape from a single tree image, involves no tree modeling knowledge and minimal authoring effort, and the associated image-based representation is easy to compress and thus suitable for network streaming.Peer ReviewedPostprint (author's final draft
    corecore