24,474 research outputs found

    Wrapper Maintenance: A Machine Learning Approach

    Full text link
    The proliferation of online information sources has led to an increased use of wrappers for extracting data from Web sources. While most of the previous research has focused on quick and efficient generation of wrappers, the development of tools for wrapper maintenance has received less attention. This is an important research problem because Web sources often change in ways that prevent the wrappers from extracting data correctly. We present an efficient algorithm that learns structural information about data from positive examples alone. We describe how this information can be used for two wrapper maintenance applications: wrapper verification and reinduction. The wrapper verification system detects when a wrapper is not extracting correct data, usually because the Web source has changed its format. The reinduction algorithm automatically recovers from changes in the Web source by identifying data on Web pages so that a new wrapper may be generated for this source. To validate our approach, we monitored 27 wrappers over a period of a year. The verification algorithm correctly discovered 35 of the 37 wrapper changes, and made 16 mistakes, resulting in precision of 0.73 and recall of 0.95. We validated the reinduction algorithm on ten Web sources. We were able to successfully reinduce the wrappers, obtaining precision and recall values of 0.90 and 0.80 on the data extraction task

    A Tabu Search Based Approach for Graph Layout

    Get PDF
    This paper describes an automated tabu search based method for drawing general graph layouts with straight lines. To our knowledge, this is the first time tabu methods have been applied to graph drawing. We formulated the task as a multi-criteria optimization problem with a number of metrics which are used in a weighted fitness function to measure the aesthetic quality of the graph layout. The main goal of this work is to speed up the graph layout process without sacrificing layout quality. To achieve this, we use a tabu search based method that goes through a predefined number of iterations to minimize the value of the fitness function. Tabu search always chooses the best solution in the neighbourhood. This may lead to cycling, so a tabu list is used to store moves that are not permitted, meaning that the algorithm does not choose previous solutions for a set period of time. We evaluate the method according to the time spent to draw a graph and the quality of the drawn graphs. We give experimental results applied on random graphs and we provide statistical evidence that our method outperforms a fast search-based drawing method (hill climbing) in execution time while it produces comparably good graph layouts.We also demonstrate the method on real world graph datasets to show that we can reproduce similar results in a real world setting

    An Interactive Tool to Explore and Improve the Ply Number of Drawings

    Full text link
    Given a straight-line drawing Γ\Gamma of a graph G=(V,E)G=(V,E), for every vertex vv the ply disk DvD_v is defined as a disk centered at vv where the radius of the disk is half the length of the longest edge incident to vv. The ply number of a given drawing is defined as the maximum number of overlapping disks at some point in R2\mathbb{R}^2. Here we present a tool to explore and evaluate the ply number for graphs with instant visual feedback for the user. We evaluate our methods in comparison to an existing ply computation by De Luca et al. [WALCOM'17]. We are able to reduce the computation time from seconds to milliseconds for given drawings and thereby contribute to further research on the ply topic by providing an efficient tool to examine graphs extensively by user interaction as well as some automatic features to reduce the ply number.Comment: Appears in the Proceedings of the 25th International Symposium on Graph Drawing and Network Visualization (GD 2017

    Technical Evaluation: VIRCON Task 12 Report

    Get PDF

    Applying semantic web technologies to knowledge sharing in aerospace engineering

    Get PDF
    This paper details an integrated methodology to optimise Knowledge reuse and sharing, illustrated with a use case in the aeronautics domain. It uses Ontologies as a central modelling strategy for the Capture of Knowledge from legacy docu-ments via automated means, or directly in systems interfacing with Knowledge workers, via user-defined, web-based forms. The domain ontologies used for Knowledge Capture also guide the retrieval of the Knowledge extracted from the data using a Semantic Search System that provides support for multiple modalities during search. This approach has been applied and evaluated successfully within the aerospace domain, and is currently being extended for use in other domains on an increasingly large scale

    Expansion of CMOS array design techniques

    Get PDF
    The important features of the multiport (double entry) automatic placement and routing programs for standard cells are described. Measured performance and predicted performance were compared for seven CMOS/SOS array types and hybrids designed with the high speed CMOS/SOS cell family. The CMOS/SOS standard cell data sheets are listed and described
    • …
    corecore