24,474 research outputs found
Wrapper Maintenance: A Machine Learning Approach
The proliferation of online information sources has led to an increased use
of wrappers for extracting data from Web sources. While most of the previous
research has focused on quick and efficient generation of wrappers, the
development of tools for wrapper maintenance has received less attention. This
is an important research problem because Web sources often change in ways that
prevent the wrappers from extracting data correctly. We present an efficient
algorithm that learns structural information about data from positive examples
alone. We describe how this information can be used for two wrapper maintenance
applications: wrapper verification and reinduction. The wrapper verification
system detects when a wrapper is not extracting correct data, usually because
the Web source has changed its format. The reinduction algorithm automatically
recovers from changes in the Web source by identifying data on Web pages so
that a new wrapper may be generated for this source. To validate our approach,
we monitored 27 wrappers over a period of a year. The verification algorithm
correctly discovered 35 of the 37 wrapper changes, and made 16 mistakes,
resulting in precision of 0.73 and recall of 0.95. We validated the reinduction
algorithm on ten Web sources. We were able to successfully reinduce the
wrappers, obtaining precision and recall values of 0.90 and 0.80 on the data
extraction task
A Tabu Search Based Approach for Graph Layout
This paper describes an automated tabu search based method for drawing general graph layouts with straight lines. To our knowledge, this is the first time tabu methods have been applied to graph drawing. We formulated the task as a multi-criteria optimization problem with a number of
metrics which are used in a weighted fitness function to measure the aesthetic
quality of the graph layout. The main goal of this work is to speed up the graph
layout process without sacrificing layout quality. To achieve this, we use a tabu
search based method that goes through a predefined number of iterations to minimize
the value of the fitness function. Tabu search always chooses the best solution in
the neighbourhood. This may lead to cycling, so a tabu list is used to store moves
that are not permitted, meaning that the algorithm does not choose previous
solutions for a set period of time. We evaluate the method according to the time
spent to draw a graph and the quality of the drawn graphs. We give experimental
results applied on random graphs and we provide statistical evidence that our
method outperforms a fast search-based drawing method (hill climbing) in execution
time while it produces comparably good graph layouts.We also demonstrate the method
on real world graph datasets to show that we can reproduce similar results in a
real world setting
An Interactive Tool to Explore and Improve the Ply Number of Drawings
Given a straight-line drawing of a graph , for every vertex
the ply disk is defined as a disk centered at where the radius of
the disk is half the length of the longest edge incident to . The ply number
of a given drawing is defined as the maximum number of overlapping disks at
some point in . Here we present a tool to explore and evaluate
the ply number for graphs with instant visual feedback for the user. We
evaluate our methods in comparison to an existing ply computation by De Luca et
al. [WALCOM'17]. We are able to reduce the computation time from seconds to
milliseconds for given drawings and thereby contribute to further research on
the ply topic by providing an efficient tool to examine graphs extensively by
user interaction as well as some automatic features to reduce the ply number.Comment: Appears in the Proceedings of the 25th International Symposium on
Graph Drawing and Network Visualization (GD 2017
Applying semantic web technologies to knowledge sharing in aerospace engineering
This paper details an integrated methodology to optimise Knowledge reuse and sharing, illustrated with a use case in the aeronautics domain. It uses Ontologies as a central modelling strategy for the Capture of Knowledge from legacy docu-ments via automated means, or directly in systems interfacing with Knowledge workers, via user-defined, web-based forms. The domain ontologies used for Knowledge Capture also guide the retrieval of the Knowledge extracted from the data using a Semantic Search System that provides support for multiple modalities during search. This approach has been applied and evaluated successfully within the aerospace domain, and is currently being extended for use in other domains on an increasingly large scale
Expansion of CMOS array design techniques
The important features of the multiport (double entry) automatic placement and routing programs for standard cells are described. Measured performance and predicted performance were compared for seven CMOS/SOS array types and hybrids designed with the high speed CMOS/SOS cell family. The CMOS/SOS standard cell data sheets are listed and described
- …