15,761 research outputs found
Transitioning Applications to Semantic Web Services: An Automated Formal Approach
Semantic Web Services have been recognized as a promising technology that exhibits huge commercial potential, and attract significant attention from both industry and the research community. Despite expectations being high, the industrial take-up of Semantic Web Service technologies has been slower than expected. One of the main reasons is that many systems have been developed without considering the potential of the web in integrating services and sharing resources. Without a systematic methodology and proper tool support, the migration from legacy systems to Semantic Web Service-based systems can be a very tedious and expensive process, which carries a definite risk of failure. There is an urgent need to provide strategies which allow the migration of legacy systems to Semantic Web Services platforms, and also tools to support such a strategy. In this paper we propose a methodology for transitioning these applications to Semantic Web Services by taking the advantage of rigorous mathematical methods. Our methodology allows users to migrate their applications to Semantic Web Services platform automatically or semi-automatically
Which one is better: presentation-based or content-based math search?
Mathematical content is a valuable information source and retrieving this
content has become an important issue. This paper compares two searching
strategies for math expressions: presentation-based and content-based
approaches. Presentation-based search uses state-of-the-art math search system
while content-based search uses semantic enrichment of math expressions to
convert math expressions into their content forms and searching is done using
these content-based expressions. By considering the meaning of math
expressions, the quality of search system is improved over presentation-based
systems
Engineering polymer informatics: Towards the computer-aided design of polymers
The computer-aided design of polymers is one of the holy grails of modern chemical
informatics and of significant interest for a number of communities in polymer
science. The paper outlines a vision for the in silico design of polymers and presents
an information model for polymers based on modern semantic web technologies, thus
laying the foundations for achieving the vision
Topic Map Generation Using Text Mining
Starting from text corpus analysis with linguistic and statistical analysis algorithms, an infrastructure for text mining is described which uses collocation analysis as a central tool. This text mining method may be applied to different domains as well as languages. Some examples taken form large reference databases motivate the applicability to knowledge management using declarative standards of information structuring and description. The ISO/IEC Topic Map standard is introduced as a candidate for rich metadata description of information resources and it is shown how text mining can be used for automatic topic map generation
Improving the Representation and Conversion of Mathematical Formulae by Considering their Textual Context
Mathematical formulae represent complex semantic information in a concise
form. Especially in Science, Technology, Engineering, and Mathematics,
mathematical formulae are crucial to communicate information, e.g., in
scientific papers, and to perform computations using computer algebra systems.
Enabling computers to access the information encoded in mathematical formulae
requires machine-readable formats that can represent both the presentation and
content, i.e., the semantics, of formulae. Exchanging such information between
systems additionally requires conversion methods for mathematical
representation formats. We analyze how the semantic enrichment of formulae
improves the format conversion process and show that considering the textual
context of formulae reduces the error rate of such conversions. Our main
contributions are: (1) providing an openly available benchmark dataset for the
mathematical format conversion task consisting of a newly created test
collection, an extensive, manually curated gold standard and task-specific
evaluation metrics; (2) performing a quantitative evaluation of
state-of-the-art tools for mathematical format conversions; (3) presenting a
new approach that considers the textual context of formulae to reduce the error
rate for mathematical format conversions. Our benchmark dataset facilitates
future research on mathematical format conversions as well as research on many
problems in mathematical information retrieval. Because we annotated and linked
all components of formulae, e.g., identifiers, operators and other entities, to
Wikidata entries, the gold standard can, for instance, be used to train methods
for formula concept discovery and recognition. Such methods can then be applied
to improve mathematical information retrieval systems, e.g., for semantic
formula search, recommendation of mathematical content, or detection of
mathematical plagiarism.Comment: 10 pages, 4 figure
Web Content Extraction - a Meta-Analysis of its Past and Thoughts on its Future
In this paper, we present a meta-analysis of several Web content extraction
algorithms, and make recommendations for the future of content extraction on
the Web. First, we find that nearly all Web content extractors do not consider
a very large, and growing, portion of modern Web pages. Second, it is well
understood that wrapper induction extractors tend to break as the Web changes;
heuristic/feature engineering extractors were thought to be immune to a Web
site's evolution, but we find that this is not the case: heuristic content
extractor performance also tends to degrade over time due to the evolution of
Web site forms and practices. We conclude with recommendations for future work
that address these and other findings.Comment: Accepted for publication in SIGKDD Exploration
- ā¦