Search CORE

203,473 research outputs found

Web Content Mining for Information on Information Scientists

Author: Risse Sarah
Publication venue
Publication date: 28/04/2011
Field of study

This paper presents a search system for information on scientists which was implemented prototypically for the area of information science, employing Web Content Mining techniques. The sources that are used in the implemented approach are online publication services and personal homepages of scientists. The system contains wrappers for querying the publication services and information extraction from their result pages, as well as methods for information extraction from homepages, which are based on heuristics concerning structure and composition of the pages. Moreover a specialised search technique for searching for personal homepages of information scientists was developed

University of Hildesheim

Automated information extraction from web APIs documentation

Author: A. Sheth
C. Pedrinaci
K. Gomadam
N. Steinmetz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

A fundamental characteristic of Web APIs is the fact that, de facto, providers hardly follow any standard practices while implementing, publishing, and documenting their APIs. As a consequence, the discovery and use of these services by third parties is significantly hampered. In order to achieve further automation while exploiting Web APIs we present an approach for automatically extracting relevant technical information from the Web pages documenting them. In particular we have devised two algorithms that automatically extract technical details such as operation names, operation descriptions or URI templates from the documentation of Web APIs adopting either RPC or RESTful interfaces. The algorithms devised, which exploit advanced DOM processing as well as state of the art Information Extraction and Natural Language Processing techniques, have been evaluated against a detailed dataset exhibiting a high precision and recall–around 90% for both REST and RPC APIs outperforming state of the art information extraction algorithms

Crossref

Open Research Online (The Open University)

Implementing Service Oriented Architecture for Data Mining

Author: Vasant Maruti Patil, Prof. Kalpana Kadam
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/07/2015
Field of study

With Web technology, data on internet has become increasingly large and complex. No matter users or internet users needs all this data. Also the data which is available on web not all the time useful information or it is knowledgeable. Hence web data mining is necessary to fulfill this demand. Web data mining can extract unstructured, undiscovered data which is possibly useful information and knowledge, from much incomplete, noisy, ambiguous, random, practical application related data from WWW network. It is a new emerging commercial information/data mining technology. Its main characteristic is to extract key data to support business for decision making from business database through the use of extraction, conversion, analysis and other transaction models. Web service is deployed on the web with an object or component to achieve distributed application software platform through a series of protocols. Web Service platform provides a set of standard types systems, rules, techniques and internet service-oriented applications for communication between the different platforms, different programming languages and different types of systems to achieve interoperability. This paper gives the actual and practical application of web services for data mining, we build a data mining model based on Web services and going forward it is possible to implement the new data mining solution for security configuration. This has been achieved with the use of prototypes of a dynamic web service based data mining systems. DOI: 10.17762/ijritcc2321-8169.15079

International Journal on Recent and Innovation Trends in Computing and Communication

Dynamic Fusion of Web Data

Author: Aumüller David
Rahm Erhard
Thor Andreas
Publication venue
Publication date: 04/02/2019
Field of study

Mashups exemplify a workflow-like approach to dynamically integrate data and services from multiple web sources. Such integration workflows can build on existing services for web search, entity search, database querying, and information extraction and thus complement other data integration approaches. A key challenge is the efficient execution of integration workflows and their query and matching steps at runtime. We relate mashup data integration with other approaches, list major challenges, and outline features of a first prototype design

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Qucosa - Publikationsserver der Universität Leipzig

Recommended from our members

Ontology learning for Semantic Web Services

Author: Alfaries Auhood
Publication venue: Brunel University, School of Information Systems, Computing and Mathematics Theses
Publication date: 01/01/2010
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University, 18/10/2010.The expansion of Semantic Web Services is restricted by traditional ontology engineering methods. Manual ontology development is time consuming, expensive and a resource exhaustive task. Consequently, it is important to support ontology engineers by automating the ontology acquisition process to help deliver the Semantic Web vision. Existing Web Services offer an affluent source of domain knowledge for ontology engineers. Ontology learning can be seen as a plug-in in the Web Service ontology development process, which can be used by ontology engineers to develop and maintain an ontology that evolves with current Web Services. Supporting the domain engineer with an automated tool whilst building an ontological domain model, serves the purpose of reducing time and effort in acquiring the domain concepts and relations from Web Service artefacts, whilst effectively speeding up the adoption of Semantic Web Services, thereby allowing current Web Services to accomplish their full potential. With that in mind, a Service Ontology Learning Framework (SOLF) is developed and applied to a real set of Web Services. The research contributes a rigorous method that effectively extracts domain concepts, and relations between these concepts, from Web Services and automatically builds the domain ontology. The method applies pattern-based information extraction techniques to automatically learn domain concepts and relations between those concepts. The framework is automated via building a tool that implements the techniques. Applying the SOLF and the tool on different sets of services results in an automatically built domain ontology model that represents semantic knowledge in the underlying domain. The framework effectiveness, in extracting domain concepts and relations, is evaluated by its appliance on varying sets of commercial Web Services including the financial domain. The standard evaluation metrics, precision and recall, are employed to determine both the accuracy and coverage of the learned ontology models. Both the lexical and structural dimensions of the models are evaluated thoroughly. The evaluation results are encouraging, providing concrete outcomes in an area that is little researched

Brunel University Research Archive

Structuring and extracting knowledge for the support of hypothesis generation in molecular biology

Author: A Gomez-Perez
Andrew P Gibson
B Smith
C Goble
CA Goble
CD Manning
CJ Mungall
DA Moreira
DL Rubin
E Neumann
Edgar Meij
EJ Meij
G Antoniou
I Spasic
IH Witten
J Broekstra
JA Kors
Konstantinos Krommydas
LD Stein
LJ Post
M Ashburner
M Missikoff
M Scott Marshall
M Weeber
MA Inda
Marco Roos
Martijn Schuemie
O Tuason
P Fisher
P Missier
P Romano
Pieter W Adriaans
PJ Verschure
R Hoehndorf
R Jelier
R Stevens
R Witte
S Jupp
S Katrenko
S Katrenko
S Katrenko
Sophia Katrenko
T Clark
Willem Robert van Hage
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Background: Hypothesis generation in molecular and cellular biology is an empirical process in which knowledge derived from prior experiments is distilled into a comprehensible model. The requirement of automated support is exemplified by the difficulty of considering all relevant facts that are contained in the millions of documents available from PubMed. Semantic Web provides tools for sharing prior knowledge, while information retrieval and information extraction techniques enable its extraction from literature. Their combination makes prior knowledge available for computational analysis and inference. While some tools provide complete solutions that limit the control over the modeling and extraction processes, we seek a methodology that supports control by the experimenter over these critical processes. Results: We describe progress towards automated support for the generation of biomolecular hypotheses. Semantic Web technologies are used to structure and store knowledge, while a workflow extracts knowledge from text. We designed minimal proto-ontologies in OWL for capturing different aspects of a text mining experiment: the biological hypothesis, text and documents, text mining, and workflow provenance. The models fit a methodology that allows focus on the requirements of a single experiment while supporting reuse and posterior analysis of extracted knowledge from multiple experiments. Our workflow is composed of services from the 'Adaptive Information Disclosure Application' (AIDA) toolkit as well as a few others. The output is a semantic model with putative biological relations, with each relation linked to the corresponding evidence. Conclusion: We demonstrated a 'do-it-yourself' approach for structuring and extracting knowledge in the context of experimental research on biomolecular mechanisms. The methodology can be used to bootstrap the construction of semantically rich biological models using the results of knowledge extraction processes. Models specific to particular experiments can be constructed that, in turn, link with other semantic models, creating a web of knowledge that spans experiments. Mapping mechanisms can link to other knowledge resources such as OBO ontologies or SKOS vocabularies. AIDA Web Services can be used to design personalized knowledge extraction procedures. In our example experiment, we found three proteins (NF-Kappa B, p21, and Bax) potentially playing a role in the interplay between nutrients and epigenetic gene regulation

Crossref

VU Research Portal

Springer - Publisher Connector

PubMed Central

EUR Research Repository

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Integrating information to bootstrap information extraction from web sites

Author: Ciravegna Fabio
Dingli Alexiei
Guthrie David
IJCAI-03 Workshop on Information Integration on the Web
Wilks Yorick
Publication venue: International Joint Conferences on Artificial Intelligence Organization
Publication date: 01/01/2005
Field of study

In this paper we propose a methodology to learn to extract domain-specific information from large repositories (e.g. the Web) with minimum user intervention. Learning is seeded by integrating information from structured sources (e.g. databases and digital libraries). Retrieved information is then used to bootstrap learning for simple Information Extraction (IE) methodologies, which in turn will produce more annotation to train more complex IE engines. All the corpora for training the IE en- gines are produced automatically by integrating in- formation from different sources such as available corpora and services (e.g. databases or digital libraries, etc.). User intervention is limited to providing an initial URL and adding information missed by the different modules when the computation has finished. The information added or delete by the user can then be reused providing further training and therefore getting more information (recall) and/or more precision. We are currently applying this methodology to mining web sites of Computer Science departments.peer-reviewe

OAR@UM