Search CORE

10,703 research outputs found

Realization of Semantic Atom Blog

Author: Khuba Sidheshwar A.
Patel Dhiren R.
Publication venue
Publication date: 01/12/2009
Field of study

Web blog is used as a collaborative platform to publish and share information. The information accumulated in the blog intrinsically contains the knowledge. The knowledge shared by the community of people has intangible value proposition. The blog is viewed as a multimedia information resource available on the Internet. In a blog, information in the form of text, image, audio and video builds up exponentially. The multimedia information contained in an Atom blog does not have the capability, which is required by the software processes so that Atom blog content can be accessed, processed and reused over the Internet. This shortcoming is addressed by exploring OWL knowledge modeling, semantic annotation and semantic categorization techniques in an Atom blog sphere. By adopting these techniques, futuristic Atom blogs can be created and deployed over the Internet

arXiv.org e-Print Archive

Interpr\'etation vague des contraintes structurelles pour la RI dans des corpus de documents XML - \'Evaluation d'une m\'ethode approch\'ee de RI structur\'ee

Author: Marteau Pierre-François
Ménier Gilbas
Popovici Eugen
Publication venue: 'Lavoisier'
Publication date: 30/06/2008
Field of study

We propose specific data structures designed to the indexing and retrieval of information elements in heterogeneous XML data bases. The indexing scheme is well suited to the management of various contextual searches, expressed either at a structural level or at an information content level. The approximate search mechanisms are based on a modified Levenshtein editing distance and information fusion heuristics. The implementation described highlights the mixing of structured information presented as field/value instances and free text elements. The retrieval performances of the proposed approach are evaluated within the INEX 2005 evaluation campaign. The evaluation results rank the proposed approach among the best evaluated XML IR systems for the VVCAS task.Comment: 26 pages, ISBN 978-2-7462-1969-

arXiv.org e-Print Archive

A random forest system combination approach for error detection in digital dictionaries

Author: Bloodgood Michael
Doermann David
Rodrigues Paul
Ye Peng
Zajic David
Publication venue
Publication date: 30/10/2014
Field of study

When digitizing a print bilingual dictionary, whether via optical character recognition or manual entry, it is inevitable that errors are introduced into the electronic version that is created. We investigate automating the process of detecting errors in an XML representation of a digitized print dictionary using a hybrid approach that combines rule-based, feature-based, and language model-based methods. We investigate combining methods and show that using random forests is a promising approach. We find that in isolation, unsupervised methods rival the performance of supervised methods. Random forests typically require training data so we investigate how we can apply random forests to combine individual base methods that are themselves unsupervised without requiring large amounts of training data. Experiments reveal empirically that a relatively small amount of data is sufficient and can potentially be further reduced through specific selection criteria.Comment: 9 pages, 7 figures, 10 tables; appeared in Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data, April 201

arXiv.org e-Print Archive

CiteSeerX

Web Service Discovery in a Semantically Extended UDDI Registry: the Case of FUSION

Author: A Alazcib
A Bouras
D Martin
E Christensen
J Cardoso
J Colgrave
JD Bruijn
L Li
M Paolucci
R Akkiraju
R Akkiraju
R Lara
U Keller
UDDI Version 2.04 API
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Service-oriented computing is being adopted at an unprecedented rate, making the effectiveness of automated service discovery an increasingly important challenge. UDDI has emerged as a de facto industry standard and fundamental building block within SOA infrastructures. Nevertheless, conventional UDDI registries lack means to provide unambiguous, semantically rich representations of Web service capabilities, and the logic inference power required for facilitating automated service discovery. To overcome this important limitation, a number of approaches have been proposed towards augmenting Web service discovery with semantics. This paper discusses the benefits of semantically extending Web service descriptions and UDDI registries, and presents an overview of the approach put forward in project FUSION, towards semantically-enhanced publication and discovery of services based on SAWSDL

Data mining and fusion

Author: Addis M. J.
Choi F.
Taylor S. J.
Upstill C.
Watkins E. R.
Publication venue: s.n.
Publication date: 01/04/2006
Field of study

Southampton (e-Prints Soton)