Search CORE

189,190 research outputs found

BlogForever D2.6: Data Extraction Methodology

Author: Banos V.
Davis R.
Gkotsis G.
Pincent E.
Stepanyan K.
Publication venue
Publication date: 25/10/2013
Field of study

This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Developing information architecture through records management classification techniques

Author: Milne Christopher
Publication venue
Publication date: 23/06/2009
Field of study

Purpose – This work aims to draw attention to information retrieval philosophies and techniques allied to the records management profession, advocating a wider professional consideration of a functional approach to information management, in this instance in the development of information architecture. Design/methodology/approach – The paper draws from a hypothesis originally presented by the author that advocated a viewpoint whereby the application of records management techniques, traditionally applied to develop business classification schemes, was offered as an additional solution to organising information resources and services (within a university intranet), where earlier approaches, notably subject- and administrative-based arrangements, were found to be lacking. The hypothesis was tested via work-based action learning and is presented here as an extended case study. The paper also draws on evidence submitted to the Joint Information Systems Committee in support of the Abertay University's application for consideration for the JISC award for innovation in records and information management. Findings – The original hypothesis has been tested in the workplace. Information retrieval techniques, allied to records management (functional classification), were the main influence in the development of pre- and post-coordinate information retrieval systems to support a wider information architecture, where the subject approach was found to be lacking. Their use within the workplace has since been extended. Originality/value – The paper advocates that the development of information retrieval as a discipline should include a wider consideration of functional classification, as this alternative to the subject approach is largely ignored in mainstream IR works

Abertay Research Portal

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Unifying an Introduction to Artificial Intelligence Course through Machine Learning Laboratory Experiences

Author: Coleman Susan
Georgiopoulos Michael
Markov Zdravko
Neller Todd W.
Russell Ingrid
Publication venue: The Cupola: Scholarship at Gettysburg College
Publication date: 01/01/2005
Field of study

This paper presents work on a collaborative project funded by the National Science Foundation that incorporates machine learning as a unifying theme to teach fundamental concepts typically covered in the introductory Artificial Intelligence courses. The project involves the development of an adaptable framework for the presentation of core AI topics. This is accomplished through the development, implementation, and testing of a suite of adaptable, hands-on laboratory projects that can be closely integrated into the AI course. Through the design and implementation of learning systems that enhance commonly-deployed applications, our model acknowledges that intelligent systems are best taught through their application to challenging problems. The goals of the project are to (1) enhance the student learning experience in the AI course, (2) increase student interest and motivation to learn AI by providing a framework for the presentation of the major AI topics that emphasizes the strong connection between AI and computer science and engineering, and (3) highlight the bridge that machine learning provides between AI technology and modern software engineering

Gettysburg College

Automated user modeling for personalized digital libraries

Author: Aihara
Angiulli
Belkin
Bezdek
Blum
Costabile
Cristianini
E. Frias-Martinez
Fausett
Ford
Friedman
G. Magoulas
Hartigan
Haykin
Jain
Kobsa
Krishnapuram
Magoulas
Manber
Mitchell
Montaner
R. Macredie
Rabiner
Ramsey
Riecken
S. Chen
Sarukkai
Tsukada
Webb
Winter
Witten
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

Digital libraries (DL) have become one of the most typical ways of accessing any kind of digitalized information. Due to this key role, users welcome any improvements on the services they receive from digital libraries. One trend used to improve digital services is through personalization. Up to now, the most common approach for personalization in digital libraries has been user-driven. Nevertheless, the design of efficient personalized services has to be done, at least in part, in an automatic way. In this context, machine learning techniques automate the process of constructing user models. This paper proposes a new approach to construct digital libraries that satisfy user’s necessity for information: Adaptive Digital Libraries, libraries that automatically learn user preferences and goals and personalize their interaction using this information

CiteSeerX

Crossref

Birkbeck Institutional Research Online

Brunel University Research Archive

Conceptual Linking: Ontology-based Open Hypermedia

Author: Bechhofer Sean
Carr Leslie
Goble Carole
Hall Wendy
Publication venue
Publication date: 01/01/2001
Field of study

This paper describes the attempts of the COHSE project to define and deploy a Conceptual Open Hypermedia Service. Consisting of • an ontological reasoning service which is used to represent a sophisticated conceptual model of document terms and their relationships; • a Web-based open hypermedia link service that can offer a range of different link-providing facilities in a scalable and non-intrusive fashion; and integrated to form a conceptual hypermedia system to enable documents to be linked via metadata describing their contents and hence to improve the consistency and breadth of linking of WWW documents at retrieval time (as readers browse the documents) and authoring time (as authors create the documents)

Southampton (e-Prints Soton)

Conceptual Linking: Ontology-based Open Hypermedia

Author: Bechhofer Sean
Carr Leslie
Goble Carole
Hall Wendy
Publication venue
Publication date: 01/01/2001
Field of study

CiteSeerX

Southampton (e-Prints Soton)

The University of Manchester - Institutional Repository

BlogForever D3.2: Interoperability Prospects

Author: Banos V.
Berninger L.
Kalb H.
Kim Y.
Kopidaki S.
Lazaridou P.
Pinsent E.
Ross S.
Publication venue
Publication date: 25/10/2013
Field of study

This report evaluates the interoperability prospects of the BlogForever platform. Therefore, existing interoperability models are reviewed, a Delphi study to identify crucial aspects for the interoperability of web archives and digital libraries is conducted, technical interoperability standards and protocols are reviewed regarding their relevance for BlogForever, a simple approach to consider interoperability in specific usage scenarios is proposed, and a tangible approach to develop a succession plan that would allow a reliable transfer of content from the current digital archive to other digital repositories is presented

ZENODO

Variation of word frequencies across genre classification tasks

Author: Kim Y.
Ross S.
Publication venue: GEIE-ERCIM
Publication date: 01/01/2007
Field of study

This paper examines automated genre classification of text documents and its role in enabling the effective management of digital documents by digital libraries and other repositories. Genre classification, which narrows down the possible structure of a document, is a valuable step in realising the general automatic extraction of semantic metadata essential to the efficient management and use of digital objects. In the present report, we present an analysis of word frequencies in different genre classes in an effort to understand the distinction between independent classification tasks. In particular, we examine automated experiments on thirty-one genre classes to determine the relationship between the word frequency metrics and the degree of its significance in carrying out classification in varying environments

Enlighten