9,568 research outputs found
Meeting of the MINDS: an information retrieval research agenda
Since its inception in the late 1950s, the field of Information Retrieval (IR) has developed tools that help people find, organize, and analyze information. The key early influences on the field are well-known. Among them are H. P. Luhn's pioneering work, the development of the vector space retrieval model by Salton and his students, Cleverdon's development of the Cranfield experimental methodology, Spärck Jones' development of idf, and a series of probabilistic retrieval models by Robertson and Croft. Until the development of the WorldWideWeb (Web), IR was of greatest interest to professional information analysts such as librarians, intelligence analysts, the legal community, and the pharmaceutical industry
Scenarios and research issues for a network of information
This paper describes ideas and items of work within the
framework of the EU-funded 4WARD project. We present
scenarios where the current host-centric approach to infor-
mation storage and retrieval is ill-suited for and explain
how a new networking paradigm emerges, by adopting the
information-centric network architecture approach, which
we call Network of Information (NetInf). NetInf capital-
izes on a proposed identifier/locator split and allows users
to create, distribute, and retrieve information using a com-
mon infrastructure without tying data to particular hosts.
NetInf introduces the concepts of information and data ob-
jects. Data objects correspond to the particular bits and
bytes of a digital object, such as text file, a specific encod-
ing of a song or a video. Information objects can be used
to identify other objects irrespective of their particular dig-
ital representation. After discussing the benefits of such an
indirection, we consider the impact of NetInf with respect
to naming and governance in the Future Internet. Finally,
we provide an outlook on the research scope of NetInf along
with items for future work
TiFi: Taxonomy Induction for Fictional Domains [Extended version]
Taxonomies are important building blocks of structured knowledge bases, and their construction from text sources and Wikipedia has received much attention. In this paper we focus on the construction of taxonomies for fictional domains, using noisy category systems from fan wikis or text extraction as input. Such fictional domains are archetypes of entity universes that are poorly covered by Wikipedia, such as also enterprise-specific knowledge bases or highly specialized verticals. Our fiction-targeted approach, called TiFi, consists of three phases: (i) category cleaning, by identifying candidate categories that truly represent classes in the domain of interest, (ii) edge cleaning, by selecting subcategory relationships that correspond to class subsumption, and (iii) top-level construction, by mapping classes onto a subset of high-level WordNet categories. A comprehensive evaluation shows that TiFi is able to construct taxonomies for a diverse range of fictional domains such as Lord of the Rings, The Simpsons or Greek Mythology with very high precision and that it outperforms state-of-the-art baselines for taxonomy induction by a substantial margin
Minding our ps and qs: Issues of property, provenance, quantity and quality in institutional repositories
The development of institutional repositories has opened the path to the mass availability of peer-reviewed scholarly information and the extension of information democracy to the
academic domain. A secondary space of free-to-all documents has begun to parallel the hitherto-closed world of journal publishing and many publishers have consented to the inclusion of copyrighted documents in digital repositories, although frequently specifying that a version other than the formally-published one be used. This paper will conceptually examine the complex interplay of rights, permissions and versions between publishers and repositories, focussing on the New Zealand situation and the challenges faced by university repositories in recruiting high-quality peer-reviewed documents for the open access domain. A brief statistical snapshot of the appearance of material from significant publishers in repositories will be used to gauge the progress that has been made towards broadening information availability. The paper will also look at the importance of harvesting and dissemination, in particular the role of Google Scholar in bringing research information within reach of ordinary internet users. The importance of accuracy, authority, provenance and transparency in the presentation of research-based information and the important role that librarians can and should play in optimising the open research discovery experience will be emphasised
Knowledge Representation with Ontologies: The Present and Future
Recently, we have seen an explosion of interest in ontologies as
artifacts to represent human knowledge and as critical components in
knowledge management, the semantic Web, business-to-business
applications, and several other application areas. Various research
communities commonly assume that ontologies are the appropriate modeling
structure for representing knowledge. However, little discussion has
occurred regarding the actual range of knowledge an ontology can
successfully represent
Taxonomies for Development
{Excerpt} Organizations spend millions of dollars on management systems without commensurate investments in the categorization needed to organize the information they rest on. Taxonomy work is strategic work: it enables efficient and interoperable retrieval and sharing of data, information, and knowledge by building needs and natural workflows in intuitive structures.
Bible readers think that taxonomy is the world’s oldest profession. Whatever the case, the word is now synonymous with any hierarchical system of classification that orders domains of inquiry into groups and signifies natural relationships among these. (A taxonomic scheme is often depicted as a “tree” and individual taxonomic units as “branches” in the tree.) Almost anything can be classified according to some taxonomic scheme. Resulting catalogs provide conceptual frameworks for miscellaneous purposes including knowledge identification, creation, storage, sharing, and use, including related decision making
CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap
After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in
multimedia search engines, we have identified and analyzed gaps within European research effort during our second year.
In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio-
economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown
of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on
requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the
community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our
Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as
National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core
technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research
challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal
challenges
A usability approach to improving the user experience in web directories
PhDWeb directories are hierarchically organised website collections that offer users subjectbased
access to the Web. They played a significant part in navigating the Web in the past
but their role has been weakened in recent years due to their cumbersome expanding
collections. This thesis presents a unified framework combining the advantages of
personalisation and redefined directory search for improving the usability of Web
directories.
The thesis begins with an examination of classification schemes that identifies the
rigidity of hierarchical classifications and their suitability for Web directories in contrast
to faceted classifications. This leads on to an Ontological Sketch Modelling (OSM) case
study which identifies the misfits affecting user navigation in Web directories from
known rigidity issues. The thesis continues with a review of personalisation techniques
and a discussion of the user search model of Web directories following the suggested
directions of improvement from the case study. A proposed user-centred framework to
improve the usability of Web directories which consists of an individual content-based
personalisation model and a redefined search model is then implemented as D-Persona
and D-Search respectively. The remainder of the thesis is concerned with a usability test
of D-Persona and D-Search aimed at discovering the efficiency, effectiveness and user
satisfaction of the solution. This involves an experimental design, test results and
discussions for the comparative user study.
This thesis extracts a formal definition of the rigidity of hierarchies from their
characteristics and justifies why hierarchies are still better suited than facets in
organising Web directories. Second, it identifies misfits causing poor usability in Web
directories based on the discovered rigidity of hierarchies. Third, it proposes a solution
to tackle the misfits and improve the usability of Web directories which has been
experimentally proved to be successful
- …