Search CORE

3,236 research outputs found

Retrieving with good sense

Author: Sanderson M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2000
Field of study

Although always present in text, word sense ambiguity only recently became regarded as a problem to information retrieval which was potentially solvable. The growth of interest in word senses resulted from new directions taken in disambiguation research. This paper first outlines this research and surveys the resulting efforts in information retrieval. Although the majority of attempts to improve retrieval effectiveness were unsuccessful, much was learnt from the research. Most notably a notion of under what circumstance disambiguation may prove of use to retrieval

CiteSeerX

White Rose Research Online

The SIMBAD astronomical database

Author: Bonnarel Francois
Borde Suzanne
Dubois Pascal
Egret Daniel
Genova Francoise
Jasniewicz Gerard
Laloe Suzanne
Lesteven Soizick
Monier Richard
Ochsenbein Francois
Wenger Marc
Publication venue: 'EDP Sciences'
Publication date: 04/02/2000
Field of study

Simbad is the reference database for identification and bibliography of astronomical objects. It contains identifications, `basic data', bibliography, and selected observational measurements for several million astronomical objects. Simbad is developed and maintained by CDS, Strasbourg. Building the database contents is achieved with the help of several contributing institutes. Scanning the bibliography is the result of the collaboration of CDS with bibliographers in Observatoire de Paris (DASGAL), Institut d'Astrophysique de Paris, and Observatoire de Bordeaux. When selecting catalogues and tables for inclusion, priority is given to optimal multi-wavelength coverage of the database, and to support of research developments linked to large projects. In parallel, the systematic scanning of the bibliography reflects the diversity and general trends of astronomical research. A WWW interface to Simbad is available at: http://simbad.u-strasbg.fr/SimbadComment: 14 pages, 5 Postscript figures; to be published in A&A

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

HAL-INSU

HAL-OBSPM

CERN Document Server

Virtual WWW Documents: a Concept to Explicit the Structure of WWW Sites

Author: Beigbeder Michel
Bich-Liên Doan
Publication venue: British Computer Society
Publication date: 19/04/1999
Field of study

http://www.emse.fr/~beigbeder/PUBLIS/1999-BCS-IRSG-p185-doan-v1.pdfInternational audienceThis paper shows a new concept of a virtual WWW document (VWD), as a set of WWW pages representing a logical information space, generally dealing with one particular domain. The VWD is described using metadata in the XML syntax and will be accessed through a metadata.class file, stored at the root level of WWW sites. We'll suggest how the VWD can improve information retrieval on the WWW and reduce the network load generated by the robots. We describe a prototype implemented in JAVA, within an application in the environmental domain. The exchanges of such metadata lay in a flexible architecture based on two kinds of robots : generalists and specialists that collect and organize this metadata, in order to localize the resources on the WWW. They will contribute to the overall auto-organizing information process by exchanging their indices, therefore forwarding their knowledge each other

Crossref

HAL-EMSE

Recommended from our members

Classification design : understanding the decisions between theory and consequence

Author: Bullard Julia Amber
Publication venue
Publication date: 05/02/2018
Field of study

Classification systems are systems of terms and term relationships intended to sort and gather like concepts and documents. These systems are ubiquitous as the substrate of our interactions with library collections, retail websites, and bureaucracies. Through their design and impact, classification systems share with other technologies an unavoidable though often ignored relationship to politics, power, and authority (Fleischmann & Wallace, 2007). Despite concern among scholars that classification systems embody values and bias, there is little work examining how these qualities are built into a classification system. Specifically, we do not adequately understand classification construction, in which classification designers make decisions by applying classification theory to the specific context of a project (Park, 2008). If systems embody values— particularly values that might either cause harm (Berman, 1971) or provide an additional means of communicating the creator’s position (Feinberg, 2007)— we must understand how and when the system takes on these qualities. This dissertation bridges critical classification theory with design-oriented classification theory. Where critical classification theory is concerned with the outcomes of classification system design, design-oriented classification theory is concerned with the correct processes by which to build a classification system. To connect the consequences of classification system design to designers’ methods and intentions, I use the research lens of infrastructure studies, particularly infrastructural inversion (Star & Ruhleder, 1996) or making visible the work behind infrastructures such as classification systems. Accordingly, my research focuses on designers’ decisions and rethinks our assumptions regarding the factors that classification designers consider in making their design decisions. I adopted an ethnographic approach to the study of classification design that would make visible design decisions and designers’ consideration of factors. Using this approach, I studied the daily design work of volunteer classification designers who maintain a curated folksonomy. Using the grounded theory method (Strauss & Corbin, 1998), I analyzed the designers’ decisions. My analysis identified the implications of the designers’ convergences and divergences from established classification methods for the character of the system and for the connection between classification theory and classification methods. I show how the factors—and the prioritization of factors—that these designers considered in making their decisions were consistent with the values and needs of the community. Therefore, I argue that classification designers have an important role in creating the values or bias of a classification system. In particular, designers’ divergence from universal guidelines and designers’ choices among sources of evidence represent opportunities to align a classification system to its community. I recommend that classification research focus on such instances of divergence and choice to understand the connection between classification design and the values of classification systems. The Introduction motivates the problem space around values in classification systems and outlines my approach in focusing on classification design. The Literature Review outlines the dominant theories in classification scholarship according to three elements of classification design: what decisions designers make, what information designers use in their decisions, and what skills designers apply to their decisions. In the Methods chapter, I introduce the site of my ethnographic research (The Fanwork Repository), detail my ethnographic methods, summarize the types of data I collected, and describe my grounded analysis. Three findings chapters examine one type of complex decision each: Names, Works, and Guidelines, respectively. In the fourth findings chapter, Synthesis, I define 10 factors designers considered across these complex design decisions. I then discuss how the factors figured into complex design decisions, how the factors overlapped and conflicted in design decisions, and how designers understood their role in making complex design decisions. In the Discussion chapter I connect the findings from the site of my ethnography to classification scholarship. In the Conclusion, I consider the contribution of examining classification systems as infrastructure, highlight the differences in accounts of classification design decisions made visible through classification theory and infrastructure studies approaches, and present suggestions for future research in classification design and the study of classification systems as infrastructure.Informatio

Texas ScholarWorks

Personalization of tagging systems

Author: Clements M. (Maarten)
Reinders M.J.T.
Vries A.P. (Arjen) de
Wang J. (Jun)
Yang J.
Publication venue: Pergamon
Publication date: 01/01/2010
Field of study

Social media systems have encouraged end user participation in the Internet, for the purpose of storing and distributing Internet content, sharing opinions and maintaining relationships. Collaborative tagging allows users to annotate the resulting user-generated content, and enables effective retrieval of otherwise uncategorised data. However, compared to professional web content production, collaborative tagging systems face the challenge that end-users assign tags in an uncontrolled manner, resulting in unsystematic and inconsistent metadata. This paper introduces a framework for the personalization of social media systems. We pinpoint three tasks that would benefit from personalization: collaborative tagging, collaborative browsing and collaborative s

CWI's Institutional Repository

Guided generation of pedagogical concept maps from the Wikipedia

Author: Lahti Lauri
Publication venue: Association for the Advancement of Computing in Education (AACE)
Publication date: 01/01/2009
Field of study

We propose a new method for guided generation of concept maps from open accessonline knowledge resources such as Wikies. Based on this method we have implemented aprototype extracting semantic relations from sentences surrounding hyperlinks in the Wikipedia’sarticles and letting a learner to create customized learning objects in real-time based oncollaborative recommendations considering her earlier knowledge. Open source modules enablepedagogically motivated exploration in Wiki spaces, corresponding to an intelligent tutoringsystem. The method extracted compact noun–verb–noun phrases, suggested for labeling arcsbetween nodes that were labeled with article titles. On average, 80 percent of these phrases wereuseful while their length was only 20 percent of the length of the original sentences. Experimentsindicate that even simple analysis algorithms can well support user-initiated information retrievaland building intuitive learning objects that follow the learner’s needs.Peer reviewe

Aaltodoc Publication Archive

Applying Wikipedia to Interactive Information Retrieval

Author: Milne David N.
Publication venue: 'University of Waikato'
Publication date: 15/09/2010
Field of study

There are many opportunities to improve the interactivity of information retrieval systems beyond the ubiquitous search box. One idea is to use knowledge bases—e.g. controlled vocabularies, classification schemes, thesauri and ontologies—to organize, describe and navigate the information space. These resources are popular in libraries and specialist collections, but have proven too expensive and narrow to be applied to everyday webscale search. Wikipedia has the potential to bring structured knowledge into more widespread use. This online, collaboratively generated encyclopaedia is one of the largest and most consulted reference works in existence. It is broader, deeper and more agile than the knowledge bases put forward to assist retrieval in the past. Rendering this resource machine-readable is a challenging task that has captured the interest of many researchers. Many see it as a key step required to break the knowledge acquisition bottleneck that crippled previous efforts. This thesis claims that the roadblock can be sidestepped: Wikipedia can be applied effectively to open-domain information retrieval with minimal natural language processing or information extraction. The key is to focus on gathering and applying human-readable rather than machine-readable knowledge. To demonstrate this claim, the thesis tackles three separate problems: extracting knowledge from Wikipedia; connecting it to textual documents; and applying it to the retrieval process. First, we demonstrate that a large thesaurus-like structure can be obtained directly from Wikipedia, and that accurate measures of semantic relatedness can be efficiently mined from it. Second, we show that Wikipedia provides the necessary features and training data for existing data mining techniques to accurately detect and disambiguate topics when they are mentioned in plain text. Third, we provide two systems and user studies that demonstrate the utility of the Wikipedia-derived knowledge base for interactive information retrieval

Research Commons@Waikato