22,515 research outputs found
Recommended from our members
Extending Faceted Search to the Open-Domain Web
Faceted search enables users to navigate a multi-dimensional information space by combining keyword search with drill-down options in each facets. For example, when searching “computer monitor”\u27 in an e-commerce site, users can select brands and monitor types from the the provided facets {“Samsung”, “Dell”, “Acer”, ...} and {“LET-Lit”, “LCD”, “OLED”, ...}. It has been used successfully for many vertical applications, including e-commerce and digital libraries. However, this idea is not well explored for general web search in an open-domain setting, even though it holds great potential for assisting multi-faceted queries and exploratory search.
The goal of this work is to explore this potential by extending faceted search into the open-domain web setting, which we call Faceted Web Search. We address three fundamental issues in Faceted Web Search, namely: how to automatically generate facets (facet generation); how to re-organize search results with users\u27 selections on facets (facet feedback); and how to evaluate generated facets and entire Faceted Web Search systems.
In conventional faceted search, facets are generated in advance for an entire corpus either manually or semi-automatically, and then recommended for particular queries in most of the previous work. However, this approach is difficult to extend to the entire web due to the web\u27s large and heterogeneous nature. We instead propose a query-dependent approach, which extracts facets for queries from their web search results. We further improve our facet generation model under a more practical scenario, where users care more about precision of presented facets than recall.
The dominant facet feedback method in conventional faceted search is Boolean filtering, which filters search results by users\u27 selections on facets. However, our investigation shows Boolean filtering is too strict when extended to the open-domain setting. Thus, we propose soft ranking models for Faceted Web Search, which expand original queries with users\u27 selections on facets to re-rank search results. Our experiments show that the soft ranking models are more effective than Boolean filtering models for Faceted Web Search.
To evaluate Faceted Web Search, we propose both intrinsic evaluation, which evaluates facet generation on its own, and extrinsic evaluation, which evaluates an entire Faceted Web Search system by its utility in assisting search clarification. We also design a method for building reusable test collections for such evaluations. Our experiments show that using the Faceted Web Search interface can significantly improve the original ranking if allowed sufficient time for user feedback on facets
Extending a geo-catalogue with matching capabilities
To achieve semantic interoperability, geo-spatial applications need to be equipped with tools able to understand user terminology that is typically different from the one enforced by standards. In this paper we summarize our experience in providing a semantic extension to the geo-catalogue of the Autonomous Province of Trento (PAT) in Italy. The semantic extension is based on the adoption of the S-Match semantic matching tool and on the use of a specifically designed faceted ontology codifying domain specific knowledge. We also briefly report our experience in the integration of the ontology with the geo-spatial ontology GeoWordNet
Living Knowledge
Diversity, especially manifested in language and knowledge, is a function of local goals, needs, competences, beliefs, culture, opinions and personal experience. The Living Knowledge project considers diversity as an asset rather than a problem. With the project, foundational ideas emerged from the synergic contribution of different disciplines, methodologies (with which many partners were previously unfamiliar) and technologies flowed in concrete diversity-aware applications such as the Future Predictor and the Media Content Analyser providing users with better structured information while coping with Web scale complexities. The key notions of diversity, fact, opinion and bias have been defined in relation to three methodologies: Media Content Analysis (MCA) which operates from a social sciences perspective; Multimodal Genre Analysis (MGA) which operates from a semiotic perspective and Facet Analysis (FA) which operates from a knowledge representation and organization perspective. A conceptual architecture that pulls all of them together has become the core of the tools for automatic extraction and the way they interact. In particular, the conceptual architecture has been implemented with the Media Content Analyser application. The scientific and technological results obtained are described in the following
Towards improving web service repositories through semantic web techniques
The success of the Web services technology has brought topicsas software reuse and discovery once again on the agenda of software engineers. While there are several efforts towards automating Web service discovery and composition, many developers still search for services
via online Web service repositories and then combine them manually. However, from our analysis of these repositories, it yields that, unlike traditional software libraries, they rely on little metadata to support
service discovery. We believe that the major cause is the difficulty of automatically deriving metadata that would describe rapidly changing Web service collections. In this paper, we discuss the major shortcomings of state of the art Web service repositories and, as a solution, we
report on ongoing work and ideas on how to use techniques developed in the context of the Semantic Web (ontology learning, mapping, metadata based presentation) to improve the current situation
Developing an open data portal for the ESA climate change initiative
We introduce the rationale for, and architecture of, the European Space Agency Climate Change Initiative (CCI) Open Data Portal (http://cci.esa.int/data/). The Open Data Portal hosts a set of richly diverse datasets – 13 “Essential Climate Variables” – from the CCI programme in a consistent and harmonised form and to provides a single point of access for the (>100 TB) data for broad dissemination to an international user community. These data have been produced by a range of different institutions and vary across both scientific and spatio-temporal characteristics. This heterogeneity of the data together with the range of services to be supported presented significant technical challenges.
An iterative development methodology was key to tackling these challenges: the system developed exploits a workflow which takes data that conforms to the CCI data specification, ingests it into a managed archive and uses both manual and automatically generated metadata to support data discovery, browse, and delivery services. It utilises both Earth System Grid Federation (ESGF) data nodes and the Open Geospatial Consortium Catalogue Service for the Web (OGC-CSW) interface, serving data into both the ESGF and the Global Earth Observation System of Systems (GEOSS). A key part of the system is a new vocabulary server, populated with CCI specific terms and relationships which integrates OGC-CSW and ESGF search services together, developed as part of a dialogue between domain scientists and linked data specialists. These services have enabled the development of a unified user interface for graphical search and visualisation – the CCI Open Data Portal Web Presence
Evaluating tag-based information access in image collections
The availability of social tags has greatly enhanced access to information. Tag clouds have emerged as a new "social" way to find and visualize information, providing both one-click access to information and a snapshot of the "aboutness" of a tagged collection. A range of research projects explored and compared different tag artifacts for information access ranging from regular tag clouds to tag hierarchies. At the same time, there is a lack of user studies that compare the effectiveness of different types of tag-based browsing interfaces from the users point of view. This paper contributes to the research on tag-based information access by presenting a controlled user study that compared three types of tag-based interfaces on two recognized types of search tasks - lookup and exploratory search. Our results demonstrate that tag-based browsing interfaces significantly outperform traditional search interfaces in both performance and user satisfaction. At the same time, the differences between the two types of tag-based browsing interfaces explored in our study are not as clear. Copyright 2012 ACM
- …