Search CORE

23 research outputs found

Determining and satisfying search users real needs via socially constructed search concept classification

Author: Dreher Heinz
Zhu Dengya
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

The focus of the research is to disambiguate search query by categorizing search results returned by search engines and interacting with the user to achieve query and results refinement. A novel special search-browser has been developed which combines search engine results, the Open DirectoryProject (ODP) based lightweight ontology as navigator and classifier, and search results categorizing. Categories are formed based on the ODP as a predefined ontology and Lucene is to be employed to calculate the similarity between retrieved items of the search engine and concepts in the ODP. With theinteraction of users, the search-browser improves the quality of search results by excluding the irrelevant documents and ontologically categorizing results for user inspection

Crossref

espace@Curtin

Analyse de l'ambiguïté des requêtes utilisateurs par catégorisation thématique.

Author: Lalleman Fanny
Publication venue: HAL CCSD
Publication date: 01/07/2011
Field of study

International audienceDans cet article, nous cherchons à identiﬁer la nature de l'ambiguïté des requêtes utilisateurs issues d'un moteur de recherche dédié à l'actualité, 2424actu.fr, en utilisant une tâche de catégorisation. Dans un premier temps, nous verrons les différentes formes de l'ambiguïté des requêtes déjà décrites dans les travaux de TAL. Nous confrontons la vision lexicographique de l'ambiguïté à celle décrite par les techniques de classiﬁcation appliquées à la recherche d'information. Dans un deuxième temps, nous appliquons une méthode de catégorisation thématique aﬁn d'explorer l'ambiguïté des requêtes, celle-ci nous permet de conduire une analyse sémantique de ces requêtes, en intégrant la dimension temporelle propre au contexte des news. Nous proposons une typologie des phénomènes d'ambiguïté basée sur notre analyse sémantique. Enﬁn, nous comparons l'exploration par catégorisation à une ressource comme Wikipédia, montrant concrètement les divergences des deux approches

Scientific Publications of the University of Toulouse II Le Mirail

HAL Descartes

Recommended from our members

Discovering web services to specify more complete system requirements

Author: G. Salton
H. Schutze
K. Zachos
M. Stevenson
N. Leavitt
S. Robertson
S.V. Jones
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Service-centric systems pose new challenges and opportunities for requirements processes and techniques. This paper reports new techniques developed by the EU-funded SeCSE Integrated Project that enable service discovery during early requirements processes and exploit discovered services to enhance requirements specifications. The paper describes the algorithm for discovering services from requirements expressed using structured natural language, and demonstrates it using an automotive example. The paper also reports a first evaluation of the utility of the environment that implements this algorithm when improving the specification of requirements with retrieved services

City Research Online

Crossref

Détecter le potentiel d'ambiguïté d'une requête - le cas des recherches portant sur l'actualité

Author: Fabre Cécile
Heinecke Johannes
Lalleman Fanny
Publication venue: HAL CCSD
Publication date: 01/01/2012
Field of study

International audienceL'objectif du travail que nous présentons ici est d'examiner la notion d'ambigüité à travers l'étude des requêtes produites dans un système de RI, le site 2424actu.fr d'Orange, opérationnel du 1/10/2009 au 1/09/2011. Celui-ci vise le traitement d'une base de documents relatifs à l'actualité française, domaine particulièrement mouvant et par conséquent propice à l'examen de la question de l'ambiguïté. Nous cherchons à déterminer la nature de l'ambiguïté des requêtes en examinant les logs de requêtes disponibles et en les confrontant à différents indices contextuels qui enrichissent la perception de la variabilité sémantique des termes de la requête

Scientific Publications of the University of Toulouse II Le Mirail

EDP Sciences OAI-PMH repository (1.2.0)

HAL Descartes

Automatic Concept Extraction in Semantic Summarization Process

Author: Antonella Carbonaro
Publication venue: 'IntechOpen'
Publication date: 01/01/2012
Field of study

The Semantic Web offers a generic infrastructure for interchange, integration and creative reuse of structured data, which can help to cross some of the boundaries that Web 2.0 is facing. Currently, Web 2.0 offers poor query possibilities apart from searching by keywords or tags. There has been a great deal of interest in the development of semantic-based systems to facilitate knowledge representation and extraction and content integration [1], [2]. Semantic-based approach to retrieving relevant material can be useful to address issues like trying to determine the type or the quality of the information suggested from a personalized environment. In this context, standard keyword search has a very limited effectiveness. For example, it cannot filter for the type of information, the level of information or the quality of information. Potentially, one of the biggest application areas of content-based exploration might be personalized searching framework (e.g., [3],[4]). Whereas search engines provide nowadays largely anonymous information, new framework might highlight or recommend web pages related to key concepts. We can consider semantic information representation as an important step towards a wide efficient manipulation and retrieval of information [5], [6], [7]. In the digital library community a flat list of attribute/value pairs is often assumed to be available. In the Semantic Web community, annotations are often assumed to be an instance of an ontology. Through the ontologies the system will express key entities and relationships describing resources in a formal machine-processable representation. An ontology-based knowledge representation could be used for content analysis and object recognition, for reasoning processes and for enabling user-friendly and intelligent multimedia content search and retrieval. Text summarization has been an interesting and active research area since the 60’s. The definition and assumption are that a small portion or several keywords of the original long document can represent the whole informatively and/or indicatively. Reading or processing this shorter version of the document would save time and other resources [8]. This property is especially true and urgently needed at present due to the vast availability of information. Concept-based approach to represent dynamic and unstructured information can be useful to address issues like trying to determine the key concepts and to summarize the information exchanged within a personalized environment. In this context, a concept is represented with a Wikipedia article. With millions of articles and thousands of contributors, this online repository of knowledge is the largest and fastest growing encyclopedia in existence. The problem described above can then be divided into three steps: • Mapping of a series of terms with the most appropriate Wikipedia article (disambiguation). • Assigning a score for each item identified on the basis of its importance in the given context. • Extraction of n items with the highest score. Text summarization can be applied to many fields: from information retrieval to text mining processes and text display. Also in personalized searching framework text summarization could be very useful. The chapter is organized as follows: the next Section introduces personalized searching framework as one of the possible application areas of automatic concept extraction systems. Section three describes the summarization process, providing details on system architecture, used methodology and tools. Section four provides an overview about document summarization approaches that have been recently developed. Section five summarizes a number of real-world applications which might benefit from WSD. Section six introduces Wikipedia and WordNet as used in our project. Section seven describes the logical structure of the project, describing software components and databases. Finally, Section eight provides some consideration..

IntechOpen

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

USI: a fast and accurate approach for conceptual document annotation

Author: Fiorini , Nicolas
Montmain Jacky
Ranwez Sylvie
Ranwez Vincent
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/03/2015
Field of study

International audienceBackground : Semantic approaches such as concept-based information retrieval rely on a corpus in which resources are indexed by concepts belonging to a domain ontology. In order to keep such applications up-to-date, new entities need to be frequently annotated to enrich the corpus. However, this task is time-consuming and requires a high-level of expertise in both the domain and the related ontology. Different strategies have thus been proposed to ease this indexing process, each one taking advantage from the features of the document.Results : In this paper we present USI (User-oriented Semantic Indexer), a fast and intuitive method for indexing tasks. We introduce a solution to suggest a conceptual annotation for new entities based on related already indexed documents. Our results, compared to those obtained by previous authors using the MeSH thesaurus and a dataset of biomedical papers, show that the method surpasses text-specific methods in terms of both quality and speed. Evaluations are done via usual metrics and semantic similarity.Conclusions : By only relying on neighbor documents, the User-oriented Semantic Indexer does not need a representative learning set. Yet, it provides better results than the other approaches by giving a consistent annotation scored with a global criterion instead of one score per concept

Improving self-organising information maps as navigational tools: A semantic approach

Author: Brusilovsky P
He D
Lin YL
Publication venue: 'Emerald'
Publication date: 01/01/2011
Field of study

Purpose - The goal of the research is to explore whether the use of higher-level semantic features can help us to build better self-organising map (SOM) representation as measured from a human-centred perspective. The authors also explore an automatic evaluation method that utilises human expert knowledge encapsulated in the structure of traditional textbooks to determine map representation quality. Design/methodology/approach - Two types of document representations involving semantic features have been explored - i.e. using only one individual semantic feature, and mixing a semantic feature with keywords. Experiments were conducted to investigate the impact of semantic representation quality on the map. The experiments were performed on data collections from a single book corpus and a multiple book corpus. Findings - Combining keywords with certain semantic features achieves significant improvement of representation quality over the keywords-only approach in a relatively homogeneous single book corpus. Changing the ratios in combining different features also affects the performance. While semantic mixtures can work well in a single book corpus, they lose their advantages over keywords in the multiple book corpus. This raises a concern about whether the semantic representations in the multiple book corpus are homogeneous and coherent enough for applying semantic features. The terminology issue among textbooks affects the ability of the SOM to generate a high quality map for heterogeneous collections. Originality/value - The authors explored the use of higher-level document representation features for the development of better quality SOM. In addition the authors have piloted a specific method for evaluating the SOM quality based on the organisation of information content in the map. © 2011 Emerald Group Publishing Limited

Crossref

D-Scholarship@Pitt

Word sense discrimination in information retrieval: a spectral clustering-based approach

Author: Chifu Adrian-Gabriel
Hristea Florentina
Mothe Josiane
Popescu Marius
Publication venue: 'Elsevier BV'
Publication date: 01/07/2014
Field of study

International audienceWord sense ambiguity has been identified as a cause of poor precision in information retrieval (IR) systems. Word sense disambiguation and discrimination methods have been defined to help systems choose which documents should be retrieved in relation to an ambiguous query. However, the only approaches that show a genuine benefit for word sense discrimination or disambiguation in IR are generally supervised ones. In this paper we propose a new unsupervised method that uses word sense discrimination in IR. The method we develop is based on spectral clustering and reorders an initially retrieved document list by boosting documents that are semantically similar to the target query. For several TREC ad hoc collections we show that our method is useful in the case of queries which contain ambiguous terms. We are interested in improving the level of precision after 5, 10 and 30 retrieved documents (P@5, P@10, P@30) respectively. We show that precision can be improved by 8% above current state-of-the-art baselines. We also focus on poor performing queries

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

HAL Descartes