20 research outputs found
Passage de la langue naturelle à une requête SPARQL dans le système SWIP
International audienceNotre objectif est de fournir aux utilisateurs un moyen d'interroger des bases de connaissances en utilisant des requêtes exprimées en langue naturelle. Nous souhaitons masquer la complexité liée à la formulation des requêtes dans un langage de requêtes graphes comme SPARQL. L'originalité principale de notre approche réside dans l'utilisation de patrons de requêtes. Dans cet article, nous justifions le postulat selon lequel les requêtes issues d'utilisateurs de la "vraie vie" sont des variations autour de quelques familles typiques de requêtes. Nous expliquons également comment notre approche est adaptable à différentes langues. Les premières évaluations sur le jeu de données du challenge QALD-2 montrent la pertinence de notre approche
Measuring mobile search tasks on Android platform
See english abstract for detailsThe presented work concerns the topic of complex search tasks and exploratory search. Search algorithms require statistical data to be improved and have to consider the effects of the user platform. Today the mobility of the world is quickly growing and more and more search queries are made from mobile devices. As mobile platforms have some specific differences from the desktops, it is an interesting field to research and compare mobile search with desktop based search. The Search Experiment application for Android developed in this work allows to compare the behavior of different users carrying out the same search tasks. The application provides the possibility to log sequences of actions users made during their searches and provides a data store engine for logs. The work gives small overview of related work and implementations on similar and describes the details of implementation of the Search Experiment application for the Android platform. It also gives a short summary on collected statistics and the usability of the program
Optimizing Interactive Systems via Data-Driven Objectives
Effective optimization is essential for real-world interactive systems to
provide a satisfactory user experience in response to changing user behavior.
However, it is often challenging to find an objective to optimize for
interactive systems (e.g., policy learning in task-oriented dialog systems).
Generally, such objectives are manually crafted and rarely capture complex user
needs in an accurate manner. We propose an approach that infers the objective
directly from observed user interactions. These inferences can be made
regardless of prior knowledge and across different types of user behavior. We
introduce Interactive System Optimizer (ISO), a novel algorithm that uses these
inferred objectives for optimization. Our main contribution is a new general
principled approach to optimizing interactive systems using data-driven
objectives. We demonstrate the high effectiveness of ISO over several
simulations.Comment: 30 pages, 12 figures. arXiv admin note: text overlap with
arXiv:1802.0630
Rewarding the Location of Terms in Sentences to Enhance Probabilistic Information Retrieval
In most traditional retrieval models, the weight (or probability) of a query term is estimated based on its own distribution or statistics. Intuitively, however, the nouns are more important in information retrieval and are more often found near the beginning and the end of sentences. In this thesis, we investigate the effect of rewarding the terms based on their location in sentences on information retrieval. Particularly, we propose a kernel-based method to capture the term placement pattern, in which a novel Term Location retrieval model is derived in combination with the BM25 model to enhance probabilistic information retrieval. Experiments on five TREC datasets of varied size and content indicates that the proposed model significantly outperforms the optimized BM25 and DirichletLM in MAP over all datasets with all kernel functions, and excels compared to the optimized BM25 and DirichletLM over most of the datasets in P@5 and P@20 with different kernel functions
FreeLib: A Self-Sustainable Peer-to-Peer Digital Library Framework for Evolving Communities
The need for efficient solutions to the problem of disseminating and sharing of data is growing. Digital libraries provide an efficient solution for disseminating and sharing large volumes of data to diverse sets of users. They enable the use of structured and well defined metadata to provide quality search services. Most of the digital libraries built so far follow a centralized model. The centralized model is an efficient model; however, it has some inherent problems. It is not suitable when content contribution is highly distributed over a very large number of participants. It also requires an organizational support to provide resources (hardware, software, and network bandwidth) and to manage processes for collecting, ingesting, curating, and maintaining the content.
In this research, we develop an alternative digital library framework based on peer-to-peer. The framework utilizes resources contributed by participating nodes to achieve self-sustainability. A second key contribution of this research is a significant enhancement of search performance by utilizing the novel concept of community evolution. As demonstrated in this thesis, bringing users sharing similar interest together in a community significantly enhances the search performance. Evolving users into communities is based on a simple analysis of user access patterns in a completely distributed manner. This community evolution process is completely transparent to the user. In our framework, community membership of each node is continuously evolving. This allows users to move between communities as their interest shifts between topics, thus enhancing search performance for users all the time even when their interest changes. It also gives our framework great flexibility as it allows communities to dissolve and new communities to form and evolve over time to reflect the latest user interests. In addition to self-sustainability and performance enhancements, our framework has the potential of building extremely large collections although every node is only maintaining a small collection of digital objects
Recommended from our members
A user-centred approach to information retrieval
A user model is a fundamental component in user-centred information retrieval systems. It enables personalization of a user's search experience. The development of such a model involves three phases: collecting information about each user, representing such information, and integrating the model into a retrieval application. Progress in this area is typically met with privacy and scalability challenges that hinder the ability to synthesize collective knowledge from each user's search behaviour. In this thesis, I propose a framework that addresses each of these three phases. The proposed framework is based on social role theory from the social science literature and at the centre of this theory is the concept of a social position. A social position is a label for a group of users with similar behavioural patterns. Examples of such positions are traveller, patient, movie fan, and computer scientist. In this thesis, a social position acts as a label for users who are expected to have similar interests. The proposed framework does not require real users' data; rather it uses the web as a resource to model users.
The proposed framework offers a data-driven and modular design for each of the three phases of building a user model. First, I present an approach to identify social positions from natural language sentences. I formulate this task as a binary classification task and develop a method to enumerate candidate social positions. The proposed classifier achieves an accuracy score of 85.8%, which indicates that social positions can be identified with good accuracy. Through an inter-annotator agreement study, I further show a reasonable level of agreement between users when identifying social positions.
Second, I introduce a novel topic modelling-based approach to represent each social position as a multinomial distribution over words. This approach estimates a topic from a document collection for each position. To construct such a collection for a particular position, I propose a seeding algorithm that extracts a set of terms relevant to the social position. Coherence-based evaluation shows that the proposed approach learns significantly more coherent representations when compared with a relevance modelling baseline.
Third, I present a diversification approach based on the proposed framework. Diversification algorithms aim to return a result list for a search query that would potentially satisfy users with diverse information needs. I propose to identify social positions that are relevant to a search query. These positions act as an implicit representation of the many possible interpretations of the search query. Then, relevant positions are provided to a diversification technique that proportionally diversifies results based on each social position's importance. I evaluate my approach using four test collections provided by the diversity task of the Text REtrieval Conference (TREC) web tracks for 2009, 2010, 2011, and 2012. Results demonstrate that my proposed diversification approach is effective and provides statistically significant improvements over various implicit diversification approaches.
Fourth, I introduce a session-based search system under the framework of learning to rank. Such a system aims to improve the retrieval performance for a search query using previous user interactions during the search session. I present a method to match a search session to its most relevant social positions based on the session's interaction data. I then suggest identifying related sessions from query logs that are likely to be issued by users with similar information needs. Novel learning features are then estimated from the session's social positions, related sessions, and interaction data. I evaluate the proposed system using four test collections from the TREC session track. This approach achieves state-of-the-art results compared with effective session-based search systems. I demonstrate that such a strong performance is mainly attributed to features that are derived from social positions' data