7 research outputs found

    A new integrated model for multitasking during web searching

    Get PDF
    Investigating multitasking information behaviour, particularly while using the web, has become an increasingly important research area. People s reliance on the web to seek and find information has encouraged a number of researchers to investigate the characteristics of information seeking behaviour and the web seeking strategies used. The current research set out to explore multitasking information behaviour while using the web in relation to people s personal characteristics, working memory, and flow (a state where people feel in control and immersed in the task). Also investigated were the effects of pre-determined knowledge about search tasks and the artefact characteristics. In addition, the study also investigated cognitive states (interactions between the user and the system) and cognitive coordination shifts (the way people change their actions to search effectively) while multitasking on the web. The research was exploratory using a mixed method approach. Thirty University students participated; 10 psychologists, 10 accountants and 10 mechanical engineers. The data collection tools used were: pre and post questionnaires, pre-interviews, a working memory test, a flow state scale test, audio-visual data, web search logs, think aloud data, observation, and the critical decision method. Based on the working memory test, the participants were divided into two groups, those with high scores and those with lower scores. Similarly, participants were divided into two groups based on their flow state scale tests. All participants searched information on the web for four topics: two for which they had prior knowledge and two more without prior knowledge. The results revealed that working memory capacity affects multitasking information behaviour during web searching. For example, the participants in the high working memory group and high flow group had a significantly greater number of cognitive coordination and state shifts than the low working memory group and low flow group. Further, the perception of task complexity was related to working memory capacity; those with low memory capacity thought task complexity increased towards the end of tasks for which they had no prior knowledge compared to tasks for which they had prior knowledge. The results also showed that all participants, regardless of their working memory capacity and flow level, had the same the first frequent cognitive coordination and cognitive state sequences: from strategy to topic. In respect of disciplinary differences, accountants rated task complexity at the end of the web seeking procedure to be statistically less significant for information tasks with prior knowledge compared to the participants from the other disciplines. Moreover, multitasking information behaviour characteristics such as the number of queries, web search sessions and opened tabs/windows during searches has been affected by the disciplines. The findings of the research enabled an exploratory integrated model to be created, which illustrates the nature of multitasking information behaviour when using the web. One other contribution of this research was to develop new more specific and closely grounded definitions of task complexity and artefact characteristics). This new research may influence the creation of more effective web search systems by placing more emphasis on our understanding of the complex cognitive mechanisms of multitasking information behaviour when using the web

    Selective web information retrieval

    Get PDF
    This thesis proposes selective Web information retrieval, a framework formulated in terms of statistical decision theory, with the aim to apply an appropriate retrieval approach on a per-query basis. The main component of the framework is a decision mechanism that selects an appropriate retrieval approach on a per-query basis. The selection of a particular retrieval approach is based on the outcome of an experiment, which is performed before the final ranking of the retrieved documents. The experiment is a process that extracts features from a sample of the set of retrieved documents. This thesis investigates three broad types of experiments. The first one counts the occurrences of query terms in the retrieved documents, indicating the extent to which the query topic is covered in the document collection. The second type of experiments considers information from the distribution of retrieved documents in larger aggregates of related Web documents, such as whole Web sites, or directories within Web sites. The third type of experiments estimates the usefulness of the hyperlink structure among a sample of the set of retrieved Web documents. The proposed experiments are evaluated in the context of both informational and navigational search tasks with an optimal Bayesian decision mechanism, where it is assumed that relevance information exists. This thesis further investigates the implications of applying selective Web information retrieval in an operational setting, where the tuning of a decision mechanism is based on limited existing relevance information and the information retrieval system’s input is a stream of queries related to mixed informational and navigational search tasks. First, the experiments are evaluated using different training and testing query sets, as well as a mixture of different types of queries. Second, query sampling is introduced, in order to approximate the queries that a retrieval system receives, and to tune an ad-hoc decision mechanism with a broad set of automatically sampled queries

    Integration of distributed terminology resources to facilitate subject cross-browsing for library portal systems

    Get PDF
    With the increase in the number of distributed library information resources, users may have to interact with different user interfaces, learn to switch their mental models between these interfaces, and familiarise themselves with controlled vocabularies used by different resources. For this reason, library professionals have developed library portals to integrate these distributed information resources, and assist end-users in cross-accessing distributed resources via a single access point in their own library. There are two important subject-based services that a library portal system might be able to provide. The first is a federated search service, which refers to a process where a user can input a query to cross-search a number of information resources. The second is a subject cross-browsing service, which can offer a knowledge navigation tree to link subject schemes used by distributed resources. However, the development of subject cross-searching and browsing services has been impeded by the heterogeneity of different KOS (Knowledge Organisation System) used by different information resources. Due to the lack of mappings between different KOS, it is impossible to offer a subject cross-browsing service for a library portal system. [Continues.

    Doctor of Philosophy in Computer Science

    Get PDF
    dissertationOver the last decade, social media has emerged as a revolutionary platform for informal communication and social interactions among people. Publicly expressing thoughts, opinions, and feelings is one of the key characteristics of social media. In this dissertation, I present research on automatically acquiring knowledge from social media that can be used to recognize people's affective state (i.e., what someone feels at a given time) in text. This research addresses two types of affective knowledge: 1) hashtag indicators of emotion consisting of emotion hashtags and emotion hashtag patterns, and 2) affective understanding of similes (a form of figurative comparison). My research introduces a bootstrapped learning algorithm for learning hashtag in- dicators of emotions from tweets with respect to five emotion categories: Affection, Anger/Rage, Fear/Anxiety, Joy, and Sadness/Disappointment. With a few seed emotion hashtags per emotion category, the bootstrapping algorithm iteratively learns new hashtags and more generalized hashtag patterns by analyzing emotion in tweets that contain these indicators. Emotion phrases are also harvested from the learned indicators to train additional classifiers that use the surrounding word context of the phrases as features. This is the first work to learn hashtag indicators of emotions. My research also presents a supervised classification method for classifying affective polarity of similes in Twitter. Using lexical, semantic, and sentiment properties of different simile components as features, supervised classifiers are trained to classify a simile into a positive or negative affective polarity class. The property of comparison is also fundamental to the affective understanding of similes. My research introduces a novel framework for inferring implicit properties that 1) uses syntactic constructions, statistical association, dictionary definitions and word embedding vector similarity to generate and rank candidate properties, 2) re-ranks the top properties using influence from multiple simile components, and 3) aggregates the ranks of each property from different methods to create a final ranked list of properties. The inferred properties are used to derive additional features for the supervised classifiers to further improve affective polarity recognition. Experimental results show substantial improvements in affective understanding of similes over the use of existing sentiment resources

    Définition et évaluation de modèles d'agrégation pour l'estimation de la pertinence multidimensionnelle en recherche d'information

    Get PDF
    The main research topic of this document revolve around the information retrieval (IR) field. Traditional IR models rank documents by computing single scores separately with respect to one single objective criterion. Recently, an increasing number of IR studies has triggered a resurgence of interest in redefining the algorithmic estimation of relevance, which implies a shift from topical to multidimensional relevance assessment.In our work, we specifically address the multidimensional relevance assessment and evaluation problems. To tackle this challenge, state-of-the-art approaches are often based on linear combination mechanisms. However, However, these methods rely on the unrealistic additivity hypothesis and independence of the relevance dimensions, which makes it unsuitable in many real situations where criteria are correlated.Other techniques from the machine learning area have also been proposed. The latter learn a model from example inputs and generalize it to combine the different criteria. Nonetheless, these methods tend to offer only limited insight on how to consider the importance and the interaction between the criteria. In addition to the parameters sensitivity used within these algorithms, it is quite difficult to understand why a criteria is more preferred over another one.To address this problem, we proposed a model based on a multi-criteria aggregation operator that is able to overcome the problem of additivity. Our model is based on a fuzzy measure that offer semantic interpretations of the correlations and interactions between the criteria. We have adapted this model to the multidimensional relevance estimation in two scenarii: (i) a tweet search task and (ii) two personalized IR settings. The second line of research focuses on the integration of the temporal factor in the aggregation process, in order to consider the changes of document collections over time. To do so, we have proposed a time-aware IR model for combining the temporal relavance criterion with the topical relevance one. Then, we performed a time series analysis to identify the temporal query nature, and we proposed an evaluation framework within a time-aware IR setting.La problématique générale de notre travail s'inscrit dans le domaine scientifique de la recherche d'information (RI). Les modèles de RI classiques sont généralement basés sur une définition de la notion de pertinence qui est liée essentiellement à l'adéquation thématique entre le sujet de la requête et le sujet du document. Le concept de pertinence a été revisité selon différents niveaux intégrant ainsi différents facteurs liés à l'utilisateur et à son environnement dans une situation de RI. Dans ce travail, nous abordons spécifiquement le problème lié à la modélisation de la pertinence multidimensionnelle à travers la définition de nouveaux modèles d'agrégation des critères et leur évaluation dans des tâches de recherche de RI. Pour répondre à cette problématique, les travaux de l'état de l'art se basent principalement sur des combinaisons linéaires simples. Cependant, ces méthodes se reposent sur l'hypothèse non réaliste d'additivité ou d'indépendance des dimensions, ce qui rend le modèle non approprié dans plusieurs situations de recherche réelles dans lesquelles les critères étant corrélés ou présentant des interactions entre eux. D'autres techniques issues du domaine de l'apprentissage automatique ont été aussi proposées, permettant ainsi d'apprendre un modèle par l'exemple et de le généraliser dans l'ordonnancement et l'agrégation des critères. Toutefois, ces méthodes ont tendance à offrir un aperçu limité sur la façon de considérer l'importance et l'interaction entre les critères. En plus de la sensibilité des paramètres utilisés dans ces algorithmes, est très difficile de comprendre pourquoi un critère est préféré par rapport à un autre. Pour répondre à cette première direction de recherche, nous avons proposé un modèle de combinaison de pertinence multicritères basé sur un opérateur d'agrégation qui permet de surmonter le problème d'additivité des fonctions de combinaison classiques. Notre modèle se base sur une mesure qui permet de donner une idée plus claire sur les corrélations et interactions entre les critères. Nous avons ainsi adapté ce modèle pour deux scénarios de combinaison de pertinence multicritères : (i) un cadre de recherche d'information multicritères dans un contexte de recherche de tweets et (ii) deux cadres de recherche d'information personnalisée. Le deuxième axe de recherche s'intéresse à l'intégration du facteur temporel dans le processus d'agrégation afin de tenir compte des changements occurrents sur les collection de documents au cours du temps. Pour ce faire, nous avons proposé donc un modèle d'agrégation sensible au temps pour combinant le facteur temporel avec le facteur de pertinence thématique. Dans cet objectif, nous avons effectué une analyse temporelle pour éliciter l'aspect temporel des requêtes, et nous avons proposé une évaluation de ce modèle dans une tâche de recherche sensible au temps
    corecore