20,735 research outputs found

    Events and Controversies: Influences of a Shocking News Event on Information Seeking

    Full text link
    It has been suggested that online search and retrieval contributes to the intellectual isolation of users within their preexisting ideologies, where people's prior views are strengthened and alternative viewpoints are infrequently encountered. This so-called "filter bubble" phenomenon has been called out as especially detrimental when it comes to dialog among people on controversial, emotionally charged topics, such as the labeling of genetically modified food, the right to bear arms, the death penalty, and online privacy. We seek to identify and study information-seeking behavior and access to alternative versus reinforcing viewpoints following shocking, emotional, and large-scale news events. We choose for a case study to analyze search and browsing on gun control/rights, a strongly polarizing topic for both citizens and leaders of the United States. We study the period of time preceding and following a mass shooting to understand how its occurrence, follow-on discussions, and debate may have been linked to changes in the patterns of searching and browsing. We employ information-theoretic measures to quantify the diversity of Web domains of interest to users and understand the browsing patterns of users. We use these measures to characterize the influence of news events on these web search and browsing patterns

    Web Data Extraction, Applications and Techniques: A Survey

    Full text link
    Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

    VAS (Visual Analysis System): An information visualization engine to interpret World Wide Web structure

    Get PDF
    People increasingly encounter problems of interpreting and filtering mass quantities of information. The enormous growth of information systems on the World Wide Web has demonstrated that we need systems to filter, interpret, organize and present information in ways that allow users to use these large quantities of information. People need to be able to extract knowledge from this sometimes meaningful but sometimes useless mass of data in order to make informed decisions. Web users need to have some kind of information about the sort of page they might visit, such as, is it a rarely referenced or often-referenced page? This master\u27s thesis presents a method to address these problems using data mining and information visualization techniques

    Analysing Web Multimedia Query Reformulation Behaviour

    Get PDF
    Current multimedia Web search engines still use keywords as the primary means to search. Due to the richness in multimedia contents, general users constantly experience some difficulties in formulating textual queries that are representative enough for their needs. As a result, query reformulation becomes part of an inevitable process in most multimedia searches. Previous Web query formulation studies did not investigate the modification sequences and thus can only report limited findings on the reformulation behavior. In this study, we propose an automatic approach to examine multimedia query reformulation using large-scale transaction logs. The key findings show that search term replacement is the most dominant type of modifications in visual searches but less important in audio searches. Image search users prefer the specified search strategy more than video and audio users. There is also a clear tendency to replace terms with synonyms or associated terms in visual queries. The analysis of the search strategies in different types of multimedia searching provides some insights into user’s searching behavior, which can contribute to the design of future query formulation assistance for keyword-based Web multimedia retrieval systems

    Task-based user profiling for query refinement (toque)

    Get PDF
    The information needs of search engine users vary in complexity. Some simple needs can be satisfied by using a single query, while complicated ones require a series of queries spanning a period of time. A search task, consisting of a sequence of search queries serving the same information need, can be treated as an atomic unit for modeling user’s search preferences and has been applied in improving the accuracy of search results. However, existing studies on user search tasks mainly focus on applying user’s interests in re-ranking search results. Only few studies have examined the effects of utilizing search tasks to assist users in obtaining effective queries. Moreover, fewer existing studies have examined the dynamic characteristics of user’s search interests within a search task. Furthermore, even fewer studies have examined approaches to selective personalization for candidate refined queries that are expected to benefit from its application. This study proposes a framework of modeling user’s task-based dynamic search interests to address these issues and makes the following contributions. First, task identification: a cross-session based method is proposed to discover tasks by modeling the best-link structure of queries, based on the commonly shared clicked results. A graph-based representation method is introduced to improve the effectiveness of link prediction in a query sequence. Second, dynamic task-level search interest representation: a four-tuple user profiling model is introduced to represent long- and short-term user interests extracted from search tasks and sessions. It models user’s interests at the task level to re-rank candidate queries through modules of task identification and update. Third, selective personalization: a two-step personalization algorithm is proposed to improve the rankings of candidate queries for query refinement by assessing the task dependency via exploiting a latent task space. Experimental results show that the proposed TOQUE framework contributes to an increased precision of candidate queries and thus shortened search sessions
    corecore