21 research outputs found

    Sensemaking for Broad Topics via Automated Extraction and Recursive Search

    Get PDF
    The availability of vast amounts of diverse information related to a broad topic makes it difficult and time-consuming for users to find and digest the right information regarding various low-level topics within the broader space. Current approaches to addressing these challenges include providing curated topical pages, relevant query refinement suggestions, list of subtopics, etc. However, these approaches do not scale and offer inadequate support for sensemaking. This disclosure describes automated techniques that extract information from online information sources by using a query related to a high-level topic to recursively formulate additional queries for subtopics to construct a hierarchical set of topics related to the broad query. The results can be utilized to provide a user interface using the hierarchical topic levels which can make it faster and easier for users to understand and navigate information regarding a high-level topic

    Automated Extraction of Pivot Topics for Sideways Expansion of Search Scope

    Get PDF
    Users benefit from mechanisms that can help them refine their queries to facilitate searching for information connected to their underlying intent. Apart from refinements to narrow the scope of a query, users can benefit from suggestions that can help them pivot their information seeking by expanding their search sideways to related topics. This disclosure describes computational techniques for automated determination of suitable topics and/or queries for helping users expand the scope of their information search by pivoting to topics related to their query. The techniques involve selecting a meta-query, performing query expansion, identifying, aggregating, and deduplicating related entities. The identified entities are clustered and ranked to enable selection of particular entities that can be shown to users as pivot topics

    Evaluation of utility of LSA for word sense discrimination

    No full text
    The goal of the on-going project described in this paper is evaluation of the utility of Latent Semantic Analysis (LSA) for unsupervised word sense discrimination. The hypothesis is that LSA can be used to compute context vectors for ambiguous words that can be clustered together – with each cluster corresponding to a different sense of the word. In this paper we report first experimental result on tightness, separation and purity of sense-based clusters as a function of vector space dimensionality and using different distance metrics.

    Varying input segmentation for story boundary detection in english, arabic and mandarin broadcast news

    Get PDF
    Story segmentation of news broadcasts has been shown to improve the accuracy of the subsequent processes such as question answering and information retrieval. In previous work, a decision tree trained on automatically extracted lexical and acoustic features was trained to predict story boundaries, using hypothesized sentence boundaries to define potential story boundaries. In this paper, we empirically evaluate several alternatives to choice of segmentation on three languages: English, Mandarin and Arabic. Our results suggest that the best performance can be achieved by using 250ms pause-based segmentation or sentence boundaries determined using a very low confidence score threshold. Index Terms: story boundary detection, segmentation 1
    corecore