74 research outputs found

    Semantic Equivalence of e-Commerce Queries

    Full text link
    Search query variation poses a challenge in e-commerce search, as equivalent search intents can be expressed through different queries with surface-level differences. This paper introduces a framework to recognize and leverage query equivalence to enhance searcher and business outcomes. The proposed approach addresses three key problems: mapping queries to vector representations of search intent, identifying nearest neighbor queries expressing equivalent or similar intent, and optimizing for user or business objectives. The framework utilizes both surface similarity and behavioral similarity to determine query equivalence. Surface similarity involves canonicalizing queries based on word inflection, word order, compounding, and noise words. Behavioral similarity leverages historical search behavior to generate vector representations of query intent. An offline process is used to train a sentence similarity model, while an online nearest neighbor approach supports processing of unseen queries. Experimental evaluations demonstrate the effectiveness of the proposed approach, outperforming popular sentence transformer models and achieving a Pearson correlation of 0.85 for query similarity. The results highlight the potential of leveraging historical behavior data and training models to recognize and utilize query equivalence in e-commerce search, leading to improved user experiences and business outcomes. Further advancements and benchmark datasets are encouraged to facilitate the development of solutions for this critical problem in the e-commerce domain.Comment: The 6th Workshop on e-Commerce and NL

    Deriving query suggestions for site search

    Get PDF
    Modern search engines have been moving away from simplistic interfaces that aimed at satisfying a user's need with a single-shot query. Interactive features are now integral parts of web search engines. However, generating good query modification suggestions remains a challenging issue. Query log analysis is one of the major strands of work in this direction. Although much research has been performed on query logs collected on the web as a whole, query log analysis to enhance search on smaller and more focused collections has attracted less attention, despite its increasing practical importance. In this article, we report on a systematic study of different query modification methods applied to a substantial query log collected on a local website that already uses an interactive search engine. We conducted experiments in which we asked users to assess the relevance of potential query modification suggestions that have been constructed using a range of log analysis methods and different baseline approaches. The experimental results demonstrate the usefulness of log analysis to extract query modification suggestions. Furthermore, our experiments demonstrate that a more fine-grained approach than grouping search requests into sessions allows for extraction of better refinement terms from query log files. © 2013 ASIS&T

    'A Modern Up-To-Date Laptop' -- Vagueness in Natural Language Queries for Product Search

    Full text link
    With the rise of voice assistants and an increase in mobile search usage, natural language has become an important query language. So far, most of the current systems are not able to process these queries because of the vagueness and ambiguity in natural language. Users have adapted their query formulation to what they think the search engine is capable of, which adds to their cognitive burden. With our research, we contribute to the design of interactive search systems by investigating the genuine information need in a product search scenario. In a crowd-sourcing experiment, we collected 132 information needs in natural language. We examine the vagueness of the formulations and their match to retailer-generated content and user-generated product reviews. Our findings reveal high variance on the level of vagueness and the potential of user reviews as a source for supporting users with rather vague search intents

    Using Linked Data for Prosopographical Research of Historical Persons : Case U.S. Congress Legislators

    Get PDF
    This paper shows how biographical registries can be represented as Linked Data, enriched by data linking to related data sources, and used in Digital Humanities. As a use case, a database of 11 987 historical U.S. Congress Legislators in 1789–2018 was transformed into a knowledge graph. The data was published as a Linked Data service, including a SPARQL endpoint, on top of which tools for biographical and prosopographical research are implemented. A faceted browser named U.S. Congress Prosopographer with visualization tools for knowledge discovery is presented to provide new insights in political history.Peer reviewe

    Personal Semantics

    Get PDF
    International audienceQuantified self, life logging, digital eyeglasses, technology is ad- vancing rapidly to a point where people can gather masses of data about their own persons and their own life. Large-scale models of what people are doing are being built by credit companies, advertising agencies, and national security agencies, using digital traces that people leave behind them. How can individuals exploit their own data for their own benefit? With this mass of personal data, we will need to induce personal semantic dimensions to sift data and find what is meaningful to each individual. In this chapter, we present semantic dimensions, made by experts, and by crowds. We show the type of information that individuals will have access to once lifelogging becomes common, and we will sketch what personal semantic dimensions might look like

    A Sparse Stress Model

    Full text link
    Force-directed layout methods constitute the most common approach to draw general graphs. Among them, stress minimization produces layouts of comparatively high quality but also imposes comparatively high computational demands. We propose a speed-up method based on the aggregation of terms in the objective function. It is akin to aggregate repulsion from far-away nodes during spring embedding but transfers the idea from the layout space into a preprocessing phase. An initial experimental study informs a method to select representatives, and subsequent more extensive experiments indicate that our method yields better approximations of minimum-stress layouts in less time than related methods.Comment: Appears in the Proceedings of the 24th International Symposium on Graph Drawing and Network Visualization (GD 2016

    The State-of-the-Art of Set Visualization

    Get PDF
    Sets comprise a generic data model that has been used in a variety of data analysis problems. Such problems involve analysing and visualizing set relations between multiple sets defined over the same collection of elements. However, visualizing sets is a non-trivial problem due to the large number of possible relations between them. We provide a systematic overview of state-of-the-art techniques for visualizing different kinds of set relations. We classify these techniques into six main categories according to the visual representations they use and the tasks they support. We compare the categories to provide guidance for choosing an appropriate technique for a given problem. Finally, we identify challenges in this area that need further research and propose possible directions to address these challenges. Further resources on set visualization are available at http://www.setviz.net

    Cell lineage visualisation

    Get PDF
    Cell lineages describe the developmental history of cell populations and are produced by combining time-lapse imaging and image processing. Biomedical researchers study cell lineages to understand fundamental processes, such as cell differentiation and the pharmacodynamic action of anticancer agents. Yet, the interpretation of cell lineages is hindered by their complexity and insufficient capacity for visual analysis. We present a novel approach for interactive visualisation of cell lineages. Based on an understanding of cellular biology and live-cell imaging methodology, we identify three requirements: multimodality (cell lineages combine spatial, temporal, and other properties), symmetry (related to lineage branching structure), and synchrony (related to temporal alignment of cellular events). We address these by combining visual summaries of the spatiotemporal behaviour of an arbitrary number of lineages, including variation from average behaviour, with node-link representations that emphasise the presence or absence of symmetry and synchrony. We illustrate the merit of our approach by presenting a real-world case study where the cytotoxic action of the anticancer drug topotecan was determined
    corecore