26 research outputs found

    Web Search using Improved Concept Based Query Refinement

    Get PDF
    The information extracted from Web pages can be used for effective query expansion. The aspect needed to improve accuracy of web search engines is the inclusion of metadata, not only to analyze Web content, but also to interpret. With the Web of today being unstructured and semantically heterogeneous, keyword-based queries are likely to miss important results. . Using data mining methods, our system derives dependency rules and applies them to concept-based queries. This paper presents a novel approach for query expansion that applies dependence rules mined from a large Web World, combining several existing techniques for data extraction and mining, to integrate the system into COMPACT, our prototype implementation of a concept-based search engine

    Tweet Enrichment for Effective Dimensions Classification in Online Reputation Management

    Get PDF
    Online Reputation Management (ORM) is concerned with the monitoring of public opinions on social media for entities such as commercial organisations. In particular, we investigate the task of reputation dimension classification, which aims to classify tweets that mention a business entity into different dimensions (e.g. "financial performance'' or "products and services''). However, producing a general reputation dimension classification system that can be used across businesses of different types is challenging, due to the brief nature of tweets and the lack of terms in tweets that relate to specific reputation dimensions. To tackle these issues, we propose a robust and effective tweet enrichment approach that expands tweets with additional discriminative terms from a contemporary Web corpus. Using the RepLab 2014 test collection, we show that our tweet enrichment approach outperforms effective baselines including the top performing submission to RepLab 2014. Moreover, we show that the achieved accuracy scores are very close to the upper bound that our approach could achieve on this collection

    Information Retrieval based on Content and Location Ontology for Search Engine (CLOSE)

    Get PDF
    This paper mainly focuses on the personalization of the search engine based on data mining technique, such that user preferences are taken into consideration. Clickthrough data is applied on the user profile to mine the user preferences in order to extract the features to know in which users are really interested. The basic idea behind the concept is to construct the content and location ontology2019;s, where content represent the previous search records of the user and location refer to current location of user. SpyNB is the approach used to mining the user preferences from the Clickthrough data. The ranked support vector machine (RVSM) is performed on the searched results in order to display results according to user preferences by considering Clickthrough data

    Graph Enhanced BERT for Query Understanding

    Full text link
    Query understanding plays a key role in exploring users' search intents and facilitating users to locate their most desired information. However, it is inherently challenging since it needs to capture semantic information from short and ambiguous queries and often requires massive task-specific labeled data. In recent years, pre-trained language models (PLMs) have advanced various natural language processing tasks because they can extract general semantic information from large-scale corpora. Therefore, there are unprecedented opportunities to adopt PLMs for query understanding. However, there is a gap between the goal of query understanding and existing pre-training strategies -- the goal of query understanding is to boost search performance while existing strategies rarely consider this goal. Thus, directly applying them to query understanding is sub-optimal. On the other hand, search logs contain user clicks between queries and urls that provide rich users' search behavioral information on queries beyond their content. Therefore, in this paper, we aim to fill this gap by exploring search logs. In particular, to incorporate search logs into pre-training, we first construct a query graph where nodes are queries and two queries are connected if they lead to clicks on the same urls. Then we propose a novel graph-enhanced pre-training framework, GE-BERT, which can leverage both query content and the query graph. In other words, GE-BERT can capture both the semantic information and the users' search behavioral information of queries. Extensive experiments on various query understanding tasks have demonstrated the effectiveness of the proposed framework

    SHORT TEXT INFERENCE USING ENHANCED STRING SEMANTICS

    Get PDF
    The cryptography is created by obtaining AN in-depth neural network, that is trained on texts symbolized by word-count vectors (bag-of word representation). unfortunately, the conclusion result's texts for instance searches, tweets, or news titles, such representations inadequate to capture the linguistics. bunch short texts (for example news titles) by their which means could be a difficult task. The linguistics hashing approach encodes usually| this can be often  within the text within the compact code. Thus, to tell if 2 texts have similar meanings, we tend to merely check whether or not they have similar codes. To cluster short texts by their meanings, we tend to advise to incorporate a lot of linguistics signals to short texts. significantly, for each term inside the short text, we've got its ideas and co-occurring terms inside the probabilistic understanding base to boost fast text. additionally, we tend to introduce a simplified deep learning network comprised of the 3-layer stacked auto-encoders for linguistics hashing. Comprehensive experiments show, with elevated linguistics signals, our simplified deep learning model has the capability to capture the linguistics of short texts, which will facilitate various applications as well as short text retrieval, classification, and general purpose text process

    Advising Customers on Products in Navigating Online Shops – An Empirical Analysis

    Get PDF
    Most online shops do not provide advisory services that take advantage of expert product knowledge. Therefore, consumersmay spend a higher search effort than necessary to find products that serve their needs. This study investigates to what extentan ontology-based, “advisory” navigation menu can decrease a consumers’ search effort. For this purpose, we conducted alaboratory experiment with 159 participants to assess the impact of an ontology-based navigation menu on participants’information behavior in an online shop. Our log file-based comparison with a conventional navigation menu showed asignificant decrease of search effort for the advisory navigation menu. Comparison criteria include the number of productresult pages viewed, the number of detail pages viewed, and the amount of filters used in a session. Implications of thisresearch concern the development of online shop interfaces that use ontology-based product catalogues and therefore supportconsumers in their information search

    Survey and evaluation of query intent detection methods

    Get PDF
    Second ACM International Conference on Web Search and Data Mining, Barcelona (Spain)User interactions with search engines reveal three main underlying intents, namely navigational, informational, and transactional. By providing more accurate results depending on such query intents the performance of search engines can be greatly improved. Therefore, query classification has been an active research topic for the last years. However, while query topic classification has deserved a specific bakeoff, no evaluation campaign has been devoted to the study of automatic query intent detection. In this paper some of the available query intent detection techniques are reviewed, an evaluation framework is proposed, and it is used to compare those methods in order to shed light on their relative performance and drawbacks. As it will be shown, manually prepared gold-standard files are much needed, and traditional pooling is not the most feasible evaluation method. In addition to this, future lines of work in both query intent detection and its evaluation are propose

    Concept-based short text classification and ranking

    Get PDF
    ABSTRACT Most existing approaches for text classification represent texts as vectors of words, namely "Bag-of-Words." This text representation results in a very high dimensionality of feature space and frequently suffers from surface mismatching. Short texts make these issues even more serious, due to their shortness and sparsity. In this paper, we propose using "Bag-of-Concepts" in short text representation, aiming to avoid the surface mismatching and handle the synonym and polysemy problem. Based on "Bag-of-Concepts," a novel framework is proposed for lightweight short text classification applications. By leveraging a large taxonomy knowledgebase, it learns a concept model for each category, and conceptualizes a short text to a set of relevant concepts. A concept-based similarity mechanism is presented to classify the given short text to the most similar category. One advantage of this mechanism is that it facilitates short text ranking after classification, which is needed in many applications, such as query or ad recommendation. We demonstrate the usage of our proposed framework through a real online application: Channel-based Query Recommendation. Experiments show that our framework can map queries to channels with a high degree of precision (avg. precision = 90.3%), which is critical for recommendation applications
    corecore