3,161 research outputs found

    Location-Aware Keyword Query Proposal Based On File Proximity

    Get PDF
    Web search query suggestions aid users in finding relevant content without requiring them to know how to search for it exactly. Existing keyword suggestion approaches do not take into account user locations and query results; i.e. the geographic proximity of a user to the results found is not taken as a consideration in the recommendation. However, the relevancy of search results is known to be connected to their geographic proximity to the query emitter in many applications (e.g. location-based services). We build a keyword query suggestion framework that is aware of location. We offer a weighted keyword-document graph capturing both the semitone significance between keyword searches and the geographic distance between the documents generated and the user location. To choose the highest-scoring keyword queries as suggestions, the graph is viewed in a random-walk-with-restart method. A partition-based technique that's up to an order of magnitude better than the baseline beats the baseline method. To assess the performance of our framework and algorithms, we use real data

    Toward Entity-Aware Search

    Get PDF
    As the Web has evolved into a data-rich repository, with the standard "page view," current search engines are becoming increasingly inadequate for a wide range of query tasks. While we often search for various data "entities" (e.g., phone number, paper PDF, date), today's engines only take us indirectly to pages. In my Ph.D. study, we focus on a novel type of Web search that is aware of data entities inside pages, a significant departure from traditional document retrieval. We study the various essential aspects of supporting entity-aware Web search. To begin with, we tackle the core challenge of ranking entities, by distilling its underlying conceptual model Impression Model and developing a probabilistic ranking framework, EntityRank, that is able to seamlessly integrate both local and global information in ranking. We also report a prototype system built to show the initial promise of the proposal. Then, we aim at distilling and abstracting the essential computation requirements of entity search. From the dual views of reasoning--entity as input and entity as output, we propose a dual-inversion framework, with two indexing and partition schemes, towards efficient and scalable query processing. Further, to recognize more entity instances, we study the problem of entity synonym discovery through mining query log data. The results we obtained so far have shown clear promise of entity-aware search, in its usefulness, effectiveness, efficiency and scalability

    Large Formless Sets of Data for Competitor Mining

    Get PDF
    A company's success is determined by its ability to make a thing appealing to its customers rather than the competitors. We have some questions within the context of this challenge: how can competition between two parts be formalized and quantified? Who are the primary rivals of a certain item? What are an item's most distinguishing features? Despite the fact that this subject has a wide range of influence and relevance, only a little amount of effort has gone into developing a suitable answer. The competition between two items is observed formally in this project, depending on the business areas they both represent. Consumer comments and a wide range of available knowledge in a variety of fields are used in our competitiveness assessment. We provide efficient approaches for assessing competition and resolving the inherent challenge of finishing the top-known competitors of a specific object using large data sets. Finally, we test the validity and scalability of our conclusions using a variety of datasets from other domains

    Supporting Source Code Search with Context-Aware and Semantics-Driven Query Reformulation

    Get PDF
    Software bugs and failures cost trillions of dollars every year, and could even lead to deadly accidents (e.g., Therac-25 accident). During maintenance, software developers fix numerous bugs and implement hundreds of new features by making necessary changes to the existing software code. Once an issue report (e.g., bug report, change request) is assigned to a developer, she chooses a few important keywords from the report as a search query, and then attempts to find out the exact locations in the software code that need to be either repaired or enhanced. As a part of this maintenance, developers also often select ad hoc queries on the fly, and attempt to locate the reusable code from the Internet that could assist them either in bug fixing or in feature implementation. Unfortunately, even the experienced developers often fail to construct the right search queries. Even if the developers come up with a few ad hoc queries, most of them require frequent modifications which cost significant development time and efforts. Thus, construction of an appropriate query for localizing the software bugs, programming concepts or even the reusable code is a major challenge. In this thesis, we overcome this query construction challenge with six studies, and develop a novel, effective code search solution (BugDoctor) that assists the developers in localizing the software code of interest (e.g., bugs, concepts and reusable code) during software maintenance. In particular, we reformulate a given search query (1) by designing novel keyword selection algorithms (e.g., CodeRank) that outperform the traditional alternatives (e.g., TF-IDF), (2) by leveraging the bug report quality paradigm and source document structures which were previously overlooked and (3) by exploiting the crowd knowledge and word semantics derived from Stack Overflow Q&A site, which were previously untapped. Our experiment using 5000+ search queries (bug reports, change requests, and ad hoc queries) suggests that our proposed approach can improve the given queries significantly through automated query reformulations. Comparison with 10+ existing studies on bug localization, concept location and Internet-scale code search suggests that our approach can outperform the state-of-the-art approaches with a significant margin

    POSITION ATTENTIVE KEYWORD ENQUIRY PROPOSAL BASED ON PAPER CLOSENESS

    Get PDF
    We design the initial ever Location-Aware Keyword Query Suggestion Framework; for suggestions tightly connected using the user’s information needs which retrieve relevant documents near the query issuer’s location. Existing keyword suggestion techniques don't think about the locations within the users combined with query results i.e., the spatial closeness inside the user for the retrieved results isn't taken like phone recommendation. We advise a weighted keyword-document graph, which captures the both semantic relevance between keyword queries combined with spatial distance in regards to the resulting documents combined with user location. Our suggested LKS framework is orthogonal that is definitely integrated within the suggestion techniques that make use of the query-URL bipartite graph. That LKS includes a different goal and for that reason is different from other Location-Aware recommendation methods. The initial challenge inside our LKS framework is the easiest method to effectively measure keyword query similarity while recording the spatial distance factor. To make certain this assertion, we conducted experiments using two denser versions inside our datasets the dense America online-D. Particularly, the hybrid method outperforms other approaches since it uses both spatial and textual factors while using ink propagation procedure, and therefore predicts better what type of ink possess a inclination to flow and cluster, achieving better partitioning. Produce a baseline formula extended from formula BCA is brought to solve the issue. Then, we suggested a partition-based formula which computes the majority of the candidate keyword queries inside the partition level and uses lazy mechanism in cutting the computational cost

    Sistema de Sugestões Sensível ao Contexto

    Get PDF
    Over the last few years, pervasive systems have experienced some interesting development. Nevertheless, human-human interaction can also take advantage of those systems by using their ability to perceive the surrounding environment. In this dissertation, we have developed a pervasive system - named ConversationaL Aware Suggestion SYstem (CLASSY) - which is aware of the conversational context and suggests the users potentially useful documents or that, somehow, save time executing a specific task. We have also proposed two different approaches - the Neighborhood one, that uses semantic similarity, based on proximity data in order to classify the relationship between tokens; and the Reinforcement Learning one, that uses implicit feedback associated with each suggestion as a source of knowledge that can be used to improve the system's performance over time. The conducted tests showed that these two approaches not only enhanced the pervasive behavior of the system, but also increased its global performance. A case study regarding the importance of feedback on context-limited environments was also carried out, whose results showed that it is still a useful source of knowledge regardless the conversational environment's characteristics.Ao longo dos últimos anos, os sistemas pervasivos têm sido fonte de um grande desenvolvimento. Contudo, as interações humano-humano também podem tirar vantagem deste tipo de sistemas recorrendo à sua capacidade para entender o ambiente que o rodeia. Nesta dissertação, foi desenvolvido um sistema pervasivo - chamado Sistema de Sugestões Sensível ao Contexto (CLASSY) - que está consciente dos vários contextos conversacionais e que sugere documentos considerados potencialmente úteis para os utilizadores ou que, de alguma forma, poupam tempo na execução de uma tarefa específica. Foram também propostas duas aproximações diferentes - a de vizinhança, que usa similaridade semântica, baseando-se em proximidades de forma a classificar relações entre palavras; e a de Aprendizagem por Reforço, que usa feedback implícito dos utilizadores associado a cada sugestão, como fonte de conhecimento que pode ser utilizado para melhorar a performance do sistema ao longo do tempo. Os testes realizados mostraram que as aproximações acima referidas melhoraram não só o comportamento pervasivo do sistema, mas também a sua performance global. Foi, ainda, analisado um caso de estudo referente à importância de feedback em ambientes com contexto limitado, onde os resultados mostraram que o mesmo continua a ser uma importante fonte de conhecimento, independentemente das características do ambiente conversacional.Mestrado em Engenharia de Computadores e Telemátic

    A FRAMEWORK FOR QUERY RECOMMENDATION ON LOCATION-BASED QUERIES

    Get PDF
    Existing keyword insinuation techniques don't ponder about the locations from the users and also the query ensue i.e., the spatial oppressiveness of the user towards the retrieved results isn't taken preference a water in the recommendations. We advise a weighted keyword-document chart, which captures both semantic applicability between keyword queries and also the spatial distance between your resulting dogma and also the user place. We design the very first ever Location-aware Keyword Query Suggestion framework, for suggestions highly relevant to the user’s message needs which recover germane dogma well-nigh to the query issuer’s location. Our prompt LKS framework is orthogonal to and could be conveniently integrated out of all complaint techniques that make use of the query-URL bipartite chart. That LKS hold a different goal and for that reason is distinct from other location-sensitive recommendation methods. The very first blame in our LKS framework is how you can thoroughly measure keyword query similarity while recording the spatial restraint factor. To insure this affirmation, we conducted experience second-hand two denser versions in our datasets the close America online-D. Particularly, the outcross method outperforms other approaches since it uses both spatial and textual constituent throughout the ink propagation process, and therefore soothsay better the moving the ink may have a tendency to proceed and cluster, achieving better partitioning. Set up a baseline formula amplify from formula BCA is brought to solve the issue. Then, we allude to a partition-supported formula which figure the lots of the candidate keyword question in the partition straightforward and found on an inert clockwork to succour reduce the computational cost

    A Systematic Review of Automated Query Reformulations in Source Code Search

    Full text link
    Fixing software bugs and adding new features are two of the major maintenance tasks. Software bugs and features are reported as change requests. Developers consult these requests and often choose a few keywords from them as an ad hoc query. Then they execute the query with a search engine to find the exact locations within software code that need to be changed. Unfortunately, even experienced developers often fail to choose appropriate queries, which leads to costly trials and errors during a code search. Over the years, many studies attempt to reformulate the ad hoc queries from developers to support them. In this systematic literature review, we carefully select 70 primary studies on query reformulations from 2,970 candidate studies, perform an in-depth qualitative analysis (e.g., Grounded Theory), and then answer seven research questions with major findings. First, to date, eight major methodologies (e.g., term weighting, term co-occurrence analysis, thesaurus lookup) have been adopted to reformulate queries. Second, the existing studies suffer from several major limitations (e.g., lack of generalizability, vocabulary mismatch problem, subjective bias) that might prevent their wide adoption. Finally, we discuss the best practices and future opportunities to advance the state of research in search query reformulations.Comment: 81 pages, accepted at TOSE

    OPTIMISED DOCUMENT RECOMMENDATION BASED ON CONTENT AND QUERYING USER LOCATION

    Get PDF
    We form the very initially ever Location-aware Keyword inquire Suggestion groundwork, for proposals immensely respecting the user’s report needs that repair admissible documents much the inquire sender’s position. Existing secret sign indication techniques injunction suffers the scenes from the users and the inquire results i.e., the geographical intimacy of the user pointing to the salvaged results isn't reserved like an aspect in the proposals. We apprise a lade paternoster-document linear representation, whichever captures both linguistic congruity betwixt abacas queries and the structural length during your resulting documents and the user whereabouts. Our counseled LKS cage is orthogonal to and likely clearly mixed drained all proposal techniques that abuse the quiz-URL amphibian visual representation. That LKS includes a strange goal and anyway consideration is extraordinary from separate whereabouts-aware order manners. The very initially impose in our LKS scheme is how you can dramatically measure secret sign doubt comparison bit disk the structural size cause. To establish this allegation, we conducted experiments accepting two duller versions in our datasets the impenetrable America online-D. Particularly, the combination structure outperforms separate approaches afterward it uses both structural and textual considerations in all respects the ink breeding scheme, and so predicts enhance the way the ink may bear to flow and chunk, achieving beat barbering. Set up a guideline maxim lengthy from description BCA is revitalize decide the send. Then, we recommended a separation-based prescription whichever computes the endless the applicant magic formula queries in the barrier level and relies on a lazy agency to help bring the computational cost
    • …
    corecore