7,941 research outputs found
Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture
We present the architecture behind Twitter's real-time related query
suggestion and spelling correction service. Although these tasks have received
much attention in the web search literature, the Twitter context introduces a
real-time "twist": after significant breaking news events, we aim to provide
relevant results within minutes. This paper provides a case study illustrating
the challenges of real-time data processing in the era of "big data". We tell
the story of how our system was built twice: our first implementation was built
on a typical Hadoop-based analytics stack, but was later replaced because it
did not meet the latency requirements necessary to generate meaningful
real-time results. The second implementation, which is the system deployed in
production, is a custom in-memory processing engine specifically designed for
the task. This experience taught us that the current typical usage of Hadoop as
a "big data" platform, while great for experimentation, is not well suited to
low-latency processing, and points the way to future work on data analytics
platforms that can handle "big" as well as "fast" data
Multiple Models for Recommending Temporal Aspects of Entities
Entity aspect recommendation is an emerging task in semantic search that
helps users discover serendipitous and prominent information with respect to an
entity, of which salience (e.g., popularity) is the most important factor in
previous work. However, entity aspects are temporally dynamic and often driven
by events happening over time. For such cases, aspect suggestion based solely
on salience features can give unsatisfactory results, for two reasons. First,
salience is often accumulated over a long time period and does not account for
recency. Second, many aspects related to an event entity are strongly
time-dependent. In this paper, we study the task of temporal aspect
recommendation for a given entity, which aims at recommending the most relevant
aspects and takes into account time in order to improve search experience. We
propose a novel event-centric ensemble ranking method that learns from multiple
time and type-dependent models and dynamically trades off salience and recency
characteristics. Through extensive experiments on real-world query logs, we
demonstrate that our method is robust and achieves better effectiveness than
competitive baselines.Comment: In proceedings of the 15th Extended Semantic Web Conference (ESWC
2018
Query Click and Text Similarity Graph for Query Suggestions
Query suggestion is an important feature of the search engine with the explosive and diverse growth of web contents. Different kind of suggestions like query, image, movies, music and book etc. are used every day. Various types of data sources are used for the suggestions. If we model the data into various kinds of graphs then we can build a general method for any suggestions. In this paper, we have proposed a general method for query suggestion by combining two graphs: (1) query click graph which captures the relationship between queries frequently clicked on common URLs and (2) query text similarity graph which finds the similarity between two queries using Jaccard similarity. The proposed method provides literally as well as semantically relevant queries for users’ need. Simulation results show that the proposed algorithm outperforms heat diffusion method by providing more number of relevant queries. It can be used for recommendation tasks like query, image, and product suggestion
- …