18,371 research outputs found
Anticipating Information Needs Based on Check-in Activity
In this work we address the development of a smart personal assistant that is
capable of anticipating a user's information needs based on a novel type of
context: the person's activity inferred from her check-in records on a
location-based social network. Our main contribution is a method that
translates a check-in activity into an information need, which is in turn
addressed with an appropriate information card. This task is challenging
because of the large number of possible activities and related information
needs, which need to be addressed in a mobile dashboard that is limited in
size. Our approach considers each possible activity that might follow after the
last (and already finished) activity, and selects the top information cards
such that they maximize the likelihood of satisfying the user's information
needs for all possible future scenarios. The proposed models also incorporate
knowledge about the temporal dynamics of information needs. Using a combination
of historical check-in data and manual assessments collected via crowdsourcing,
we show experimentally the effectiveness of our approach.Comment: Proceedings of the 10th ACM International Conference on Web Search
and Data Mining (WSDM '17), 201
Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture
We present the architecture behind Twitter's real-time related query
suggestion and spelling correction service. Although these tasks have received
much attention in the web search literature, the Twitter context introduces a
real-time "twist": after significant breaking news events, we aim to provide
relevant results within minutes. This paper provides a case study illustrating
the challenges of real-time data processing in the era of "big data". We tell
the story of how our system was built twice: our first implementation was built
on a typical Hadoop-based analytics stack, but was later replaced because it
did not meet the latency requirements necessary to generate meaningful
real-time results. The second implementation, which is the system deployed in
production, is a custom in-memory processing engine specifically designed for
the task. This experience taught us that the current typical usage of Hadoop as
a "big data" platform, while great for experimentation, is not well suited to
low-latency processing, and points the way to future work on data analytics
platforms that can handle "big" as well as "fast" data
Context Sensitive Search String Composition Algorithm using User Intention to Handle Ambiguous Keywords
Finding the required URL among the first few result pages of a search engine is still a challenging task. This may require number of reformulations of the search string thus adversely affecting user's search time. Query ambiguity and polysemy are major reasons for not obtaining relevant results in the top few result pages. Efficient query composition and data organization are necessary for getting effective results. Context of the information need and the user intent may improve the autocomplete feature of existing search engines. This research proposes a Funnel Mesh-5 algorithm (FM5) to construct a search string taking into account context of information need and user intention with three main steps 1) Predict user intention with user profiles and the past searches via weighted mesh structure 2) Resolve ambiguity and polysemy of search strings with context and user intention 3) Generate a personalized disambiguated search string by query expansion encompassing user intention and predicted query. Experimental results for the proposed approach and a comparison with direct use of search engine are presented. A comparison of FM5 algorithm with K Nearest Neighbor algorithm for user intention identification is also presented. The proposed system provides better precision for search results for ambiguous search strings with improved identification of the user intention. Results are presented for English language dataset as well as Marathi (an Indian language) dataset of ambiguous search strings.
Declarative Ajax Web Applications through SQL++ on a Unified Application State
Implementing even a conceptually simple web application requires an
inordinate amount of time. FORWARD addresses three problems that reduce
developer productivity: (a) Impedance mismatch across the multiple languages
used at different tiers of the application architecture. (b) Distributed data
access across the multiple data sources of the application (SQL database, user
input of the browser page, session data in the application server, etc). (c)
Asynchronous, incremental modification of the pages, as performed by Ajax
actions.
FORWARD belongs to a novel family of web application frameworks that attack
impedance mismatch by offering a single unifying language. FORWARD's language
is SQL++, a minimally extended SQL. FORWARD's architecture is based on two
novel cornerstones: (a) A Unified Application State (UAS), which is a virtual
database over the multiple data sources. The UAS is accessed via distributed
SQL++ queries, therefore resolving the distributed data access problem. (b)
Declarative page specifications, which treat the data displayed by pages as
rendered SQL++ page queries. The resulting pages are automatically
incrementally modified by FORWARD. User input on the page becomes part of the
UAS.
We show that SQL++ captures the semi-structured nature of web pages and
subsumes the data models of two important data sources of the UAS: SQL
databases and JavaScript components. We show that simple markup is sufficient
for creating Ajax displays and for modeling user input on the page as UAS data
sources. Finally, we discuss the page specification syntax and semantics that
are needed in order to avoid race conditions and conflicts between the user
input and the automated Ajax page modifications.
FORWARD has been used in the development of eight commercial and academic
applications. An alpha-release web-based IDE (itself built in FORWARD) enables
development in the cloud.Comment: Proceedings of the 14th International Symposium on Database
Programming Languages (DBPL 2013), August 30, 2013, Riva del Garda, Trento,
Ital
- …