15 research outputs found
Time Pressure and System Delays in Information Search
We report preliminary results of the impact of time pres-
sure and system delays on search behavior from a laboratory
study with forty-three participants. To induce time pres-
sure, we randomly assigned half of our study participants
to a treatment condition where they were only allowed five
minutes to search for each of four ad-hoc search topics. The
other half of the participants were given no task time limits.
For half of participants’ search tasks (n=2), five second de-
lays were introduced after queries were submitted and SERP
results were clicked. Results showed that participants in
the time pressure condition queried at a significantly higher
rate, viewed significantly fewer documents per query, had
significantly shallower hover and view depths, and spent sig-
nificantly less time examining documents and SERPs. We
found few significant differences in search behavior for sys-
tem delay or interaction effects between time pressure and
system delay. These initial results show time pressure has
a significant impact on search behavior and suggest the de-
sign of search interfaces and features that support people
who are searching under time pressure
On the Role of Engagement in Information Seeking Contexts: From Research to Implementation
This workshop will provide a forum for researchers, practitioners and developers interested in user engagement and emotion in the context of information systems design and use. Specifically, we seek to address questions such as “How do we ensure that the measurement of subjective user experiences is robust and scalable?”, “How do we design for engaging and emotionally compelling experiences?”, and “How do we prevent disengagement?” The ability to answer these questions relies upon: a solid
conceptual understanding of subjective experiences; robust, scalable approaches to measuring engagement; and the ability to utilize this knowledge in information systems design. This three-part workshop will include: talks by the organizers to ground the workshop’s themes; position paper presentations and design exemplars from attendees, and an interactive session focused on design scenarios and prototyping. The intersection of emotion and engagement with measurement and design in information seeking contexts is a timely issue for the iSchool community.ye
i-DATAQUEST : a Proposal for a Manufacturing Data Query System Based on a Graph
During the manufacturing product life cycle, an increasing volume of data is generated and stored in distributed resources. These data are heterogeneous, explicitly and implicitly linked and they could be structured and unstructured. The rapid, exhaustive and relevant acquisition of information from this data is a major manufacturing industry issue. The key challenges, in this context, are to transform heterogeneous data into a common searchable data model, to allow semantic search, to detect implicit links between data and to rank results by relevance. To address this issue, the authors propose a query system based on a graph database. This graph is defined based on all the transformed manufacturing data. Besides, the graph is enriched by explicitly and implicitly data links. Finally, the enriched graph is queried thanks to an extended queries system defined by a knowledge graph. The authors depict a proof of concept to validate the proposal. After a partial implementation of this proof of concept, the authors obtain an acceptable result and a needed effort to improve the system response time. Finally, the authors open the topic on the subjects of right management, user profile/customization and data update.Chaire ENSAM-Capgemini sur le PLM du futu
A Comparison of Supervised Learning to Match Methods for Product Search
The vocabulary gap is a core challenge in information retrieval (IR). In
e-commerce applications like product search, the vocabulary gap is reported to
be a bigger challenge than in more traditional application areas in IR, such as
news search or web search. As recent learning to match methods have made
important advances in bridging the vocabulary gap for these traditional IR
areas, we investigate their potential in the context of product search. In this
paper we provide insights into using recent learning to match methods for
product search. We compare both effectiveness and efficiency of these methods
in a product search setting and analyze their performance on two product search
datasets, with 50,000 queries each. One is an open dataset made available as
part of a community benchmark activity at CIKM 2016. The other is a proprietary
query log obtained from a European e-commerce platform. This comparison is
conducted towards a better understanding of trade-offs in choosing a preferred
model for this task. We find that (1) models that have been specifically
designed for short text matching, like MV-LSTM and DRMMTKS, are consistently
among the top three methods in all experiments; however, taking efficiency and
accuracy into account at the same time, ARC-I is the preferred model for real
world use cases; and (2) the performance from a state-of-the-art BERT-based
model is mediocre, which we attribute to the fact that the text BERT is
pre-trained on is very different from the text we have in product search. We
also provide insights into factors that can influence model behavior for
different types of query, such as the length of retrieved list, and query
complexity, and discuss the implications of our findings for e-commerce
practitioners, with respect to choosing a well performing method.Comment: 10 pages, 5 figures, Accepted at SIGIR Workshop on eCommerce 202
This is not the End: Rethinking Serverless Function Termination
Elastic scaling is one of the central benefits provided by serverless
platforms, and requires that they scale resource up and down in response to
changing workloads. Serverless platforms scale-down resources by terminating
previously launched instances (which are containers or processes). The
serverless programming model ensures that terminating instances is safe
assuming all application code running on the instance has either completed or
timed out. Safety thus depends on the serverless platform's correctly
determining that application processing is complete.
In this paper, we start with the observation that current serverless
platforms do not account for pending asynchronous I/O operations when
determining whether application processing is complete. These platforms are
thus unsafe when executing programs that use asynchronous I/O, and incorrectly
deciding that application processing has terminated can result in data
inconsistency when these platforms are used. We show that the reason for this
problem is that current serverless semantics couple termination and response
generation in serverless applications. We address this problem by proposing an
extension to current semantics that decouples response generation and
termination, and demonstrate the efficacy and benefits of our proposal by
extending OpenWhisk, an open source serverless platform
Quality versus efficiency in document scoring with learning-to-rank models
Learning-to-Rank (LtR) techniques leverage machine learning algorithms and large amounts of training data to induce high-quality ranking functions. Given a set of docu- ments and a user query, these functions are able to precisely predict a score for each of the documents, in turn exploited to effectively rank them. Although the scoring efficiency of LtR models is critical in several applications – e.g., it directly impacts on response time and throughput of Web query processing – it has received relatively little attention so far.
The goal of this work is to experimentally investigate the scoring efficiency of LtR models along with their ranking quality. Specifically, we show that machine-learned ranking mod- els exhibit a quality versus efficiency trade-off. For example, each family of LtR algorithms has tuning parameters that can influence both effectiveness and efficiency, where higher ranking quality is generally obtained with more complex and expensive models. Moreover, LtR algorithms that learn complex models, such as those based on forests of regression trees, are generally more expensive and more effective than other algorithms that induce simpler models like linear combination of features.
We extensively analyze the quality versus efficiency trade-off of a wide spectrum of state- of-the-art LtR, and we propose a sound methodology to devise the most effective ranker given a time budget. To guarantee reproducibility, we used publicly available datasets and we contribute an open source C++ framework providing optimized, multi-threaded imple- mentations of the most effective tree-based learners: Gradient Boosted Regression Trees (GBRT), Lambda-Mart (λ-MART), and the first public-domain implementation of Oblivious Lambda-Mart (λ-MART), an algorithm that induces forests of oblivious regression trees.
We investigate how the different training parameters impact on the quality versus effi- ciency trade-off, and provide a thorough comparison of several algorithms in the quality- cost space. The experiments conducted show that there is not an overall best algorithm, but the optimal choice depends on the time budget