3,573 research outputs found
An Improved Crowdsourcing Based Evaluation Technique for Word Embedding Methods
In this proposal track paper, we have presented a crowdsourcing-based word embedding evaluation technique that will be more reliable and linguistically justified. The method is designed for intrinsic evaluation and extends the approach proposed in (Schnabel et al., 2015). Our improved evaluation technique captures word relatedness based on the word context
Target Apps Selection: Towards a Unified Search Framework for Mobile Devices
With the recent growth of conversational systems and intelligent assistants
such as Apple Siri and Google Assistant, mobile devices are becoming even more
pervasive in our lives. As a consequence, users are getting engaged with the
mobile apps and frequently search for an information need in their apps.
However, users cannot search within their apps through their intelligent
assistants. This requires a unified mobile search framework that identifies the
target app(s) for the user's query, submits the query to the app(s), and
presents the results to the user. In this paper, we take the first step forward
towards developing unified mobile search. In more detail, we introduce and
study the task of target apps selection, which has various potential real-world
applications. To this aim, we analyze attributes of search queries as well as
user behaviors, while searching with different mobile apps. The analyses are
done based on thousands of queries that we collected through crowdsourcing. We
finally study the performance of state-of-the-art retrieval models for this
task and propose two simple yet effective neural models that significantly
outperform the baselines. Our neural approaches are based on learning
high-dimensional representations for mobile apps. Our analyses and experiments
suggest specific future directions in this research area.Comment: To appear at SIGIR 201
On Identifying Disaster-Related Tweets: Matching-based or Learning-based?
Social media such as tweets are emerging as platforms contributing to
situational awareness during disasters. Information shared on Twitter by both
affected population (e.g., requesting assistance, warning) and those outside
the impact zone (e.g., providing assistance) would help first responders,
decision makers, and the public to understand the situation first-hand.
Effective use of such information requires timely selection and analysis of
tweets that are relevant to a particular disaster. Even though abundant tweets
are promising as a data source, it is challenging to automatically identify
relevant messages since tweet are short and unstructured, resulting to
unsatisfactory classification performance of conventional learning-based
approaches. Thus, we propose a simple yet effective algorithm to identify
relevant messages based on matching keywords and hashtags, and provide a
comparison between matching-based and learning-based approaches. To evaluate
the two approaches, we put them into a framework specifically proposed for
analyzing disaster-related tweets. Analysis results on eleven datasets with
various disaster types show that our technique provides relevant tweets of
higher quality and more interpretable results of sentiment analysis tasks when
compared to learning approach
Accelerating Innovation Through Analogy Mining
The availability of large idea repositories (e.g., the U.S. patent database)
could significantly accelerate innovation and discovery by providing people
with inspiration from solutions to analogous problems. However, finding useful
analogies in these large, messy, real-world repositories remains a persistent
challenge for either human or automated methods. Previous approaches include
costly hand-created databases that have high relational structure (e.g.,
predicate calculus representations) but are very sparse. Simpler
machine-learning/information-retrieval similarity metrics can scale to large,
natural-language datasets, but struggle to account for structural similarity,
which is central to analogy. In this paper we explore the viability and value
of learning simpler structural representations, specifically, "problem
schemas", which specify the purpose of a product and the mechanisms by which it
achieves that purpose. Our approach combines crowdsourcing and recurrent neural
networks to extract purpose and mechanism vector representations from product
descriptions. We demonstrate that these learned vectors allow us to find
analogies with higher precision and recall than traditional
information-retrieval methods. In an ideation experiment, analogies retrieved
by our models significantly increased people's likelihood of generating
creative ideas compared to analogies retrieved by traditional methods. Our
results suggest a promising approach to enabling computational analogy at scale
is to learn and leverage weaker structural representations.Comment: KDD 201
- …