1,449 research outputs found
Accelerating Innovation Through Analogy Mining
The availability of large idea repositories (e.g., the U.S. patent database)
could significantly accelerate innovation and discovery by providing people
with inspiration from solutions to analogous problems. However, finding useful
analogies in these large, messy, real-world repositories remains a persistent
challenge for either human or automated methods. Previous approaches include
costly hand-created databases that have high relational structure (e.g.,
predicate calculus representations) but are very sparse. Simpler
machine-learning/information-retrieval similarity metrics can scale to large,
natural-language datasets, but struggle to account for structural similarity,
which is central to analogy. In this paper we explore the viability and value
of learning simpler structural representations, specifically, "problem
schemas", which specify the purpose of a product and the mechanisms by which it
achieves that purpose. Our approach combines crowdsourcing and recurrent neural
networks to extract purpose and mechanism vector representations from product
descriptions. We demonstrate that these learned vectors allow us to find
analogies with higher precision and recall than traditional
information-retrieval methods. In an ideation experiment, analogies retrieved
by our models significantly increased people's likelihood of generating
creative ideas compared to analogies retrieved by traditional methods. Our
results suggest a promising approach to enabling computational analogy at scale
is to learn and leverage weaker structural representations.Comment: KDD 201
Mathematical practice, crowdsourcing, and social machines
The highest level of mathematics has traditionally been seen as a solitary
endeavour, to produce a proof for review and acceptance by research peers.
Mathematics is now at a remarkable inflexion point, with new technology
radically extending the power and limits of individuals. Crowdsourcing pulls
together diverse experts to solve problems; symbolic computation tackles huge
routine calculations; and computers check proofs too long and complicated for
humans to comprehend.
Mathematical practice is an emerging interdisciplinary field which draws on
philosophy and social science to understand how mathematics is produced. Online
mathematical activity provides a novel and rich source of data for empirical
investigation of mathematical practice - for example the community question
answering system {\it mathoverflow} contains around 40,000 mathematical
conversations, and {\it polymath} collaborations provide transcripts of the
process of discovering proofs. Our preliminary investigations have demonstrated
the importance of "soft" aspects such as analogy and creativity, alongside
deduction and proof, in the production of mathematics, and have given us new
ways to think about the roles of people and machines in creating new
mathematical knowledge. We discuss further investigation of these resources and
what it might reveal.
Crowdsourced mathematical activity is an example of a "social machine", a new
paradigm, identified by Berners-Lee, for viewing a combination of people and
computers as a single problem-solving entity, and the subject of major
international research endeavours. We outline a future research agenda for
mathematics social machines, a combination of people, computers, and
mathematical archives to create and apply mathematics, with the potential to
change the way people do mathematics, and to transform the reach, pace, and
impact of mathematics research.Comment: To appear, Springer LNCS, Proceedings of Conferences on Intelligent
Computer Mathematics, CICM 2013, July 2013 Bath, U
The Argument Reasoning Comprehension Task: Identification and Reconstruction of Implicit Warrants
Reasoning is a crucial part of natural language argumentation. To comprehend
an argument, one must analyze its warrant, which explains why its claim follows
from its premises. As arguments are highly contextualized, warrants are usually
presupposed and left implicit. Thus, the comprehension does not only require
language understanding and logic skills, but also depends on common sense. In
this paper we develop a methodology for reconstructing warrants systematically.
We operationalize it in a scalable crowdsourcing process, resulting in a freely
licensed dataset with warrants for 2k authentic arguments from news comments.
On this basis, we present a new challenging task, the argument reasoning
comprehension task. Given an argument with a claim and a premise, the goal is
to choose the correct implicit warrant from two options. Both warrants are
plausible and lexically close, but lead to contradicting claims. A solution to
this task will define a substantial step towards automatic warrant
reconstruction. However, experiments with several neural attention and language
models reveal that current approaches do not suffice.Comment: Accepted as NAACL 2018 Long Paper; see details on the front pag
SensEmbed: Learning sense embeddings for word and relational similarity
Word embeddings have recently gained considerable popularity for modeling words in different Natural Language Processing (NLP) tasks including semantic similarity measurement. However, notwithstanding their success, word embeddings are by their very nature unable to capture polysemy, as different meanings of a word are conflated into a single representation. In addition, their learning process usually relies on massive corpora only, preventing them from taking advantage of structured knowledge. We address both issues by proposing a multifaceted approach that transforms word embeddings to the sense level and leverages knowledge from a large semantic network for effective semantic similarity measurement. We evaluate our approach on word similarity and relational similarity frameworks, reporting state-of-the-art performance on multiple datasets
Fast Dawid-Skene: A Fast Vote Aggregation Scheme for Sentiment Classification
Many real world problems can now be effectively solved using supervised
machine learning. A major roadblock is often the lack of an adequate quantity
of labeled data for training. A possible solution is to assign the task of
labeling data to a crowd, and then infer the true label using aggregation
methods. A well-known approach for aggregation is the Dawid-Skene (DS)
algorithm, which is based on the principle of Expectation-Maximization (EM). We
propose a new simple, yet effective, EM-based algorithm, which can be
interpreted as a `hard' version of DS, that allows much faster convergence
while maintaining similar accuracy in aggregation. We show the use of this
algorithm as a quick and effective technique for online, real-time sentiment
annotation. We also prove that our algorithm converges to the estimated labels
at a linear rate. Our experiments on standard datasets show a significant
speedup in time taken for aggregation - upto 8x over Dawid-Skene and
6x over other fast EM methods, at competitive accuracy performance. The
code for the implementation of the algorithms can be found at
https://github.com/GoodDeeds/Fast-Dawid-SkeneComment: 8 pages, 5 tables, 1 figure, KDD Workshop on Issues of Sentiment
Discovery and Opinion Mining (WISDOM) 201
- …