1,449 research outputs found

    Accelerating Innovation Through Analogy Mining

    Full text link
    The availability of large idea repositories (e.g., the U.S. patent database) could significantly accelerate innovation and discovery by providing people with inspiration from solutions to analogous problems. However, finding useful analogies in these large, messy, real-world repositories remains a persistent challenge for either human or automated methods. Previous approaches include costly hand-created databases that have high relational structure (e.g., predicate calculus representations) but are very sparse. Simpler machine-learning/information-retrieval similarity metrics can scale to large, natural-language datasets, but struggle to account for structural similarity, which is central to analogy. In this paper we explore the viability and value of learning simpler structural representations, specifically, "problem schemas", which specify the purpose of a product and the mechanisms by which it achieves that purpose. Our approach combines crowdsourcing and recurrent neural networks to extract purpose and mechanism vector representations from product descriptions. We demonstrate that these learned vectors allow us to find analogies with higher precision and recall than traditional information-retrieval methods. In an ideation experiment, analogies retrieved by our models significantly increased people's likelihood of generating creative ideas compared to analogies retrieved by traditional methods. Our results suggest a promising approach to enabling computational analogy at scale is to learn and leverage weaker structural representations.Comment: KDD 201

    Mathematical practice, crowdsourcing, and social machines

    Full text link
    The highest level of mathematics has traditionally been seen as a solitary endeavour, to produce a proof for review and acceptance by research peers. Mathematics is now at a remarkable inflexion point, with new technology radically extending the power and limits of individuals. Crowdsourcing pulls together diverse experts to solve problems; symbolic computation tackles huge routine calculations; and computers check proofs too long and complicated for humans to comprehend. Mathematical practice is an emerging interdisciplinary field which draws on philosophy and social science to understand how mathematics is produced. Online mathematical activity provides a novel and rich source of data for empirical investigation of mathematical practice - for example the community question answering system {\it mathoverflow} contains around 40,000 mathematical conversations, and {\it polymath} collaborations provide transcripts of the process of discovering proofs. Our preliminary investigations have demonstrated the importance of "soft" aspects such as analogy and creativity, alongside deduction and proof, in the production of mathematics, and have given us new ways to think about the roles of people and machines in creating new mathematical knowledge. We discuss further investigation of these resources and what it might reveal. Crowdsourced mathematical activity is an example of a "social machine", a new paradigm, identified by Berners-Lee, for viewing a combination of people and computers as a single problem-solving entity, and the subject of major international research endeavours. We outline a future research agenda for mathematics social machines, a combination of people, computers, and mathematical archives to create and apply mathematics, with the potential to change the way people do mathematics, and to transform the reach, pace, and impact of mathematics research.Comment: To appear, Springer LNCS, Proceedings of Conferences on Intelligent Computer Mathematics, CICM 2013, July 2013 Bath, U

    The Argument Reasoning Comprehension Task: Identification and Reconstruction of Implicit Warrants

    Full text link
    Reasoning is a crucial part of natural language argumentation. To comprehend an argument, one must analyze its warrant, which explains why its claim follows from its premises. As arguments are highly contextualized, warrants are usually presupposed and left implicit. Thus, the comprehension does not only require language understanding and logic skills, but also depends on common sense. In this paper we develop a methodology for reconstructing warrants systematically. We operationalize it in a scalable crowdsourcing process, resulting in a freely licensed dataset with warrants for 2k authentic arguments from news comments. On this basis, we present a new challenging task, the argument reasoning comprehension task. Given an argument with a claim and a premise, the goal is to choose the correct implicit warrant from two options. Both warrants are plausible and lexically close, but lead to contradicting claims. A solution to this task will define a substantial step towards automatic warrant reconstruction. However, experiments with several neural attention and language models reveal that current approaches do not suffice.Comment: Accepted as NAACL 2018 Long Paper; see details on the front pag

    SensEmbed: Learning sense embeddings for word and relational similarity

    Get PDF
    Word embeddings have recently gained considerable popularity for modeling words in different Natural Language Processing (NLP) tasks including semantic similarity measurement. However, notwithstanding their success, word embeddings are by their very nature unable to capture polysemy, as different meanings of a word are conflated into a single representation. In addition, their learning process usually relies on massive corpora only, preventing them from taking advantage of structured knowledge. We address both issues by proposing a multifaceted approach that transforms word embeddings to the sense level and leverages knowledge from a large semantic network for effective semantic similarity measurement. We evaluate our approach on word similarity and relational similarity frameworks, reporting state-of-the-art performance on multiple datasets

    Fast Dawid-Skene: A Fast Vote Aggregation Scheme for Sentiment Classification

    Full text link
    Many real world problems can now be effectively solved using supervised machine learning. A major roadblock is often the lack of an adequate quantity of labeled data for training. A possible solution is to assign the task of labeling data to a crowd, and then infer the true label using aggregation methods. A well-known approach for aggregation is the Dawid-Skene (DS) algorithm, which is based on the principle of Expectation-Maximization (EM). We propose a new simple, yet effective, EM-based algorithm, which can be interpreted as a `hard' version of DS, that allows much faster convergence while maintaining similar accuracy in aggregation. We show the use of this algorithm as a quick and effective technique for online, real-time sentiment annotation. We also prove that our algorithm converges to the estimated labels at a linear rate. Our experiments on standard datasets show a significant speedup in time taken for aggregation - upto ∼\sim8x over Dawid-Skene and ∼\sim6x over other fast EM methods, at competitive accuracy performance. The code for the implementation of the algorithms can be found at https://github.com/GoodDeeds/Fast-Dawid-SkeneComment: 8 pages, 5 tables, 1 figure, KDD Workshop on Issues of Sentiment Discovery and Opinion Mining (WISDOM) 201
    • …
    corecore