24,308 research outputs found
Corpus Wide Argument Mining -- a Working Solution
One of the main tasks in argument mining is the retrieval of argumentative
content pertaining to a given topic. Most previous work addressed this task by
retrieving a relatively small number of relevant documents as the initial
source for such content. This line of research yielded moderate success, which
is of limited use in a real-world system. Furthermore, for such a system to
yield a comprehensive set of relevant arguments, over a wide range of topics,
it requires leveraging a large and diverse corpus in an appropriate manner.
Here we present a first end-to-end high-precision, corpus-wide argument mining
system. This is made possible by combining sentence-level queries over an
appropriate indexing of a very large corpus of newspaper articles, with an
iterative annotation scheme. This scheme addresses the inherent label bias in
the data and pinpoints the regions of the sample space whose manual labeling is
required to obtain high-precision among top-ranked candidates
Text Mining Infrastructure in R
During the last decade text mining has become a widely used discipline utilizing statistical and machine learning methods. We present the tm package which provides a framework for text mining applications within R. We give a survey on text mining facilities in R and explain how typical application tasks can be carried out using our framework. We present techniques for count-based analysis methods, text clustering, text classification and string kernels.
Introduction
There has been little overt discussion of the experimental philosophy of logic or mathematics. So it may be tempting to assume that application of the methods of experimental philosophy to these areas is impractical or unavailing. This assumption is undercut by three trends in recent research: a renewed interest in historical antecedents of experimental philosophy in philosophical logic; a “practice turn” in the philosophies of mathematics and logic; and philosophical interest in a substantial body of work in adjacent disciplines, such as the psychology of reasoning and mathematics education. This introduction offers a snapshot of each trend and addresses how they intersect with some of the standard criticisms of experimental philosophy. It also briefly summarizes the specific contribution of the other chapters of this book
Analyzing collaborative learning processes automatically
In this article we describe the emerging area of text classification research focused on the problem of collaborative learning process analysis both from a broad perspective and more specifically in terms of a publicly available tool set called TagHelper tools. Analyzing the variety of pedagogically valuable facets of learners’ interactions is a time consuming and effortful process. Improving automated analyses of such highly valued processes of collaborative learning by adapting and applying recent text classification technologies would make it a less arduous task to obtain insights from corpus data. This endeavor also holds the potential for enabling substantially improved on-line instruction both by providing teachers and facilitators with reports about the groups they are moderating and by triggering context sensitive collaborative learning support on an as-needed basis. In this article, we report on an interdisciplinary research project, which has been investigating the effectiveness of applying text classification technology to a large CSCL corpus that has been analyzed by human coders using a theory-based multidimensional coding scheme. We report promising results and include an in-depth discussion of important issues such as reliability, validity, and efficiency that should be considered when deciding on the appropriateness of adopting a new technology such as TagHelper tools. One major technical contribution of this work is a demonstration that an important piece of the work towards making text classification technology effective for this purpose is designing and building linguistic pattern detectors, otherwise known as features, that can be extracted reliably from texts and that have high predictive power for the categories of discourse actions that the CSCL community is interested in
Accelerating Innovation Through Analogy Mining
The availability of large idea repositories (e.g., the U.S. patent database)
could significantly accelerate innovation and discovery by providing people
with inspiration from solutions to analogous problems. However, finding useful
analogies in these large, messy, real-world repositories remains a persistent
challenge for either human or automated methods. Previous approaches include
costly hand-created databases that have high relational structure (e.g.,
predicate calculus representations) but are very sparse. Simpler
machine-learning/information-retrieval similarity metrics can scale to large,
natural-language datasets, but struggle to account for structural similarity,
which is central to analogy. In this paper we explore the viability and value
of learning simpler structural representations, specifically, "problem
schemas", which specify the purpose of a product and the mechanisms by which it
achieves that purpose. Our approach combines crowdsourcing and recurrent neural
networks to extract purpose and mechanism vector representations from product
descriptions. We demonstrate that these learned vectors allow us to find
analogies with higher precision and recall than traditional
information-retrieval methods. In an ideation experiment, analogies retrieved
by our models significantly increased people's likelihood of generating
creative ideas compared to analogies retrieved by traditional methods. Our
results suggest a promising approach to enabling computational analogy at scale
is to learn and leverage weaker structural representations.Comment: KDD 201
- …