27,730 research outputs found
Causal schema induction for knowledge discovery
Making sense of familiar yet new situations typically involves making
generalizations about causal schemas, stories that help humans reason about
event sequences. Reasoning about events includes identifying cause and effect
relations shared across event instances, a process we refer to as causal schema
induction. Statistical schema induction systems may leverage structural
knowledge encoded in discourse or the causal graphs associated with event
meaning, however resources to study such causal structure are few in number and
limited in size. In this work, we investigate how to apply schema induction
models to the task of knowledge discovery for enhanced search of
English-language news texts. To tackle the problem of data scarcity, we present
Torquestra, a manually curated dataset of text-graph-schema units integrating
temporal, event, and causal structures. We benchmark our dataset on three
knowledge discovery tasks, building and evaluating models for each. Results
show that systems that harness causal structure are effective at identifying
texts sharing similar causal meaning components rather than relying on lexical
cues alone. We make our dataset and models available for research purposes.Comment: 8 pages, appendi
Decision table for classifying point sources based on FIRST and 2MASS databases
With the availability of multiwavelength, multiscale and multiepoch
astronomical catalogues, the number of features to describe astronomical
objects has increases. The better features we select to classify objects, the
higher the classification accuracy is. In this paper, we have used data sets of
stars and quasars from near infrared band and radio band. Then best-first
search method was applied to select features. For the data with selected
features, the algorithm of decision table was implemented. The classification
accuracy is more than 95.9%. As a result, the feature selection method improves
the effectiveness and efficiency of the classification method. Moreover the
result shows that decision table is robust and effective for discrimination of
celestial objects and used for preselecting quasar candidates for large survey
projects.Comment: 10 pages. accepted by Advances in Space Researc
Attribute oriented induction with star schema
This paper will propose a novel star schema attribute induction as a new
attribute induction paradigm and as improving from current attribute oriented
induction. A novel star schema attribute induction will be examined with
current attribute oriented induction based on characteristic rule and using non
rule based concept hierarchy by implementing both of approaches. In novel star
schema attribute induction some improvements have been implemented like
elimination threshold number as maximum tuples control for generalization
result, there is no ANY as the most general concept, replacement the role
concept hierarchy with concept tree, simplification for the generalization
strategy steps and elimination attribute oriented induction algorithm. Novel
star schema attribute induction is more powerful than the current attribute
oriented induction since can produce small number final generalization tuples
and there is no ANY in the results.Comment: 23 Pages, IJDM
Recommended from our members
A comparative survey of integrated learning systems
This paper presents the duction framework for unifying the three basic forms of inference - deduction, abduction, and induction - by specifying the possible relationships and influences among them in the context of integrated learning. Special assumptive forms of inference are defined that extend the use of these inference methods, and the properties of these forms are explored. A comparison to a related inference-based learning frame work is made. Finally several existing integrated learning programs are examined in the perspective of the duction framework
Probabilities and health risks: a qualitative approach
Health risks, defined in terms of the probability that an individual will suffer a particular type of adverse health event within a given time period, can be understood as referencing either natural entities or complex patterns of belief which incorporate the observer's values and knowledge, the position adopted in the present paper. The subjectivity inherent in judgements about adversity and time frames can be easily recognised, but social scientists have tended to accept uncritically the objectivity of probability. Most commonly in health risk analysis, the term probability refers to rates established by induction, and so requires the definition of a numerator and denominator. Depending upon their specification, many probabilities may be reasonably postulated for the same event, and individuals may change their risks by deciding to seek or avoid information. These apparent absurdities can be understood if probability is conceptualised as the projection of expectation onto the external world. Probabilities based on induction from observed frequencies provide glimpses of the future at the price of acceptance of the simplifying heuristic that statistics derived from aggregate groups can be validly attributed to individuals within them. The paper illustrates four implications of this conceptualisation of probability with qualitative data from a variety of sources, particularly a study of genetic counselling for pregnant women in a U.K. hospital. Firstly, the official selection of a specific probability heuristic reflects organisational constraints and values as well as predictive optimisation. Secondly, professionals and service users must work to maintain the facticity of an established heuristic in the face of alternatives. Thirdly, individuals, both lay and professional, manage probabilistic information in ways which support their strategic objectives. Fourthly, predictively sub-optimum schema, for example the idea of AIDS as a gay plague, may be selected because they match prevailing social value systems
Memory-Based Learning: Using Similarity for Smoothing
This paper analyses the relation between the use of similarity in
Memory-Based Learning and the notion of backed-off smoothing in statistical
language modeling. We show that the two approaches are closely related, and we
argue that feature weighting methods in the Memory-Based paradigm can offer the
advantage of automatically specifying a suitable domain-specific hierarchy
between most specific and most general conditioning information without the
need for a large number of parameters. We report two applications of this
approach: PP-attachment and POS-tagging. Our method achieves state-of-the-art
performance in both domains, and allows the easy integration of diverse
information sources, such as rich lexical representations.Comment: 8 pages, uses aclap.sty, To appear in Proc. ACL/EACL 9
PASS: a simple classifier system for data analysis
Let x be a vector of predictors and y a scalar response associated with it. Consider the regression problem of inferring the relantionship between predictors and response on the basis of a sample of observed pairs (x,y). This is a familiar problem for which a variety of methods are available. This paper describes a new method based on the classifier system approach to problem solving. Classifier systems provide a rich framework for learning and induction, and they have been suc:cessfully applied in the artificial intelligence literature for some time. The present method emiches the simplest classifier system architecture with some new heuristic and explores its potential in a purely inferential context. A prototype called PASS (Predictive Adaptative Sequential System) has been built to test these ideas empirically. Preliminary Monte Carlo experiments indicate that PASS is able to discover the structure imposed on the data in a wide array of cases
- …