Search CORE

27,730 research outputs found

Causal schema induction for knowledge discovery

Author: Hwang Jena D.
Pustejovsky James
Regan Michael
Sakaguchi Keisuke
Publication venue
Publication date: 27/03/2023
Field of study

Making sense of familiar yet new situations typically involves making generalizations about causal schemas, stories that help humans reason about event sequences. Reasoning about events includes identifying cause and effect relations shared across event instances, a process we refer to as causal schema induction. Statistical schema induction systems may leverage structural knowledge encoded in discourse or the causal graphs associated with event meaning, however resources to study such causal structure are few in number and limited in size. In this work, we investigate how to apply schema induction models to the task of knowledge discovery for enhanced search of English-language news texts. To tackle the problem of data scarcity, we present Torquestra, a manually curated dataset of text-graph-schema units integrating temporal, event, and causal structures. We benchmark our dataset on three knowledge discovery tasks, building and evaluating models for each. Results show that systems that harness causal structure are effective at identifying texts sharing similar causal meaning components rather than relying on lexical cues alone. We make our dataset and models available for research purposes.Comment: 8 pages, appendi

arXiv.org e-Print Archive

Decision table for classifying point sources based on FIRST and 2MASS databases

Author: Ball
Breiman
Dan Gao
Ginsberg
Hog
Jarrett
Stone
Véron-Cetty
Weiss
Witten
Yanxia Zhang
Yongheng Zhao
Zhang
Zhang
Publication venue: 'Elsevier BV'
Publication date: 31/08/2007
Field of study

With the availability of multiwavelength, multiscale and multiepoch astronomical catalogues, the number of features to describe astronomical objects has increases. The better features we select to classify objects, the higher the classification accuracy is. In this paper, we have used data sets of stars and quasars from near infrared band and radio band. Then best-first search method was applied to select features. For the data with selected features, the algorithm of decision table was implemented. The classification accuracy is more than 95.9%. As a result, the feature selection method improves the effectiveness and efficiency of the classification method. Moreover the result shows that decision table is robust and effective for discrimination of celestial objects and used for preselecting quasar candidates for large survey projects.Comment: 10 pages. accepted by Advances in Space Researc

arXiv.org e-Print Archive

Crossref

Attribute oriented induction with star schema

Author: H Spits Warnars H. L.
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 29/05/2010
Field of study

This paper will propose a novel star schema attribute induction as a new attribute induction paradigm and as improving from current attribute oriented induction. A novel star schema attribute induction will be examined with current attribute oriented induction based on characteristic rule and using non rule based concept hierarchy by implementing both of approaches. In novel star schema attribute induction some improvements have been implemented like elimination threshold number as maximum tuples control for generalization result, there is no ANY as the most general concept, replacement the role concept hierarchy with concept tree, simplification for the generalization strategy steps and elimination attribute oriented induction algorithm. Novel star schema attribute induction is more powerful than the current attribute oriented induction since can produce small number final generalization tuples and there is no ANY in the results.Comment: 23 Pages, IJDM

arXiv.org e-Print Archive

CiteSeerX

Crossref

Recommended from our members

A comparative survey of integrated learning systems

Author: Cain Timothy
Publication venue: eScholarship, University of California
Publication date: 31/05/1990
Field of study

This paper presents the duction framework for unifying the three basic forms of inference - deduction, abduction, and induction - by specifying the possible relationships and influences among them in the context of integrated learning. Special assumptive forms of inference are defined that extend the use of these inference methods, and the properties of these forms are explored. A comparison to a related inference-based learning frame work is made. Finally several existing integrated learning programs are examined in the perspective of the duction framework

eScholarship - University of California

Probabilities and health risks: a qualitative approach

Author: Heyman Bob
Publication venue: 'Elsevier BV'
Publication date: 15/10/2002
Field of study

Health risks, defined in terms of the probability that an individual will suffer a particular type of adverse health event within a given time period, can be understood as referencing either natural entities or complex patterns of belief which incorporate the observer's values and knowledge, the position adopted in the present paper. The subjectivity inherent in judgements about adversity and time frames can be easily recognised, but social scientists have tended to accept uncritically the objectivity of probability. Most commonly in health risk analysis, the term probability refers to rates established by induction, and so requires the definition of a numerator and denominator. Depending upon their specification, many probabilities may be reasonably postulated for the same event, and individuals may change their risks by deciding to seek or avoid information. These apparent absurdities can be understood if probability is conceptualised as the projection of expectation onto the external world. Probabilities based on induction from observed frequencies provide glimpses of the future at the price of acceptance of the simplifying heuristic that statistics derived from aggregate groups can be validly attributed to individuals within them. The paper illustrates four implications of this conceptualisation of probability with qualitative data from a variety of sources, particularly a study of genetic counselling for pregnant women in a U.K. hospital. Firstly, the official selection of a specific probability heuristic reflects organisational constraints and values as well as predictive optimisation. Secondly, professionals and service users must work to maintain the facticity of an established heuristic in the face of alternatives. Thirdly, individuals, both lay and professional, manage probabilistic information in ways which support their strategic objectives. Fourthly, predictively sub-optimum schema, for example the idea of AIDS as a gay plague, may be selected because they match prevailing social value systems

University of Huddersfield Repository

Memory-Based Learning: Using Similarity for Smoothing

Author: Daelemans Walter
Zavrel Jakub
Publication venue
Publication date: 01/01/1997
Field of study

This paper analyses the relation between the use of similarity in Memory-Based Learning and the notion of backed-off smoothing in statistical language modeling. We show that the two approaches are closely related, and we argue that feature weighting methods in the Memory-Based paradigm can offer the advantage of automatically specifying a suitable domain-specific hierarchy between most specific and most general conditioning information without the need for a large number of parameters. We report two applications of this approach: PP-attachment and POS-tagging. Our method achieves state-of-the-art performance in both domains, and allows the easy integration of diverse information sources, such as rich lexical representations.Comment: 8 pages, uses aclap.sty, To appear in Proc. ACL/EACL 9

arXiv.org e-Print Archive

CiteSeerX

Institutional Repository Universiteit Antwerpen

Tilburg University Repository

PASS: a simple classifier system for data analysis

Author: Muruzábal Jorge
Publication venue
Publication date: 01/09/1993
Field of study

Let x be a vector of predictors and y a scalar response associated with it. Consider the regression problem of inferring the relantionship between predictors and response on the basis of a sample of observed pairs (x,y). This is a familiar problem for which a variety of methods are available. This paper describes a new method based on the classifier system approach to problem solving. Classifier systems provide a rich framework for learning and induction, and they have been suc:cessfully applied in the artificial intelligence literature for some time. The present method emiches the simplest classifier system architecture with some new heuristic and explores its potential in a purely inferential context. A prototype called PASS (Predictive Adaptative Sequential System) has been built to test these ideas empirically. Preliminary Monte Carlo experiments indicate that PASS is able to discover the structure imposed on the data in a wide array of cases

Universidad Carlos III de Madrid e-Archivo