55,815 research outputs found

    On Cognitive Preferences and the Plausibility of Rule-based Models

    Get PDF
    It is conventional wisdom in machine learning and data mining that logical models such as rule sets are more interpretable than other models, and that among such rule-based models, simpler models are more interpretable than more complex ones. In this position paper, we question this latter assumption by focusing on one particular aspect of interpretability, namely the plausibility of models. Roughly speaking, we equate the plausibility of a model with the likeliness that a user accepts it as an explanation for a prediction. In particular, we argue that, all other things being equal, longer explanations may be more convincing than shorter ones, and that the predominant bias for shorter models, which is typically necessary for learning powerful discriminative models, may not be suitable when it comes to user acceptance of the learned models. To that end, we first recapitulate evidence for and against this postulate, and then report the results of an evaluation in a crowd-sourcing study based on about 3.000 judgments. The results do not reveal a strong preference for simple rules, whereas we can observe a weak preference for longer rules in some domains. We then relate these results to well-known cognitive biases such as the conjunction fallacy, the representative heuristic, or the recogition heuristic, and investigate their relation to rule length and plausibility.Comment: V4: Another rewrite of section on interpretability to clarify focus on plausibility and relation to interpretability, comprehensibility, and justifiabilit

    Marketing Portfolio Choices by Independent Peach Growers: An Application of the Polychotomous Selection Model

    Get PDF
    In selecting a marketing channel for fresh peach sales, Georgia commercial peach growers choose the channel after accounting for buyers' preferences for quality attributes. Using the polychotomous selection model and survey data we identified external and internal quality attributes as essential factors influencing the choice of a marketing channel and the share of the crop marketed. Other factors influencing the choice and the volume sold through each marketing channel included orchard characteristics and the variety-determined fruit maturity.Marketing,

    Can models be useful for deciding to convert to organic fruit growing ? An introduction to the discussion

    Get PDF
    Modern high-input agriculture has produced great increases in crop yields but social and environmental costs have also been high. Over the past decades, sustainability has become more and more a guiding principle in agriculture. In this context, organic farming became recognised by farmers, policymakers and consumers as one of the possibilities for the farmer in a more sustainable way (De Cock L., 2005)

    Knowledge structure representation and automated updates in intelligent information management systems

    Get PDF
    A continuing effort to apply rapid prototyping and Artificial Intelligence techniques to problems associated with projected Space Station-era information management systems is examined. In particular, timely updating of the various databases and knowledge structures within the proposed intelligent information management system (IIMS) is critical to support decision making processes. Because of the significantly large amounts of data entering the IIMS on a daily basis, information updates will need to be automatically performed with some systems requiring that data be incorporated and made available to users within a few hours. Meeting these demands depends first, on the design and implementation of information structures that are easily modified and expanded, and second, on the incorporation of intelligent automated update techniques that will allow meaningful information relationships to be established. Potential techniques are studied for developing such an automated update capability and IIMS update requirements are examined in light of results obtained from the IIMS prototyping effort

    Ensemble Committees for Stock Return Classification and Prediction

    Full text link
    This paper considers a portfolio trading strategy formulated by algorithms in the field of machine learning. The profitability of the strategy is measured by the algorithm's capability to consistently and accurately identify stock indices with positive or negative returns, and to generate a preferred portfolio allocation on the basis of a learned model. Stocks are characterized by time series data sets consisting of technical variables that reflect market conditions in a previous time interval, which are utilized produce binary classification decisions in subsequent intervals. The learned model is constructed as a committee of random forest classifiers, a non-linear support vector machine classifier, a relevance vector machine classifier, and a constituent ensemble of k-nearest neighbors classifiers. The Global Industry Classification Standard (GICS) is used to explore the ensemble model's efficacy within the context of various fields of investment including Energy, Materials, Financials, and Information Technology. Data from 2006 to 2012, inclusive, are considered, which are chosen for providing a range of market circumstances for evaluating the model. The model is observed to achieve an accuracy of approximately 70% when predicting stock price returns three months in advance.Comment: 15 pages, 4 figures, Neukom Institute Computational Undergraduate Research prize - second plac

    Feature selection methods for solving the reference class problem

    Get PDF
    Probabilistic inference from frequencies, such as "Most Quakers are pacifists; Nixon is a Quaker, so probably Nixon is a pacifist" suffer from the problem that an individual is typically a member of many "reference classes" (such as Quakers, Republicans, Californians, etc) in which the frequency of the target attribute varies. How to choose the best class or combine the information? The article argues that the problem can be solved by the feature selection methods used in contemporary Big Data science: the correct reference class is that determined by the features relevant to the target, and relevance is measured by correlation (that is, a feature is relevant if it makes a difference to the frequency of the target)

    Using machine learning techniques to automate sky survey catalog generation

    Get PDF
    We describe the application of machine classification techniques to the development of an automated tool for the reduction of a large scientific data set. The 2nd Palomar Observatory Sky Survey provides comprehensive photographic coverage of the northern celestial hemisphere. The photographic plates are being digitized into images containing on the order of 10(exp 7) galaxies and 10(exp 8) stars. Since the size of this data set precludes manual analysis and classification of objects, our approach is to develop a software system which integrates independently developed techniques for image processing and data classification. Image processing routines are applied to identify and measure features of sky objects. Selected features are used to determine the classification of each object. GID3* and O-BTree, two inductive learning techniques, are used to automatically learn classification decision trees from examples. We describe the techniques used, the details of our specific application, and the initial encouraging results which indicate that our approach is well-suited to the problem. The benefits of the approach are increased data reduction throughput, consistency of classification, and the automated derivation of classification rules that will form an objective, examinable basis for classifying sky objects. Furthermore, astronomers will be freed from the tedium of an intensely visual task to pursue more challenging analysis and interpretation problems given automatically cataloged data

    An Analytical Comparison of Some Rule-Learning Programs

    Get PDF
    • …
    corecore