17 research outputs found

    Inductive Policy: The Pragmatics of Bias Selection

    No full text
    This paper extends the currently accepted model of inductive bias by identifying six categories of bias and separates inductive bias from the policy for its selection (the inductive policy). We analyze existing "blas selection " systems, examining the similarities and differences in their inductive policies, and idemify three techniques useful for building inductive policies. We then present a framework for representing and automaticaIly selecting a wide variety of biases and describe experiments with an instantiation of the framework addressing various pragmatic tradeoffs of time, space, accuracy, and the cost oferrors. The experiments show that a common framework can be used to implement policies for a variety of different types of blas selection, such as parameter selection, term selection, and example selection, using similar techniques. The experiments also show that different tradeoffs can be made by the implementation of different policies; for example, from the same data different rule sets can be learned based on different tradeoffs of accuracy versus the cost of erroneous predictions

    Scaling Up: Distributed Machine Learning with Cooperation

    No full text
    Machine-learning methods are becoming increasingly popular for automated data analysis. However, standard methods do not scale up to massive scientific and business data sets without expensive hardware. This paper investigates a practical alternative for scaling up: the use of distributed processing to take advantage of the often dormant PCs and workstations available on local networks. Each workstation runs a common rule-learning program on a subset of the data. We first show that for commonly used ruleevaluation criteria, a simple form of cooperation can guarantee that a rule will look good to the set of cooperating learners if and only if it would look good to a single learner operating with the entire data set. We then show how such a system can further capitalize on different perspectives by sharing learned knowledge for significant reduction in search effort. We demonstrate the power of the method by learning from a massive data set taken from the domain of cel..

    Exploiting Background Knowledge in Automated Discovery

    No full text
    Prior work in automated scientific discovery has been successful in finding patterns in data, given that a reasonably small set of mostly relevant features is specified. The work described in this paper places data in the context of large bodies of background knowledge. Specifically, data items are connected to multiple databases of background knowledge represented as inheritance networks. The system has made a practical impact on botanical toxicology research, which required linking examples of cases of plant exposures to databases of botanical, geographical, and climate background knowledge
    corecore