2,819 research outputs found
Recommended from our members
The Lish: a data model to support analysis by end user programmers
For end user programmers needing to carry out data analysis, the spreadsheet is an attractive choice, but has little safety net against user errors. Reducing these errors is an active research area, but one aspect rather little investigated is the role played by the underlying data model: the grid of cells. I am working on an alternative model, the “lish”, based on nested lists of cells. Its theoretical advantages include fewer and more concise formulae, and easier updates to the structure. A user study is in preparation to assess its practical utility
Learned Changes in Stimulus Representations (A Personal History)
Hace casi 40 años, empecé lo que con el tiempo se convirtió en un programa de investigación sobre la forma en que la experiencia puede cambiar la efectividad de los eventos empleados como estímulos en procedimientos típicos de aprendizaje asociativo. En esta historia personal, describiré mis primeros (fallidos) intentos de demostrar la distintividad adquirida de las claves, y mi conclusión de que la experiencia tiende a reducir, en vez de a facilitar, la asociabilidad de los estímulos. Después paso a describir mis intentos de hacer compatible esta conclusión con el innegable hecho empírico de que, en algunas circunstancias, el pre-entrenamiento con (o la pre-exposición a) los estímulos puede facilitar la posterior discriminación entre ellos. Describo los experimentos (llevados a cabo con ratas como sujetos) que muestran cómo algunos de estos efectos pueden explicarse en términos asociativos. Sin embargo, otros parecen exigir una explicación en términos de un nuevo proceso de aprendizaje que modula la saliencia efectiva de los estímulos. Paso a describir los intentos de especificar la naturaleza de este proceso y (llegando al momento actual) a describir los experimentos recientes que investigan los efectos de modulación de la saliencia en el aprendizaje perceptual humano.Almost 40 years ago I began what turned out to be a programme of research on the way in which experience can change the effectiveness of the events used as stimuli in standard associative learning procedures. In this personal history I will describe my early (failed) attempts to find evidence for the acquired istinctiveness of cues, and my conclusion that experience tends to reduce, not enhance the associability of stimuli. I then go on to describe my attempts to square this conclusion with the stubborn empirical fact that, in some circumstances, pretraining with (or preexposure to) stimuli, can facilitate subsequent discrimination between them. I describe experiments (conducted mostly with rats as the subjects) showing how some of these effects can be explained in associative terms. Others, however, seemed to demand an explanation in terms of a new learning process that modulates the effective salience of stimuli. I go on to describe attempts to specify the nature of this process, and (bringing the story up to date) to describe recent experiments investigating the effects of salience modulation in human perceptual learning
Experiment versus analogy in the search for animal sentience
Deciding between rival accounts of an instance of an animal’s behavior can frequently be achieved by experimental tests of different predictions made by the alternatives. When, however, one (or both) of the alternatives is expressed in terms of the mental state of the animal, an experimental test to distinguish them can be hard to find. Although it is unsatisfactory in many ways, it may be necessary to fall back on argument from analogy with human behavior and experience
Structuring Spreadsheets with the “Lish” Data Model
A spreadsheet is remarkably flexible in representing various forms of structured data, but the individual cells have no knowledge of the larger structures of which they may form a part. This can hamper comprehension and increase formula replication, increasing the risk of error on both scores. We explore a novel data model (called the “lish”) that could form an alternative to the traditional grid in a spreadsheet-like environment. Its aim is to capture some of these higher structures while preserving the simplicity that makes a spreadsheet so attractive. It is based on cells organised into nested lists, in each of which the user may optionally employ a template to prototype repeating structures. These template elements can be likened to the marginal “cells” in the borders of a traditional worksheet, but are proper members of the sheet and may themselves contain internal structure. A small demonstration application shows the “lish” in operation
Recommended from our members
Wide, long, or nested data? Reconciling the machine and human viewpoints
Data expressed in tables may be re-arranged in various forms, while conveying the same information. This can create a tension when one form is easier to comprehend by a human reader, but another form is more convenient for processing by machine. This problem has received considerable attention for data scientists writing code, but rather less for end user analysts using spreadsheets. We propose a new data model, the “lish”, which supports a spreadsheet-like flexibility of layout, while capturing sufficient structure to facilitate processing. Using a typical example in a prototype editor, we demonstrate how it might help users resolve the tension between the two forms. A user study is in preparation
Recommended from our members
The Lish: A Data Model for Grid Free Spreadsheets
Throughout the history of the spreadsheet, and throughout the majority of research into improving it, the grid of cells has remained a constant as the underlying data model. An idea that has received recent interest is to provide users with a spreadsheet-like environment based on something other than a grid. The attraction is that if salient features of the data structure can be made more explicit, the machine will be able to provide certain types of error checking and automation.
In this project I consider one such grid replacement, a new data model which I call the “lish”. It is based on nested lists of cells, composed according to rules that allow repeating structures to be described. It allows columns, tables, groups of tables and other structures to be treated as coherent objects. This supports a novel form of cell range selection, and allows the machine to ensure that related structures are kept consistent. The model is also more accommodating than the grid of dynamic space allocation, where the number of cells occupied by a result is not known in advance.
Then, I develop a “lish calculus”, an extension to vector arithmetic for hierarchical structures that provides a concise notation for calculations with lishes. This simplifies the usual spreadsheet formula expressions, and enables the machine to interpret them consistently with the context in which they are located.
I evaluate the lish in the framework of the cognitive dimensions of notations, with the help of example use cases and a user study based on a prototype lish editor. These verify many of the hypothesised advantages, but also reveal some difficulties for users. I close with an analysis of how the lish might be revised to address these shortcomings, while continuing to capitalise on the essential benefits
A logic boosting approach to inducing multiclass alternating decision trees
The alternating decision tree (ADTree) is a successful classification technique that combine decision trees with the predictive accuracy of boosting into a ser to interpretable classification rules. The original formulation of the tree induction algorithm restricted attention to binary classification problems. This paper empirically evaluates several methods for extending the algorithm to the multiclass case by splitting the problem into several two-class LogitBoost procedure to induce alternating decision trees directly. Experimental results confirm that this procedure is comparable with methods that are based on the original ADTree formulation in accuracy, while inducing much smaller trees
Data mining in bioinformatics using Weka
The Weka machine learning workbench provides a general purpose environment for automatic classification, regression, clustering and feature selection-common data mining problems in bioinformatics research. It contains an extensive collection of machine learning algorithms and data exploration and the experimental comparison of different machine learning techniques on the same problem. Weka can process data given in the form of a single relational table. Its main objectives are to (a) assist users in extracting useful information from data and (b) enable them to easily identify a suitable algorithm for generating an accurate predictive model from it
Benchmarking attribute selection techniques for discrete class data mining
Data engineering is generally considered to be a central issue in the development of data mining applications. The success of many learning schemes, in their attempts to construct models of data, hinges on the reliable identification of a small set of highly predictive attributes. The inclusion of irrelevant, redundant and noisy attributes in the model building process phase can result in poor predictive performance and increased computation.
Attribute selection generally involves a combination of search and attribute utility estimation plus evaluation with respect to specific learning schemes. This leads to a large number of possible permutation and has led to a situation where very few benchmark studies have been conducted.
This paper presents a benchmark comparison of several attribute selection methods for supervised classification. All the methods produce an attribute ranking, a useful devise for isolating the individual merit of an attribute. Attribute selection is achieved by cross-validating the attribute rankings with respect to a classification learner to find the best attributes. Results are reported for a selection of standard data sets and two diverse learning schemes C4.5 and naïve Bayes
- …