8 research outputs found

    Case Representation, Acquisition, and Retrieval in SIROCCO

    No full text
    As part of our investigation of how abstract principles are operationalized to facilitate their application to specific fact situations, we have begun to develop and experiment with SIROCCO (System for Intelligent Retrieval of Operationalized Cases and COdes), a CBR retrieval and analysis system applied to the domain of engineering ethics. SIROCCO is intended to retrieve decided engineering ethics cases and previously applied ethics codes to assist engineers and students in analyzing new cases. Here we describe a limited but expressive language designed to represent a wide range of ethics cases in SIROCCO, a world-wide web tool developed to perform case acquisition and support a measure of consistency in representation, and an experiment to validate the initial phase of SIROCCO's retrieval algorithm and test its sensitivity to small variations in case description

    Unsupervised Feature Selection for Text Data

    No full text
    Feature selection for unsupervised tasks is particularly challenging, especially when dealing with text data. The increase in online documents and email communication creates a need for tools that can operate without the supervision of the user. In this paper we look at novel feature selection techniques that address this need. A distributional similarity measure from information theory is applied to measure feature utility. This utility informs the search for both representative and diverse features in two complementary ways: CLUSTER divides the entire feature space, before then selecting one feature to represent each cluster; and GREEDY increments the feature subset size by a greedily selected feature. In particular we found that GREEDY’s local search is suited to learning smaller feature subset sizes while CLUSTER is able to improve the global quality of larger feature sets. Experiments with four email data sets show significant improvement in retrieval accuracy with nearest neighbour based search methods compared to an existing frequency-based method. Importantly both GREEDY and CLUSTER make significant progress towards the upper bound performance set by a standard supervised feature selection method

    An analysis of research themes in the CBR conference literature

    Get PDF
    9th European Conference on Case-Based Reasoning (ECCBR 2008), Trier, Germany, September 1-4, 2008After fifteen years of CBR conferences, this paper sets out to examine the themes that have evolved in CBR research as revealed by the implicit and explicit relationships between the conference papers. We have examined a number of metrics for demonstrating connections between papers and between authors and have found that a clustering based on co-citation of papers appears to produce the most meaningful organisation. We have employed an Ensemble Non-negative Matrix Factorisation (NMF) approach that produces a “soft” hierarchical clustering, where papers can belong to more than one cluster. This is useful as papers can naturally relate to more than one research area. We have produced timelines for each of these clusters that highlight influential papers and illustrate the life-cycle of research themes over the last fifteen years. The insights afforded by this analysis are presented in detail. In addition to the analysis of the sub-structure of CBR research, this paper also presents some global statistics on the CBR conference literature.Science Foundation Ireland, Grant No. 05/IN.1/I24Conference detailshttp://www.wi2.uni-trier.de/eccbr08/index.ph

    The Depth of the Surface Zone of a Liquid

    No full text
    corecore