4 research outputs found

    Iterative Optimization and Simplification of Hierarchical Clusterings

    Full text link
    Clustering is often used for discovering structure in data. Clustering systems differ in the objective function used to evaluate clustering quality and the control strategy used to search the space of clusterings. Ideally, the search strategy should consistently construct clusterings of high quality, but be computationally inexpensive as well. In general, we cannot have it both ways, but we can partition the search so that a system inexpensively constructs a `tentative' clustering for initial examination, followed by iterative optimization, which continues to search in background for improved clusterings. Given this motivation, we evaluate an inexpensive strategy for creating initial clusterings, coupled with several control strategies for iterative optimization, each of which repeatedly modifies an initial clustering in search of a better one. One of these methods appears novel as an iterative optimization strategy in clustering contexts. Once a clustering has been constructed it is judged by analysts -- often according to task-specific criteria. Several authors have abstracted these criteria and posited a generic performance task akin to pattern completion, where the error rate over completed patterns is used to `externally' judge clustering utility. Given this performance task, we adapt resampling-based pruning strategies used by supervised learning systems to the task of simplifying hierarchical clusterings, thus promising to ease post-clustering analysis. Finally, we propose a number of objective functions, based on attribute-selection measures for decision-tree induction, that might perform well on the error rate and simplicity dimensions.Comment: See http://www.jair.org/ for any accompanying file

    A probabilistic model of cross-categorization

    Get PDF
    Most natural domains can be represented in multiple ways: we can categorize foods in terms of their nutritional content or social role, animals in terms of their taxonomic groupings or their ecological niches, and musical instruments in terms of their taxonomic categories or social uses. Previous approaches to modeling human categorization have largely ignored the problem of cross-categorization, focusing on learning just a single system of categories that explains all of the features. Cross-categorization presents a difficult problem: how can we infer categories without first knowing which features the categories are meant to explain? We present a novel model that suggests that human cross-categorization is a result of joint inference about multiple systems of categories and the features that they explain. We also formalize two commonly proposed alternative explanations for cross-categorization behavior: a features-first and an objects-first approach. The features-first approach suggests that cross-categorization is a consequence of attentional processes, where features are selected by an attentional mechanism first and categories are derived second. The objects-first approach suggests that cross-categorization is a consequence of repeated, sequential attempts to explain features, where categories are derived first, then features that are poorly explained are recategorized. We present two sets of simulations and experiments testing the models’ predictions about human categorization. We find that an approach based on joint inference provides the best fit to human categorization behavior, and we suggest that a full account of human category learning will need to incorporate something akin to these capabilities

    Exploring students' conceptualisations of technology through their experiences of it (in and out of school)

    Get PDF
    The aim of this study was to explore students’ experiences and conceptualisation of technology. The study employed a qualitative case study approach, with a sample comprising 16 students and four teachers from two schools, as the researcher believed that looking at more than one school would lead to a better understanding of technology education in the classroom. The data were collected using semi-structured interviews for both students and teachers and lesson observation notes, and transcripts were analysed using an inductive thematic analysis technique developed by Braun et al. (2014). The findings indicate the existence of a huge disconnect between the students’ experience of technology and their perspectives on technology, which seemed to suggest that more efforts are needed to introduce learning activities that reflect existing literature’s views of the meaning of technology. This study found that outside school, students conceptualised technology in terms of technological artefacts, which was indicative of their limited perspectives on technology. The findings also showed that in the classroom, most students’ experience of technology aligned more with theoretical aspects of technical drawing (construction of different angles, bisection of lines and angles), which is similar to what students are taught in technical and vocational education that aims to prepare them for specific jobs, while technology education leads them to develop technological literacy – the ability to use, manage, understand and evaluate technology in general. This difference highlighted the need to develop a framework for understanding technological concepts in order to help students understand not only familiar aspects of technology, but also unfamiliar things and ideas or concepts that have not been discussed much in the literature (De Vries, 2005; Collier-Reed, 2009; DiGironimo, 2011). Consequently, this study proposed a model for conceptualising technology that has potential educational benefits, which could be developed to help future research, policy and practice to enhance students’ strengths and reduce their weaknesses in learning about technology. If we are educating students to be technologically literate, we must encourage them to advance their understanding of technology for real-world learning and help them to become global citizens

    Ant colony optimization based clustering for data partitioning.

    Get PDF
    Woo Kwan Ho.Thesis (M.Phil.)--Chinese University of Hong Kong, 2005.Includes bibliographical references (leaves 148-155).Abstracts in English and Chinese.Contents --- p.iiAbstract --- p.ivAcknowledgements --- p.viiList of Figures --- p.viiiList of Tables --- p.xChapter Chapter 1 --- Introduction --- p.1Chapter Chapter 2 --- Literature Reviews --- p.7Chapter 2.1 --- Block Clustering --- p.7Chapter 2.2 --- Clustering XML by structure --- p.10Chapter 2.2.1 --- Definition of XML schematic information --- p.10Chapter 2.2.2 --- Identification of XML schematic information --- p.12Chapter Chapter 3 --- Bi-Tour Ant Colony Optimization for diagonal clustering --- p.15Chapter 3.1 --- Motivation --- p.15Chapter 3.2 --- Framework of Bi-Tour Ant Colony Algorithm --- p.21Chapter 3.3 --- Re-order of the data matrix in BTACO clustering method --- p.27Chapter 3.3.1 --- Review of Ant Colony Optimization --- p.29Chapter 3.3.2 --- Bi-Tour Ant Colony Optimization --- p.36Chapter 3.4 --- Determination of partitioning scheme --- p.44Chapter 3.4.1 --- Weighed Sum of Error (WSE) --- p.48Chapter 3.4.2 --- Materialization of partitioning scheme via hypothetic matrix --- p.50Chapter 3.4.3 --- Search of best-fit hypothetic matrix --- p.52Chapter 3.4.4 --- Dynamic programming approach --- p.53Chapter 3.4.5 --- Heuristic partitioning approach --- p.57Chapter 3.5 --- Experimental Study --- p.62Chapter 3.5.1 --- Data set --- p.63Chapter 3.5.2 --- Study on DP Approach and HP Approach --- p.65Chapter 3.5.3 --- Study on parameter settings --- p.69Chapter 3.5.4 --- Comparison with GA-based & hierarchical clustering methods --- p.81Chapter 3.6 --- Chapter conclusion --- p.90Chapter Chapter 4 --- Application of BTACO-based clustering in XML database system --- p.93Chapter 4.1 --- Introduction --- p.93Chapter 4.2 --- Overview of normalization and vertical partitioning in relational DB design --- p.95Chapter 4.2.1 --- Normalization of relational models in database design --- p.95Chapter 4.2.2 --- Vertical partitioning in database design --- p.98Chapter 4.3 --- Clustering XML documents --- p.100Chapter 4.4 --- Proposed approach using BTACO-based clustering --- p.103Chapter 4.4.1 --- Clustering XML documents by structure --- p.103Chapter 4.4.2 --- Clustering XML documents by user transaction patterns --- p.109Chapter 4.4.3 --- Implementation of Query Manager for our experimental study --- p.114Chapter 4.5 --- Experimental Study --- p.118Chapter 4.5.1 --- Experimental Study on the clustering by structure --- p.118Chapter 4.5.2 --- Experimental Study on the clustering by user access patterns --- p.133Chapter 4.6 --- Chapter conclusion --- p.141Chapter Chapter 5 --- Conclusions --- p.143Chapter 5.1 --- Contributions --- p.144Chapter 5.2 --- Future works --- p.146Bibliography --- p.148Appendix I --- p.156Appendix II --- p.168Index tables for Profile A --- p.168Index tables for Profile B --- p.171Appendix III --- p.17
    corecore