1,866 research outputs found

    Towards fully covariant machine learning

    Full text link
    Any representation of data involves arbitrary investigator choices. Because those choices are external to the data-generating process, each choice leads to an exact symmetry, corresponding to the group of transformations that takes one possible representation to another. These are the passive symmetries; they include coordinate freedom, gauge symmetry, and units covariance, all of which have led to important results in physics. In machine learning, the most visible passive symmetry is the relabeling or permutation symmetry of graphs. Our goal is to understand the implications for machine learning of the many passive symmetries in play. We discuss dos and don'ts for machine learning practice if passive symmetries are to be respected. We discuss links to causal modeling, and argue that the implementation of passive symmetries is particularly valuable when the goal of the learning problem is to generalize out of sample. This paper is conceptual: It translates among the languages of physics, mathematics, and machine-learning. We believe that consideration and implementation of passive symmetries might help machine learning in the same ways that it transformed physics in the twentieth century.Comment: substantial revision from v1; submitted to TML

    Information Retrieval Performance Enhancement Using The Average Standard Estimator And The Multi-criteria Decision Weighted Set

    Get PDF
    Information retrieval is much more challenging than traditional small document collection retrieval. The main difference is the importance of correlations between related concepts in complex data structures. These structures have been studied by several information retrieval systems. This research began by performing a comprehensive review and comparison of several techniques of matrix dimensionality estimation and their respective effects on enhancing retrieval performance using singular value decomposition and latent semantic analysis. Two novel techniques have been introduced in this research to enhance intrinsic dimensionality estimation, the Multi-criteria Decision Weighted model to estimate matrix intrinsic dimensionality for large document collections and the Average Standard Estimator (ASE) for estimating data intrinsic dimensionality based on the singular value decomposition (SVD). ASE estimates the level of significance for singular values resulting from the singular value decomposition. ASE assumes that those variables with deep relations have sufficient correlation and that only those relationships with high singular values are significant and should be maintained. Experimental results over all possible dimensions indicated that ASE improved matrix intrinsic dimensionality estimation by including the effect of both singular values magnitude of decrease and random noise distracters. Analysis based on selected performance measures indicates that for each document collection there is a region of lower dimensionalities associated with improved retrieval performance. However, there was clear disagreement between the various performance measures on the model associated with best performance. The introduction of the multi-weighted model and Analytical Hierarchy Processing (AHP) analysis helped in ranking dimensionality estimation techniques and facilitates satisfying overall model goals by leveraging contradicting constrains and satisfying information retrieval priorities. ASE provided the best estimate for MEDLINE intrinsic dimensionality among all other dimensionality estimation techniques, and further, ASE improved precision and relative relevance by 10.2% and 7.4% respectively. AHP analysis indicates that ASE and the weighted model ranked the best among other methods with 30.3% and 20.3% in satisfying overall model goals in MEDLINE and 22.6% and 25.1% for CRANFIELD. The weighted model improved MEDLINE relative relevance by 4.4%, while the scree plot, weighted model, and ASE provided better estimation of data intrinsic dimensionality for CRANFIELD collection than Kaiser-Guttman and Percentage of variance. ASE dimensionality estimation technique provided a better estimation of CISI intrinsic dimensionality than all other tested methods since all methods except ASE tend to underestimate CISI document collection intrinsic dimensionality. ASE improved CISI average relative relevance and average search length by 28.4% and 22.0% respectively. This research provided evidence supporting a system using a weighted multi-criteria performance evaluation technique resulting in better overall performance than a single criteria ranking model. Thus, the weighted multi-criteria model with dimensionality reduction provides a more efficient implementation for information retrieval than using a full rank model

    High School Physical Sciences Teachers’ Competence in Some Basic Cognitive Skills

    Get PDF
    The successful implementation of the national high school Physical Sciences curriculum in South Africa, which places strong emphasis on critical thinking and reasoning abilities of students, would need teachers who are competent in cognitive skills and strategies. The main objectives of this study were to test South African high school Physical Sciences teachers’ competence in the cognitive skills and strategies needed for studying Physical Sciences effectively and also to identify possible reasons for their difficulties and suggest methods for overcoming them. The study method used was the analysis of teachers’ answers to questions that were carefully designed to test competence in explanation skills, mathematical skills, graphical skills, three-dimensional visualization skills, information-processing skills and reasoning skills. Seventy-three teachers from about 50 Dinaledi schools in the North West and Kwazulu-Natal provinces were tested. Teachers’ competence was found to be poor in most of the skills tested. About 40 % (average performance in all 14 test questions) of them had difficulty in answering the questions. Teachers’ lack of competence in cognitive skills and strategies would be an important limiting factor in the successful implementation of the Physical Sciences curriculum. An urgent need therefore exists for training teachers to increase their competence in the cognitive skills and strategies that are needed for studying science effectively.Keywords: Cognitive skills, thinking skills, questions testing skills, problem solving, teacher training, high school physical scienc

    Application Of Model Problem Based Learning (PBL) With Creative Problem Solving (CPS) In Arithmetic Sequence And Series

    Get PDF

    Holistic biomimicry: a biologically inspired approach to environmentally benign engineering

    Get PDF
    Humanity's activities increasingly threaten Earth's richness of life, of which mankind is a part. As part of the response, the environmentally conscious attempt to engineer products, processes and systems that interact harmoniously with the living world. Current environmental design guidance draws upon a wealth of experiences with the products of engineering that damaged humanity's environment. Efforts to create such guidelines inductively attempt to tease right action from examination of past mistakes. Unfortunately, avoidance of past errors cannot guarantee environmentally sustainable designs in the future. One needs to examine and understand an example of an environmentally sustainable, complex, multi-scale system to engineer designs with similar characteristics. This dissertation benchmarks and evaluates the efficacy of guidance from one such environmentally sustainable system resting at humanity's doorstep - the biosphere. Taking a holistic view of biomimicry, emulation of and inspiration by life, this work extracts overarching principles of life from academic life science literature using a sociological technique known as constant comparative method. It translates these principles into bio-inspired sustainable engineering guidelines. During this process, it identifies physically rooted measures and metrics that link guidelines to engineering applications. Qualitative validation for principles and guidelines takes the form of review by biology experts and comparison with existing environmentally benign design and manufacturing guidelines. Three select bio-inspired guidelines at three different organizational scales of engineering interest are quantitatively validated. Physical experiments with self-cleaning surfaces quantify the potential environmental benefits generated by applying the first, sub-product scale guideline. An interpretation of a metabolically rooted guideline applied at the product / organism organizational scale is shown to correlate with existing environmental metrics and predict a sustainability threshold. Finally, design of a carpet recycling network illustrates the quantitative environmental benefits one reaps by applying the third, multi-facility scale bio-inspired sustainability guideline. Taken as a whole, this work contributes (1) a set of biologically inspired sustainability principles for engineering, (2) a translation of these principles into measures applicable to design, (3) examples demonstrating a new, holistic form of biomimicry and (4) a deductive, novel approach to environmentally benign engineering. Life, the collection of processes that tamed and maintained themselves on planet Earth's once hostile surface, long ago confronted and solved the fundamental problems facing all organisms. Through this work, it is hoped that humanity has taken one small step toward self-mastery, thus drawing closer to a solution to the latest problem facing all organisms.Ph.D.Committee Chair: Bert Bras; Committee Member: David Rosen; Committee Member: Dayna Baumeister; Committee Member: Janet Allen; Committee Member: Jeannette Yen; Committee Member: Matthew Realf

    A psychometric evaluation of a state testing program: accommodated versus non-accommodated students.

    Get PDF
    Federal legislation such as No Child Left Behind mandated that students with disabilities be included in accountability standards, creating an important responsibility to fairly assess all students, even those with disabilities. Consequently, a sense of urgency was placed on the entire educational system to ensure that these students had a fair chance at being adequately tested, to ensure equity and access. Alternate assessments and accommodations are a part of access for these students in the testing realm. This study, which centered around fairness, focused on psychometrically comparing three subject exams administered to eighth grade students who received accommodations as opposed to those who did not. Covariance matrices, dimensionality, factor structure, and item functioning were all compared across groups to examine invariance and show that the use of accommodations did not affect the validity or fairness of the testing program. Analyses revealed that for each of the tests, there was a significant difference in the covariance matrices, but unidimensionality held across groups implying that the dimensionality structure was the same for both groups. The factor structure was tested only for the math exam and the one-factor structure held for accommodated and non-accommodated students, again confirming unidimensionality. Despite consistent dimensions being measured for both groups, DIF did manifest but without actual test items it could not be attributed to bias

    Acquiring symbolic design optimization problem reformulation knowledge: On computable relationships between design syntax and semantics

    Get PDF
    This thesis presents a computational method for the inductive inference of explicit and implicit semantic design knowledge from the symbolic-mathematical syntax of design formulations using an unsupervised pattern recognition and extraction approach. Existing research shows that AI / machine learning based design computation approaches either require high levels of knowledge engineering or large training databases to acquire problem reformulation knowledge. The method presented in this thesis addresses these methodological limitations. The thesis develops, tests, and evaluates ways in which the method may be employed for design problem reformulation. The method is based on the linear algebra based factorization method Singular Value Decomposition (SVD), dimensionality reduction and similarity measurement through unsupervised clustering. The method calculates linear approximations of the associative patterns of symbol cooccurrences in a design problem representation to infer induced coupling strengths between variables, constraints and system components. Unsupervised clustering of these approximations is used to identify useful reformulations. These two components of the method automate a range of reformulation tasks that have traditionally required different solution algorithms. Example reformulation tasks that it performs include selection of linked design variables, parameters and constraints, design decomposition, modularity and integrative systems analysis, heuristically aiding design “case” identification, topology modeling and layout planning. The relationship between the syntax of design representation and the encoded semantic meaning is an open design theory research question. Based on the results of the method, the thesis presents a set of theoretical postulates on computable relationships between design syntax and semantics. The postulates relate the performance of the method with empirical findings and theoretical insights provided by cognitive neuroscience and cognitive science on how the human mind engages in symbol processing and the resulting capacities inherent in symbolic representational systems to encode “meaning”. The performance of the method suggests that semantic “meaning” is a higher order, global phenomenon that lies distributed in the design representation in explicit and implicit ways. A one-to-one local mapping between a design symbol and its meaning, a largely prevalent approach adopted by many AI and learning algorithms, may not be sufficient to capture and represent this meaning. By changing the theoretical standpoint on how a “symbol” is defined in design representations, it was possible to use a simple set of mathematical ideas to perform unsupervised inductive inference of knowledge in a knowledge-lean and training-lean manner, for a knowledge domain that traditionally relies on “giving” the system complex design domain and task knowledge for performing the same set of tasks
    • …
    corecore