1,363,540 research outputs found

    Incremental Learning of Nonparametric Bayesian Mixture Models

    Get PDF
    Clustering is a fundamental task in many vision applications. To date, most clustering algorithms work in a batch setting and training examples must be gathered in a large group before learning can begin. Here we explore incremental clustering, in which data can arrive continuously. We present a novel incremental model-based clustering algorithm based on nonparametric Bayesian methods, which we call Memory Bounded Variational Dirichlet Process (MB-VDP). The number of clusters are determined flexibly by the data and the approach can be used to automatically discover object categories. The computational requirements required to produce model updates are bounded and do not grow with the amount of data processed. The technique is well suited to very large datasets, and we show that our approach outperforms existing online alternatives for learning nonparametric Bayesian mixture models

    Reusing Data and Metadata to Create New Metadata Through Machine-Learning & Other Programmatic Methods

    Get PDF
    Recent improvements in natural language processing (NLP) enable metadata to be created programmatically from reused original metadata or even the dataset itself. Transfer-learning applied to NLP has greatly improved performance and reduced training data requirements. In this talk, well compare machine-generated metadata to human-generated metadata and discuss characteristics of metadata and data archives that affect suitability for machine-learning reuse of metadata. Where as human-generated metadata is often populated once, populated from the perspective of data supplier, populated by many individuals with different words for the same thing, and limited in length, machine-generated metadata can be updated any number of times, generated from the perspective of any user, constrained to a standardized set of terms that can be evolved over time, and be any length required. Machine-learning generated metadata offers benefits but also additional needs in terms of version control, process transparency, human-computer interaction, and IT requirements. As a successful example, well discuss how a dataset of abstracts and associated human-tagged keywords from a standardized list of several thousand keywords were used to create a machine-learning model that predicted keyword metadata for open-source code projects on code.nasa.gov. Well also discuss a less successful example from data.nasa.gov to show how data archive architecture and characteristics of initial metadata can be strong controls on how easy it is to leverage programmatic methods to reuse metadata to create additional metadata

    Learning Heterogeneous Similarity Measures for Hybrid-Recommendations in Meta-Mining

    Get PDF
    The notion of meta-mining has appeared recently and extends the traditional meta-learning in two ways. First it does not learn meta-models that provide support only for the learning algorithm selection task but ones that support the whole data-mining process. In addition it abandons the so called black-box approach to algorithm description followed in meta-learning. Now in addition to the datasets, algorithms also have descriptors, workflows as well. For the latter two these descriptions are semantic, describing properties of the algorithms. With the availability of descriptors both for datasets and data mining workflows the traditional modelling techniques followed in meta-learning, typically based on classification and regression algorithms, are no longer appropriate. Instead we are faced with a problem the nature of which is much more similar to the problems that appear in recommendation systems. The most important meta-mining requirements are that suggestions should use only datasets and workflows descriptors and the cold-start problem, e.g. providing workflow suggestions for new datasets. In this paper we take a different view on the meta-mining modelling problem and treat it as a recommender problem. In order to account for the meta-mining specificities we derive a novel metric-based-learning recommender approach. Our method learns two homogeneous metrics, one in the dataset and one in the workflow space, and a heterogeneous one in the dataset-workflow space. All learned metrics reflect similarities established from the dataset-workflow preference matrix. We demonstrate our method on meta-mining over biological (microarray datasets) problems. The application of our method is not limited to the meta-mining problem, its formulations is general enough so that it can be applied on problems with similar requirements

    Methods of Measuring the Students’ Results Obtained in the Teaching-Learning Process

    Get PDF
    The experimental implementation and the determination of the efficiency of multimedia teaching-learning technologies was done with the purpose of establishing the necessity of transformations that are paramount for the educational system, in order to synchronize it with the general development tendencies of contemporary society. In this article I shall present the results of an experiment made at the Faculty of Economics Sciences, specialization Finances Banks. In this scientific experiment we applied the technique of parallel groups which supposes the implication of 4 groups of second year students, 2 groups forming the experimental team for whom the multimedia courses for training process were used and 2 control groups for whom teaching was made in the traditional system. The application of the statistical methods of processing experimental data attested the hypotheses about the positive impact of the implementation of the multimedia courses in teaching- learning process in the experimental groups and the efficiency of the applied methods to the experimental groups, compared to traditional methods, applied to control groups. The research in question has tried to propose a new perspective for performing the learning-teaching process, corresponding to present requirements, which, by using information technology, offers new possibilities to stimulate interest, new ways for active involvement of the student in the knowledge process.traditional teaching-learning process, multimedia technology, knowledge acquiring coefficient, automation coefficient, efficiency coefficient
    corecore