2,621 research outputs found

    Generalized additive modelling with implicit variable selection by likelihood based boosting

    Get PDF
    The use of generalized additive models in statistical data analysis suffers from the restriction to few explanatory variables and the problems of selection of smoothing parameters. Generalized additive model boosting circumvents these problems by means of stagewise fitting of weak learners. A fitting procedure is derived which works for all simple exponential family distributions, including binomial, Poisson and normal response variables. The procedure combines the selection of variables and the determination of the appropriate amount of smoothing. As weak learners penalized regression splines and the newly introduced penalized stumps are considered. Estimates of standard deviations and stopping criteria which are notorious problems in iterative procedures are based on an approximate hat matrix. The method is shown to outperform common procedures for the fitting of generalized additive models. In particular in high dimensional settings it is the only method that works properly

    Adult beginner distance language learner perceptions and use of assignment feedback

    Get PDF
    This qualitative study examines perceptions and use of assignment feedback among adult beginner modern foreign language learners on higher education distance learning courses. A survey of responses to feedback on assignments by 43 Open University students on beginner language courses in Spanish, French, and German indicated that respondents can be classified into three groups: those who use feedback strategically by integrating it into the learning process and comparing it with, for example, informal feedback from interaction with native speakers, those who take note of feedback, but seem not to use it strategically, and those who appear to take little account of either marks or feedback. The first group proved to be the most confident and most likely to maintain their motivation in the longer term. The conclusion discusses some of the pedagogical and policy implications of the findings

    Hypermedia for language learning: The FREE model at Coventry University

    Get PDF
    Coventry University is pioneering the integration of hypermedia into the curriculum for the teaching of Italian language and society with the creation of a package based on Nerino Rossi's novel La neve nel bicchiere. The novel was already in use as a basic course text, and developing a hypermedia package was felt to be the ideal way of creating a more stimulating means of access to it. The procedure used in creating the package is described, as are its contents, the ways in which the students use it and the tasks they are given to perform, the feedback from the students, and its impact on their performance. The testing of the prototype has helped in creating a new cognitive model: the FREE (Fluid Role‐Exchange Environment) which functions as a fluid and interactive ‘pool’ where the three main actors, or act ants, ie. the learner, the lecturer and the computer, exchange roles. Within the FREE, students were involved in the construction and evaluation of the courseware, as well as testing the various versions of the prototype. The development and use of hypermedia inside and outside the classroom has made it possible to change both the students’ and the lecturer's attitude towards the material being learnt. However, the courseware does not seem to equip students sufficiently for essay writing, and this problem needs further investigation

    Machine Learning in Automated Text Categorization

    Full text link
    The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert manpower, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey

    Privacy-Preserving and Lossless Distributed Estimation of High-Dimensional Generalized Additive Mixed Models

    Full text link
    Various privacy-preserving frameworks that respect the individual's privacy in the analysis of data have been developed in recent years. However, available model classes such as simple statistics or generalized linear models lack the flexibility required for a good approximation of the underlying data-generating process in practice. In this paper, we propose an algorithm for a distributed, privacy-preserving, and lossless estimation of generalized additive mixed models (GAMM) using component-wise gradient boosting (CWB). Making use of CWB allows us to reframe the GAMM estimation as a distributed fitting of base learners using the L2L_2-loss. In order to account for the heterogeneity of different data location sites, we propose a distributed version of a row-wise tensor product that allows the computation of site-specific (smooth) effects. Our adaption of CWB preserves all the important properties of the original algorithm, such as an unbiased feature selection and the feasibility to fit models in high-dimensional feature spaces, and yields equivalent model estimates as CWB on pooled data. Next to a derivation of the equivalence of both algorithms, we also showcase the efficacy of our algorithm on a distributed heart disease data set and compare it with state-of-the-art methods

    A literature survey of active machine learning in the context of natural language processing

    Get PDF
    Active learning is a supervised machine learning technique in which the learner is in control of the data used for learning. That control is utilized by the learner to ask an oracle, typically a human with extensive knowledge of the domain at hand, about the classes of the instances for which the model learned so far makes unreliable predictions. The active learning process takes as input a set of labeled examples, as well as a larger set of unlabeled examples, and produces a classifier and a relatively small set of newly labeled data. The overall goal is to create as good a classifier as possible, without having to mark-up and supply the learner with more data than necessary. The learning process aims at keeping the human annotation effort to a minimum, only asking for advice where the training utility of the result of such a query is high. Active learning has been successfully applied to a number of natural language processing tasks, such as, information extraction, named entity recognition, text categorization, part-of-speech tagging, parsing, and word sense disambiguation. This report is a literature survey of active learning from the perspective of natural language processing

    Binary and Ordinal Random Effects Models Including Variable Selection

    Get PDF
    A likelihood-based boosting approach for fitting binary and ordinal mixed models is presented. In contrast to common procedures it can be used in high-dimensional settings where a large number of potentially influential explanatory variables is available. Constructed as a componentwise boosting method it is able to perform variable selection with the complexity of the resulting estimator being determined by information criteria. The method is investigated in simulation studies both for cumulative and sequential models and is illustrated by using real data sets
    corecore