Search CORE

566 research outputs found

Nonparametric Statistical Inference with an Emphasis on Information-Theoretic Methods

Author
Publication venue: 'MDPI AG'
Publication date: 21/06/2022
Field of study

This book addresses contemporary statistical inference issues when no or minimal assumptions on the nature of studied phenomenon are imposed. Information theory methods play an important role in such scenarios. The approaches discussed include various high-dimensional regression problems, time series and dependence analyses

Directory of Open Access Books (DOAB)

VI Workshop on Computational Data Analysis and Numerical Methods: Book of Abstracts

Author: Carapau Fernando
Inácio Ilda
M. Grilo Luís
Nunes Célia
Simões Alberto
Publication venue: UBI-Universidade da Beira Interior
Publication date: 27/06/2019
Field of study

The VI Workshop on Computational Data Analysis and Numerical Methods (WCDANM) is going to be held on June 27-29, 2019, in the Department of Mathematics of the University of Beira Interior (UBI), Covilhã, Portugal and it is a unique opportunity to disseminate scientific research related to the areas of Mathematics in general, with particular relevance to the areas of Computational Data Analysis and Numerical Methods in theoretical and/or practical field, using new techniques, giving especial emphasis to applications in Medicine, Biology, Biotechnology, Engineering, Industry, Environmental Sciences, Finance, Insurance, Management and Administration. The meeting will provide a forum for discussion and debate of ideas with interest to the scientific community in general. With this meeting new scientific collaborations among colleagues, namely new collaborations in Masters and PhD projects are expected. The event is open to the entire scientific community (with or without communication/poster)

UBibliorum repositorio digital da ubi

Repositório Científico da Universidade de Évora

Density Estimates as Representations of Agricultural Fields for Remote Sensing-Based Monitoring of Tillage and Vegetation Cover

Author: Klami Arto
Luotamo Markku
Yli-Heikkilä Maria
Publication venue: Multidisciplinary Digital Publishing Institute
Publication date: 11/01/2022
Field of study

We consider the use of remote sensing for large-scale monitoring of agricultural land use, focusing on classification of tillage and vegetation cover for individual field parcels across large spatial areas. From the perspective of remote sensing and modelling, field parcels are challenging as objects of interest due to highly varying shape and size but relatively uniform pixel content and texture. To model such areas we need representations that can be reliably estimated already for small parcels and that are invariant to the size of the parcel. We propose representing the parcels using density estimates of remote imaging pixels and provide a computational pipeline that combines the representation with arbitrary supervised learning algorithms, while allowing easy integration of multiple imaging sources. We demonstrate the method in the task of the automatic monitoring of autumn tillage method and vegetation cover of Finnish crop fields, based on the integrated analysis of intensity of Synthetic Aperture Radar (SAR) polarity bands of the Sentinel-1 satellite and spectral indices calculated from Sentinel-2 multispectral image data. We use a collection of 127,757 field parcels monitored in April 2018 and annotated to six tillage method and vegetation cover classes, reaching 70% classification accuracy for test parcels when using both SAR and multispectral data. Besides this task, the method could also directly be applied for other agricultural monitoring tasks, such as crop yield prediction

Helsingin yliopiston digitaalinen arkisto

Density Estimates as Representations of Agricultural Fields for Remote Sensing-Based Monitoring of Tillage and Vegetation Cover

Author: Klami Arto
Luotamo Markku
Yli-Heikkilä Maria
Publication venue
Publication date: 01/01/2022
Field of study

Jukuri

Directory of Open Access Journals

Helsingin yliopiston digitaalinen arkisto

Model-based Boosting in R: A Hands-on Tutorial Using the R Package mboost

Author: Hofner Benjamin
Mayr Andreas
Robinzonov Nikolay
Schmid Matthias
Publication venue
Publication date: 14/02/2012
Field of study

We provide a detailed hands-on tutorial for the R add-on package mboost. The package implements boosting for optimizing general risk functions utilizing component-wise (penalized) least squares estimates as base-learners for fitting various kinds of generalized linear and generalized additive models to potentially high-dimensional data. We give a theoretical background and demonstrate how mboost can be used to fit interpretable models of different complexity. As an example we use mboost to predict the body fat based on anthropometric measurements throughout the tutorial

Open Access LMU

Vol. 16, No. 1 (Full Issue)

Author: Editors JMASM
Publication venue: DigitalCommons@WayneState
Publication date: 01/05/2017
Field of study

Digital Commons@Wayne State University

Algoritmos de aprendizagem adaptativos para classificadores de redes Bayesianas

Author: Jordán Gladys Castillo
Publication venue: Universidade de Aveiro
Publication date: 01/01/2006
Field of study

Doutoramento em MatemáticaNesta tese consideramos o desenvolvimento de algoritmos adaptativos para classificadores de redes Bayesianas (BNCs) num cenário on-line. Neste cenário os dados são apresentados sequencialmente. O modelo de decisão primeiro faz uma predição e logo este é actualizado com os novos dados. Um cenário on-line de aprendizagem corresponde ao cenário “prequencial” proposto por Dawid. Um algoritmo de aprendizagem num cenário prequencial é eficiente se este melhorar o seu desempenho dedutivo e, ao mesmo tempo, reduzir o custo da adaptação. Por outro lado, em muitas aplicações pode ser difícil melhorar o desempenho e adaptar-se a fluxos de dados que apresentam mudança de conceito. Neste caso, os algoritmos de aprendizagem devem ser dotados com estratégias de controlo e adaptação que garantem o ajuste rápido a estas mudanças. Todos os algoritmos adaptativos foram integrados num modelo conceptual de aprendizagem adaptativo e prequencial para classificação supervisada designado AdPreqFr4SL, o qual tem como objectivo primordial atingir um equilíbrio óptimo entre custo-qualidade e controlar a mudança de conceito. O equilíbrio entre custo-qualidade é abordado através do controlo do viés (bias) e da adaptação do modelo. Em vez de escolher uma única classe de BNCs durante todo o processo, propomo-nos utilizar a classe de classificadores Bayesianos k-dependentes (k-DBCs) e começar com o seu modelo mais simples: o classificador Naïve Bayes (NB) (quando o número máximo de dependências permissíveis entre os atributos, k, é 0). Podemos melhorar o desempenho do NB se reduzirmos o bias produto das restrições de independência. Com este fim, propomo-nos incrementar k gradualmente de forma a que em cada etapa de aprendizagem sejam seleccionados modelos de k-DBCs com uma complexidade crescente que melhor se vai ajustando ao actual montante de dados. Assim podemos evitar os problemas causados por demasiado viés (underfitting) ou demasiada variância (overfiting). Por outro lado, a adaptação da estrutura de um BNC com novos dados implica um custo computacional elevado. Propomo-nos reduzir nos custos da adaptação se, sempre que possível, usarmos os novos dados para adaptar os parâmetros. A estrutura é adaptada só em momentos esporádicos, quando é detectado que a sua adaptação é vital para atingir uma melhoria no desempenho. Para controlar a mudança de conceito, incluímos um método baseado no Controlo de Qualidade Estatístico que tem mostrado ser efectivo na detecção destas mudanças. Avaliamos os algoritmos adaptativos usando a classe de classificadores k-DBC em diferentes problemas artificiais e reais e mostramos as vantagens da sua implementação quando comparado com as versões no adaptativas.This thesis mainly addresses the development of adaptive learning algorithms for Bayesian network classifiers (BNCs) in an on-line leaning scenario. In this scenario data arrives at the learning system sequentially. The actual predictive model must first make a prediction and then update the current model with new data. This scenario corresponds to the Dawid’s prequential approach for statistical validation of models. An efficient adaptive algorithm in a prequential learning framework must be able, above all, to improve its predictive accuracy over time while reducing the cost of adaptation. However, in many real-world situations it may be difficult to improve and adapt to existing changing environments, a problem known as concept drift. In changing environments, learning algorithms should be provided with some control and adaptive mechanisms that effort to adjust quickly to these changes. We have integrated all the adaptive algorithms into an adaptive prequential framework for supervised learning called AdPreqFr4SL, which attempts to handle the cost-performance trade-off and also to cope with concept drift. The cost-quality trade-off is approached through bias management and adaptation control. The rationale is as follows. Instead of selecting a particular class of BNCs and using it during all the learning process, we use the class of k-Dependence Bayesian classifiers and start with the simple Naïve Bayes (by setting the maximum number of allowable attribute dependence k to 0). We can then improve the performance of Naïve Bayes over time if we trade-off the bias reduction which leads to the addition of new attribute dependencies with the variance reduction by accurately estimating the parameters. However, as the learning process advances we should place more focus on bias management. We reduce the bias resulting from the independence assumption by gradually adding dependencies between the attributes over time. To this end, we gradually increase k so that at each learning step we can use a class-model of k-DBCs that better suits the available data. Thus, we can avoid the problems caused by either too much bias (underfitting) or too much variance (overfitting). On the other hand, updating the structure of BNCs with new data is a very costly task. Hence some adaptation control is desirable to decide whether it is inevitable to adapt the structure. We reduce the cost of updating by using new data to primarily adapt the parameters. Only when it is detected that the use of the current structure no longer guarantees the desirable improvement in the performance, do we adapt the structure. To handle concept drift, our framework includes a method based on Statistical Quality Control, which has been demonstrated to be efficient for recognizing concept changes. We experimentally evaluated the AdPreqFr4SL on artificial domains and benchmark problems and show its advantages in comparison against its nonadaptive versions

Repositório Institucional da Universidade de Aveiro

Recommended from our members

Towards More Scalable and Robust Machine Learning

Author: Yin Dong
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

For many data-intensive real-world applications, such as recognizing objects from images, detecting spam emails, and recommending items on retail websites, the most successful current approaches involve learning rich prediction rules from large datasets. There are many challenges in these machine learning tasks. For example, as the size of the datasets and the complexity of these prediction rules increase, there is a significant challenge in designing scalable methods that can effectively exploit the availability of distributed computing units. As another example, in many machine learning applications, there can be data corruptions, communication errors, and even adversarial attacks during training and test. Therefore, to build reliable machine learning models, we also have to tackle the challenge of robustness in machine learning.In this dissertation, we study several topics on the scalability and robustness in large-scale learning, with a focus of establishing solid theoretical foundations for these problems, and demonstrate recent progress towards the ambitious goal of building more scalable and robust machine learning models. We start with the speedup saturation problem in distributed stochastic gradient descent (SGD) algorithms with large mini-batches. We introduce the notion of gradient diversity, a metric of the dissimilarity between concurrent gradient updates, and show its key role in the convergence and generalization performance of mini-batch SGD. We then move forward to Byzantine distributed learning, a topic that involves both scalability and robustness in distributed learning. In the Byzantine setting that we consider, a fraction of distributed worker machines can have arbitrary or even adversarial behavior. We design statistically and computationally efficient algorithms to defend against Byzantine failures in distributed optimization with convex and non-convex objectives. Lastly, we discuss the adversarial example phenomenon. We provide theoretical analysis of the adversarially robust generalization properties of machine learning models through the lens of Radamacher complexity

eScholarship - University of California