20 research outputs found

    Choosing the most effective pattern classification model under learning-time constraint

    Get PDF
    Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)Nowadays, large datasets are common and demand faster and more effective pattern analysis techniques. However, methodologies to compare classifiers usually do not take into account the learning-time constraints required by applications. This work presents a methodology to compare classifiers with respect to their ability to learn from classification errors on a large learning set, within a given time limit. Faster techniques may acquire more training samples, but only when they are more effective will they achieve higher performance on unseen testing sets. We demonstrate this result using several techniques, multiple datasets, and typical learning-time limits required by applications.Nowadays, large datasets are common and demand faster and more effective pattern analysis techniques. However, methodologies to compare classifiers usually do not take into account the learning-time constraints required by applications. This work presentsCNPQ - CONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICOCAPES - COORDENAÇÃO DE APERFEIÇOAMENTO DE PESSOAL DE NÍVEL SUPERIOR - FAPESP - FUNDAÇÃO DE AMPARO À PESQUISA DO ESTADO DE SÃO PAULOFUNDECT - FUNDAÇÃO DE APOIO AO DESENVOLVIMENTO DConselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)CNPq [303182/2011-3, 477692/2012-5, 552559/2010-5, 481556/2009-5, 303673/2010-9, 470571/2013-6, 306166/2014-3, 311140/2014-9]CAPES [01-P-01965/2012]FAPESP [2011/14058-5, 2012/18768-0, 2007/52015-0, 2013/20387-7, 2014/16250-9]311140/2014-9; 303182/2011-3; 477692/2012-5; 552559/2010-5; 481556/2009-5; 303673/2010-9; 303182/2011-3; 470571/2013-6; 306166/2014-301-P-01965/20122011/14058-5, 2012/18768-0; 2007/52015-0; 2013/20387-7; 2014/16250-9sem informaçã

    Pervasive gaps in Amazonian ecological research

    Get PDF
    Biodiversity loss is one of the main challenges of our time,1,2 and attempts to address it require a clear un derstanding of how ecological communities respond to environmental change across time and space.3,4 While the increasing availability of global databases on ecological communities has advanced our knowledge of biodiversity sensitivity to environmental changes,5–7 vast areas of the tropics remain understudied.8–11 In the American tropics, Amazonia stands out as the world’s most diverse rainforest and the primary source of Neotropical biodiversity,12 but it remains among the least known forests in America and is often underrepre sented in biodiversity databases.13–15 To worsen this situation, human-induced modifications16,17 may elim inate pieces of the Amazon’s biodiversity puzzle before we can use them to understand how ecological com munities are responding. To increase generalization and applicability of biodiversity knowledge,18,19 it is thus crucial to reduce biases in ecological research, particularly in regions projected to face the most pronounced environmental changes. We integrate ecological community metadata of 7,694 sampling sites for multiple or ganism groups in a machine learning model framework to map the research probability across the Brazilian Amazonia, while identifying the region’s vulnerability to environmental change. 15%–18% of the most ne glected areas in ecological research are expected to experience severe climate or land use changes by 2050. This means that unless we take immediate action, we will not be able to establish their current status, much less monitor how it is changing and what is being lostinfo:eu-repo/semantics/publishedVersio

    Pervasive gaps in Amazonian ecological research

    Get PDF

    Pervasive gaps in Amazonian ecological research

    Get PDF
    Biodiversity loss is one of the main challenges of our time,1,2 and attempts to address it require a clear understanding of how ecological communities respond to environmental change across time and space.3,4 While the increasing availability of global databases on ecological communities has advanced our knowledge of biodiversity sensitivity to environmental changes,5,6,7 vast areas of the tropics remain understudied.8,9,10,11 In the American tropics, Amazonia stands out as the world's most diverse rainforest and the primary source of Neotropical biodiversity,12 but it remains among the least known forests in America and is often underrepresented in biodiversity databases.13,14,15 To worsen this situation, human-induced modifications16,17 may eliminate pieces of the Amazon's biodiversity puzzle before we can use them to understand how ecological communities are responding. To increase generalization and applicability of biodiversity knowledge,18,19 it is thus crucial to reduce biases in ecological research, particularly in regions projected to face the most pronounced environmental changes. We integrate ecological community metadata of 7,694 sampling sites for multiple organism groups in a machine learning model framework to map the research probability across the Brazilian Amazonia, while identifying the region's vulnerability to environmental change. 15%–18% of the most neglected areas in ecological research are expected to experience severe climate or land use changes by 2050. This means that unless we take immediate action, we will not be able to establish their current status, much less monitor how it is changing and what is being lost

    Improving semi-supervised learning through optimum connectivity

    No full text
    Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)The annotation of large data sets by a classifier is a problem whose challenge increases as the number of labeled samples used to train the classifier reduces in comparison to the number of unlabeled samples. In this context, semi-supervised learning methods aim at discovering and labeling informative samples among the unlabeled ones, such that their addition to the correct class in the training set can improve classification performance. We present a semi-supervised learning approach that connects unlabeled and labeled samples as nodes of a minimum-spanning tree and partitions the tree into an optimum-path forest rooted at the labeled nodes. It is suitable when most samples from a same class are more closely connected through sequences of nearby samples than samples from distinct classes, which is usually the case in data sets with a reasonable relation between number of samples and feature space dimension. The proposed solution is validated by using several data sets and state-of-the-art methods as baselines. (C) 2016 Elsevier Ltd. All rights reserved.The annotation of large data sets by a classifier is a problem whose challenge increases as the number of labeled samples used to train the classifier reduces in comparison to the number of unlabeled samples. In this context, semi-supervised learning meth607285FUNDECT - FUNDAÇÃO DE APOIO AO DESENVOLVIMENTO DO ENSINO, CIÊNCIA E TECNOLOGIACNPQ - CONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICOFAPESP - FUNDAÇÃO DE AMPARO À PESQUISA DO ESTADO DE SÃO PAULOConselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)303673/2010-9; 479070/2013-0; 302970/2014-2; 303182/2011-3; 470571/2013-6; 306166/2014-32013/20387-7; 2014/16250-9sem informaçã

    Example of interactive graph-based image segmentation.

    No full text
    <p>(a) The user draws labeled markers (a training set) inside and outside the object, and segmentation is based on optimum path competition from the markers in an image graph. (b) Image segmentation first relies on a pixel classifier, which is trained from the markers to create a fuzzy object map (the object should appear brighter than the background). (c) Second, the image is interpreted as a graph, whose arc weights should be lower on the border of the object than elsewhere. (d)-(f) The visual feedback from these results guides the user to the image location where more markers must be selected, improving fuzzy object map, arc weights, and so segmentation along a few interventions.</p

    SensIT Vehicle (combined).

    No full text
    <p>Comparison of all classifiers against each other with the Nemenyi test and learning time constraint equals to 1, 5, 20, 60, 300, and 1200 seconds. Groups of classifiers that are not significantly different (at p = 0.05) are connected.</p

    Correlation table between each pair of training sets,

    No full text
    <p><b>Cod-RNA</b> (a—d). <b>Connect</b> (e—h). <b>Covertype</b> (i—l). <b>IJCNN</b> (m—p). <b>SensIT</b> (q—t).</p
    corecore