20 research outputs found
Choosing the most effective pattern classification model under learning-time constraint
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)Nowadays, large datasets are common and demand faster and more effective pattern analysis techniques. However, methodologies to compare classifiers usually do not take into account the learning-time constraints required by applications. This work presents a methodology to compare classifiers with respect to their ability to learn from classification errors on a large learning set, within a given time limit. Faster techniques may acquire more training samples, but only when they are more effective will they achieve higher performance on unseen testing sets. We demonstrate this result using several techniques, multiple datasets, and typical learning-time limits required by applications.Nowadays, large datasets are common and demand faster and more effective pattern analysis techniques. However, methodologies to compare classifiers usually do not take into account the learning-time constraints required by applications. This work presentsCNPQ - CONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICOCAPES - COORDENAÇÃO DE APERFEIÇOAMENTO DE PESSOAL DE NÍVEL SUPERIOR - FAPESP - FUNDAÇÃO DE AMPARO À PESQUISA DO ESTADO DE SÃO PAULOFUNDECT - FUNDAÇÃO DE APOIO AO DESENVOLVIMENTO DConselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)CNPq [303182/2011-3, 477692/2012-5, 552559/2010-5, 481556/2009-5, 303673/2010-9, 470571/2013-6, 306166/2014-3, 311140/2014-9]CAPES [01-P-01965/2012]FAPESP [2011/14058-5, 2012/18768-0, 2007/52015-0, 2013/20387-7, 2014/16250-9]311140/2014-9; 303182/2011-3; 477692/2012-5; 552559/2010-5; 481556/2009-5; 303673/2010-9; 303182/2011-3; 470571/2013-6; 306166/2014-301-P-01965/20122011/14058-5, 2012/18768-0; 2007/52015-0; 2013/20387-7; 2014/16250-9sem informaçã
Pervasive gaps in Amazonian ecological research
Biodiversity loss is one of the main challenges of our time,1,2 and attempts to address it require a clear un derstanding of how ecological communities respond to environmental change across time and space.3,4
While the increasing availability of global databases on ecological communities has advanced our knowledge
of biodiversity sensitivity to environmental changes,5–7 vast areas of the tropics remain understudied.8–11 In
the American tropics, Amazonia stands out as the world’s most diverse rainforest and the primary source of
Neotropical biodiversity,12 but it remains among the least known forests in America and is often underrepre sented in biodiversity databases.13–15 To worsen this situation, human-induced modifications16,17 may elim inate pieces of the Amazon’s biodiversity puzzle before we can use them to understand how ecological com munities are responding. To increase generalization and applicability of biodiversity knowledge,18,19 it is thus
crucial to reduce biases in ecological research, particularly in regions projected to face the most pronounced
environmental changes. We integrate ecological community metadata of 7,694 sampling sites for multiple or ganism groups in a machine learning model framework to map the research probability across the Brazilian
Amazonia, while identifying the region’s vulnerability to environmental change. 15%–18% of the most ne glected areas in ecological research are expected to experience severe climate or land use changes by
2050. This means that unless we take immediate action, we will not be able to establish their current status,
much less monitor how it is changing and what is being lostinfo:eu-repo/semantics/publishedVersio
Pervasive gaps in Amazonian ecological research
Biodiversity loss is one of the main challenges of our time,1,2 and attempts to address it require a clear understanding of how ecological communities respond to environmental change across time and space.3,4 While the increasing availability of global databases on ecological communities has advanced our knowledge of biodiversity sensitivity to environmental changes,5,6,7 vast areas of the tropics remain understudied.8,9,10,11 In the American tropics, Amazonia stands out as the world's most diverse rainforest and the primary source of Neotropical biodiversity,12 but it remains among the least known forests in America and is often underrepresented in biodiversity databases.13,14,15 To worsen this situation, human-induced modifications16,17 may eliminate pieces of the Amazon's biodiversity puzzle before we can use them to understand how ecological communities are responding. To increase generalization and applicability of biodiversity knowledge,18,19 it is thus crucial to reduce biases in ecological research, particularly in regions projected to face the most pronounced environmental changes. We integrate ecological community metadata of 7,694 sampling sites for multiple organism groups in a machine learning model framework to map the research probability across the Brazilian Amazonia, while identifying the region's vulnerability to environmental change. 15%–18% of the most neglected areas in ecological research are expected to experience severe climate or land use changes by 2050. This means that unless we take immediate action, we will not be able to establish their current status, much less monitor how it is changing and what is being lost
Improving semi-supervised learning through optimum connectivity
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)The annotation of large data sets by a classifier is a problem whose challenge increases as the number of labeled samples used to train the classifier reduces in comparison to the number of unlabeled samples. In this context, semi-supervised learning methods aim at discovering and labeling informative samples among the unlabeled ones, such that their addition to the correct class in the training set can improve classification performance. We present a semi-supervised learning approach that connects unlabeled and labeled samples as nodes of a minimum-spanning tree and partitions the tree into an optimum-path forest rooted at the labeled nodes. It is suitable when most samples from a same class are more closely connected through sequences of nearby samples than samples from distinct classes, which is usually the case in data sets with a reasonable relation between number of samples and feature space dimension. The proposed solution is validated by using several data sets and state-of-the-art methods as baselines. (C) 2016 Elsevier Ltd. All rights reserved.The annotation of large data sets by a classifier is a problem whose challenge increases as the number of labeled samples used to train the classifier reduces in comparison to the number of unlabeled samples. In this context, semi-supervised learning meth607285FUNDECT - FUNDAÇÃO DE APOIO AO DESENVOLVIMENTO DO ENSINO, CIÊNCIA E TECNOLOGIACNPQ - CONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICOFAPESP - FUNDAÇÃO DE AMPARO À PESQUISA DO ESTADO DE SÃO PAULOConselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)303673/2010-9; 479070/2013-0; 302970/2014-2; 303182/2011-3; 470571/2013-6; 306166/2014-32013/20387-7; 2014/16250-9sem informaçã
Correlation table between each pair of training sets,
<p><b>Cod-RNA</b> (a—d). <b>Connect</b> (e—h). <b>Covertype</b> (i—l). <b>IJCNN</b> (m—p). <b>SensIT</b> (q—t).</p
Example of interactive graph-based image segmentation.
<p>(a) The user draws labeled markers (a training set) inside and outside the object, and segmentation is based on optimum path competition from the markers in an image graph. (b) Image segmentation first relies on a pixel classifier, which is trained from the markers to create a fuzzy object map (the object should appear brighter than the background). (c) Second, the image is interpreted as a graph, whose arc weights should be lower on the border of the object than elsewhere. (d)-(f) The visual feedback from these results guides the user to the image location where more markers must be selected, improving fuzzy object map, arc weights, and so segmentation along a few interventions.</p
SensIT Vehicle (combined).
<p>Comparison of all classifiers against each other with the Nemenyi test and learning time constraint equals to 1, 5, 20, 60, 300, and 1200 seconds. Groups of classifiers that are not significantly different (at p = 0.05) are connected.</p
Correlation table between each pair of training sets,
<p><b>Cod-RNA</b> (a—d). <b>Connect</b> (e—h). <b>Covertype</b> (i—l). <b>IJCNN</b> (m—p). <b>SensIT</b> (q—t).</p