4 research outputs found

    Robust Multi-class Graph Transduction with higher order regularization

    Get PDF
    Graph transduction refers to a family of algorithms that learn from both labeled and unlabeled examples using a weighted graph and scarce label information via regularization or label propagation. A recent empirical study showed that the Robust Multi-class Graph Transduction (RMGT) algorithm achieves state-of-the-art performance on a variety of graph transduction tasks. Although RMGT achieves state-of-the-art performance and is parameter-free, this method was specifically designed for using the combinatorial Laplacian within its regularization framework. Unfortunately, the combinatorial Laplacian may not be the most appropriate graph Laplacian for all real applications and recent empirical studies showed that normalized and iterated Laplacians may be better suited than combinatorial Laplacians in general tasks. In this paper, we generalize the RMGT algorithm for any positive semidefinite matrix. Therefore, we provide a novel graph transduction method that can naturally deal with higher order regularization. In order to show the effectiveness of our method, we empirically evaluate it against five state-of-the-art graphbased semi-supervised learning algorithms with respect to graph construction and parameter selection on a number of benchmark data sets. Through a detailed experimental analysis using recently proposed empirical evaluation models, we see that our method achieved competitive performance on most data sets. In addition, our method achieved good stability with respect to the graph's parameter for most data sets and graph construction methods, which is a valuable property for real applications. However, the Laplacian's degree value may have a moderate influence in the performance of our method.CAPESFAPESP (grant 2012/50714-7)CNPq (grant 446330/2014-0

    Time series transductive classification on imbalanced data sets: an experimental study

    Get PDF
    Graph-based semi-supervised learning (SSL) algorithms perform well on a variety of domains, such as digit recognition and text classification, when the data lie on a low-dimensional manifold. However, it is surprising that these methods have not been effectively applied on time series classification tasks. In this paper, we provide a comprehensive empirical comparison of state-of-the-art graph-based SSL algorithms with respect to graph construction and parameter selection. Specifically, we focus in this paper on the problem of time series transductive classification on imbalanced data sets. Through a comprehensive analysis using recently proposed empirical evaluation models, we confirm some of the hypotheses raised on previous work and show that some of them may not hold in the time series domain. From our results, we suggest the use of the Gaussian Fields and Harmonic Functions algorithm with the mutual k-nearest neighbors graph weighted by the RBF kernel, setting k = 20 on general tasks of time series transductive classification on imbalanced data sets.S√£o Paulo Research Foundation (FAPESP) (grants 2011/17698-5 and 2012/50714-7

    Photography-based taxonomy is inadequate, unnecessary, and potentially harmful for biological sciences

    Get PDF
    The question whether taxonomic descriptions naming new animal species without type specimen(s) deposited in collections should be accepted for publication by scientific journals and allowed by the Code has already been discussed in Zootaxa (Dubois & Nem√©sio 2007; Donegan 2008, 2009; Nem√©sio 2009a‚Äďb; Dubois 2009; Gentile & Snell 2009; Minelli 2009; Cianferoni & Bartolozzi 2016; Amorim et al. 2016). This question was again raised in a letter supported by 35 signatories published in the journal Nature (Pape et al. 2016) on 15 September 2016. On 25 September 2016, the following rebuttal (strictly limited to 300 words as per the editorial rules of Nature) was submitted to Nature, which on 18 October 2016 refused to publish it. As we think this problem is a very important one for zoological taxonomy, this text is published here exactly as submitted to Nature, followed by the list of the 493 taxonomists and collection-based researchers who signed it in the short time span from 20 September to 6 October 2016

    Aprendizado semissupervisionado restrito baseado em grafos com regularização de ordem elevada

    No full text
    Graph-based semi-supervised learning (SSL) algorithms have been widely studied in the last few years. Most of these algorithms were designed from unconstrained optimization problems using a Laplacian regularizer term as smoothness functional in an attempt to reflect the intrinsic geometric structure of the datas marginal distribution. Although a number of recent research papers are still focusing on unconstrained methods for graph-based SSL, a recent statistical analysis showed that many of these algorithms may be unstable on transductive regression. Therefore, we focus on providing new constrained methods for graph-based SSL. We begin by analyzing the regularization framework of existing unconstrained methods. Then, we incorporate two normalization constraints into the optimization problem of three of these methods. We show that the proposed optimization problems have closed-form solution. By generalizing one of these constraints to any distribution, we provide generalized methods for constrained graph-based SSL. The proposed methods have a more flexible regularization framework than the corresponding unconstrained methods. More precisely, our methods can deal with any graph Laplacian and use higher order regularization, which is effective on general SSL taks. In order to show the effectiveness of the proposed methods, we provide comprehensive experimental analyses. Specifically, our experiments are subdivided into two parts. In the first part, we evaluate existing graph-based SSL algorithms on time series data to find their weaknesses. In the second part, we evaluate the proposed constrained methods against six state-of-the-art graph-based SSL algorithms on benchmark data sets. Since the widely used best case analysis may hide useful information concerning the SSL algorithms performance with respect to parameter selection, we used recently proposed empirical evaluation models to evaluate our results. Our results show that our methods outperforms the competing methods on most parameter settings and graph construction methods. However, we found a few experimental settings in which our methods showed poor performance. In order to facilitate the reproduction of our results, the source codes, data sets, and experimental results are freely available.Algoritmos de aprendizado semissupervisionado baseado em grafos foram amplamente estudados nos √ļltimos anos. A maioria desses algoritmos foi projetada a partir de problemas de otimiza√ß√£o sem restri√ß√Ķes usando um termo regularizador Laplaciano como funcional de suavidade numa tentativa de refletir a estrutura geom√©trica intr√≠nsica da distribui√ß√£o marginal dos dados. Apesar de v√°rios artigos cient√≠ficos recentes continuarem focando em m√©todos sem restri√ß√£o para aprendizado semissupervisionado em grafos, uma an√°lise estat√≠stica recente mostrou que muitos desses algoritmos podem ser inst√°veis em regress√£o transdutiva. Logo, n√≥s focamos em propor novos m√©todos com restri√ß√Ķes para aprendizado semissupervisionado em grafos. N√≥s come√ßamos analisando o framework de regulariza√ß√£o de m√©todos sem restri√ß√Ķes existentes. Ent√£o, n√≥s incorporamos duas restri√ß√Ķes de normaliza√ß√£o no problema de otimiza√ß√£o de tr√™s desses m√©todos. Mostramos que os problemas de otimiza√ß√£o propostos possuem solu√ß√£o de forma fechada. Ao generalizar uma dessas restri√ß√Ķes para qualquer distribui√ß√£o, provemos m√©todos generalizados para aprendizado semissupervisionado restrito baseado em grafos. Os m√©todos propostos possuem um framework de regulariza√ß√£o mais flex√≠vel que os m√©todos sem restri√ß√Ķes correspondentes. Mais precisamente, nossos m√©todos podem lidar com qualquer Laplaciano em grafos e usar regulariza√ß√£o de ordem elevada, a qual √© efetiva em tarefas de aprendizado semissupervisionado em geral. Para mostrar a efetividade dos m√©todos propostos, n√≥s provemos an√°lises experimentais robustas. Especificamente, nossos experimentos s√£o subdivididos em duas partes. Na primeira parte, avaliamos algoritmos de aprendizado semissupervisionado em grafos existentes em dados de s√©ries temporais para encontrar poss√≠veis fraquezas desses m√©todos. Na segunda parte, avaliamos os m√©todos restritos propostos contra seis algoritmos de aprendizado semissupervisionado baseado em grafos do estado da arte em conjuntos de dados benchmark. Como a amplamente usada an√°lise de melhor caso pode esconder informa√ß√Ķes relevantes sobre o desempenho dos algoritmos de aprendizado semissupervisionado com respeito √† sele√ß√£o de par√Ęmetros, n√≥s usamos modelos de avalia√ß√£o emp√≠rica recentemente propostos para avaliar os nossos resultados. Nossos resultados mostram que os nossos m√©todos superam os demais m√©todos na maioria das configura√ß√Ķes de par√Ęmetro e m√©todos de constru√ß√£o de grafos. Entretanto, encontramos algumas configura√ß√Ķes experimentais nas quais nossos m√©todos mostraram baixo desempenho. Para facilitar a reprodu√ß√£o dos nossos resultados, os c√≥digos fonte, conjuntos de dados e resultados experimentais est√£o dispon√≠veis gratuitamente