7 research outputs found

    On-line PCA with Optimal Regrets

    Full text link
    We carefully investigate the on-line version of PCA, where in each trial a learning algorithm plays a k-dimensional subspace, and suffers the compression loss on the next instance when projected into the chosen subspace. In this setting, we analyze two popular on-line algorithms, Gradient Descent (GD) and Exponentiated Gradient (EG). We show that both algorithms are essentially optimal in the worst-case. This comes as a surprise, since EG is known to perform sub-optimally when the instances are sparse. This different behavior of EG for PCA is mainly related to the non-negativity of the loss in this case, which makes the PCA setting qualitatively different from other settings studied in the literature. Furthermore, we show that when considering regret bounds as function of a loss budget, EG remains optimal and strictly outperforms GD. Next, we study the extension of the PCA setting, in which the Nature is allowed to play with dense instances, which are positive matrices with bounded largest eigenvalue. Again we can show that EG is optimal and strictly better than GD in this setting

    Leading strategies in competitive on-line prediction

    Get PDF
    We start from a simple asymptotic result for the problem of on-line regression with the quadratic loss function: the class of continuous limited-memory prediction strategies admits a "leading prediction strategy", which not only asymptotically performs at least as well as any continuous limited-memory strategy but also satisfies the property that the excess loss of any continuous limited-memory strategy is determined by how closely it imitates the leading strategy. More specifically, for any class of prediction strategies constituting a reproducing kernel Hilbert space we construct a leading strategy, in the sense that the loss of any prediction strategy whose norm is not too large is determined by how closely it imitates the leading strategy. This result is extended to the loss functions given by Bregman divergences and by strictly proper scoring rules.Comment: 20 pages; a conference version is to appear in the ALT'2006 proceeding

    Annotation of the modular polyketide synthase and nonribosomal peptide synthetase gene clusters in the genome of Streptomyces tsukubaensis NRRL18488

    Get PDF
    et al.The high G+C content and large genome size make the sequencing and assembly of Streptomyces genomes more difficult than for other bacteria. Many pharmaceutically important natural products are synthesized by modular polyketide synthases (PKSs) and nonribosomal peptide synthetases (NRPSs). The analysis of such gene clusters is difficult if the genome sequence is not of the highest quality, because clusters can be distributed over several contigs, and sequencing errors can introduce apparent frameshifts into the large PKS and NRPS proteins. An additional problem is that the modular nature of the clusters results in the presence of imperfect repeats, which may cause assembly errors. The genome sequence of Streptomyces tsukubaensis NRRL18488 was scanned for potential PKS and NRPS modular clusters. A phylogenetic approach was used to identify multiple contigs belonging to the same cluster. Four PKS clusters and six NRPS clusters were identified. Contigs containing cluster sequences were analyzed in detail by using the ClustScan program, which suggested the order and orientation of the contigs. The sequencing of the appropriate PCR products confirmed the ordering and allowed the correction of apparent frameshifts resulting from sequencing errors. The product chemistry of such correctly assembled clusters could also be predicted. The analysis of one PKS cluster showed that it should produce a bafilomycin-like compound, and reverse transcription (RT)-PCR was used to show that the cluster was transcribed. © 2012, American Society for Microbiology.We thank the Government of Slovenia, Ministry of Higher Education, Science and Technology (Slovenian Research Agency [ARRS]), for the award of grant no. J4-9331 and L4-2188 to H.P. We also thank the Ministry of the Economy, the JAPTI Agency, and the European Social Fund (contract no. 102/2008) for the funds awarded for the employment of G.K. This work was also funded by a cooperation grant of the German Academic Exchange Service (DAAD) and the Ministry of Science, Education, and Sports, Republic of Croatia (to J.C. and D.H.), and by grant 09/5 (to D.H.) from the Croatian Science Foundation.Peer Reviewe

    7th Drug hypersensitivity meeting: part two

    Get PDF
    No abstract availabl

    Clustering with Lower Bound on Similarity ⋆

    No full text
    Abstract. We propose a new method, called SimClus, for clustering with lower bound on similarity. Instead of accepting k the number of clusters to find, the alternative similarity-based approach imposes a lower bound on the similarity between an object and its corresponding cluster representative (with one representative per cluster). SimClus achieves a O(log n) approximation bound on the number of clusters, whereas for the best previous algorithm the bound can be as poor as O(n). Experiments on real and synthetic datasets show that our algorithm produces more than 40 % fewer representative objects, yet offers the same or better clustering quality. We also propose a dynamic variant of the algorithm, which can be effectively used in an on-line setting.

    Loss bounds for online category ranking

    No full text
    Abstract. Category ranking is the task of ordering labels with respect to their relevance to an input instance. In this paper we describe and analyze several algorithms for online category ranking where the instances are revealed in a sequential manner. We describe additive and multiplicative updates which constitute the core of the learning algorithms. The updates are derived by casting a constrained optimization problem for each new instance. We derive loss bounds for the algorithms by using the properties of the dual solution while imposing additional constraints on the dual form. Finally, we outline and analyze the convergence of a general update that can be employed with any Bregman divergence.
    corecore