840 research outputs found

    On-line PCA with Optimal Regrets

    Full text link
    We carefully investigate the on-line version of PCA, where in each trial a learning algorithm plays a k-dimensional subspace, and suffers the compression loss on the next instance when projected into the chosen subspace. In this setting, we analyze two popular on-line algorithms, Gradient Descent (GD) and Exponentiated Gradient (EG). We show that both algorithms are essentially optimal in the worst-case. This comes as a surprise, since EG is known to perform sub-optimally when the instances are sparse. This different behavior of EG for PCA is mainly related to the non-negativity of the loss in this case, which makes the PCA setting qualitatively different from other settings studied in the literature. Furthermore, we show that when considering regret bounds as function of a loss budget, EG remains optimal and strictly outperforms GD. Next, we study the extension of the PCA setting, in which the Nature is allowed to play with dense instances, which are positive matrices with bounded largest eigenvalue. Again we can show that EG is optimal and strictly better than GD in this setting

    General E(2)-Equivariant Steerable CNNs

    Get PDF

    Multitask Protein Function Prediction Through Task Dissimilarity

    Get PDF
    Automated protein function prediction is a challenging problem with distinctive features, such as the hierarchical organization of protein functions and the scarcity of annotated proteins for most biological functions. We propose a multitask learning algorithm addressing both issues. Unlike standard multitask algorithms, which use task (protein functions) similarity information as a bias to speed up learning, we show that dissimilarity information enforces separation of rare class labels from frequent class labels, and for this reason is better suited for solving unbalanced protein function prediction problems. We support our claim by showing that a multitask extension of the label propagation algorithm empirically works best when the task relatedness information is represented using a dissimilarity matrix as opposed to a similarity matrix. Moreover, the experimental comparison carried out on three model organism shows that our method has a more stable performance in both "protein-centric" and "function-centric" evaluation settings

    Revisiting the Core Ontology and Problem in Requirements Engineering

    Full text link
    In their seminal paper in the ACM Transactions on Software Engineering and Methodology, Zave and Jackson established a core ontology for Requirements Engineering (RE) and used it to formulate the "requirements problem", thereby defining what it means to successfully complete RE. Given that stakeholders of the system-to-be communicate the information needed to perform RE, we show that Zave and Jackson's ontology is incomplete. It does not cover all types of basic concerns that the stakeholders communicate. These include beliefs, desires, intentions, and attitudes. In response, we propose a core ontology that covers these concerns and is grounded in sound conceptual foundations resting on a foundational ontology. The new core ontology for RE leads to a new formulation of the requirements problem that extends Zave and Jackson's formulation. We thereby establish new standards for what minimum information should be represented in RE languages and new criteria for determining whether RE has been successfully completed.Comment: Appears in the proceedings of the 16th IEEE International Requirements Engineering Conference, 2008 (RE'08). Best paper awar

    Bandit Online Optimization Over the Permutahedron

    Full text link
    The permutahedron is the convex polytope with vertex set consisting of the vectors (π(1),,π(n))(\pi(1),\dots, \pi(n)) for all permutations (bijections) π\pi over {1,,n}\{1,\dots, n\}. We study a bandit game in which, at each step tt, an adversary chooses a hidden weight weight vector sts_t, a player chooses a vertex πt\pi_t of the permutahedron and suffers an observed loss of i=1nπ(i)st(i)\sum_{i=1}^n \pi(i) s_t(i). A previous algorithm CombBand of Cesa-Bianchi et al (2009) guarantees a regret of O(nTlogn)O(n\sqrt{T \log n}) for a time horizon of TT. Unfortunately, CombBand requires at each step an nn-by-nn matrix permanent approximation to within improved accuracy as TT grows, resulting in a total running time that is super linear in TT, making it impractical for large time horizons. We provide an algorithm of regret O(n3/2T)O(n^{3/2}\sqrt{T}) with total time complexity O(n3T)O(n^3T). The ideas are a combination of CombBand and a recent algorithm by Ailon (2013) for online optimization over the permutahedron in the full information setting. The technical core is a bound on the variance of the Plackett-Luce noisy sorting process's "pseudo loss". The bound is obtained by establishing positive semi-definiteness of a family of 3-by-3 matrices generated from rational functions of exponentials of 3 parameters

    Correlation Clustering with Adaptive Similarity Queries

    Get PDF
    In correlation clustering, we are givennobjects together with a binary similarityscore between each pair of them. The goal is to partition the objects into clustersso to minimise the disagreements with the scores. In this work we investigatecorrelation clustering as an active learning problem: each similarity score can belearned by making a query, and the goal is to minimise both the disagreementsand the total number of queries. On the one hand, we describe simple activelearning algorithms, which provably achieve an almost optimal trade-off whilegiving cluster recovery guarantees, and we test them on different datasets. On theother hand, we prove information-theoretical bounds on the number of queriesnecessary to guarantee a prescribed disagreement bound. These results give a richcharacterization of the trade-off between queries and clustering error

    China’s Emergence in the World Economy and Business Cycles in Latin America

    Get PDF
    The international business cycle is very important for Latin America's economic performance as the recent global crisis vividly illustrated. This paper investigates how changes in trade linkages between China, Latin America, and the rest of the world have altered the transmission mechanism of international business cycles to Latin America. Evidence based on a Global Vector Autoregressive (GVAR) model for 5 large Latin American economies and all major advanced and emerging economies of the world shows that the long-term impact of a China GDP shock on the typical Latin American economy has increased by three times since mid-1990s. At the same time, the long-term impact of a US GDP shock has halved, while the transmission of shocks to Latin America and the rest of emerging Asia (excluding China and India) GDP has not undergone any significant change. Contrary to common wisdom, we find that these changes owe more to the changed impact of China on Latin America's traditional and largest trading partners than to increased direct bilateral trade linkages boosted by the decade-long commodity price boom. These findings help to explain why Latin America did so well during the global crisis, but point to the risks associated with a deceleration in China's economic growth in the future for both Latin America and the rest of the world economy. The evidence reported also suggests that the emergence of China as an important source of world growth might be the driver of the so called "decoupling" of emerging markets business cycle from that of advanced economies reported in the existing literature.Latin Americ
    corecore