488 research outputs found
On-line PCA with Optimal Regrets
We carefully investigate the on-line version of PCA, where in each trial a
learning algorithm plays a k-dimensional subspace, and suffers the compression
loss on the next instance when projected into the chosen subspace. In this
setting, we analyze two popular on-line algorithms, Gradient Descent (GD) and
Exponentiated Gradient (EG). We show that both algorithms are essentially
optimal in the worst-case. This comes as a surprise, since EG is known to
perform sub-optimally when the instances are sparse. This different behavior of
EG for PCA is mainly related to the non-negativity of the loss in this case,
which makes the PCA setting qualitatively different from other settings studied
in the literature. Furthermore, we show that when considering regret bounds as
function of a loss budget, EG remains optimal and strictly outperforms GD.
Next, we study the extension of the PCA setting, in which the Nature is allowed
to play with dense instances, which are positive matrices with bounded largest
eigenvalue. Again we can show that EG is optimal and strictly better than GD in
this setting
Multitask Protein Function Prediction Through Task Dissimilarity
Automated protein function prediction is a challenging problem with distinctive features, such as the hierarchical organization of protein functions and the scarcity of annotated proteins for most biological functions. We propose a multitask learning algorithm addressing both issues. Unlike standard multitask algorithms, which use task (protein functions) similarity information as a bias to speed up learning, we show that dissimilarity information enforces separation of rare class labels from frequent class labels, and for this reason is better suited for solving unbalanced protein function prediction problems. We support our claim by showing that a multitask extension of the label propagation algorithm empirically works best when the task relatedness information is represented using a dissimilarity matrix as opposed to a similarity matrix. Moreover, the experimental comparison carried out on three model organism shows that our method has a more stable performance in both "protein-centric" and "function-centric" evaluation settings
Uncertainty and Economic Activity: A Global Perspective
The 2007-2008 global financial crisis and the subsequent anemic recovery have rekindled academic interest in quantifying the impact of uncertainty on macroeconomic dynamics based on the premise that uncertainty causes economic activity to slow down and contract. In this paper, we study the interrelation between financial markets volatility and economic activity assuming that both variables are driven by the same set of unobserved common factors. We further assume that these common factors affect volatility and economic activity with a time lag of at least a quarter. Under these assumptions, we show analytically that volatility is forward looking and that the output equation of a typical VAR estimated in the literature is mis-specified as least squares estimates of this equation are inconsistent. Empirically, we document a statistically significant and economically sizable impact of future output growth on current volatility, and no effect of volatility shocks on business cycles, over and above those driven by the common factors. We interpret this evidence as suggesting that volatility is a symptom rather than a cause of economic instability
Correlation Clustering with Adaptive Similarity Queries
In correlation clustering, we are givennobjects together with a binary similarityscore between each pair of them. The goal is to partition the objects into clustersso to minimise the disagreements with the scores. In this work we investigatecorrelation clustering as an active learning problem: each similarity score can belearned by making a query, and the goal is to minimise both the disagreementsand the total number of queries. On the one hand, we describe simple activelearning algorithms, which provably achieve an almost optimal trade-off whilegiving cluster recovery guarantees, and we test them on different datasets. On theother hand, we prove information-theoretical bounds on the number of queriesnecessary to guarantee a prescribed disagreement bound. These results give a richcharacterization of the trade-off between queries and clustering error
China’s Emergence in the World Economy and Business Cycles in Latin America
The international business cycle is very important for Latin America's economic performance as the recent global crisis vividly illustrated. This paper investigates how changes in trade linkages between China, Latin America, and the rest of the world have altered the transmission mechanism of international business cycles to Latin America. Evidence based on a Global Vector Autoregressive (GVAR) model for 5 large Latin American economies and all major advanced and emerging economies of the world shows that the long-term impact of a China GDP shock on the typical Latin American economy has increased by three times since mid-1990s. At the same time, the long-term impact of a US GDP shock has halved, while the transmission of shocks to Latin America and the rest of emerging Asia (excluding China and India) GDP has not undergone any significant change. Contrary to common wisdom, we find that these changes owe more to the changed impact of China on Latin America's traditional and largest trading partners than to increased direct bilateral trade linkages boosted by the decade-long commodity price boom. These findings help to explain why Latin America did so well during the global crisis, but point to the risks associated with a deceleration in China's economic growth in the future for both Latin America and the rest of the world economy. The evidence reported also suggests that the emergence of China as an important source of world growth might be the driver of the so called "decoupling" of emerging markets business cycle from that of advanced economies reported in the existing literature.Latin Americ
Revisiting the Core Ontology and Problem in Requirements Engineering
In their seminal paper in the ACM Transactions on Software Engineering and
Methodology, Zave and Jackson established a core ontology for Requirements
Engineering (RE) and used it to formulate the "requirements problem", thereby
defining what it means to successfully complete RE. Given that stakeholders of
the system-to-be communicate the information needed to perform RE, we show that
Zave and Jackson's ontology is incomplete. It does not cover all types of basic
concerns that the stakeholders communicate. These include beliefs, desires,
intentions, and attitudes. In response, we propose a core ontology that covers
these concerns and is grounded in sound conceptual foundations resting on a
foundational ontology. The new core ontology for RE leads to a new formulation
of the requirements problem that extends Zave and Jackson's formulation. We
thereby establish new standards for what minimum information should be
represented in RE languages and new criteria for determining whether RE has
been successfully completed.Comment: Appears in the proceedings of the 16th IEEE International
Requirements Engineering Conference, 2008 (RE'08). Best paper awar
Bandit Online Optimization Over the Permutahedron
The permutahedron is the convex polytope with vertex set consisting of the
vectors for all permutations (bijections) over
. We study a bandit game in which, at each step , an
adversary chooses a hidden weight weight vector , a player chooses a
vertex of the permutahedron and suffers an observed loss of
.
A previous algorithm CombBand of Cesa-Bianchi et al (2009) guarantees a
regret of for a time horizon of . Unfortunately,
CombBand requires at each step an -by- matrix permanent approximation to
within improved accuracy as grows, resulting in a total running time that
is super linear in , making it impractical for large time horizons.
We provide an algorithm of regret with total time
complexity . The ideas are a combination of CombBand and a recent
algorithm by Ailon (2013) for online optimization over the permutahedron in the
full information setting. The technical core is a bound on the variance of the
Plackett-Luce noisy sorting process's "pseudo loss". The bound is obtained by
establishing positive semi-definiteness of a family of 3-by-3 matrices
generated from rational functions of exponentials of 3 parameters
- …