Search CORE

14 research outputs found

A neural network approach to understanding implied volatility movements

Author: Culkin R.
Telgarsky M.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

Exponential convergence of the deep neural network approximation for analytic functions

Author: A R Barron
D Yarotsky
G Cybenko
H Montanelli
M Telgarsky
Qingcan Wang
S Liang
T Poggio
Weinan E
Z Lu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Comments on 'The Role of Housing Policy in the Transformation Process in Central-East European Cities': Is Economic Efficiency the End-all?

Author: Douglas M.
Fassmann H.
Kemeny J.
Kovács Z.
Pichler-Milanovich N.
Sarkany C.
Struyk R.
Sykora L.
Telgarsky J.P.
Vienna Paper on Urban Renewal
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

Smooth Minimization of Nonsmooth Functions with Parallel Coordinate Descent Methods

Author: A Ruszczyński
D Leventhal
I Necoara
I Necoara
I Necoara
I Palit
M Collins
M Journée
M Telgarsky
R Tappenden
S Shalev-Shwartz
S Shalev-Shwartz
Y Nesterov
Publication venue: HAL CCSD
Publication date
Field of study

39 pages, 1 algorithm, 3 figures, 2 tablesInternational audienceWe study the performance of a family of randomized parallel coordinate descent methods for minimizing the sum of a nonsmooth and separable convex functions. The problem class includes as a special case L1-regularized L1 regression and the minimization of the exponential loss ("AdaBoost problem"). We assume the input data defining the loss function is contained in a sparse

m\times n

matrix

A

with at most

\omega

nonzeros in each row. Our methods need

O(n \beta/\tau)

iterations to find an approximate solution with high probability, where

\tau

is the number of processors and

\beta = 1 + (\omega-1)(\tau-1)/(n-1)

for the fastest variant. The notation hides dependence on quantities such as the required accuracy and confidence levels and the distance of the starting iterate from an optimal point. Since

\beta/\tau

is a decreasing function of

\tau

, the method needs fewer iterations when more processors are used. Certain variants of our algorithms perform on average only O(\nnz(A)/n) arithmetic operations during a single iteration per processor and, because