Search CORE

262,064 research outputs found

The computational complexity of ReLU network training parameterized by data dimensionality

Author: Froese Vincent
Hertrich Christoph
Niedermeier Rolf
Publication venue: 'AI Access Foundation'
Publication date: 31/05/2021
Field of study

Understanding the computational complexity of training simple neural networks with rectified linear units (ReLUs) has recently been a subject of intensive research. Closing gaps and complementing results from the literature, we present several results on the parameterized complexity of training two-layer ReLU networks with respect to various loss functions. After a brief discussion of other parameters, we focus on analyzing the influence of the dimension d of the training data on the computational complexity. We provide running time lower bounds in terms of W[1]-hardness for parameter d and prove that known brute-force strategies are essentially optimal (assuming the Exponential Time Hypothesis). In comparison with previous work, our results hold for a broad(er) range of loss functions, including `p-loss for all p ∈ [0, ∞]. In particular, we improve a known polynomial-time algorithm for constant d and convex loss functions to a more general class of loss functions, matching our running time lower bounds also in these cases

arXiv.org e-Print Archive

LSE Research Online

Effects of the Generation Size and Overlap on Throughput and Complexity in Randomized Linear Network Coding

Author: Li Yao
Soljanin Emina
Spasojevic Predrag
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/11/2010
Field of study

To reduce computational complexity and delay in randomized network coded content distribution, and for some other practical reasons, coding is not performed simultaneously over all content blocks, but over much smaller, possibly overlapping subsets of these blocks, known as generations. A penalty of this strategy is throughput reduction. To analyze the throughput loss, we model coding over generations with random generation scheduling as a coupon collector's brotherhood problem. This model enables us to derive the expected number of coded packets needed for successful decoding of the entire content as well as the probability of decoding failure (the latter only when generations do not overlap) and further, to quantify the tradeoff between computational complexity and throughput. Interestingly, with a moderate increase in the generation size, throughput quickly approaches link capacity. Overlaps between generations can further improve throughput substantially for relatively small generation sizes.Comment: To appear in IEEE Transactions on Information Theory Special Issue: Facets of Coding Theory: From Algorithms to Networks, Feb 201

arXiv.org e-Print Archive

Crossref

Scalable Coordinated Beamforming for Dense Wireless Cooperative Networks

Author: Letaief Khaled B.
Shi Yuanming
Zhang Jun
Publication venue
Publication date: 13/05/2014
Field of study

To meet the ever growing demand for both high throughput and uniform coverage in future wireless networks, dense network deployment will be ubiquitous, for which co- operation among the access points is critical. Considering the computational complexity of designing coordinated beamformers for dense networks, low-complexity and suboptimal precoding strategies are often adopted. However, it is not clear how much performance loss will be caused. To enable optimal coordinated beamforming, in this paper, we propose a framework to design a scalable beamforming algorithm based on the alternative direction method of multipliers (ADMM) method. Specifically, we first propose to apply the matrix stuffing technique to transform the original optimization problem to an equivalent ADMM-compliant problem, which is much more efficient than the widely-used modeling framework CVX. We will then propose to use the ADMM algorithm, a.k.a. the operator splitting method, to solve the transformed ADMM-compliant problem efficiently. In particular, the subproblems of the ADMM algorithm at each iteration can be solved with closed-forms and in parallel. Simulation results show that the proposed techniques can result in significant computational efficiency compared to the state- of-the-art interior-point solvers. Furthermore, the simulation results demonstrate that the optimal coordinated beamforming can significantly improve the system performance compared to sub-optimal zero forcing beamforming

arXiv.org e-Print Archive

Crossref

Recovery Guarantees for One-hidden-layer Neural Networks

Author: Bartlett Peter L.
Dhillon Inderjit S.
Jain Prateek
Song Zhao
Zhong Kai
Publication venue
Publication date: 01/01/2017
Field of study

In this paper, we consider regression problems with one-hidden-layer neural networks (1NNs). We distill some properties of activation functions that lead to

\mathit{local~strong~convexity}

in the neighborhood of the ground-truth parameters for the 1NN squared-loss objective. Most popular nonlinear activation functions satisfy the distilled properties, including rectified linear units (ReLUs), leaky ReLUs, squared ReLUs and sigmoids. For activation functions that are also smooth, we show

\mathit{local~linear~convergence}

guarantees of gradient descent under a resampling rule. For homogeneous activations, we show tensor methods are able to initialize the parameters to fall into the local strong convexity region. As a result, tensor initialization followed by gradient descent is guaranteed to recover the ground truth with sample complexity

d \cdot \log(1/\epsilon) \cdot \mathrm{poly}(k,\lambda )

and computational complexity

n\cdot d \cdot \mathrm{poly}(k,\lambda)

for smooth homogeneous activations with high probability, where

d

is the dimension of the input,

k

(

k\leq d

) is the number of hidden nodes,

\lambda

is a conditioning property of the ground-truth parameter matrix between the input layer and the hidden layer,

\epsilon

is the targeted precision and

n

is the number of samples. To the best of our knowledge, this is the first work that provides recovery guarantees for 1NNs with both sample complexity and computational complexity

\mathit{linear}

in the input dimension and

\mathit{logarithmic}

in the precision.Comment: ICML 201

arXiv.org e-Print Archive

Queensland University of Technology ePrints Archive