1,014 research outputs found
Learning detectors quickly using structured covariance matrices
Computer vision is increasingly becoming interested in the rapid estimation
of object detectors. Canonical hard negative mining strategies are slow as they
require multiple passes of the large negative training set. Recent work has
demonstrated that if the distribution of negative examples is assumed to be
stationary, then Linear Discriminant Analysis (LDA) can learn comparable
detectors without ever revisiting the negative set. Even with this insight,
however, the time to learn a single object detector can still be on the order
of tens of seconds on a modern desktop computer. This paper proposes to
leverage the resulting structured covariance matrix to obtain detectors with
identical performance in orders of magnitude less time and memory. We elucidate
an important connection to the correlation filter literature, demonstrating
that these can also be trained without ever revisiting the negative set
Channel Covariance Matrix Estimation via Dimension Reduction for Hybrid MIMO MmWave Communication Systems
Hybrid massive MIMO structures with lower hardware complexity and power
consumption have been considered as a potential candidate for millimeter wave
(mmWave) communications. Channel covariance information can be used for
designing transmitter precoders, receiver combiners, channel estimators, etc.
However, hybrid structures allow only a lower-dimensional signal to be
observed, which adds difficulties for channel covariance matrix estimation. In
this paper, we formulate the channel covariance estimation as a structured
low-rank matrix sensing problem via Kronecker product expansion and use a
low-complexity algorithm to solve this problem. Numerical results with uniform
linear arrays (ULA) and uniform squared planar arrays (USPA) are provided to
demonstrate the effectiveness of our proposed method
Applications and accuracy of the parallel diagonal dominant algorithm
The Parallel Diagonal Dominant (PDD) algorithm is a highly efficient, ideally scalable tridiagonal solver. In this paper, a detailed study of the PDD algorithm is given. First the PDD algorithm is introduced. Then the algorithm is extended to solve periodic tridiagonal systems. A variant, the reduced PDD algorithm, is also proposed. Accuracy analysis is provided for a class of tridiagonal systems, the symmetric, and anti-symmetric Toeplitz tridiagonal systems. Implementation results show that the analysis gives a good bound on the relative error, and the algorithm is a good candidate for the emerging massively parallel machines
A simple parallel prefix algorithm for compact finite-difference schemes
A compact scheme is a discretization scheme that is advantageous in obtaining highly accurate solutions. However, the resulting systems from compact schemes are tridiagonal systems that are difficult to solve efficiently on parallel computers. Considering the almost symmetric Toeplitz structure, a parallel algorithm, simple parallel prefix (SPP), is proposed. The SPP algorithm requires less memory than the conventional LU decomposition and is highly efficient on parallel machines. It consists of a prefix communication pattern and AXPY operations. Both the computation and the communication can be truncated without degrading the accuracy when the system is diagonally dominant. A formal accuracy study was conducted to provide a simple truncation formula. Experimental results were measured on a MasPar MP-1 SIMD machine and on a Cray 2 vector machine. Experimental results show that the simple parallel prefix algorithm is a good algorithm for the compact scheme on high-performance computers
Architectures for block Toeplitz systems
In this paper efficient VLSI architectures of highly concurrent algorithms for the solution of block linear systems with Toeplitz or near-to-Toeplitz entries are presented. The main features of the proposed scheme are the use of scalar only operations, multiplications/divisions and additions, and the local communication which enables the development of wavefront array architecture. Both the mean squared error and the total squared error formulations are described and a variety of implementations are given
Very Large-Scale Singular Value Decomposition Using Tensor Train Networks
We propose new algorithms for singular value decomposition (SVD) of very
large-scale matrices based on a low-rank tensor approximation technique called
the tensor train (TT) format. The proposed algorithms can compute several
dominant singular values and corresponding singular vectors for large-scale
structured matrices given in a TT format. The computational complexity of the
proposed methods scales logarithmically with the matrix size under the
assumption that both the matrix and the singular vectors admit low-rank TT
decompositions. The proposed methods, which are called the alternating least
squares for SVD (ALS-SVD) and modified alternating least squares for SVD
(MALS-SVD), compute the left and right singular vectors approximately through
block TT decompositions. The very large-scale optimization problem is reduced
to sequential small-scale optimization problems, and each core tensor of the
block TT decompositions can be updated by applying any standard optimization
methods. The optimal ranks of the block TT decompositions are determined
adaptively during iteration process, so that we can achieve high approximation
accuracy. Extensive numerical simulations are conducted for several types of
TT-structured matrices such as Hilbert matrix, Toeplitz matrix, random matrix
with prescribed singular values, and tridiagonal matrix. The simulation results
demonstrate the effectiveness of the proposed methods compared with standard
SVD algorithms and TT-based algorithms developed for symmetric eigenvalue
decomposition
- âŠ