Search CORE

1,423 research outputs found

The Dual JL Transforms and Superfast Matrix Algorithms

Author: Luan Qi
Pan Victor Y.
Svadlenka John
Publication venue
Publication date: 02/04/2021
Field of study

We call a matrix algorithm superfast (aka running at sublinear cost) if it involves much fewer flops and memory cells than the matrix has entries. Using such algorithms is highly desired or even imperative in computations for Big Data, which involve immense matrices and are quite typically reduced to solving linear least squares problem and/or computation of low rank approximation of an input matrix. The known algorithms for these problems are not superfast, but we prove that their certain superfast modifications output reasonable or even nearly optimal solutions for large input classes. We also propose, analyze, and test a novel superfast algorithm for iterative refinement of any crude but sufficiently close low rank approximation of a matrix. The results of our numerical tests are in good accordance with our formal study.Comment: 36.1 pages, 5 figures, and 1 table. arXiv admin note: text overlap with arXiv:1710.07946, arXiv:1906.0411

arXiv.org e-Print Archive

マレーバク(Tapirus indicus）の保全を目指した遺伝解析

Author: LIM Qi Luan
Publication venue: 京都大学
Publication date: 23/03/2023
Field of study

付記する学位プログラム名: 霊長類学・ワイルドライフサイエンス・リーディング大学院京都大学新制・課程博士博士(理学)甲第24471号理博第4970号新制||理||1709(附属図書館)京都大学大学院理学研究科生物科学専攻(主査)教授村山美穂, 教授伊谷原一, 教授平田聡学位規則第4条第1項該当Doctor of ScienceKyoto UniversityDGA

Kyoto University Research Information Repository

Superfast Refinement of Low Rank Approximation of a Matrix

Author: Luan Qi
Pan Victor Y.
Publication venue
Publication date: 31/03/2021
Field of study

Low rank approximation (LRA) of a matrix is a hot subject of modern computations. In application to Big Data mining and analysis the input matrices are usually so immense that one must apply superfast algorithms, which only access a tiny fraction of the input entries and involve much fewer memory cells and flops than an input matrix has entries. Recently we devised and analyzed some superfast LRA algorithms; in this paper we extend a classical algorithm of iterative refinement of the solution of linear systems of equations to superfast refinement of a crude but reasonably close LRA; we also list some heuristic recipes for superfast a posteriori estimation of the errors of LRA and support our superfast refinement algorithm with some superfast heuristic recipes for a posteriori error estimation of LRA and with superfast back and forth transition between any LRA of a matrix and its SVD. Our algorithm of iterative refinement of LRA is the first attempt of this kind and should motivate further effort in that direction, but already our initial tests are in good accordance with our formal study.Comment: 12.5 pages,, 1 table and 1 figur

arXiv.org e-Print Archive

Matrix Low Rank Approximation at Sublinear Cost

Author: Luan Qi
Publication venue: CUNY Academic Works
Publication date: 01/09/2020
Field of study

A matrix algorithm runs at sublinear cost if the number of arithmetic operations involved is far fewer than the number of entries of the input matrix. Such algorithms are especially crucial for applications in the field of Big Data, where input matrices are so immense that one can only store a fraction of the entire matrix in memory of modern machines. Typically, such matrices admit Low Rank Approximation (LRA) that can be stored and processed at sublinear cost. Can we compute LRA at sublinear cost? Our counter example presented in Appendix C shows that no sublinear cost algorithm can compute accurate LRA for arbitrary input. However, for a decade, researchers observed that many sublinear cost algorithms, such as Cross Approximations (C--A) iterations, routinely compute accurate LRA. We partly resolve this long-known contradiction by proving that: (i) sublinear cost variations of a popular subspace sampling algorithm can compute accurate LRA for a large class of inputs with high probability; (ii) a single two-stage C--A loop computes accurate LRA given that the input is reasonably close to a low rank matrix and the C--A loop starts with a submatrix that shares the same numerical rank with the input; (iii) for arbitrary Symmetric Positive Semi-Definite (SPSD) input, there exists a deterministic sublinear cost algorithm that outputs close to optimal LRA in the Chebyshev norm; (iv) for any input, an LRA based on given sets of columns and rows can be computed at sublinear cost, and this approximation is near optimal

City University of New York

A cost-effective blood DNA methylation-based age estimation method in domestic cats, Tsushima leopard cats (Prionailurus bengalensis euptilurus) and Panthera species, using targeted bisulphite sequencing and machine learning models

Author: Inoue‐Murayama Miho
Kinoshita Kodzue
Lim Qi Luan
Nakajima Nobuyoshi
Qi Huiyuan
Publication venue: Wiley
Publication date: 01/04/2024
Field of study

Individual age can be used to design more efficient and suitable management plans in both in situ and ex situ conservation programmes for targeted wildlife species. DNA methylation is a promising marker of epigenetic ageing that can accurately estimate age from small amounts of biological material, which can be collected in a minimally invasive manner. In this study, we sequenced five targeted genetic regions and used 8–23 selected CpG sites to build age estimation models using machine learning methods at only about $3–7 per sample. Blood samples of seven Felidae species were used, ranging from small to big, and domestic to endangered species: domestic cats (Felis catus, 139 samples), Tsushima leopard cats (Prionailurus bengalensis euptilurus, 84 samples) and five Panthera species (96 samples). The models achieved satisfactory accuracy, with the mean absolute error of the most accurate models recorded at 1.966, 1.348 and 1.552 years in domestic cats, Tsushima leopard cats and Panthera spp. respectively. We developed the models in domestic cats and Tsushima leopard cats, which were applicable to individuals regardless of health conditions; therefore, these models are applicable to samples collected from individuals with diverse characteristics, which is often the case in conservation. We also showed the possibility of developing universal age estimation models for the five Panthera spp. using only two of the five genetic regions. We do not recommend building a common age estimation model for all the target species using our markers, because of the degraded performance of models that included all species

Kyoto University Research Information Repository

CUR Low Rank Approximation of a Matrix at Sublinear Cost

Author: Luan Qi
Pan Victor Y.
Svadlenka John
Zhao Liang
Publication venue
Publication date: 21/12/2020
Field of study

Low rank approximation of a matrix (hereafter LRA) is a highly important area of Numerical Linear and Multilinear Algebra and Data Mining and Analysis. One can operate with LRA at sublinear cost, that is, by using much fewer memory cells and flops than an input matrix has entries, but no sublinear cost algorithm can compute accurate LRA of the worst case input matrices or even of the matrices of small families in our Appendix. Nevertheless we prove that Cross-Approximation celebrated algorithms and even more primitive sublinear cost algorithms output quite accurate LRA for a large subclass of the class of all matrices that admit LRA and in a sense for most of such matrices. Moreover, we accentuate the power of sublinear cost LRA by means of multiplicative pre-processing of an input matrix, and this also reveals a link between C-A algorithms and Randomized and Sketching LRA algorithms. Our tests are in good accordance with our formal study.Comment: 29 pages, 5 figures, 5 tables. arXiv admin note: text overlap with arXiv:1906.0492

arXiv.org e-Print Archive