2 research outputs found
μ΄μ§μ 곡λΆμ° λͺ¨νμμμ μ΄μ€ λ°μ΄ν° νμΌλ§ νμ
νμλ
Όλ¬Έ(μμ¬) -- μμΈλνκ΅λνμ : μμ°κ³Όνλν ν΅κ³νκ³Ό, 2023. 8. μ μ±κ·.In this work, we characterize two data piling phenomenon for a high-dimensional binary classification problem with heterogeneous covariance models. The data piling refers to the phenomenon where projections of the training data onto a direction vector have exactly two distinct values, one for each class. This first data piling phenomenon occurs for any data when the dimension p is larger than the sample size n. We show that the second data piling phenomenon, which refers to a data piling of independent test data, can occur in an asymptotic context where p grows while n is fixed. We further show that a second maximal data piling direction, which gives an asymptotic maximal distance between the two piles of independent test data, can be obtained by projecting the first maximal data piling direction onto the nullspace of the common leading eigenspace. Based on the second data piling phenomenon, we propose novel linear classification rules which ensure perfect classification of high-dimension low-sample-size data under generalized heterogeneous spiked covariance models.λ³Έ μ°κ΅¬μμλ μ΄μ§μ μΈ κ³΅λΆμ° λͺ¨νμ κ°μ νλ κ³ μ°¨μ μ΄ν λΆλ₯ λ¬Έμ μ λν λ κ°μ§ λ°μ΄ν° νμΌλ§ νμμ ꡬ체ννλ€. λ°μ΄ν° νμΌλ§ νμμ νλ ¨ λ°μ΄ν°λ₯Ό λ°©ν₯ 벑ν°μ μ¬μνμμ λ κ° λ²μ£Όλ§λ€ μ νν λ κ°μ λ€λ₯Έ κ°μ κ°λ νμμ λ§νλ€. 첫 λ²μ§Έ λ°μ΄ν° νμΌλ§ νμμ λ°μ΄ν°μ μ°¨μ pκ° νλ³Έ ν¬κΈ° nλ³΄λ€ ν° κ²½μ° νμ λ°μνλ€. μ΄ μ°κ΅¬μμλ μλ‘μ΄ ν
μ€νΈ λ°μ΄ν°μ νμΌλ§μ μλ―Ένλ λ λ²μ§Έ λ°μ΄ν° νμΌλ§ νμμ΄ νλ³Έ ν¬κΈ° nμ κ³ μ λμ΄ μμ λ λ°μ΄ν°μ μ°¨μ pκ° μ¦κ°νλ μ κ·Όμ μν©μμ λ°μν μ μμμ 보μΈλ€. λν ν
μ€νΈ λ°μ΄ν°μ λ λλ―Έ μ¬μ΄μ μ΅λ μ κ·Ό 거리λ₯Ό λ§λλ λ λ²μ§Έ μ΅λ λ°μ΄ν° νμΌλ§ λ°©ν₯μ 첫 λ²μ§Έ μ΅λ λ°μ΄ν° νμΌλ§ λ°©ν₯μ 곡ν΅μ μ ν κ³ μ 벑ν°λ‘ ꡬμ±λλ 곡κ°μ μ§κ΅μ¬κ³΅κ°μ ν¬μνμ¬ μ»μ μ μμμ 보μΈλ€. λ λ²μ§Έ λ°μ΄ν° νμΌλ§ νμμ λ°νμΌλ‘, μΌλ°νλ μ΄μ§μ μ€νμ΄ν¬ 곡λΆμ° λͺ¨ν νμμ κ³ μ°¨μ μ νλ³Έ λ°μ΄ν°λ₯Ό μλ²½νκ² λΆλ₯ν μ μλ μλ‘μ΄ μ ν λΆλ₯ λ°©λ²μ μ μνλ€.Chapter 1 Introduction 1
Chapter 2 Heterogeneous Covariance Models 6
Chapter 3 Data Piling of Independent Test Data 10
3.1 One-component Covariance Model 11
3.2 Main Theorem 20
Chapter 4 Estimation of Second Maximal Data Piling Direction 26
Chapter 5 Simulation 33
Chapter 6 Discussion 37
Appendix A Asymptotic Properties of High-dimensional Sample Within-scatter Matrix 42
A.1 Proof of Lemma 3 45
A.2 Proof of Lemma 4 47
Appendix B Technical Details of Main Results 52
B.1 Proof of Theorem 5 52
B.2 Proof of Theorem 6 55
B.3 Proof of Theorem 7 58
B.4 Proof of Theorem 8 59
B.5 Proof of Theorem 9 60
κ΅λ¬Έμ΄λ‘ 63μ