17 research outputs found

    Online multi-modal robust non-negative dictionary learning for visual tracking

    Full text link
    © 2015 Zhang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Dictionary learning is a method of acquiring a collection of atoms for subsequent signal representation. Due to its excellent representation ability, dictionary learning has been widely applied in multimedia and computer vision. However, conventional dictionary learning algorithms fail to deal with multi-modal datasets. In this paper, we propose an online multi-modal robust non-negative dictionary learning (OMRNDL) algorithm to overcome this deficiency. Notably, OMRNDL casts visual tracking as a dictionary learning problem under the particle filter framework and captures the intrinsic knowledge about the target from multiple visual modalities, e.g., pixel intensity and texture information. To this end, OMRNDL adaptively learns an individual dictionary, i.e., template, for each modality from available frames, and then represents new particles over all the learned dictionaries by minimizing the fitting loss of data based on M-estimation. The resultant representation coefficient can be viewed as the common semantic representation of particles across multiple modalities, and can be utilized to track the target. OMRNDL incrementally learns the dictionary and the coefficient of each particle by using multiplicative update rules to respectively guarantee their non-negativity constraints. Experimental results on a popular challenging video benchmark validate the effectiveness of OMRNDL for visual tracking in both quantity and quality

    Understanding and Diagnosing Visual Tracking Systems

    Full text link
    Several benchmark datasets for visual tracking research have been proposed in recent years. Despite their usefulness, whether they are sufficient for understanding and diagnosing the strengths and weaknesses of different trackers remains questionable. To address this issue, we propose a framework by breaking a tracker down into five constituent parts, namely, motion model, feature extractor, observation model, model updater, and ensemble post-processor. We then conduct ablative experiments on each component to study how it affects the overall result. Surprisingly, our findings are discrepant with some common beliefs in the visual tracking research community. We find that the feature extractor plays the most important role in a tracker. On the other hand, although the observation model is the focus of many studies, we find that it often brings no significant improvement. Moreover, the motion model and model updater contain many details that could affect the result. Also, the ensemble post-processor can improve the result substantially when the constituent trackers have high diversity. Based on our findings, we put together some very elementary building blocks to give a basic tracker which is competitive in performance to the state-of-the-art trackers. We believe our framework can provide a solid baseline when conducting controlled experiments for visual tracking research

    Relaxed Majorization-Minimization for Non-smooth and Non-convex Optimization

    Full text link
    We propose a new majorization-minimization (MM) method for non-smooth and non-convex programs, which is general enough to include the existing MM methods. Besides the local majorization condition, we only require that the difference between the directional derivatives of the objective function and its surrogate function vanishes when the number of iterations approaches infinity, which is a very weak condition. So our method can use a surrogate function that directly approximates the non-smooth objective function. In comparison, all the existing MM methods construct the surrogate function by approximating the smooth component of the objective function. We apply our relaxed MM methods to the robust matrix factorization (RMF) problem with different regularizations, where our locally majorant algorithm shows advantages over the state-of-the-art approaches for RMF. This is the first algorithm for RMF ensuring, without extra assumptions, that any limit point of the iterates is a stationary point.Comment: AAAI1

    Online Learning a High-Quality Dictionary and Classifier Jointly for Multitask Object Tracking

    Full text link

    이상치 탐지를 위한 적대적 사전 학습 알고리즘

    Get PDF
    학위논문 (석사) -- 서울대학교 대학원 : 공과대학 기계공학부, 2020. 8. 박종우.In this thesis, we propose a semi-supervised dictionary learning algorithm that learns representations of only non-outlier data. The presence of outliers in a dataset is a major drawback for dictionary learning, resulting in less than desirable performance in real-world applications. Our adversarial dictionary learning (ADL) algorithm exploits a supervision dataset composed of known outliers. The algorithm penalizes the dictionary expressing the known outliers well. Penalizing the known outliers makes dictionary learning robust to the outliers present in the dataset. The proposed method can handle highly corrupted dataset which cannot be effectively dealt with using conventional robust dictionary learning algorithms. We empirically show the usefulness of our algorithm with extensive experiments on anomaly detection, using both synthetic univariate time-series data and multivariate point data.본 논문에서는 이상치가 아닌 데이터의 희소 표현만을 학습하는 준지도 사전 학습 알고리즘을 제안한다. 데이터셋에 섞여 있는 이상치는 사전 학습의 주요한 문제로, 실제 문제에 적용 시 바람직하지 않은 성능을 초래한다. 본 연구에서 제안하는 적대적 사전 학습(ADL) 알고리즘은 이상치 데이터로 구성된 감독 데이터셋을 학습에 이용한다. 우리의 알고리즘은 주어진 이상치 데이터를 잘 표현하는 사전에 페널티를 주고, 이것은 사전이 학습 데이터셋에 섞여 있는 이상치에 강건하게 학습되도록 한다. 제안된 방법은 기존의 사전 학습 방법들과 비교해 이상치의 비중이 높은 데이터셋에서도 효과적으로 사전을 학습해 낸다. 이 연구에서는 인공적인 단변량 시계열 데이터와 다변량 점 데이터에 대한 이상치 탐지 실험을 통해 알고리즘의 유용성을 경험적으로 검증한다.1 Introduction 1 1.1 Related Works 4 1.2 Contributions of This Thesis 5 1.3 Organization 6 2 Sparse Representation and Dictionary Learning 7 2.1 Sparse Representation 7 2.1.1 Problem De nition of Sparse Representation 7 2.1.2 Sparse representation with l0-norm regularization 10 2.1.3 Sparse representation with l1-norm regularization 11 2.1.4 Sparse representation with lp-norm regularization (0 < p < 1) 12 2.2 Dictionary Learning 12 2.2.1 Problem De nition of Dictionary Learning 12 2.2.2 Dictionary Learning Methods 14 3 Adversarial Dictionary Learning 18 3.1 Problem Formulation 18 3.2 Adversarial Loss 19 3.3 Optimization Algorithm 20 4 Experiments 25 4.1 Data Description 26 4.1.1 Univariate Time-series Data 26 4.1.2 Multivariate Point Data 29 4.2 Evaluation Process 30 4.2.1 A Baseline of Anomaly Detection 30 4.2.2 ROC Curve and AUC 34 4.3 Experiment Setting 35 4.4 Results 36 5 Conclusion 43 Bibliography 45 국문초록 50Maste

    外れ値を考慮した複数辞書によるオンラインNMF

    Get PDF
     実世界には環境音等の多様な混合信号が存在している.これらの信号の多くは非負値で表すことができ,ガウシアンノイズのような雑音だけでなく外れ値を含むようなものも存在する.こういった,実世界に存在する様々な混合信号の特定の要素の信号に注目し,その信号の特性を把握した上での信号解析を目指す. 外れ値を含む混合信号の解析を行うことにより,画像であればノイズ除去や超解像,音声であれば音源分離や自動採譜といった事が可能になる.この他にも,データ構造の把握によってエンターテイメント,セキュリティ等の様々な観点からデータを扱うことができる. このような信号を解析する手法の一つとして非負値行列因子分解(NMF)が存在する.非負値の行列で表すことができる信号であれば,基底行列と係数行列と呼ばれる行列に分解することができ,基底行列にその信号の頻出パターンを得ることができる.発展形としては雑音に対して頑強なモデルや,大規模なデータにも対応可能なオンライン学習モデルが存在する.また,先行研究として外れ値を考慮したオンラインNMFの研究も行われている. 本研究では,台風中継でのレポーターの音声や高校野球での歓声中の器楽演奏といった,一部の要素の信号の特性が予め把握できる信号の解析を行う. 更に,オンライン学習可能にすることにより,大規模なデータ解析にも応用することができる.また,逐次的に追加される信号の解析ができるため,多様な混合信号にも対応することができる. 提案手法では,従来手法に加えて混合信号の中の特定の信号の特性を予め学習したデータを用意し,それを踏まえて混合信号の解析を行った.予め注目した信号の特性を把握した上で学習を行うことにより,様々な混合信号から注目した信号を抽出可能になるという利点に着目した. 提案手法による人工データ,画像データ,音源データの信号分離実験を行った結果,良好な結果は得られなかったが,複数の係数行列および基底行列における制約条件や初期値設定など,幾つかの検討すべき課題を得る事ができた.電気通信大学201
    corecore