1,511 research outputs found

    Adaptation of CNN Classifiers to Prior Shift

    Get PDF
    V mnoha klasifikačních úlohách se relativní četnosti tříd (apriorní pravděpodobnosti tříd) na testovací sadě liší od relativních četností během trénování prediktoru. Tento jev, taktéž nazýván \textit{label shift} nebo \textit{prior shift}, může negativně ovlivnit správnost predikcí klasifikátoru. Uvažujeme-li pravděpodobnostní klasifikátor aproximující aposteriorní pravděpodobnosti, mohou být jeho predikce adaptovány na label shift převážením poměrem testovacích a trénovacích apriorních pravděpodobností. Jelikož jsou anotace v testovací sadě obvykle neznámé, musí být poměr apriorních pravděpodobností odhadnut pomocí metody učení bez učitele. Tato teze zhrnuje existující práce řešící adaptaci na label shift. Dále jsou v této práci navrženy nové algoritmy pro odhad nových apriorních pravděpodobností a poměru apriorních pravděpodobností na testovací a trénovací sadě. Navržené metody jsou uzpůsobeny tak, aby řešily známý problém nekonzistentního odhadu pravděpodobnosti rozhodnutí klasifikátoru a jeho confusion matice, jenž může vést k záporným hodnotám v odhadnutých četnostech. Experimentální vyhodnocení ukazuje, že naše metoda zlepšuje stabilitu odhadu apriorních pravděpodobností a přesnost adaptovaného klasifikítoru v porovnání s metodami založenými na confusion matici a současně dosahuje nejlepších výsledků mezi metodami pro prior shift.In many classification tasks, the test set's relative class frequencies (class priors probabilities) differ from the relative class frequencies at training time. Such phenomenon, called \textit{label shift} or \textit{prior shift}, can negatively affect the classifier's performance. Considering a probabilistic classifier approximating posterior probabilities, the predictions can be adapted to the label shift by re-weighting with a ratio of test set and training set priors. Labels in the test set are usually unknown, therefore the prior ratio has to be estimated in an unsupervised manner. This thesis reviews existing methods for adapting probabilistic classifiers to label shift and for estimating test priors in an unlabeled test set. Moreover, we propose novel algorithms to address the problems of estimating new priors and prior ratio. The methods are designed to handle a known issue in confusion matrix-based methods, where inconsistent estimates of decision probabilities and confusion matrices lead to negative values in estimated priors. Experimental evaluation shows that our method improves the stability of prior estimation and the adapted classifier's accuracy compared to the baseline confusion matrix-based methods and achieves state-of-the-art performance in prior shift adaptation

    ICLabel: An automated electroencephalographic independent component classifier, dataset, and website

    Full text link
    The electroencephalogram (EEG) provides a non-invasive, minimally restrictive, and relatively low cost measure of mesoscale brain dynamics with high temporal resolution. Although signals recorded in parallel by multiple, near-adjacent EEG scalp electrode channels are highly-correlated and combine signals from many different sources, biological and non-biological, independent component analysis (ICA) has been shown to isolate the various source generator processes underlying those recordings. Independent components (IC) found by ICA decomposition can be manually inspected, selected, and interpreted, but doing so requires both time and practice as ICs have no particular order or intrinsic interpretations and therefore require further study of their properties. Alternatively, sufficiently-accurate automated IC classifiers can be used to classify ICs into broad source categories, speeding the analysis of EEG studies with many subjects and enabling the use of ICA decomposition in near-real-time applications. While many such classifiers have been proposed recently, this work presents the ICLabel project comprised of (1) an IC dataset containing spatiotemporal measures for over 200,000 ICs from more than 6,000 EEG recordings, (2) a website for collecting crowdsourced IC labels and educating EEG researchers and practitioners about IC interpretation, and (3) the automated ICLabel classifier. The classifier improves upon existing methods in two ways: by improving the accuracy of the computed label estimates and by enhancing its computational efficiency. The ICLabel classifier outperforms or performs comparably to the previous best publicly available method for all measured IC categories while computing those labels ten times faster than that classifier as shown in a rigorous comparison against all other publicly available EEG IC classifiers.Comment: Intended for NeuroImage. Updated from version one with minor editorial and figure change

    Adapting Classifiers To Changing Class Priors During Deployment

    Full text link
    Conventional classifiers are trained and evaluated using balanced data sets in which all classes are equally present. Classifiers are now trained on large data sets such as ImageNet, and are now able to classify hundreds (if not thousands) of different classes. On one hand, it is desirable to train such general-purpose classifier on a very large number of classes so that it performs well regardless of the settings in which it is deployed. On the other hand, it is unlikely that all classes known to the classifier will occur in every deployment scenario, or that they will occur with the same prior probability. In reality, only a relatively small subset of the known classes may be present in a particular setting or environment. For example, a classifier will encounter mostly animals if its deployed in a zoo or for monitoring wildlife, aircraft and service vehicles at an airport, or various types of automobiles and commercial vehicles if it is used for monitoring traffic. Furthermore, the exact class priors are generally unknown and can vary over time. In this paper, we explore different methods for estimating the class priors based on the output of the classifier itself. We then show that incorporating the estimated class priors in the overall decision scheme enables the classifier to increase its run-time accuracy in the context of its deployment scenario

    Enhancing Energy Minimization Framework for Scene Text Recognition with Top-Down Cues

    Get PDF
    Recognizing scene text is a challenging problem, even more so than the recognition of scanned documents. This problem has gained significant attention from the computer vision community in recent years, and several methods based on energy minimization frameworks and deep learning approaches have been proposed. In this work, we focus on the energy minimization framework and propose a model that exploits both bottom-up and top-down cues for recognizing cropped words extracted from street images. The bottom-up cues are derived from individual character detections from an image. We build a conditional random field model on these detections to jointly model the strength of the detections and the interactions between them. These interactions are top-down cues obtained from a lexicon-based prior, i.e., language statistics. The optimal word represented by the text image is obtained by minimizing the energy function corresponding to the random field model. We evaluate our proposed algorithm extensively on a number of cropped scene text benchmark datasets, namely Street View Text, ICDAR 2003, 2011 and 2013 datasets, and IIIT 5K-word, and show better performance than comparable methods. We perform a rigorous analysis of all the steps in our approach and analyze the results. We also show that state-of-the-art convolutional neural network features can be integrated in our framework to further improve the recognition performance

    Similarity Discriminant Analysis

    Get PDF
    corecore