9,013 research outputs found

    Hierarchical Subquery Evaluation for Active Learning on a Graph

    Get PDF
    To train good supervised and semi-supervised object classifiers, it is critical that we not waste the time of the human experts who are providing the training labels. Existing active learning strategies can have uneven performance, being efficient on some datasets but wasteful on others, or inconsistent just between runs on the same dataset. We propose perplexity based graph construction and a new hierarchical subquery evaluation algorithm to combat this variability, and to release the potential of Expected Error Reduction. Under some specific circumstances, Expected Error Reduction has been one of the strongest-performing informativeness criteria for active learning. Until now, it has also been prohibitively costly to compute for sizeable datasets. We demonstrate our highly practical algorithm, comparing it to other active learning measures on classification datasets that vary in sparsity, dimensionality, and size. Our algorithm is consistent over multiple runs and achieves high accuracy, while querying the human expert for labels at a frequency that matches their desired time budget.Comment: CVPR 201

    ARTMAP-IC and Medical Diagnosis: Instance Counting and Inconsistent Cases

    Full text link
    For complex database prediction problems such as medical diagnosis, the ARTMAP-IC neural network adds distributed prediction and category instance counting to the basic fuzzy ARTMAP system. For the ARTMAP match tracking algorithm, which controls search following a predictive error, a new version facilitates prediction with sparse or inconsistent data. Compared to the original match tracking algorithm (MT+), the new algorithm (MT-) better approximates the real-time network differential equations and further compresses memory without loss of performance. Simulations examine predictive accuracy on four medical databases: Pima Indian diabetes, breast cancer, heart disease, and gall bladder removal. ARTMAP-IC results arc equal to or better than those of logistic regression, K nearest neighbor (KNN), the ADAP perceptron, multisurface pattern separation, CLASSIT, instance-based (IBL), and C4. ARTMAP dynamics are fast, stable, and scalable. A voting strategy improves prediction by training the system several times on different orderings of an input set. Voting, instance counting, and distributed representations combine to form confidence estimates for competing predictions.National Science Foundation (IRI 94-01659); Office of Naval Research (N00014-95-J-0409, N00014-95-0657

    Nearest-neighbor based Manifold Expansion Technique for Active Learning

    Get PDF
    The present disclosure describes a nearest-neighbor based manifold expansion technique integrated into an active learner for seeking human review. Initially, the active learner performs a sampling formulation in which an unlabeled dataset, including unlabeled examples, is provided as an input to the active learner. The unlabeled dataset is then divided into seed datasets (i.e. a positive seed dataset and a negative seed dataset) and a test dataset. The positive seed dataset includes positive seeds, the negative seed dataset includes negative seeds and the test dataset includes test examples. In a voting process, each of the positive seeds and the negative seeds votes to the test examples that are in a neighborhood of the positive seed or the negative seed. A ranked list of the test examples is prepared based on an overall score for each test example accumulated by votes. Top-k examples in the ranked list are sent to annotators for review. The annotators assign labels (i.e. positive or negative) to the top-k examples. The annotators can interpret why a particular example got a particular score and how much the positive seeds and the negative seeds contributed to that score. The examples labeled by the annotators are added to the seed datasets. The voting process is executed again based on the updated seed datasets. This way, the voting process is executed continuously, and the ranked list is updated in an incremental manner in real time

    XSS-FP: Browser Fingerprinting using HTML Parser Quirks

    Get PDF
    There are many scenarios in which inferring the type of a client browser is desirable, for instance to fight against session stealing. This is known as browser fingerprinting. This paper presents and evaluates a novel fingerprinting technique to determine the exact nature (browser type and version, eg Firefox 15) of a web-browser, exploiting HTML parser quirks exercised through XSS. Our experiments show that the exact version of a web browser can be determined with 71% of accuracy, and that only 6 tests are sufficient to quickly determine the exact family a web browser belongs to

    Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning

    Get PDF
    Object category localization is a challenging problem in computer vision. Standard supervised training requires bounding box annotations of object instances. This time-consuming annotation process is sidestepped in weakly supervised learning. In this case, the supervised information is restricted to binary labels that indicate the absence/presence of object instances in the image, without their locations. We follow a multiple-instance learning approach that iteratively trains the detector and infers the object locations in the positive training images. Our main contribution is a multi-fold multiple instance learning procedure, which prevents training from prematurely locking onto erroneous object locations. This procedure is particularly important when using high-dimensional representations, such as Fisher vectors and convolutional neural network features. We also propose a window refinement method, which improves the localization accuracy by incorporating an objectness prior. We present a detailed experimental evaluation using the PASCAL VOC 2007 dataset, which verifies the effectiveness of our approach.Comment: To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI

    Extracting News Events from Microblogs

    Full text link
    Twitter stream has become a large source of information for many people, but the magnitude of tweets and the noisy nature of its content have made harvesting the knowledge from Twitter a challenging task for researchers for a long time. Aiming at overcoming some of the main challenges of extracting the hidden information from tweet streams, this work proposes a new approach for real-time detection of news events from the Twitter stream. We divide our approach into three steps. The first step is to use a neural network or deep learning to detect news-relevant tweets from the stream. The second step is to apply a novel streaming data clustering algorithm to the detected news tweets to form news events. The third and final step is to rank the detected events based on the size of the event clusters and growth speed of the tweet frequencies. We evaluate the proposed system on a large, publicly available corpus of annotated news events from Twitter. As part of the evaluation, we compare our approach with a related state-of-the-art solution. Overall, our experiments and user-based evaluation show that our approach on detecting current (real) news events delivers a state-of-the-art performance

    Metric-based Few-shot Classification in Remote Sensing Image

    Get PDF
    Target recognition based on deep learning relies on a large quantity of samples, but in some specific remote sensing scenes, the samples are very rare. Currently, few-shot learning can obtain high-performance target classification models using only a few samples, but most researches are based on the natural scene. Therefore, this paper proposes a metric-based few-shot classification technology in remote sensing. First, we constructed a dataset (RSD-FSC) for few-shot classification in remote sensing, which contained 21 classes typical target sample slices of remote sensing images. Second, based on metric learning, a k-nearest neighbor classification network is proposed, to find multiple training samples similar to the testing target, and then the similarity between the testing target and multiple similar samples is calculated to classify the testing target. Finally, the 5-way 1-shot, 5-way 5-shot and 5-way 10-shot experiments are conducted to improve the generalization of the model on few-shot classification tasks. The experimental results show that for the newly emerged classes few-shot samples, when the number of training samples is 1, 5 and 10, the average accuracy of target recognition can reach 59.134%, 82.553% and 87.796%, respectively. It demonstrates that our proposed method can resolve fewshot classification in remote sensing image and perform better than other few-shot classification methods
    • …
    corecore