75 research outputs found

    A framework for evaluating automatic image annotation algorithms

    Get PDF
    Several Automatic Image Annotation (AIA) algorithms have been introduced recently, which have been found to outperform previous models. However, each one of them has been evaluated using either different descriptors, collections or parts of collections, or "easy" settings. This fact renders their results non-comparable, while we show that collection-specific properties are responsible for the high reported performance measures, and not the actual models. In this paper we introduce a framework for the evaluation of image annotation models, which we use to evaluate two state-of-the-art AIA algorithms. Our findings reveal that a simple Support Vector Machine (SVM) approach using Global MPEG-7 Features outperforms state-of-the-art AIA models across several collection settings. It seems that these models heavily depend on the set of features and the data used, while it is easy to exploit collection-specific properties, such as tag popularity especially in the commonly used Corel 5K dataset and still achieve good performance

    Bi-directional representation learning for multi-label classification

    Get PDF
    Multi-label classification is a central problem in many application domains. In this paper, we present a novel supervised bi-directional model that learns a low-dimensional mid-level representation for multi-label classification. Unlike traditional multi-label learning methods which identify intermediate representations from either the input space or the output space but not both, the mid-level representation in our model has two complementary parts that capture intrinsic information of the input data and the output labels respectively under the autoencoder principle while augmenting each other for the target output label prediction. The resulting optimization problem can be solved efficiently using an iterative procedure with alternating steps, while closed-form solutions exist for one major step. Our experiments conducted on a variety of multi-label data sets demonstrate the efficacy of the proposed bi-directional representation learning model for multi-label classification

    Informedia at TRECVID 2003: Analyzing and searching broadcast news video

    Get PDF
    We submitted a number of semantic classifiers, most of which were merely trained on keyframes. We also experimented with runs of classifiers were trained exclusively on text data and relative time within the video, while a few were trained using all available multiple modalities. 1.2 Interactive search This year, we submitted two runs using different versions of the Informedia systems. In one run, a version identical to last year's interactive system was used by five researchers, who split up the topics between themselves. The system interface emphasizes text queries, allowing search across ASR, closed captions and OCR text. The result set can then be manipulated through: • storyboards of images spanning across video story segments • emphasizing matching shots to a user’s query to reduce the image count to a manageable size • resolution and layout under user control • additional filtering provided through shot classifiers such as outdoors, and shots with people, etc. • display of filter count and distribution to guide their use in manipulating storyboard views. In the best-performing interactive run, for all topics a single researcher used an improved version of the system, which allowed more effective browsing and visualization of the results of text queries using

    Bilkent University at TRECVID 2007

    Get PDF
    We describe our fourth participation, that includes two high-level feature extraction runs, and one manual search run, to the TRECVID video retrieval evaluation. All of these runs have used a system trained on the common development collection. Only visual information, consisting of color, texture and edge-based low-level features, was used

    An information-theoretic framework for semantic-multimedia retrieval

    Get PDF
    This article is set in the context of searching text and image repositories by keyword. We develop a unified probabilistic framework for text, image, and combined text and image retrieval that is based on the detection of keywords (concepts) using automated image annotation technology. Our framework is deeply rooted in information theory and lends itself to use with other media types. We estimate a statistical model in a multimodal feature space for each possible query keyword. The key element of our framework is to identify feature space transformations that make them comparable in complexity and density. We select the optimal multimodal feature space with a minimum description length criterion from a set of candidate feature spaces that are computed with the average-mutual-information criterion for the text part and hierarchical expectation maximization for the visual part of the data. We evaluate our approach in three retrieval experiments (only text retrieval, only image retrieval, and text combined with image retrieval), verify the framework’s low computational complexity, and compare with existing state-of-the-art ad-hoc models

    Muscular cystic hydatidosis: case report

    Get PDF
    BACKGROUND: Hydatidosis is a zoonosis caused by Echinococcus granulosus, and ingesting eggs released through the faeces from infected dogs infects humans. The location of the hydatid cysts is mostly hepatic and/or pulmonary, whereas musculoskeletal hydatidosis is very rare. CASE PRESENTATION: We report an unusual case of primary muscular hydatidosis in proximity of the big adductor in a young Sicilian man. The patient, 34 years old, was admitted to the Department of Infectious and Tropical Diseases for ultrasonographic detection, with successive confirmation by magnetic resonance imaging, of an ovular mass (13 × 8 cm) in the big adductor of the left thigh, cyst-like, and containing several small cystic formations. Serological tests for hydatidosis gave negative results. A second drawing of blood was done 10 days after the first one and showed an increase in the antibody titer for hydatidosis. The patient was submitted to surgical excision of the lesion with perioperatory prophylaxis with albendazole. The histopathological examination of the bioptic material was not diriment in the diagnosis, therefore further tests were performed: additional serological tests for hydatidosis for the evaluation of IgE and IgG serotype (Western Blot and REAST), and molecular analysis of the excised material. These more specific serological tests gave positive results for hydatidosis, and the sequencing of the polymerase chain reaction products from the cyst evidenced E. granulosus DNA, genotype G1. Any post-surgery complications was observed during 6 following months. CONCLUSION: Cystic hydatidosis should always be considered in the differential diagnosis of any cystic mass, regardless of its location, also in epidemiological contests less suggestive of the disease. The diagnosis should be achieved by taking into consideration the clinical aspects, the epidemiology of the disease, the imaging and immunological tests but, as demonstrated in this case, without neglecting the numerous possibilities offered by new serological devices and modern day molecular biology techniques

    Bilkent university at TRECVID 2006

    Get PDF
    We describe our third participation, that includes one high-level feature extraction run, and two manual and one interactive search runs, to the TRECVID video retrieval evaluation. All of these runs have used a system trained on the common development collection. Only visual and textual information were used where visual information consisted of color, texture and edge-based low-level features and textual information consisted of the speech transcript provided in the collection

    Sparse Kernel Learning for Image Annotation

    Get PDF
    In this paper we introduce a sparse kernel learning frame-work for the Continuous Relevance Model (CRM). State-of-the-art image annotation models linearly combine evidence from several different feature types to improve image anno-tation accuracy. While previous authors have focused on learning the linear combination weights for these features, there has been no work examining the optimal combination of kernels. We address this gap by formulating a sparse kernel learning framework for the CRM, dubbed the SKL-CRM, that greedily selects an optimal combination of ker-nels. Our kernel learning framework rapidly converges to an annotation accuracy that substantially outperforms a host of state-of-the-art annotation models. We make two surprising conclusions: firstly, if the kernels are chosen correctly, only a very small number of features are required so to achieve superior performance over models that utilise a full suite of feature types; and secondly, the standard default selection of kernels commonly used in the literature is sub-optimal, and it is much better to adapt the kernel choice based on the feature type and image dataset
    corecore