9 research outputs found

    Learning Localized Perceptual Similarity Metrics for Interactive Categorization

    Full text link

    Fast Interactive Search with a Scale-Free Comparison Oracle

    Full text link
    A comparison-based search algorithm lets a user find a target item tt in a database by answering queries of the form, ``Which of items ii and jj is closer to tt?'' Instead of formulating an explicit query (such as one or several keywords), the user navigates towards the target via a sequence of such (typically noisy) queries. We propose a scale-free probabilistic oracle model called γ\gamma-CKL for such similarity triplets (i,j;t)(i,j;t), which generalizes the CKL triplet model proposed in the literature. The generalization affords independent control over the discriminating power of the oracle and the dimension of the feature space containing the items. We develop a search algorithm with provably exponential rate of convergence under the γ\gamma-CKL oracle, thanks to a backtracking strategy that deals with the unavoidable errors in updating the belief region around the target. We evaluate the performance of the algorithm both over the posited oracle and over several real-world triplet datasets. We also report on a comprehensive user study, where human subjects navigate a database of face portraits

    Visipedia - Multi-Dimensional Object Embedding Based on Perceptual Similarity

    Get PDF
    Problémy jako je jemnozrnná kategorizace či výpočty s využitím lidských zdrojů se v posledních letech v komunitě stávají stále populárnějšími, což dosvědčuje i značné množství publikací na tato témata. Zatímco většina těchto prací využívá "klasických'' obrazových příznaků extrahovaných počítačem, tato se zaměřuje především na percepční vlastnosti, které nemohou být snadno zachyceny počítači a vyžadují zapojení lidí do procesu sběru dat. Práce zkoumá možnosti levného a efektivního získávání percepčních podobností od uživatelů rovněž ve vztahu ke škálovatelnosti. Dále vyhodnocuje několik relevantních experimentů a představuje metody zlepšující efektivitu sběru dat. Jsou zde také shrnuty a porovnány metody učení multidimenzionálního indexování a prohledávání tohoto prostoru. Získané výsledky jsou následně užity v komplexním experimentu vyhodnoceném na datasetu obrázků jídel. Procedura začíná získáváním podobností od uživatelů, pokračuje vytvořením multidimenzionálního prostoru jídel a končí prohledáváním tohoto prostoru.Some problems like fine-grained categorization or human-based computation has become popular in recent years in the community, which has been proven by a large number of published works concerning these topics. Whereas most of these works uses a "classical'' visual features extracted by machine, this one in particular focuses on perceptual properties which cannot be easily sampled by machine and which involves humans into this data retrieval process. There are examined ways, how to obtain perceptual similarities from humans cheaply and effectively also in terms of scalability. There are performed various experiments and purposed several methods to improve this efficiency. The work also reviews and compares existing methods of embedding learning and navigating through its space. The acquired observations are subsequently used in a complex experiment evaluated with a food image dataset, covering the whole procedure from similarity retrieval from humans, over data embedding learning up to searching in such multi-dimensional space.

    A Statistical Framework for Image Category Search from a Mental Picture

    Get PDF
    Image Retrieval; Relevance Feedback; Page Zero Problem; Mental Matching; Bayesian System; Statistical LearningStarting from a member of an image database designated the “query image,” traditional image retrieval techniques, for example search by visual similarity, allow one to locate additional instances of a target category residing in the database. However, in many cases, the query image or, more generally, the target category, resides only in the mind of the user as a set of subjective visual patterns, psychological impressions or “mental pictures.” Consequently, since image databases available today are often unstructured and lack reliable semantic annotations, it is often not obvious how to initiate a search session; this is the “page zero problem.” We propose a new statistical framework based on relevance feedback to locate an instance of a semantic category in an unstructured image database with no semantic annotations. A search session is initiated from a random sample of images. At each retrieval round the user is asked to select one image from among a set of displayed images – the one that is closest in his opinion to the target class. The matching is then “mental.” Performance is measured by the number of iterations necessary to display an image which satisfies the user, at which point standard techniques can be employed to display other instances. Our core contribution is a Bayesian formulation which scales to large databases. The two key components are a response model which accounts for the user's subjective perception of similarity and a display algorithm which seeks to maximize the flow of information. Experiments with real users and two databases of 20,000 and 60,000 images demonstrate the efficiency of the search process

    Graph-Based Information Processing:Scaling Laws and Applications

    Get PDF
    We live in a world characterized by massive information transfer and real-time communication. The demand for efficient yet low-complexity algorithms is widespread across different fields, including machine learning, signal processing and communications. Most of the problems that we encounter across these disciplines involves a large number of modules interacting with each other. It is therefore natural to represent these interactions and the flow of information between the modules in terms of a graph. This leads to the study of graph-based information processing framework. This framework can be used to gain insight into the development of algorithms for a diverse set of applications. We investigate the behaviour of large-scale networks (ranging from wireless sensor networks to social networks) as a function of underlying parameters. In particular, we study the scaling laws and applications of graph-based information processing in sensor networks/arrays, sparsity pattern recovery and interactive content search. In the first part of this thesis, we explore location estimation from incomplete information, a problem that arises often in wireless sensor networks and ultrasound tomography devices. In such applications, the data gathered by the sensors is only useful if we can pinpoint their positions with reasonable accuracy. This problem is particularly challenging when we need to infer the positions based on basic information/interaction such as proximity or incomplete (and often noisy) pairwise distances. As the sensors deployed in a sensor network are often of low quality and unreliable, we need to devise a mechanism to single out those that do not work properly. In the second part, we frame the network tomography problem as a well-studied inverse problem in statistics, called group testing. Group testing involves detecting a small set of defective items in a large population by grouping a subset of items into different pools. The result of each pool is a binary output depending on whether the pool contains a defective item or not. Motivated by the network tomography application, we consider the general framework of group testing with graph constraints. As opposed to conventional group testing where any subset of items can be grouped, here a test is admissible if it induces a connected subgraph. Given this constraint, we are interested in bounding the number of pools required to identify the defective items. Once the positions of sensors are known and the defective sensors are identified, we investigate another important feature of networks, namely, navigability or how fast nodes can deliver a message from one end to another by means of local operations. In the final part, we consider navigating through a database of objects utilizing comparisons. Contrary to traditional databases, users do not submit queries that are subsequently matched to objects. Instead, at each step, the database presents two objects to the user, who then selects among the pair the object closest to the target that she has in mind. This process continues until, based on the user’s answers, the database can identify the target she has in mind. The search through comparisons amounts to determining which pairs should be presented to the user in order to find the target object as quickly as possible. Interestingly, this problem has a natural connection with the navigability property studied in the second part, which enables us to develop efficient algorithms
    corecore