284,910 research outputs found

    A Quantitative Understanding of Pattern Matching

    Get PDF
    This paper shows that the recent approach to quantitative typing systems for programming languages can be extended to pattern matching features. Indeed, we define two resource-aware type systems, named U and E, for a ?-calculus equipped with pairs for both patterns and terms. Our typing systems borrow some basic ideas from [Antonio Bucciarelli et al., 2015], which characterises (head) normalisation in a qualitative way, in the sense that typability and normalisation coincide. But, in contrast to [Antonio Bucciarelli et al., 2015], our systems also provide quantitative information about the dynamics of the calculus. Indeed, system U provides upper bounds for the length of (head) normalisation sequences plus the size of their corresponding normal forms, while system E, which can be seen as a refinement of system U, produces exact bounds for each of them. This is achieved by means of a non-idempotent intersection type system equipped with different technical tools. First of all, we use product types to type pairs instead of the disjoint unions in [Antonio Bucciarelli et al., 2015], which turn out to be an essential quantitative tool because they remove the confusion between "being a pair" and "being duplicable". Secondly, typing sequents in system E are decorated with tuples of integers, which provide quantitative information about normalisation sequences, notably time (cf. length) and space (cf. size). Moreover, the time resource information is remarkably refined, because it discriminates between different kinds of reduction steps performed during evaluation, so that beta, substitution and matching steps are counted separately. Another key tool of system E is that the type system distinguishes between consuming (contributing to time) and persistent (contributing to space) constructors

    Generic 3D Representation via Pose Estimation and Matching

    Full text link
    Though a large body of computer vision research has investigated developing generic semantic representations, efforts towards developing a similar representation for 3D has been limited. In this paper, we learn a generic 3D representation through solving a set of foundational proxy 3D tasks: object-centric camera pose estimation and wide baseline feature matching. Our method is based upon the premise that by providing supervision over a set of carefully selected foundational tasks, generalization to novel tasks and abstraction capabilities can be achieved. We empirically show that the internal representation of a multi-task ConvNet trained to solve the above core problems generalizes to novel 3D tasks (e.g., scene layout estimation, object pose estimation, surface normal estimation) without the need for fine-tuning and shows traits of abstraction abilities (e.g., cross-modality pose estimation). In the context of the core supervised tasks, we demonstrate our representation achieves state-of-the-art wide baseline feature matching results without requiring apriori rectification (unlike SIFT and the majority of learned features). We also show 6DOF camera pose estimation given a pair local image patches. The accuracy of both supervised tasks come comparable to humans. Finally, we contribute a large-scale dataset composed of object-centric street view scenes along with point correspondences and camera pose information, and conclude with a discussion on the learned representation and open research questions.Comment: Published in ECCV16. See the project website http://3drepresentation.stanford.edu/ and dataset website https://github.com/amir32002/3D_Street_Vie

    Deep Video Color Propagation

    Full text link
    Traditional approaches for color propagation in videos rely on some form of matching between consecutive video frames. Using appearance descriptors, colors are then propagated both spatially and temporally. These methods, however, are computationally expensive and do not take advantage of semantic information of the scene. In this work we propose a deep learning framework for color propagation that combines a local strategy, to propagate colors frame-by-frame ensuring temporal stability, and a global strategy, using semantics for color propagation within a longer range. Our evaluation shows the superiority of our strategy over existing video and image color propagation methods as well as neural photo-realistic style transfer approaches.Comment: BMVC 201

    ADVISE: Symbolism and External Knowledge for Decoding Advertisements

    Full text link
    In order to convey the most content in their limited space, advertisements embed references to outside knowledge via symbolism. For example, a motorcycle stands for adventure (a positive property the ad wants associated with the product being sold), and a gun stands for danger (a negative property to dissuade viewers from undesirable behaviors). We show how to use symbolic references to better understand the meaning of an ad. We further show how anchoring ad understanding in general-purpose object recognition and image captioning improves results. We formulate the ad understanding task as matching the ad image to human-generated statements that describe the action that the ad prompts, and the rationale it provides for taking this action. Our proposed method outperforms the state of the art on this task, and on an alternative formulation of question-answering on ads. We show additional applications of our learned representations for matching ads to slogans, and clustering ads according to their topic, without extra training.Comment: To appear, Proceedings of the European Conference on Computer Vision (ECCV
    corecore