382 research outputs found

    Cross-dimensional Weighting for Aggregated Deep Convolutional Features

    Full text link
    We propose a simple and straightforward way of creating powerful image representations via cross-dimensional weighting and aggregation of deep convolutional neural network layer outputs. We first present a generalized framework that encompasses a broad family of approaches and includes cross-dimensional pooling and weighting steps. We then propose specific non-parametric schemes for both spatial- and channel-wise weighting that boost the effect of highly active spatial responses and at the same time regulate burstiness effects. We experiment on different public datasets for image search and show that our approach outperforms the current state-of-the-art for approaches based on pre-trained networks. We also provide an easy-to-use, open source implementation that reproduces our results.Comment: Accepted for publications at the 4th Workshop on Web-scale Vision and Social Media (VSM), ECCV 201

    Efficient On-the-fly Category Retrieval using ConvNets and GPUs

    Full text link
    We investigate the gains in precision and speed, that can be obtained by using Convolutional Networks (ConvNets) for on-the-fly retrieval - where classifiers are learnt at run time for a textual query from downloaded images, and used to rank large image or video datasets. We make three contributions: (i) we present an evaluation of state-of-the-art image representations for object category retrieval over standard benchmark datasets containing 1M+ images; (ii) we show that ConvNets can be used to obtain features which are incredibly performant, and yet much lower dimensional than previous state-of-the-art image representations, and that their dimensionality can be reduced further without loss in performance by compression using product quantization or binarization. Consequently, features with the state-of-the-art performance on large-scale datasets of millions of images can fit in the memory of even a commodity GPU card; (iii) we show that an SVM classifier can be learnt within a ConvNet framework on a GPU in parallel with downloading the new training images, allowing for a continuous refinement of the model as more images become available, and simultaneous training and ranking. The outcome is an on-the-fly system that significantly outperforms its predecessors in terms of: precision of retrieval, memory requirements, and speed, facilitating accurate on-the-fly learning and ranking in under a second on a single GPU.Comment: Published in proceedings of ACCV 201

    Surface composition of BaTiO3/SrTiO3(001) films grown by atomic oxygen plasma assisted molecular beam epitaxy

    Full text link
    We have investigated the growth of BaTiO3 thin films deposited on pure and 1% Nb-doped SrTiO3(001) single crystals using atomic oxygen assisted molecular beam epitaxy (AO-MBE) and dedicated Ba and Ti Knudsen cells. Thicknesses up to 30 nm were investigated for various layer compositions. We demonstrate 2D growth and epitaxial single crystalline BaTiO3 layers up to 10 nm before additional 3D features appear; lattice parameter relaxation occurs during the first few nanometers and is completed at {\guillemotright}10 nm. The presence of a Ba oxide rich top layer that probably favors 2D growth is evidenced for well crystallized layers. We show that the Ba oxide rich top layer can be removed by chemical etching. The present work stresses the importance of stoichiometry and surface composition of BaTiO3 layers, especially in view of their integration in devices.Comment: In press in J. Appl. Phy

    PlaNet - Photo Geolocation with Convolutional Neural Networks

    Full text link
    Is it possible to build a system to determine the location where a photo was taken using just its pixels? In general, the problem seems exceptionally difficult: it is trivial to construct situations where no location can be inferred. Yet images often contain informative cues such as landmarks, weather patterns, vegetation, road markings, and architectural details, which in combination may allow one to determine an approximate location and occasionally an exact location. Websites such as GeoGuessr and View from your Window suggest that humans are relatively good at integrating these cues to geolocate images, especially en-masse. In computer vision, the photo geolocation problem is usually approached using image retrieval methods. In contrast, we pose the problem as one of classification by subdividing the surface of the earth into thousands of multi-scale geographic cells, and train a deep network using millions of geotagged images. While previous approaches only recognize landmarks or perform approximate matching using global image descriptors, our model is able to use and integrate multiple visible cues. We show that the resulting model, called PlaNet, outperforms previous approaches and even attains superhuman levels of accuracy in some cases. Moreover, we extend our model to photo albums by combining it with a long short-term memory (LSTM) architecture. By learning to exploit temporal coherence to geolocate uncertain photos, we demonstrate that this model achieves a 50% performance improvement over the single-image model

    Smooth-AP: Smoothing the Path Towards Large-Scale Image Retrieval

    Full text link
    Optimising a ranking-based metric, such as Average Precision (AP), is notoriously challenging due to the fact that it is non-differentiable, and hence cannot be optimised directly using gradient-descent methods. To this end, we introduce an objective that optimises instead a smoothed approximation of AP, coined Smooth-AP. Smooth-AP is a plug-and-play objective function that allows for end-to-end training of deep networks with a simple and elegant implementation. We also present an analysis for why directly optimising the ranking based metric of AP offers benefits over other deep metric learning losses. We apply Smooth-AP to standard retrieval benchmarks: Stanford Online products and VehicleID, and also evaluate on larger-scale datasets: INaturalist for fine-grained category retrieval, and VGGFace2 and IJB-C for face retrieval. In all cases, we improve the performance over the state-of-the-art, especially for larger-scale datasets, thus demonstrating the effectiveness and scalability of Smooth-AP to real-world scenarios.Comment: Accepted at ECCV 202

    The breakdown of the municipality as caring platform: lessons for co-design and co-learning in the age of platform capitalism

    Get PDF
    If municipalities were the caring platforms of the 19-20th century sharing economy, how does care manifest in civic structures of the current period? We consider how platforms - from the local initiatives of communities transforming neighbourhoods, to the city, in the form of the local authority - are involved, trusted and/or relied on in the design of shared services and amenities for the public good. We use contrasting cases of interaction between local government and civil society organisations in Sweden and the UK to explore trends in public service provision. We look at how care can manifest between state and citizens and at the roles that co-design and co-learning play in developing contextually sensitive opportunities for caring platforms. In this way, we seek to learn from platforms in transition about the importance of co-learning in political and structural contexts and make recommendations for the co-design of (digital) platforms to care with and for civil society

    Variable and value elimination in binary constraint satisfaction via forbidden patterns

    Get PDF
    Variable or value elimination in a constraint satisfaction problem (CSP) can be used in preprocessing or during search to reduce search space size. A variable elimination rule (value elimination rule) allows the polynomial-time identification of certain variables (domain elements) whose elimination, without the introduction of extra compensatory constraints, does not affect the satisfiability of an instance. We show that there are essentially just four variable elimination rules and three value elimination rules defined by forbidding generic sub-instances, known as irreducible existential patterns, in arc-consistent CSP instances. One of the variable elimination rules is the already-known Broken Triangle Property, whereas the other three are novel. The three value elimination rules can all be seen as strict generalisations of neighbourhood substitution.Comment: A full version of an IJCAI'13 paper to appear in Journal of Computer and System Sciences (JCSS
    • …
    corecore