24,713 research outputs found

    WordSup: Exploiting Word Annotations for Character based Text Detection

    Full text link
    Imagery texts are usually organized as a hierarchy of several visual elements, i.e. characters, words, text lines and text blocks. Among these elements, character is the most basic one for various languages such as Western, Chinese, Japanese, mathematical expression and etc. It is natural and convenient to construct a common text detection engine based on character detectors. However, training character detectors requires a vast of location annotated characters, which are expensive to obtain. Actually, the existing real text datasets are mostly annotated in word or line level. To remedy this dilemma, we propose a weakly supervised framework that can utilize word annotations, either in tight quadrangles or the more loose bounding boxes, for character detector training. When applied in scene text detection, we are thus able to train a robust character detector by exploiting word annotations in the rich large-scale real scene text datasets, e.g. ICDAR15 and COCO-text. The character detector acts as a key role in the pipeline of our text detection engine. It achieves the state-of-the-art performance on several challenging scene text detection benchmarks. We also demonstrate the flexibility of our pipeline by various scenarios, including deformed text detection and math expression recognition.Comment: 2017 International Conference on Computer Visio

    Deep Extreme Cut: From Extreme Points to Object Segmentation

    Full text link
    This paper explores the use of extreme points in an object (left-most, right-most, top, bottom pixels) as input to obtain precise object segmentation for images and videos. We do so by adding an extra channel to the image in the input of a convolutional neural network (CNN), which contains a Gaussian centered in each of the extreme points. The CNN learns to transform this information into a segmentation of an object that matches those extreme points. We demonstrate the usefulness of this approach for guided segmentation (grabcut-style), interactive segmentation, video object segmentation, and dense segmentation annotation. We show that we obtain the most precise results to date, also with less user input, in an extensive and varied selection of benchmarks and datasets. All our models and code are publicly available on http://www.vision.ee.ethz.ch/~cvlsegmentation/dextr/.Comment: CVPR 2018 camera ready. Project webpage and code: http://www.vision.ee.ethz.ch/~cvlsegmentation/dextr

    Cortical Synchronization and Perceptual Framing

    Full text link
    How does the brain group together different parts of an object into a coherent visual object representation? Different parts of an object may be processed by the brain at different rates and may thus become desynchronized. Perceptual framing is a process that resynchronizes cortical activities corresponding to the same retinal object. A neural network model is presented that is able to rapidly resynchronize clesynchronized neural activities. The model provides a link between perceptual and brain data. Model properties quantitatively simulate perceptual framing data, including psychophysical data about temporal order judgments and the reduction of threshold contrast as a function of stimulus length. Such a model has earlier been used to explain data about illusory contour formation, texture segregation, shape-from-shading, 3-D vision, and cortical receptive fields. The model hereby shows how many data may be understood as manifestations of a cortical grouping process that can rapidly resynchronize image parts which belong together in visual object representations. The model exhibits better synchronization in the presence of noise than without noise, a type of stochastic resonance, and synchronizes robustly when cells that represent different stimulus orientations compete. These properties arise when fast long-range cooperation and slow short-range competition interact via nonlinear feedback interactions with cells that obey shunting equations.Office of Naval Research (N00014-92-J-1309, N00014-95-I-0409, N00014-95-I-0657, N00014-92-J-4015); Air Force Office of Scientific Research (F49620-92-J-0334, F49620-92-J-0225)

    Panchromatic spectral energy distributions of Herschel sources

    Get PDF
    (abridged) Far-infrared Herschel photometry from the PEP and HerMES programs is combined with ancillary datasets in the GOODS-N, GOODS-S, and COSMOS fields. Based on this rich dataset, we reproduce the restframe UV to FIR ten-colors distribution of galaxies using a superposition of multi-variate Gaussian modes. The median SED of each mode is then fitted with a modified version of the MAGPHYS code that combines stellar light, emission from dust heated by stars and a possible warm dust contribution heated by an AGN. The defined Gaussian grouping is also used to identify rare sources. The zoology of outliers includes Herschel-detected ellipticals, very blue z~1 Ly-break galaxies, quiescent spirals, and torus-dominated AGN with star formation. Out of these groups and outliers, a new template library is assembled, consisting of 32 SEDs describing the intrinsic scatter in the restframe UV-to-submm colors of infrared galaxies. This library is tested against L(IR) estimates with and without Herschel data included, and compared to eight other popular methods often adopted in the literature. When implementing Herschel photometry, these approaches produce L(IR) values consistent with each other within a median absolute deviation of 10-20%, the scatter being dominated more by fine tuning of the codes, rather than by the choice of SED templates. Finally, the library is used to classify 24 micron detected sources in PEP GOODS fields. AGN appear to be distributed in the stellar mass (M*) vs. star formation rate (SFR) space along with all other galaxies, regardless of the amount of infrared luminosity they are powering, with the tendency to lie on the high SFR side of the "main sequence". The incidence of warmer star-forming sources grows for objects with higher specific star formation rates (sSFR), and they tend to populate the "off-sequence" region of the M*-SFR-z space.Comment: Accepted for publication in A&A. Some figures are presented in low resolution. The new galaxy templates are available for download at the address http://www.mpe.mpg.de/ir/Research/PEP/uvfir_temp
    corecore