1,294 research outputs found

    Move Forward and Tell: A Progressive Generator of Video Descriptions

    Full text link
    We present an efficient framework that can generate a coherent paragraph to describe a given video. Previous works on video captioning usually focus on video clips. They typically treat an entire video as a whole and generate the caption conditioned on a single embedding. On the contrary, we consider videos with rich temporal structures and aim to generate paragraph descriptions that can preserve the story flow while being coherent and concise. Towards this goal, we propose a new approach, which produces a descriptive paragraph by assembling temporally localized descriptions. Given a video, it selects a sequence of distinctive clips and generates sentences thereon in a coherent manner. Particularly, the selection of clips and the production of sentences are done jointly and progressively driven by a recurrent network -- what to describe next depends on what have been said before. Here, the recurrent network is learned via self-critical sequence training with both sentence-level and paragraph-level rewards. On the ActivityNet Captions dataset, our method demonstrated the capability of generating high-quality paragraph descriptions for videos. Compared to those by other methods, the descriptions produced by our method are often more relevant, more coherent, and more concise.Comment: Accepted by ECCV 201

    Evaluation of wind tunnel performance testings of an advanced 45 deg swept 8-bladed propeller at Mach numbers from 0.45 to 0.85

    Get PDF
    The increased emphasis of fuel conservation in the world and the rapid increase in the cost of jet fuel has stimulated a series of studies of both conventional and unconventional propulsion systems for commercial aircraft. The results of these studies indicate that a fuel saving of 15 to 30 percent may be realized by the use of an advanced high-speed turboprop (Prop-Fan) compared to aircraft equipped with high bypass turbofan engines of equivalent technology. The Prop-Fan propulsion system is being investigated as part of the NASA Aircraft Energy Efficient Program. This effort includes the wind tunnel testing of a series of 8 and 10-blade Prop-Fan models incorporate swept blades. Test results indicate efficiency levels near the goal of 80 percent at Mach 0.8 cruise and an altitude of 10.67 km (35,000 ft). Each successive swept model has shown improved efficiency relative to the straight blade model. The fourth model, with 45 deg swept blades reported herein, shows a net efficiency of 78.2 at the design point with a power loading of 301 kW/sq meter and a tip speed of 243.8 m/sec (800 ft/sec.)

    Segmental Spatiotemporal CNNs for Fine-grained Action Segmentation

    Full text link
    Joint segmentation and classification of fine-grained actions is important for applications of human-robot interaction, video surveillance, and human skill evaluation. However, despite substantial recent progress in large-scale action classification, the performance of state-of-the-art fine-grained action recognition approaches remains low. We propose a model for action segmentation which combines low-level spatiotemporal features with a high-level segmental classifier. Our spatiotemporal CNN is comprised of a spatial component that uses convolutional filters to capture information about objects and their relationships, and a temporal component that uses large 1D convolutional filters to capture information about how object relationships change across time. These features are used in tandem with a semi-Markov model that models transitions from one action to another. We introduce an efficient constrained segmental inference algorithm for this model that is orders of magnitude faster than the current approach. We highlight the effectiveness of our Segmental Spatiotemporal CNN on cooking and surgical action datasets for which we observe substantially improved performance relative to recent baseline methods.Comment: Updated from the ECCV 2016 version. We fixed an important mathematical error and made the section on segmental inference cleare

    Conditional Image-Text Embedding Networks

    Full text link
    This paper presents an approach for grounding phrases in images which jointly learns multiple text-conditioned embeddings in a single end-to-end model. In order to differentiate text phrases into semantically distinct subspaces, we propose a concept weight branch that automatically assigns phrases to embeddings, whereas prior works predefine such assignments. Our proposed solution simplifies the representation requirements for individual embeddings and allows the underrepresented concepts to take advantage of the shared representations before feeding them into concept-specific layers. Comprehensive experiments verify the effectiveness of our approach across three phrase grounding datasets, Flickr30K Entities, ReferIt Game, and Visual Genome, where we obtain a (resp.) 4%, 3%, and 4% improvement in grounding performance over a strong region-phrase embedding baseline.Comment: ECCV 2018 accepted pape

    Learning Visual Question Answering by Bootstrapping Hard Attention

    Full text link
    Attention mechanisms in biological perception are thought to select subsets of perceptual information for more sophisticated processing which would be prohibitive to perform on all sensory inputs. In computer vision, however, there has been relatively little exploration of hard attention, where some information is selectively ignored, in spite of the success of soft attention, where information is re-weighted and aggregated, but never filtered out. Here, we introduce a new approach for hard attention and find it achieves very competitive performance on a recently-released visual question answering datasets, equalling and in some cases surpassing similar soft attention architectures while entirely ignoring some features. Even though the hard attention mechanism is thought to be non-differentiable, we found that the feature magnitudes correlate with semantic relevance, and provide a useful signal for our mechanism's attentional selection criterion. Because hard attention selects important features of the input information, it can also be more efficient than analogous soft attention mechanisms. This is especially important for recent approaches that use non-local pairwise operations, whereby computational and memory costs are quadratic in the size of the set of features.Comment: ECCV 201

    Single microtubules and small networks become significantly stiffer on short time-scales upon mechanical stimulation

    Get PDF
    The transfer of mechanical signals through cells is a complex phenomenon. To uncover a new mechanotransduction pathway, we study the frequency-dependent transport of mechanical stimuli by single microtubules and small networks in a bottom-up approach using optically trapped beads as anchor points. We interconnected microtubules to linear and triangular geometries to perform micro-rheology by defined oscillations of the beads relative to each other. We found a substantial stiffening of single filaments above a characteristic transition frequency of 1-30 Hz depending on the filament's molecular composition. Below this frequency, filament elasticity only depends on its contour and persistence length. Interestingly, this elastic behavior is transferable to small networks, where we found the surprising effect that linear two filament connections act as transistor-like, angle dependent momentum filters, whereas triangular networks act as stabilizing elements. These observations implicate that cells can tune mechanical signals by temporal and spatial filtering stronger and more flexibly than expected

    The exit velocity of a compressed air cannon

    Full text link
    The use of compressed air cannons in an undergraduate lab provides a way to illustrate the cooperation of diverse physics concepts, such as conservation of momentum, the work-kinetic energy theorem, expansion of gas, air drag, and elementary Newtonian mechanics. However, recent proposals have disagreed as to whether the expansion of the gas in the cannon should be modeled as an adiabatic or an isothermal process. We built an air cannon that utilized a diaphragm valve to release our pressurized gas and found that neither model accurately predicted the exit velocity of our projectile. We present a new model, based on the flow of air through the valve, that is in much better agreement with our data

    An optically actuated surface scanning probe

    Get PDF
    We demonstrate the use of an extended, optically trapped probe that is capable of imaging surface topography with nanometre precision, whilst applying ultra-low, femto-Newton sized forces. This degree of precision and sensitivity is acquired through three distinct strategies. First, the probe itself is shaped in such a way as to soften the trap along the sensing axis and stiffen it in transverse directions. Next, these characteristics are enhanced by selectively position clamping independent motions of the probe. Finally, force clamping is used to refine the surface contact response. Detailed analyses are presented for each of these mechanisms. To test our sensor, we scan it laterally over a calibration sample consisting of a series of graduated steps, and demonstrate a height resolution of ∼ 11 nm. Using equipartition theory, we estimate that an average force of only ∼ 140 fN is exerted on the sample during the scan, making this technique ideal for the investigation of delicate biological samples

    Ontogeny of ependymoglial cells lining the third ventricle in mice.

    Get PDF
    During hypothalamic development, the germinative neuroepithelium gives birth to diverse neural cells that regulate numerous physiological functions in adulthood. Here, we studied the ontogeny of ependymal cells in the mouse mediobasal hypothalamus using the BrdU approach and publicly available single-cell RNAseq datasets. We observed that while typical ependymal cells are mainly produced at E13, tanycyte birth depends on time and subtypes and lasts up to P8. Typical ependymocytes and β tanycytes are the first to arise at the top and bottom of the dorsoventral axis around E13, whereas α tanycytes emerge later in development, generating an outside-in dorsoventral gradient along the third ventricle. Additionally, α tanycyte generation displayed a rostral-to-caudal pattern. Finally, tanycytes mature progressively until they reach transcriptional maturity between P4 and P14. Altogether, this data shows that ependyma generation differs in time and distribution, highlighting the heterogeneity of the third ventricle
    corecore