183 research outputs found

    Sintering and Properties of Nb4AlC3 Ceramic

    Get PDF

    Field Assisted Material Engineering (FAME)

    Get PDF
    In order to further improve the energy saving of Spark Plasma Sintering we have developed a very rapid sintering technique called Flash SPS (FSPS) with heating rates in the order of 104-105 ˚C/minute[1]. Unlike the Flash Sintering based on high voltage (≈100V), FSPS is based on low voltage (≈10V) and it can be up-scaled to samples volumes of several tens of cubic centimetres. Flash SPS allows densification of metallic conductors like ZrB2 and HfB2, under a discharge time as short as 20-30 seconds. FSPS of semiconductors like silicon carbide and boron carbide was also demonstrated. Highly customized and versatile equipment with ultrafast responsive controls and programmable bipolar power supplies (up to 20 kHz, 1 MA, 500V) has been built. The developed methodology has been applied to produce FSPSed samples even larger than 6 cm in diameter of ultra refractory materials. Understanding the intrinsic electrical field role in the triangle properties-microstructure-processing remains one our primary scientific goal and the main open question. We tried to give some answers by approaching the problem at different length scales (see figure 1) by developing dedicated equipment/controls, simulations (FEM and ab-initio), thermo-kinetic analysis, in situ observations and accurate temperature measurements/calibrations. Please click Additional Files below to see the full abstract

    Interaction-aware Spatio-temporal Pyramid Attention Networks for Action Classification

    Full text link
    Local features at neighboring spatial positions in feature maps have high correlation since their receptive fields are often overlapped. Self-attention usually uses the weighted sum (or other functions) with internal elements of each local feature to obtain its weight score, which ignores interactions among local features. To address this, we propose an effective interaction-aware self-attention model inspired by PCA to learn attention maps. Furthermore, since different layers in a deep network capture feature maps of different scales, we use these feature maps to construct a spatial pyramid and then utilize multi-scale information to obtain more accurate attention scores, which are used to weight the local features in all spatial positions of feature maps to calculate attention maps. Moreover, our spatial pyramid attention is unrestricted to the number of its input feature maps so it is easily extended to a spatio-temporal version. Finally, our model is embedded in general CNNs to form end-to-end attention networks for action classification. Experimental results show that our method achieves the state-of-the-art results on the UCF101, HMDB51 and untrimmed Charades.Comment: Accepted by ECCV201

    EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

    Full text link
    Current metrics for video captioning are mostly based on the text-level comparison between reference and candidate captions. However, they have some insuperable drawbacks, e.g., they cannot handle videos without references, and they may result in biased evaluation due to the one-to-many nature of video-to-text and the neglect of visual relevance. From the human evaluator's viewpoint, a high-quality caption should be consistent with the provided video, but not necessarily be similar to the reference in literal or semantics. Inspired by human evaluation, we propose EMScore (Embedding Matching-based score), a novel reference-free metric for video captioning, which directly measures similarity between video and candidate captions. Benefit from the recent development of large-scale pre-training models, we exploit a well pre-trained vision-language model to extract visual and linguistic embeddings for computing EMScore. Specifically, EMScore combines matching scores of both coarse-grained (video and caption) and fine-grained (frames and words) levels, which takes the overall understanding and detailed characteristics of the video into account. Furthermore, considering the potential information gain, EMScore can be flexibly extended to the conditions where human-labeled references are available. Last but not least, we collect VATEX-EVAL and ActivityNet-FOIl datasets to systematically evaluate the existing metrics. VATEX-EVAL experiments demonstrate that EMScore has higher human correlation and lower reference dependency. ActivityNet-FOIL experiment verifies that EMScore can effectively identify "hallucinating" captions. The datasets will be released to facilitate the development of video captioning metrics. The code is available at: https://github.com/ShiYaya/emscore.Comment: cvpr202

    Human Action Recognition Using Pyramid Vocabulary Tree

    Full text link
    Abstract. The bag-of-visual-words (BOVW) approaches are widely used in human action recognition. Usually, large vocabulary size of the BOVW is more discriminative for inter-class action classification while small one is more robust to noise and thus tolerant to the intra-class invariance. In this pape, we propose a pyramid vocabulary tree to model local spatio-temporal features, which can characterize the inter-class difference and also allow intra-class variance. Moreover, since BOVW is geometrically unconstrained, we further consider the spatio-temporal information of local features and propose a sparse spatio-temporal pyramid matching kernel (termed as SST-PMK) to compute the similarity measures between video sequences. SST-PMK satisfies the Mercer’s condition and therefore is readily integrated into SVM to perform action recognition. Experimental results on the Weizmann datasets show that both the pyramid vocabulary tree and the SST-PMK lead to a significant improvement in human action recognition. Keywords: Action recognition, Bag-of-visual-words (BOVW), Pyramid matching kernel (PMK
    • …
    corecore