68 research outputs found

    Low-carbon scenario analysis on urban transport of one metropolitan in China in 2020

    Get PDF
    Purpose: This paper discussed possible ways of implementing effective energy conservation and GHG emission reduction measures by providing: the forecasts of mid-to-long term city-wide carbon emission rate; and the analysis of potential low-carbon transport solutions. Design/methodology/approach: According to the characteristics of the transport system in Beijing, based on the review and application analysis of existing transport energy and GHG emission calculation models, the comprehensive carbon emission calculation model established. Existing data were utilized with regression analysis to project the prospective traffic data in the baseline scenario at the target year of 2020 to calculate the emission amount. Four low-carbon scenarios were set in accordance with the goal of “low carbon transportation, green trip”, and the effectiveness of each low-carbon scenario was evaluated by comparing them with the baseline scenario in terms of the respective GHG emission rate. Findings: Under the current developing trend in policy environment and technical specifications, the total projected GHG (CO2) emissions from transport sector at 2020 in Beijing will reach 24.69 million t CO2; private-vehicle is the major contributor among all transport modes at 15.96 million t CO2. Practical implications: Limiting the growth in private-vehicle ownership, reducing the frequency of mid-to-long range travel and the average trip distance, and prompting the public transit oriented policies are all possible solutions to reduce carbon emission. The most effective practice involves a shift in public travel behavior. Originality/value: This paper presents a method to forecast the mid-to-long term city-wide carbon emission rate; and provides some potential low-carbon transport solutions.Peer Reviewe

    Analysis on Alighting and Boarding Movement Laws in Subway Using Modified Social Force Model

    Get PDF
    This paper presents a multi-agent simulator based on social force model to simulate each passenger’s boarding and alighting behavior both in a train and on a platform seamlessly. Passengers can be divided into three types: to board, alight and stay in train. They have different individual attributes and follow different walking rules. Due to the characteristics of subway environment and passengers' behavior in boarding and alighting, some adjustment and improvement were made to the basic social force model: (1) In some cases during the process of boarding and alighting, the driving force targeting to destination needs to be doubled, and the repulsion force between two agents needs to be reduced. (2) Passengers who stay in the train show quite different movement from the usual pedestrian. They usually want to remain still, unless they are in front of the door. To describe their behaviors, we introduced a tangent detour force. The scope of the interaction between agents is extended and some passengers out of the visual field also should be counted. (3) Divide the repulsive force between an agent and an obstacle into the frontal force and convex corner force. These two forces have different spheres of influence and calculation methods. The agents could exhibit reasonable intelligence and diversity during alighting and boarding

    RepViT: Revisiting Mobile CNN From ViT Perspective

    Full text link
    Recently, lightweight Vision Transformers (ViTs) demonstrate superior performance and lower latency compared with lightweight Convolutional Neural Networks (CNNs) on resource-constrained mobile devices. This improvement is usually attributed to the multi-head self-attention module, which enables the model to learn global representations. However, the architectural disparities between lightweight ViTs and lightweight CNNs have not been adequately examined. In this study, we revisit the efficient design of lightweight CNNs and emphasize their potential for mobile devices. We incrementally enhance the mobile-friendliness of a standard lightweight CNN, specifically MobileNetV3, by integrating the efficient architectural choices of lightweight ViTs. This ends up with a new family of pure lightweight CNNs, namely RepViT. Extensive experiments show that RepViT outperforms existing state-of-the-art lightweight ViTs and exhibits favorable latency in various vision tasks. On ImageNet, RepViT achieves over 80\% top-1 accuracy with nearly 1ms latency on an iPhone 12, which is the first time for a lightweight model, to the best of our knowledge. Our largest model, RepViT-M3, obtains 81.4\% accuracy with only 1.3ms latency. The code and trained models are available at \url{https://github.com/jameslahm/RepViT}.Comment: 9 pages, 7 figure

    Investigating the effectiveness of coacervates produced from conjugated and unconjugated Spirulina protein in delivering unstable oil to the intestinal phase of digestion

    Get PDF
    This study investigated the potential of complex coacervates produced using Spirulina protein concentrate (SPC) conjugated with maltodextrin (MD) and carrageenan (CG) for encapsulating and delivering sensitive oils. A wet-heating Maillard reaction was employed to conjugate SPC with MD, followed by coacervation with CG to form the conjugate-based coacervates. Additionally, a mixture of unconjugated SPC and MD was coacervated with CG to produce mixture-based coacervates. Both types of coacervates were utilised as wall materials for encapsulating canola oil. The in-vitro digestion of the resulting microcapsules was assessed in oral, gastric, and intestinal phases, focusing on physicochemical parameters such as droplet size, zeta-potential, microstructure, proteolysis, oil release and lipolysis. The findings revealed that microcapsules prepared using both (SPC-MD mixture)-CG and (SPC-MD conjugate)-CG coacervates were remarkably stable against gastric digestion, as evidenced by the minimal production of free amino acids (15 mM). Most of the encapsulated oil (62–67%) was released during the intestinal phase due to the breakdown of the coacervates. Notably, the microcapsules produced with (SPC-MD conjugate)-CG coacervates demonstrated a lower degree of lipolysis (41.77% free fatty acid content) compared to those prepared with (SPC-MD mixture)-CG coacervates (53.35% free fatty acid content). These results highlight the potential of complex coacervates produced using conjugated SPC as promising materials for the encapsulation and delivery of sensitive oils

    IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval

    Full text link
    Enabling bi-directional retrieval of images and texts is important for understanding the correspondence between vision and language. Existing methods leverage the attention mechanism to explore such correspondence in a fine-grained manner. However, most of them consider all semantics equally and thus align them uniformly, regardless of their diverse complexities. In fact, semantics are diverse (i.e. involving different kinds of semantic concepts), and humans usually follow a latent structure to combine them into understandable languages. It may be difficult to optimally capture such sophisticated correspondences in existing methods. In this paper, to address such a deficiency, we propose an Iterative Matching with Recurrent Attention Memory (IMRAM) method, in which correspondences between images and texts are captured with multiple steps of alignments. Specifically, we introduce an iterative matching scheme to explore such fine-grained correspondence progressively. A memory distillation unit is used to refine alignment knowledge from early steps to later ones. Experiment results on three benchmark datasets, i.e. Flickr8K, Flickr30K, and MS COCO, show that our IMRAM achieves state-of-the-art performance, well demonstrating its effectiveness. Experiments on a practical business advertisement dataset, named \Ads{}, further validates the applicability of our method in practical scenarios.Comment: 9 pages; Accepted by CVPR202

    GRN: Gated Relation Network to Enhance Convolutional Neural Network for Named Entity Recognition

    Full text link
    The dominant approaches for named entity recognition (NER) mostly adopt complex recurrent neural networks (RNN), e.g., long-short-term-memory (LSTM). However, RNNs are limited by their recurrent nature in terms of computational efficiency. In contrast, convolutional neural networks (CNN) can fully exploit the GPU parallelism with their feedforward architectures. However, little attention has been paid to performing NER with CNNs, mainly owing to their difficulties in capturing the long-term context information in a sequence. In this paper, we propose a simple but effective CNN-based network for NER, i.e., gated relation network (GRN), which is more capable than common CNNs in capturing long-term context. Specifically, in GRN we firstly employ CNNs to explore the local context features of each word. Then we model the relations between words and use them as gates to fuse local context features into global ones for predicting labels. Without using recurrent layers that process a sentence in a sequential manner, our GRN allows computations to be performed in parallel across the entire sentence. Experiments on two benchmark NER datasets (i.e., CoNLL2003 and Ontonotes 5.0) show that, our proposed GRN can achieve state-of-the-art performance with or without external knowledge. It also enjoys lower time costs to train and test.We have made the code publicly available at https://github.com/HuiChen24/NER-GRN.Comment: This paper is accepted by AAAI 201

    VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

    Full text link
    Vision and text have been fully explored in contemporary video-text foundational models, while other modalities such as audio and subtitles in videos have not received sufficient attention. In this paper, we resort to establish connections between multi-modality video tracks, including Vision, Audio, and Subtitle, and Text by exploring an automatically generated large-scale omni-modality video caption dataset called VAST-27M. Specifically, we first collect 27 million open-domain video clips and separately train a vision and an audio captioner to generate vision and audio captions. Then, we employ an off-the-shelf Large Language Model (LLM) to integrate the generated captions, together with subtitles and instructional prompts into omni-modality captions. Based on the proposed VAST-27M dataset, we train an omni-modality video-text foundational model named VAST, which can perceive and process vision, audio, and subtitle modalities from video, and better support various tasks including vision-text, audio-text, and multi-modal video-text tasks (retrieval, captioning and QA). Extensive experiments have been conducted to demonstrate the effectiveness of our proposed VAST-27M corpus and VAST foundation model. VAST achieves 22 new state-of-the-art results on various cross-modality benchmarks. Code, model and dataset will be released at https://github.com/TXH-mercury/VAST.Comment: 23 pages, 5 figure

    Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources

    Full text link
    For languages with no annotated resources, transferring knowledge from rich-resource languages is an effective solution for named entity recognition (NER). While all existing methods directly transfer from source-learned model to a target language, in this paper, we propose to fine-tune the learned model with a few similar examples given a test case, which could benefit the prediction by leveraging the structural and semantic information conveyed in such similar examples. To this end, we present a meta-learning algorithm to find a good model parameter initialization that could fast adapt to the given test case and propose to construct multiple pseudo-NER tasks for meta-training by computing sentence similarities. To further improve the model's generalization ability across different languages, we introduce a masking scheme and augment the loss function with an additional maximum term during meta-training. We conduct extensive experiments on cross-lingual named entity recognition with minimal resources over five target languages. The results show that our approach significantly outperforms existing state-of-the-art methods across the board.Comment: This paper is accepted by AAAI2020. Code is available at https://github.com/microsoft/vert-papers/tree/master/papers/Meta-Cros

    InfoEntropy Loss to Mitigate Bias of Learning Difficulties for Generative Language Models

    Full text link
    Generative language models are usually pretrained on large text corpus via predicting the next token (i.e., sub-word/word/phrase) given the previous ones. Recent works have demonstrated the impressive performance of large generative language models on downstream tasks. However, existing generative language models generally neglect an inherent challenge in text corpus during training, i.e., the imbalance between frequent tokens and infrequent ones. It can lead a language model to be dominated by common and easy-to-learn tokens, thereby overlooking the infrequent and difficult-to-learn ones. To alleviate that, we propose an Information Entropy Loss (InfoEntropy Loss) function. During training, it can dynamically assess the learning difficulty of a to-be-learned token, according to the information entropy of the corresponding predicted probability distribution over the vocabulary. Then it scales the training loss adaptively, trying to lead the model to focus more on the difficult-to-learn tokens. On the Pile dataset, we train generative language models at different scales of 468M, 1.2B, and 6.7B parameters. Experiments reveal that models incorporating the proposed InfoEntropy Loss can gain consistent performance improvement on downstream benchmarks
    • …
    corecore