264 research outputs found

    Flow-Guided Feature Aggregation for Video Object Detection

    Full text link
    Extending state-of-the-art object detectors from image to video is challenging. The accuracy of detection suffers from degenerated object appearances in videos, e.g., motion blur, video defocus, rare poses, etc. Existing work attempts to exploit temporal information on box level, but such methods are not trained end-to-end. We present flow-guided feature aggregation, an accurate and end-to-end learning framework for video object detection. It leverages temporal coherence on feature level instead. It improves the per-frame features by aggregation of nearby features along the motion paths, and thus improves the video recognition accuracy. Our method significantly improves upon strong single-frame baselines in ImageNet VID, especially for more challenging fast moving objects. Our framework is principled, and on par with the best engineered systems winning the ImageNet VID challenges 2016, without additional bells-and-whistles. The proposed method, together with Deep Feature Flow, powered the winning entry of ImageNet VID challenges 2017. The code is available at https://github.com/msracver/Flow-Guided-Feature-Aggregation

    Decision Making on Government Subsidy for Highway Public-Private Partnership Projects in China Using an Iteration Game Model

    Get PDF
    Government subsidy is an important responsibility of fiscal expenditure in public-private partnership (PPP) projects. However, an improper subsidy strategy may cause over-compensation or under-compensation. In this research, an iteration game model combining game theory and real option is established to describe the periodic decision-making process. The strategy game model is applied to characterize the behavioral interactions between stakeholders, and the real option theory is used to predict the project performance under the influence of their decisions. Besides, two new indicators, the efficiency of fund (SE) and the total extra cost paid by the private sector (ME), are proposed to evaluate the extra project revenue caused by each unit of the subsidy and the incentive effects of the subsidy. Consequently, the preliminary results indicate that a periodic and iterative negotiations regarding the subsidy will effectively improve the efficiency of fund compared to the traditional way. The results also show that it is important for the public sector to give incentives, encouraging the private sector to make more efforts on the project, rather than merely providing fund support. Further study will focus on more detailed and complicated behaviors of stakeholders based on the model proposed in this paper

    Bright Soliton Solution of (1+1)-Dimensional Quantum System with Power-Law Dependent Nonlinearity

    Get PDF
    We study the nonlinear dynamics of (1+1)-dimensional quantum system in power-law dependent media based on the nonlinear Schrödinger equation (NLSE) incorporating power-law dependent nonlinearity, linear attenuation, self-steepening terms, and third-order dispersion term. The analytical bright soliton solution of this NLSE is derived via the F-expansion method. The key feature of the bright soliton solution is pictorially demonstrated, which together with typical analytical formulation of the soliton solution shows the applicability of our theoretical treatment

    Analysis of the influence of side wall opening on the arch structure of metro station using the PBA method

    Get PDF
    In order to meet the traffic and commercial needs, it is sometimes necessary to open the side wall of the metro station, while the current research on the mechanical properties and safety of the arch caused by the opening of the side wall of the station by pile-beam-arch (PBA) method is rarely involved. In this paper, based on the Tianhe East Station project of Guangzhou Metro Line 11 located in soft-hard uneven stratum using PBA method, the settlement law and mechanical characteristics of the arch under different side wall opening conditions is analyzed, and the influence of opening construction and opening span on the safety of arch is also further studied. The results show that the settlement caused by the opening of the side wall is mainly concentrated in the upper part of the opening area, and gradually expands around the opening area with the increase of opening span, and the maximum settlement occurs in the middle part of the arch. Opening leads to the differential settlement at both ends of the arch. With the increase in opening span, the settlement growth trend of the right side of the arch is greater than that of the left side. The opening of the side wall leads to the increase of the safety factor of the arch body and the decrease of the safety factor of the right arch foot, while the change of the safety factor of the left arch foot is not obvious, and the safety factor meets the specification requirements

    Multi-Vector Retrieval as Sparse Alignment

    Full text link
    Multi-vector retrieval models improve over single-vector dual encoders on many information retrieval tasks. In this paper, we cast the multi-vector retrieval problem as sparse alignment between query and document tokens. We propose AligneR, a novel multi-vector retrieval model that learns sparsified pairwise alignments between query and document tokens (e.g. `dog' vs. `puppy') and per-token unary saliences reflecting their relative importance for retrieval. We show that controlling the sparsity of pairwise token alignments often brings significant performance gains. While most factoid questions focusing on a specific part of a document require a smaller number of alignments, others requiring a broader understanding of a document favor a larger number of alignments. Unary saliences, on the other hand, decide whether a token ever needs to be aligned with others for retrieval (e.g. `kind' from `kind of currency is used in new zealand}'). With sparsified unary saliences, we are able to prune a large number of query and document token vectors and improve the efficiency of multi-vector retrieval. We learn the sparse unary saliences with entropy-regularized linear programming, which outperforms other methods to achieve sparsity. In a zero-shot setting, AligneR scores 51.1 points nDCG@10, achieving a new retriever-only state-of-the-art on 13 tasks in the BEIR benchmark. In addition, adapting pairwise alignments with a few examples (<= 8) further improves the performance up to 15.7 points nDCG@10 for argument retrieval tasks. The unary saliences of AligneR helps us to keep only 20% of the document token representations with minimal performance loss. We further show that our model often produces interpretable alignments and significantly improves its performance when initialized from larger language models

    Characterization of cellulase production by carbon sources in two Bacillus species

    Get PDF
    The induction of cellulase production in two Bacillus spp. was studied by means of measuring cellulase activities under the condition of different carbon sources. The results indicate that cellulase could not be induced by cellulose material as a sole carbon source. Instead, they could be induced by monosaccharide or disaccharide with reducing group. Moreover, the expression of cellulase components was synergistic. When cell wall/envelope enzyme and endoenzyme from two Bacillus spp. acted on these inducers, analysis of reaction products by high performance liquid chromatography (HPLC) revealed that cell wall/envelope enzyme and endoenzyme from two Bacillus spp. were inactive on these inducers. It also indicated that these inducers entered cells directly and served function of induction.Keywords: Bacillus, cellulase, induction, carbon source

    PDT: Pretrained Dual Transformers for Time-aware Bipartite Graphs

    Full text link
    Pre-training on large models is prevalent and emerging with the ever-growing user-generated content in many machine learning application categories. It has been recognized that learning contextual knowledge from the datasets depicting user-content interaction plays a vital role in downstream tasks. Despite several studies attempting to learn contextual knowledge via pre-training methods, finding an optimal training objective and strategy for this type of task remains a challenging problem. In this work, we contend that there are two distinct aspects of contextual knowledge, namely the user-side and the content-side, for datasets where user-content interaction can be represented as a bipartite graph. To learn contextual knowledge, we propose a pre-training method that learns a bi-directional mapping between the spaces of the user-side and the content-side. We formulate the training goal as a contrastive learning task and propose a dual-Transformer architecture to encode the contextual knowledge. We evaluate the proposed method for the recommendation task. The empirical studies have demonstrated that the proposed method outperformed all the baselines with significant gains

    FATA-Trans: Field And Time-Aware Transformer for Sequential Tabular Data

    Full text link
    Sequential tabular data is one of the most commonly used data types in real-world applications. Different from conventional tabular data, where rows in a table are independent, sequential tabular data contains rich contextual and sequential information, where some fields are dynamically changing over time and others are static. Existing transformer-based approaches analyzing sequential tabular data overlook the differences between dynamic and static fields by replicating and filling static fields into each transformer, and ignore temporal information between rows, which leads to three major disadvantages: (1) computational overhead, (2) artificially simplified data for masked language modeling pre-training task that may yield less meaningful representations, and (3) disregarding the temporal behavioral patterns implied by time intervals. In this work, we propose FATA-Trans, a model with two field transformers for modeling sequential tabular data, where each processes static and dynamic field information separately. FATA-Trans is field- and time-aware for sequential tabular data. The field-type embedding in the method enables FATA-Trans to capture differences between static and dynamic fields. The time-aware position embedding exploits both order and time interval information between rows, which helps the model detect underlying temporal behavior in a sequence. Our experiments on three benchmark datasets demonstrate that the learned representations from FATA-Trans consistently outperform state-of-the-art solutions in the downstream tasks. We also present visualization studies to highlight the insights captured by the learned representations, enhancing our understanding of the underlying data. Our codes are available at https://github.com/zdy93/FATA-Trans.Comment: This work is accepted by ACM International Conference on Information and Knowledge Management (CIKM) 202

    Multitask Learning for Time Series Data with 2D Convolution

    Full text link
    Multitask learning (MTL) aims to develop a unified model that can handle a set of closely related tasks simultaneously. By optimizing the model across multiple tasks, MTL generally surpasses its non-MTL counterparts in terms of generalizability. Although MTL has been extensively researched in various domains such as computer vision, natural language processing, and recommendation systems, its application to time series data has received limited attention. In this paper, we investigate the application of MTL to the time series classification (TSC) problem. However, when we integrate the state-of-the-art 1D convolution-based TSC model with MTL, the performance of the TSC model actually deteriorates. By comparing the 1D convolution-based models with the Dynamic Time Warping (DTW) distance function, it appears that the underwhelming results stem from the limited expressive power of the 1D convolutional layers. To overcome this challenge, we propose a novel design for a 2D convolution-based model that enhances the model's expressiveness. Leveraging this advantage, our proposed method outperforms competing approaches on both the UCR Archive and an industrial transaction TSC dataset

    Toward a Foundation Model for Time Series Data

    Full text link
    A foundation model is a machine learning model trained on a large and diverse set of data, typically using self-supervised learning-based pre-training techniques, that can be adapted to various downstream tasks. However, current research on time series pre-training has mostly focused on models pre-trained solely on data from a single domain, resulting in a lack of knowledge about other types of time series. However, current research on time series pre-training has predominantly focused on models trained exclusively on data from a single domain. As a result, these models possess domain-specific knowledge that may not be easily transferable to time series from other domains. In this paper, we aim to develop an effective time series foundation model by leveraging unlabeled samples from multiple domains. To achieve this, we repurposed the publicly available UCR Archive and evaluated four existing self-supervised learning-based pre-training methods, along with a novel method, on the datasets. We tested these methods using four popular neural network architectures for time series to understand how the pre-training methods interact with different network designs. Our experimental results show that pre-training improves downstream classification tasks by enhancing the convergence of the fine-tuning process. Furthermore, we found that the proposed pre-training method, when combined with the Transformer model, outperforms the alternatives
    corecore