Search CORE

104 research outputs found

Recommended from our members

Sampling and Learning of the And-Or Graph

Author: Zhang Ruize
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

The And-Or graph is a tool for knowledge representation. In this thesis we first study thesampling of the And-Or graph with or without context constraints. Without any constrainton the potential functions of the And-Or graph nodes, the positions and shapes of differ-ent components of the face images are not aligned properly. In contrast, with both unaryconstraints and binary constraints, the components are aligned and the samples are morerepresentative of the And-Or graph. We further explore parameter and structure learning ofthe And-Or graph by implementing and applying some existing algorithms. The experimen-tal results on 1D text data and 2D face image data are shown. While there is no apparentdifference between the sampling results of the parameter learned And-Or graph and the trueAnd-Or graph, the sampling results of the structure learned And-Or graph are not perfectand could be further improved

eScholarship - University of California

From Synthetic to Real: Unveiling the Power of Synthetic Data for Video Person Re-ID

Author: Feng Wei
Han Ruize
Zhang Xiangqun
Publication venue
Publication date: 03/02/2024
Field of study

In this paper, we study a new problem of cross-domain video based person re-identification (Re-ID). Specifically, we take the synthetic video dataset as the source domain for training and use the real-world videos for testing, which significantly reduces the dependence on real training data collection and annotation. To unveil the power of synthetic data for video person Re-ID, we first propose a self-supervised domain invariant feature learning strategy for both static and temporal features. Then, to further improve the person identification ability in the target domain, we develop a mean-teacher scheme with the self-supervised ID consistency loss. Experimental results on four real datasets verify the rationality of cross-synthetic-real domain adaption and the effectiveness of our method. We are also surprised to find that the synthetic data performs even better than the real data in the cross-domain setting

arXiv.org e-Print Archive

MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning

Author: Feng Ruoxuan
Hu Di
Xu Ruize
Zhang Shi-Xiong
Publication venue
Publication date: 11/03/2023
Field of study

Audio-visual learning helps to comprehensively understand the world by fusing practical information from multiple modalities. However, recent studies show that the imbalanced optimization of uni-modal encoders in a joint-learning model is a bottleneck to enhancing the model's performance. We further find that the up-to-date imbalance-mitigating methods fail on some audio-visual fine-grained tasks, which have a higher demand for distinguishable feature distribution. Fueled by the success of cosine loss that builds hyperspherical feature spaces and achieves lower intra-class angular variability, this paper proposes Multi-Modal Cosine loss, MMCosine. It performs a modality-wise

L_2

normalization to features and weights towards balanced and better multi-modal fine-grained learning. We demonstrate that our method can alleviate the imbalanced optimization from the perspective of weight norm and fully exploit the discriminability of the cosine metric. Extensive experiments prove the effectiveness of our method and the versatility with advanced multi-modal fusion strategies and up-to-date imbalance-mitigating methods

arXiv.org e-Print Archive

Nearest Neighbor Machine Translation is Meta-Optimizer on Output Projection Layer

Author: Du Yichao
Gao Ruize
Liu Lemao
Wang Rui
Zhang Zhirui
Publication venue
Publication date: 24/10/2023
Field of study

Nearest Neighbor Machine Translation (

k

NN-MT) has achieved great success in domain adaptation tasks by integrating pre-trained Neural Machine Translation (NMT) models with domain-specific token-level retrieval. However, the reasons underlying its success have not been thoroughly investigated. In this paper, we comprehensively analyze

k

NN-MT through theoretical and empirical studies. Initially, we provide new insights into the working mechanism of

k

NN-MT as an efficient technique to implicitly execute gradient descent on the output projection layer of NMT, indicating that it is a specific case of model fine-tuning. Subsequently, we conduct multi-domain experiments and word-level analysis to examine the differences in performance between

k

NN-MT and entire-model fine-tuning. Our findings suggest that: (1) Incorporating

k

NN-MT with adapters yields comparable translation performance to fine-tuning on in-domain test sets, while achieving better performance on out-of-domain test sets; (2) Fine-tuning significantly outperforms

k

NN-MT on the recall of in-domain low-frequency words, but this gap could be bridged by optimizing the context representations with additional adapter layers.Comment: Accepted by EMNLP202

arXiv.org e-Print Archive

OmniDrones: An Efficient and Flexible Platform for Reinforcement Learning in Drone Control

Author: Gao Feng
Wang Yu
Wu Yi
Xu Botian
Yu Chao
Zhang Ruize
Publication venue
Publication date: 22/09/2023
Field of study

In this work, we introduce OmniDrones, an efficient and flexible platform tailored for reinforcement learning in drone control, built on Nvidia's Omniverse Isaac Sim. It employs a bottom-up design approach that allows users to easily design and experiment with various application scenarios on top of GPU-parallelized simulations. It also offers a range of benchmark tasks, presenting challenges ranging from single-drone hovering to over-actuated system tracking. In summary, we propose an open-sourced drone simulation platform, equipped with an extensive suite of tools for drone learning. It includes 4 drone models, 5 sensor modalities, 4 control modes, over 10 benchmark tasks, and a selection of widely used RL baselines. To showcase the capabilities of OmniDrones and to support future research, we also provide preliminary results on these benchmark tasks. We hope this platform will encourage further studies on applying RL to practical drone systems.Comment: Submitted to IEEE RA-

arXiv.org e-Print Archive

Keep it Consistent: Topic-Aware Storytelling from an Image Stream via Iterative Multi-agent Communication

Author: Cheng Ying
Huang Xuanjing
Li Piji
Shan Haijun
Wang Ruize
Wei Zhongyu
Zhang Ji
Zhang Qi
Publication venue
Publication date: 01/01/2020
Field of study

Visual storytelling aims to generate a narrative paragraph from a sequence of images automatically. Existing approaches construct text description independently for each image and roughly concatenate them as a story, which leads to the problem of generating semantically incoherent content. In this paper, we propose a new way for visual storytelling by introducing a topic description task to detect the global semantic context of an image stream. A story is then constructed with the guidance of the topic description. In order to combine the two generation tasks, we propose a multi-agent communication framework that regards the topic description generator and the story generator as two agents and learn them simultaneously via iterative updating mechanism. We validate our approach on VIST dataset, where quantitative results, ablations, and human evaluation demonstrate our method's good ability in generating stories with higher quality compared to state-of-the-art methods.Comment: Accepted to COLING 202

arXiv.org e-Print Archive

Crossref

Recommended from our members

Eliminating Contextual Bias in Aspect-based Sentiment Analysis.

Author: An Ruize
Song Dawei
Zhang Chen
Publication venue
Publication date
Field of study

Pretrained language models (LMs) have made remarkable achievements in aspect-based sentiment analysis (ABSA). However, it is discovered that these models may struggle in some particular cases (e.g., to detect sentiments expressed towards targeted aspects with only implicit or adversarial expressions). Since it is hard for models to align implicit or adversarial expressions with their corresponding aspects, the sentiments of the targeted aspects would largely be impacted by the expressions towards other aspects in the sentence. We name this phenomenon as contextual bias. To tackle the problem, we propose a flexible aspect-oriented debiasing method (Arde) to eliminate the harmful contextual bias without the need of adjusting the underlying LMs. Intuitively, Arde calibrates the prediction towards the targeted aspect by subtracting the bias towards the context. Favorably, Arde can get theoretical support from counterfactual reasoning theory. Experiments are conducted on SemEval benchmark, and the results show that Arde can empirically improve the accuracy on contextually biased aspect sentiments without degrading the accuracy on unbiased ones. Driven by recent success of large language models (LLMs, e.g., ChatGPT), we further uncover that even LLMs can fail to address certain contextual bias, which yet can be effectively tackled by Arde

Open Research Online

A Benchmark of Video-Based Clothes-Changing Person Re-Identification

Author: Feng Wei
Han Ruize
Li Xiaoyu
Wang Likai
Wang Song
Yang Jialin
Zhang Xiangqun
Publication venue
Publication date: 20/11/2022
Field of study

Person re-identification (Re-ID) is a classical computer vision task and has achieved great progress so far. Recently, long-term Re-ID with clothes-changing has attracted increasing attention. However, existing methods mainly focus on image-based setting, where richer temporal information is overlooked. In this paper, we focus on the relatively new yet practical problem of clothes-changing video-based person re-identification (CCVReID), which is less studied. We systematically study this problem by simultaneously considering the challenge of the clothes inconsistency issue and the temporal information contained in the video sequence for the person Re-ID problem. Based on this, we develop a two-branch confidence-aware re-ranking framework for handling the CCVReID problem. The proposed framework integrates two branches that consider both the classical appearance features and cloth-free gait features through a confidence-guided re-ranking strategy. This method provides the baseline method for further studies. Also, we build two new benchmark datasets for CCVReID problem, including a large-scale synthetic video dataset and a real-world one, both containing human sequences with various clothing changes. We will release the benchmark and code in this work to the public

arXiv.org e-Print Archive

Gains and losses from collusion: an empirical study on market behaviors of China’s power enterprises

Author: Gao Ruize
Li Ang
Ran Chang
Zhang Zhenji
Publication venue: OmniaScience
Publication date: 01/01/2015
Field of study

Purpose: Collusion is a common behavior of oligarch enterprises aiming to get an advantage in market competition. The purpose of the research is to explore positive or negative effects from the electricity generation manufacturers’ collusion through statistical analysis approach. To be exact, these effects are discovered both in market economy at a macro-economic level and in enterprise behaviors at a micro-economic level. Design/methodology/approach: This research designs a model as an extension of Porter’s model (Green & Porter, 1984). In this model FIML is applied. Taking price bidding project launched in China’s power industry as an example, this paper conducts an empirical research on its relevant price data collected from subordinate power plants of China’s five power generation groups in the pilots. Findings: It is found in this paper that power generation enterprises are facing collusion issues in the market. To be exact, it is such a situation in which non-cooperative competition and collusion alternate. Under the competition, market is relatively steady, thus forming a lower network price. It is helpful to the development of the whole industry. However, once Cartel is formed, the price will rise and clash with power enterprises and transmission-distribution companies concerning the interests conflicts. At the same time, a higher power price will form in the market, making consumers suffer losses. All of these are bad for industry development. Not only the collusion of power enterprises affects power price but also the market power that caused by long-time Cartel will reduce the market entrant in electricity generation. Market resources are centralized in the hands of Cartel, causing a low effective competition in the market, which has passive effects on users. Implications: The empirical research also indicates that collusion undoubtedly benefits the power enterprises that involved. As a cooperation pattern, collusion can lead to the synergy between relevant companies. However, collusion harms the benefits of other market entities. During the process of enterprises creating common interests cooperatively, collusion may bring harm to the outside industry. Originality/value: Using empirical research method, the paper takes China’s power industry as an example to show the gains and losses of collusion from two aspects, namely market economy and strategic management.Peer Reviewe

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC