Search CORE

279 research outputs found

Nonparametric Treatment Effect Identification in School Choice

Author: Chen Jiafeng
Publication venue
Publication date: 07/12/2021
Field of study

We study identification and estimation of treatment effects in common school choice settings, under unrestricted heterogeneity in individual potential outcomes. We propose two notions of identification, corresponding to design- and sampling-based uncertainty, respectively. We characterize the set of causal estimands that are identified for a large variety of school choice mechanisms, including ones that feature both random and non-random tie-breaking; we discuss their policy implications. We also study the asymptotic behavior of nonparametric estimators for these causal estimands. Lastly, we connect our approach to the propensity score approach proposed in Abdulkadiroglu, Angrist, Narita, and Pathak (2017a, forthcoming), and derive the implicit estimands of the latter approach, under fully heterogeneous treatment effects.Comment: Presented at SOLE 202

arXiv.org e-Print Archive

Efficient Estimation of Average Derivatives in NPIV Models: Simulation Comparisons of Neural Network Estimators

Author: Chen Jiafeng
Chen Xiaohong
Tamer Elie
Publication venue: EliScholar – A Digital Platform for Scholarly Publishing at Yale
Publication date: 31/12/2021
Field of study

Artiﬁcial Neural Networks (ANNs) can be viewed as \emph{nonlinear sieves} that can approximate complex functions of high dimensional variables more eﬀectively than linear sieves. We investigate the computational performance of various ANNs in nonparametric instrumental variables (NPIV) models of moderately high dimensional covariates that are relevant to empirical economics. We present two eﬀicient procedures for estimation and inference on a weighted average derivative (WAD): an orthogonalized plug-in with optimally-weighted sieve minimum distance (OP-OSMD) procedure and a sieve eﬀicient score (ES) procedure. Both estimators for WAD use ANN sieves to approximate the unknown NPIV function and are root-n asymptotically normal and ﬁrst-order equivalent. We provide a detailed practitioner’s recipe for implementing both eﬀicient procedures. This involves the choice of tuning parameters for the unknown NPIV, the conditional expectations and the optimal weighting function that are present in both procedures but also the choice of tuning parameters for the unknown Riesz representer in the ES procedure. We compare their ﬁnite-sample performances in various simulation designs that involve smooth NPIV function of up to 13 continuous covariates, diﬀerent nonlinearities and covariate correlations. Some Monte Carlo ﬁndings include: 1) tuning and optimization are more delicate in ANN estimation; 2) given proper tuning, both ANN estimators with various architectures can perform well; 3) easier to tune ANN OP-OSMD estimators than ANN ES estimators; 4) stable inferences are more diﬀicult to achieve with ANN (than spline) estimators; 5) there are gaps between current implementations and approximation theories. Finally, we apply ANN NPIV to estimate average partial derivatives in two empirical demand examples with multivariate covariates

Yale University

Tucker Bilinear Attention Network for Multi-scale Remote Sensing Object Detection

Author: Chen Tao
Fu Jiafeng
Jiang Daguang
Li Ruirui
Publication venue
Publication date: 28/05/2023
Field of study

Object detection on VHR remote sensing images plays a vital role in applications such as urban planning, land resource management, and rescue missions. The large-scale variation of the remote-sensing targets is one of the main challenges in VHR remote-sensing object detection. Existing methods improve the detection accuracy of high-resolution remote sensing objects by improving the structure of feature pyramids and adopting different attention modules. However, for small targets, there still be seriously missed detections due to the loss of key detail features. There is still room for improvement in the way of multiscale feature fusion and balance. To address this issue, this paper proposes two novel modules: Guided Attention and Tucker Bilinear Attention, which are applied to the stages of early fusion and late fusion respectively. The former can effectively retain clean key detail features, and the latter can better balance features through semantic-level correlation mining. Based on two modules, we build a new multi-scale remote sensing object detection framework. No bells and whistles. The proposed method largely improves the average precisions of small objects and achieves the highest mean average precisions compared with 9 state-of-the-art methods on DOTA, DIOR, and NWPU VHR-10.Code and models are available at https://github.com/Shinichict/GTNet.Comment: arXiv admin note: text overlap with arXiv:1705.06676, arXiv:2209.13351 by other author

arXiv.org e-Print Archive

Manin triples associated to $n$ -Lie bialgebras

Author: Chen Ying
Kang Chuangchuang
Lü Jiafeng
Yu Shizhuo
Publication venue
Publication date: 14/07/2023
Field of study

In this paper, we study the Manin triples associated to

n

-Lie bialgebras. We develop the method of double constructions as well as operad matrices to make

n

-Lie bialgebras into Manin triples. Then, the related Manin triples lead to a natural construction of metric

n

-Lie algebras. Moreover, a one-to-one correspondence between the double of

n

-Lie bialgebras and Manin triples of

n

-Lie algebras be established

arXiv.org e-Print Archive

Elastocaloric effect and local temperature evolution associated with structural transformation in superelastic alloys: A phase-field study

Author: Chen Nailu
Cui Shushan
Wan Jiafeng
Zuo Xunwei
Publication venue: 'Purdue University (bepress)'
Publication date: 13/10/2016
Field of study

Purdue E-Pubs

Inducing Causal Structure for Abstractive Text Summarization

Author: Chen Lu
Chen Wei
Cheng Xueqi
Guo Jiafeng
Huang Wei
Zhang Ruqing
Publication venue
Publication date: 24/08/2023
Field of study

The mainstream of data-driven abstractive summarization models tends to explore the correlations rather than the causal relationships. Among such correlations, there can be spurious ones which suffer from the language prior learned from the training corpus and therefore undermine the overall effectiveness of the learned model. To tackle this issue, we introduce a Structural Causal Model (SCM) to induce the underlying causal structure of the summarization data. We assume several latent causal factors and non-causal factors, representing the content and style of the document and summary. Theoretically, we prove that the latent factors in our SCM can be identified by fitting the observed training data under certain conditions. On the basis of this, we propose a Causality Inspired Sequence-to-Sequence model (CI-Seq2Seq) to learn the causal representations that can mimic the causal factors, guiding us to pursue causal information for summary generation. The key idea is to reformulate the Variational Auto-encoder (VAE) to fit the joint distribution of the document and summary variables from the training corpus. Experimental results on two widely used text summarization datasets demonstrate the advantages of our approach

arXiv.org e-Print Archive

Worse outcome in breast cancer with higher tumor-infiltrating FOXP3+ Tregs : a systematic review and meta-analysis

Author: Jiafeng Shou
Jian Huang
Yucheng Lai
Zhigang Chen
Zhigang Zhang
Publication venue: Springer Nature
Publication date: 01/01/2016
Field of study

Table S1. Characteristics of the included studies. (DOCX 39Â kb

Springer - Publisher Connector

FigShare

On the Robustness of Generative Retrieval Models: An Out-of-Distribution Perspective

Author: Chen Wei
Cheng Xueqi
Guo Jiafeng
Liu Yu-An
Zhang Ruqing
Publication venue
Publication date: 22/06/2023
Field of study

Recently, we have witnessed generative retrieval increasingly gaining attention in the information retrieval (IR) field, which retrieves documents by directly generating their identifiers. So far, much effort has been devoted to developing effective generative retrieval models. There has been less attention paid to the robustness perspective. When a new retrieval paradigm enters into the real-world application, it is also critical to measure the out-of-distribution (OOD) generalization, i.e., how would generative retrieval models generalize to new distributions. To answer this question, firstly, we define OOD robustness from three perspectives in retrieval problems: 1) The query variations; 2) The unforeseen query types; and 3) The unforeseen tasks. Based on this taxonomy, we conduct empirical studies to analyze the OOD robustness of several representative generative retrieval models against dense retrieval models. The empirical results indicate that the OOD robustness of generative retrieval models requires enhancement. We hope studying the OOD robustness of generative retrieval models would be advantageous to the IR community.Comment: 4 pages, submit to GenIR2

arXiv.org e-Print Archive

Continual Learning for Generative Retrieval over Dynamic Corpora

Author: Chen Jiangui
Chen Wei
Cheng Xueqi
de Rijke Maarten
Fan Yixing
Guo Jiafeng
Zhang Ruqing
Publication venue
Publication date: 28/08/2023
Field of study

Generative retrieval (GR) directly predicts the identifiers of relevant documents (i.e., docids) based on a parametric model. It has achieved solid performance on many ad-hoc retrieval tasks. So far, these tasks have assumed a static document collection. In many practical scenarios, however, document collections are dynamic, where new documents are continuously added to the corpus. The ability to incrementally index new documents while preserving the ability to answer queries with both previously and newly indexed relevant documents is vital to applying GR models. In this paper, we address this practical continual learning problem for GR. We put forward a novel Continual-LEarner for generatiVE Retrieval (CLEVER) model and make two major contributions to continual learning for GR: (i) To encode new documents into docids with low computational cost, we present Incremental Product Quantization, which updates a partial quantization codebook according to two adaptive thresholds; and (ii) To memorize new documents for querying without forgetting previous knowledge, we propose a memory-augmented learning mechanism, to form meaningful connections between old and new documents. Empirical results demonstrate the effectiveness and efficiency of the proposed model.Comment: Accepted by CIKM 202

arXiv.org e-Print Archive