papers | Rae Yu 余蕊琪

2025

AoS
Strong approximations for empirical processes indexed by Lipschitz functions

Matias D Cattaneo, and Ruiqi Rae Yu

Annals of Statistics, 2025

Abs Bib HTML PDF

This paper presents new uniform Gaussian strong approximations for empirical processes indexed by classes of functions based on \(d\)-variate random vectors (\(d \geq1 \)). First, a uniform Gaussian strong approximation is established for general empirical processes indexed by possibly Lipschitz functions, improving on previous results in the literature. In the setting considered by [Rio (1994)], and if the function class is Lipschitzian, our result improves the approximation rate \(n^{-1/(2d)}\) to \(n^{-1/\max\{d,2\}}\), up to a \(polylog(n)\) term, where \(n\) denotes the sample size. Remarkably, we establish a valid uniform Gaussian strong approximation at the rate \(n^{-1/2}\log n\) for \(d=2\), which was previously known to be valid only for univariate (\(d=1\)) empirical processes via the celebrated Hungarian construction [Komlós, Major, Tusnády (1975)]. Second, a uniform Gaussian strong approximation is established for multiplicative separable empirical processes indexed by possibly Lipschitz functions, which addresses some outstanding problems in the literature [Chernozhukov, Chetverikov, Kato (2014)]. Finally, two other uniform Gaussian strong approximation results are presented when the function class is a sequence of Haar basis based on quasi-uniform partitions. Applications to nonparametric density and regression estimation are discussed.
@article{cattaneo2024strong, title = {Strong approximations for empirical processes indexed by Lipschitz functions}, author = {Cattaneo, Matias D and Yu, Ruiqi Rae}, journal = {Annals of Statistics}, year = {2025}, url = {https://projecteuclid.org/journals/annals-of-statistics/volume-53/issue-3/Strong-approximations-for-empirical-processes-indexed-by-Lipschitz-functions/10.1214/25-AOS2500.full}, }
arXiv
The Honest Truth About Causal Trees: Accuracy Limits for Heterogeneous Treatment Effect Estimation

Matias D Cattaneo, Jason M Klusowski, and Ruiqi Rae Yu

arXiv preprint arXiv:2509.11381, 2025

Abs Bib HTML PDF

This paper studies recursive decision trees for heterogeneous causal treatment effect estimation and inference in experimental and observational settings. These procedures are typically fitted with the CART (Classification and Regression Tree) algorithm [Breiman et al. (1984)] or close variants, and thus are often believed to be “adaptive” to high-dimensional data, sparsity, or other structural features of the data-generating process. Building on the “honest” causal trees proposed by [Athey & Imbens (2016)], which have become standard in academia and industry, we analyze those estimators (and variants) and establish lower bounds on their estimation error. We show that these popular heterogeneous-treatment-effect estimators cannot attain a polynomial-in-\(n\) convergence rate under basic conditions, where \(n\) denotes the sample size. Contrary to common belief, honesty does not remove these limitations and at best yields negligible logarithmic improvements in sample size or dimension. Consequently, these widely used estimators can perform poorly in practice and may even be inconsistent in some settings. Theoretical insights are corroborated with simulation evidence.
@article{cattaneo2025honest, title = {The Honest Truth About Causal Trees: Accuracy Limits for Heterogeneous Treatment Effect Estimation}, author = {Cattaneo, Matias D and Klusowski, Jason M and Yu, Ruiqi Rae}, journal = {arXiv preprint arXiv:2509.11381}, year = {2025}, url = {https://arxiv.org/abs/2509.11381}, }
arXiv
Robust Inference for the Direct Average Treatment Effect with Treatment Assignment Interference

Matias D Cattaneo, Yihan He, and Ruiqi Rae Yu

arXiv preprint arXiv:2502.13238, 2025

Abs Bib HTML PDF

Uncertainty quantification in causal inference settings with random network interference is a challenging open problem. We study the large sample distributional properties of the classical difference-in-means Hajek treatment effect estimator, and propose a robust inference procedure for the (conditional) direct average treatment effect, allowing for cross-unit interference in both the outcome and treatment equations. Leveraging ideas from statistical physics, we introduce a novel Ising model capturing interference in the treatment assignment, and then obtain three main results. First, we establish a Berry-Esseen distributional approximation pointwise in the degree of interference generated by the Ising model. Our distributional approximation recovers known results in the literature under no-interference in treatment assignment, and also highlights a fundamental fragility of inference procedures developed using such a pointwise approximation. Second, we establish a uniform distributional approximation for the Hajek estimator, and develop robust inference procedures that remain valid regardless of the unknown degree of interference in the Ising model. Third, we propose a novel resampling method for implementation of robust inference procedure. A key technical innovation underlying our work is a new \textitDe-Finetti Machine that facilitates conditional i.i.d. Gaussianization, a technique that may be of independent interest in other settings.
@article{cattaneo2025robust, title = {Robust Inference for the Direct Average Treatment Effect with Treatment Assignment Interference}, author = {Cattaneo, Matias D and He, Yihan and Yu, Ruiqi Rae}, journal = {arXiv preprint arXiv:2502.13238}, year = {2025}, url = {https://arxiv.org/abs/2502.13238}, }
arXiv
rd2d: Causal Inference in Boundary Discontinuity Designs

Matias D Cattaneo, Rocio Titiunik, and Ruiqi Rae Yu

arXiv preprint arXiv:2505.07989, 2025

Abs Bib HTML PDF

Boundary discontinuity designs—also known as Multi-Score Regression Discontinuity (RD) designs, with Geographic RD designs as a prominent example—are often used in empirical research to learn about causal treatment effects along a continuous assignment boundary defined by a bivariate score. This article introduces the R package ‘rd2d‘, which implements and extends the methodological results developed in Cattaneo et al. [2025] for boundary discontinuity designs. The package employs local polynomial estimation and inference using either the bivariate score or a univariate distance-to-boundary metric. It features novel data-driven bandwidth selection procedures, and offers both pointwise and uniform estimation and inference along the assignment boundary. The numerical performance of the package is demonstrated through a simulation study.
@article{cattaneo2025rd2d, title = {rd2d: Causal Inference in Boundary Discontinuity Designs}, author = {Cattaneo, Matias D and Titiunik, Rocio and Yu, Ruiqi Rae}, journal = {arXiv preprint arXiv:2505.07989}, year = {2025}, url = {https://arxiv.org/abs/2505.07989}, }
arXiv
Estimation and Inference in Boundary Discontinuity Designs: Location-Based Methods

Matias D Cattaneo, Rocio Titiunik, and Ruiqi Rae Yu

arXiv preprint arXiv:2505.05670, 2025

Abs Bib HTML PDF

Boundary discontinuity designs are used to learn about causal treatment effects along a continuous assignment boundary that splits units into control and treatment groups according to a bivariate location score. We analyze the statistical properties of local polynomial treatment effect estimators employing location information for each unit. We develop pointwise and uniform estimation and inference methods for both the conditional treatment effect function at the assignment boundary as well as for transformations thereof, which aggregate information along the boundary. We illustrate our methods with an empirical application. Companion general-purpose software is provided.
@article{cattaneo2025location, title = {Estimation and Inference in Boundary Discontinuity Designs: Location-Based Methods}, author = {Cattaneo, Matias D and Titiunik, Rocio and Yu, Ruiqi Rae}, journal = {arXiv preprint arXiv:2505.05670}, year = {2025}, url = {https://arxiv.org/abs/2505.05670}, }
arXiv
Estimation and Inference in Boundary Discontinuity Designs: Distance-Based Methods

Matias D Cattaneo, Rocio Titiunik, and Ruiqi Rae Yu

arXiv preprint arXiv:2510.26051, 2025

Abs Bib HTML PDF

We study the statistical properties of nonparametric distance-based (isotropic) local polynomial regression estimators of the conditional average treatment effect at the boundary, a key causal functional parameter capturing heterogeneous treatment effects in boundary discontinuity designs. We present necessary and/or sufficient conditions for identification, estimation and inference in large samples, both pointwise and uniformly along the assignment boundary. Our theoretical results highlight the crucial role played by the “regularity" of the boundary (a one-dimensional manifold) over which identification, estimation and inference is conducted. Our methods are illustrated with simulated and real-world data. Companion general-purpose software is provided.
@article{cattaneo2025distance, title = {Estimation and Inference in Boundary Discontinuity Designs: Distance-Based Methods}, author = {Cattaneo, Matias D and Titiunik, Rocio and Yu, Ruiqi Rae}, journal = {arXiv preprint arXiv:2510.26051}, year = {2025}, url = {https://arxiv.org/abs/2510.26051}, }
working paper
Estimation and Inference in Boundary Discontinuity Designs: Pooling

Matias D Cattaneo, Rocio Titiunik, and Ruiqi Rae Yu

2025

Abs Bib

Boundary discontinuity designs are used to learn about causal treatment effects along a continuous assignment boundary that splits units into control and treatment groups according to a bivariate location score. We study the statistical properties of a pooling based local polynomial regression estimator for boundary average treatment effect, where all observations with score within a small region covering the assignment boundary are used.
@article{cattaneo2025pool, title = {Estimation and Inference in Boundary Discontinuity Designs: Pooling}, author = {Cattaneo, Matias D and Titiunik, Rocio and Yu, Ruiqi Rae}, year = {2025}, }
working paper
Boundary Discontinuity Designs: Theory and Practice

Matias D Cattaneo, Rocio Titiunik, and Ruiqi Rae Yu

2025

Abs Bib

This chapter provides a review of boundary discontinuity designs, a powerful non-experimental research methodology that identifies causal effects by exploiting a thresholding treatment assignment rule based on a bivariate score and a a boundary curve. This methodology generalizes standard regression discontinuity designs based on a univariate score and scalar cutoff, having specific challenges and features related to its multi-dimensional nature. We synthesize the empirical literature by systematically reviewing over 80 empirical papers, tracing the method’s application from its formative uses to its widespread and sophisticated implementation in modern research. In addition to this empirical survey, the chapter overviews the latest methodological developments, offering a guide to the state-of-the-art in identification, estimation and inference results for boundary discontinuity designs.
@article{cattaneo2025review, title = {Boundary Discontinuity Designs: Theory and Practice}, author = {Cattaneo, Matias D and Titiunik, Rocio and Yu, Ruiqi Rae}, year = {2025}, }