## Background

I am an associate professor in the Department of Statistics at the University of Wisconsin-Madison. I am also an affiliate of the Department of Educational Psychology (Quantitative Methods), the Department of Biostatics and Medical Informatics (BMI), the Center for Demography and Ecology (CDE) and the Center for Demogrophy of Health and Aging (CDHA). From 2015 to 2016, I did my postdoc in Economics at the Stanford Graduate School of Business, advised by Professor Guido Imbens. In 2015, I received my Ph.D. in Statistics at the Wharton School of Business of the University of Pennsylvania and I was co-advised by Professors T. Tony Cai and Dylan S. Small.

Broadly speaking, my research is focused on developing methods to analyze causal relationships by using instrumental variables, econometrics, semi/nonparametric methods, network analysis, and machine learning. I am interested in applications to genetics, epidemiology, infectious diseases, health policy, education, and applied microeconomics. I am currently an associate editor of Biometrics (2016-) and Journal of the American Statistical Association: Theory and Methods (2023-).

My research is supported by the National Science Foundation, the National Institute of Health, the Wisconsin Department of Agriculture, Trade and Consumer Protection (DATCP), UW-Madison's Data Science Initiative, and the Wisconsin Alumni Research Foundation.

## Education

NSF Postdoc, Stanford Graduate School of Business, Stanford University (2015-2016)

Ph.D. Statistics, The Wharton School of Business, University of Pennsylvania (2010-2015)

M.S. Statistics, Stanford University (2009-2010)

B.S. Mathematical and Computational Science, Stanford University (2006-2010)

## Papers

Athey, S., Chetty, R., Imbens, G., Kang, H. (2024+) The Surrogate Index: Combining Short-Term Proxies to Estimate Long-Term Treatment Effects More Rapidly and Precisely. * Review of Economic Studies.*

Kang, H., Guo, Z., Liu, Z., Small, D. (2024+) Identification and Inference with Invalid Instruments. *Annual Review of Statistics and Its Applications*.

Park, B., Kang, H., Zahasky, C. (2024+) Statistical Mapping of PFOA and PFOS in Groundwater throughout the Contiguous United States. * Environmental Science & Technology*.

Park, C. and Kang, H. (2024) A Groupwise Approach for Inferring Heterogeneous Treatment Effects in Causal Inference. * Journal of the Royal Statistical Society, Series A: Statistics in Society*. 187(2): 374-392.

Yan, D., Hu, B., Darst, B., Mukherjee, S., Kunkle, B., Deming, Y., Dumitrescu, L., Wang, Y., Naj, A., Kuzma, A., Zhao, Y., Kang, H., Johnson, S., Cruchaga, C., Hohman, T., Crane, P., Engelman, C., Alzheimer's Disease Genetics Consortium (ADGC), Lu, Q. (2024). Biobank-wide association scan identifies risk factors for late-onset Alzheimer's disease and endophenotypes. *eLife*. 12:RP01360.

Miao, X., Zhao, J., Kang, H. (2024) Transfer Learning Between U.S. Presidential Elections: How Should We Learn From A 2020 Ad Campaign To Inform 2024 Ad Campaigns? * arXiv*.

Lin, X., and Kang, H. (2024) Efficient, Cross-Fitting Estimation of Semiparametric Spatial Point Processes. *arXiv*.

Heng, S. and Kang, H. (2024) Generalized Rosenbaum Bounds Sensitivity Analysis for Matched Observational Studies with Treatment Doses: Sufficiency, Consistency, and Efficiency. *arXiv*.

Wu, Y., Kang, H., Ye, T. (2024) Debiased Multivariable Mendelian Randomization. * arXiv*.

Park, C., Chen, G., Yu, M., Kang, H. (2023+) Minimum Resource Threshold Policy Under Partial Interference.* Journal of the American Statistical Association*.

Suk, Y. and Kang, H. (2023) Tuning Random Forests for Causal Inference Under Cluster-Level Unmeasured Confounding. * Multivariate Behavioral Research.*58(2):408-440.

Kang, H. (2023) Summarising multiple bounds of the average causal effect in Mendelian randomization. * Paediatric and Perinatal Epidemiology.* 37(4): 338-340.

Park, C. and Kang, H. (2023) Assumption-Lean Analysis of Cluster Randomized Trials in Infectious Diseases for Intent-to-Treat Effects and Network Effects. * Journal of the American Statistical Association.* 118: 1195-1206.

Kokandakar, A., Kang, H., Deshpande, S. (2023) Bayesian Causal Forests & the 2022 ACIC Data Challenge: Scalability and Sensitivity. * Observational Studies.* 9(3): 29-41.

Kang, H.*, Park, C.*, and Trane, R.* (2023) Propensity Score Modeling: Key Challenges When Moving Beyond the No-Interference Assumption. * Observational Studies.* 9(1): 43-54. (*: equal contribution)

Kang, H. (2023) Discussion on "Instrumented difference-in-differences" by Ting Ye, Ashkan Ertefaie, James Flory, Sean Hennessy, and Dylan S. Small. * Biometrics. * 79(2):592-596.

Baiocchi, M. and Kang, H. (2023) Matching with Instrumental Variables. *Handbook of Matching and Weighting Adjustments for Causal Inference.*

Dong, R., Lu, Q., Kang, H., Suridjan, I., Kollmorgen, G., Wild, N., Deming, Y., Van Hulle, C. A., Anderson, R. M., Zetterberg, H., Blennow, K., Carlsson, C. M., Asthana, S. A., Johnson, S. C., Engelman, C. D. (2023) CSF metabolites associated with biomarkers of Alzheimer's disease pathology. *Frontiers in Aging Neuroscience.* 15: 1-15.

Keele, L. and Kang, H. (2022) An introduction to spillover effects in cluster randomized trials with noncompliance.* Clinical Trials.* 19(4): 375-379.

Ma, L., Kang, H., Liu, L. (2022) Semiparametric Efficient Dimension Reduction in Multivariate Regression with an Inner Envelope. *arXiv.*

Suk, Y., Steiner, P., Kim, J.S., Kang, H. (2022) Regression Discontinuity Designs with an Ordinal Running Variable: Evaluating the Effects of Extended Time Accommodations for English Language Learners.* Journal of Educational and Behavioral Statistics.* 47(4): 459-484.

Trane, R. and Kang, H. (2022) Nonparametric Bounds in Two-Sample Summary-Data Mendelian Randomization: Some Cautionary Tales for Practice.* Statistics in Medicine*, 41(14): 2523-2541.

Johnson, M., Cao, J., Kang, H. (2022) Detecting Heterogeneous Treatment Effects with Instrumental Variables and Application to the Oregon Health Insurance Experiment. * Annals of Applied Statistics*, 16(2): 1111-1129.

Kang, H.*, Lee, Y.*, Cai, T.T., Small, D.S. (2022) Two Robust Tools for Inference about Causal Effects with Invalid Instruments. * Biometrics*, 78: 24-34. (*: equal contribution).

Sanderson, E., Glymour, M. M., Holmes, M. V., Kang, H., Morrison, J., Munafo, M. R., Palmer, T., Schooling, C. M., Wallace, C., Zhao, Q., Davey Smith, G. (2022) Mendelian Randomization. * Nature Reviews Methods Primer*, 2(6): 1-21.

Park, C., Kang, H. (2022) Efficient Semiparametric Estimation of Network Treatment Effects Under Partial Interfernce. * Biometrika*. 109(4): 1015-1031.

Wang, S., Kang, H. (2022) Weak-Instrument Robust Tests in Two-Sample Summary-Data Mendelian Randomization. * Biometrics*. 78(4): 1699-1713.

Suk, Y. and Kang, H. (2022) Robust Machine Learning for Treatment Effects in Multilevel Observational Studies Under Cluster-level Unmeasured Confounding. * Psychometrika*, 87: 310-343.

Kancharla, M. and Kang, H. (2021) A Robust, Differentially Private Randomized Experiment for Evaluating Online Educational Programs With Sensitive Student Data.* arXiv*. [Video].

Park, C. and Kang, H. (2021) More Efficient, Doubly Robust, Nonparametric Estimators of Treatment Effects in Multilevel Studies. * arXiv*.

Heng, S.*, Kang, H.*, Small, D. S., Fogarty, C. B. (2021) Increasing Power for Observational Studies of Aberrant Response: An Adaptive Approach. * Journal of the Royal Statistical Society: Series B*, 83(3):482-504. (*: equal contribution).

Ye, T., Shao, J., Kang, H. (2021) Debiased Inverse-Variance Weighted Estimator in Two-Sample Summary-Data Mendelian Randomization. * Annals of Statistics*, 49(4):2079-2100.

Suk, Y., Kim, J-S, Kang, H. (2021) Hybridizing Machine Learning Methods and Finite Mixture Models for Estiamting Heterogeneous Treatment Effects in Latent Classes. * Journal of Educational and Behavioral Statistics*, 46(3):323-347.

Suk, Y., Kang, H., Kim, J-S. (2021) Random Forests Approach for Causal Inference with Clustered Observational Data. * Multivariate Behavioral Research*. 56(6): 828-852.

Kang, H., Jiang, Y., Zhao, Q., Small, D. S. (2021) ivmodel: An R Package for Inference and Sensitivity Analysis of Instrumental Variables Models with One Endogeneous Variable. * Observational Studies*, 7:1-24.

Panyard, D., Kim, K. M., Darst, B. F., Deming, Y. K., Zhong, X., Wu, Y., Kang, H., Carlsson, C. M., Johnson, S.C., Asthana, S.A., Engleman, C.D., Lu, Q. (2021) Cerebrospinal fluid metabolomics identifies 19 brain-related phenotype associations. * Communications Biology*, 4:63.1-11.

Bi, N., Kang, H., Taylor, J. (2020) Inferring Treatment Effects After Testing Instrument Strength in Linear Models. * arXiv*. [Video].

Bi, N., Kang, H., Taylor, J. (2019) Inference After Selecting Plausibly Valid Instruments with Application to Mendelian Randomization. * arXiv*.

You, J. C., Jones, E., Cross, D. E., Lyon, A. C., Kang, H., Newberg, A. B., Lippa, C. F. (2019) Association of beta-Amyloid Burden With Sleep Dysfunction and Cognitive Impairment in Elderly Individuals With Cognitive Disorders. * Journal of the American Medical Association: Network Open*, 2(10):e1913383. 1-12.

Hu, B., Shen, N., Li, J., Kang, H., Hong, J., Fletcher, J., Greenberg, J., Mailick, M., and Lu, Q. (2019) Genome-wide association study reveals sex-specific genetic architecture of facial attractiveness. *PLoS Genetics*, 15.4:1-18.

Kang, H., Keele, L. (2018) Spillover Effects in Cluster Randomized Trials with Noncompliance. *arXiv*.

Kang, H., Keele, L. (2018) Estimation Methods for Cluster Randomized Trials with Noncompliance: A Study of A Biometric Smartcard Payment System in India. * arXiv*.

Guo, Z., Kang, H., Cai, T. T., Small, D. S. (2018) Testing Endogeneity with High Dimensional Covariates. *Journal of Econometrics*, 207:175-187.

Kang, H., Peck, L., Keele, L. (2018) Inference for Instrumental Varaibles: A Randomization Inference Approach. *Journal of the Royal Statistical Society: Series A*, 181, 1231-1254.

Guo, Z., Kang, H., Cai, T. T., Small, D. S. (2018) Confidence Interval for Causal Effects with Invalid Instruments using Two-Stage Hard Thresholding with Voting. *Journal of the Royal Statistical Society: Series B*, 80:793-815.

Kang, H., Imbens, G. (2016) Peer Encouragement Designs in Causal Inference with Partial Interference and Identification of Local Average Network Effects. *arXiv*.

Kang, H. (2016) Commentary: Matched Instrumental Variables: A Possible Solution to Severe Confounding in Matched Observational Studies? * Epidemiology*,27, 624-632.

Kang, H., Kreuels, B., May, J., Small, D. S. (2016) Full Matching Approach to Instrumental Variables Estimation with Application to the Effect of Malaria on Stunting. * Annals of Applied Statistics*,10,335-364.

Kang, H., Zhang, A., Cai, T. T., Small, D. S. (2016) Instrumental Variables Estimation with Some Invalid Instruments and its Application to Mendelian Randomization. *Journal of the American Statistical Association*,111, 132-144.

Kang, H., Kreuels, B., Adjei, O., Krumkamp, R., May, J., Small, D. S. (2013) The Causal Effect of Malaria on Stunting: A Mendelian Randomization and Matching Approach. * International Journal of Epidemiology*,42,1390-1398.

## Software

All the software is available on GitHub: [link]

## Teaching

Stat 992: Causal Inference (Spring 2024) [link]