Yong Jae Lee, UW-Madison

Yong Jae Lee

Associate Professor
Department of Computer Sciences
University of Wisconsin-Madison

Office: Computer Sciences 5395
Email: yongjaelee at cs dot wisc dot edu
Phone: 608-262-1804

Bio / CV / Research Statement / Github / Google Scholar

I am an Associate Professor in the Computer Sciences Department at UW-Madison. My research interests are in computer vision and machine learning, with a focus on creating robust visual recognition systems that can learn to understand the visual world with minimal human supervision. Before joining UW-Madison in Fall 2021, I spent one year as an AI Visiting Faculty at Cruise, and before that, 6 wonderful years as an Assistant and then Associate Professor at UC Davis. Prior to that, I was a Postdoctoral Fellow in the EECS Dept at UC Berkeley working with Alyosha Efros (8/2013-6/2014). I also spent a year as a Postdoctoral Fellow in the Robotics Institute at Carnegie Mellon University (8/2012-8/2013), where I had the good fortune to work with Alyosha Efros and Martial Hebert. I obtained my Ph.D. from the University of Texas at Austin in May 2012 under the supervision of Kristen Grauman, and my B.S. from the University of Illinois at Urbana-Champaign in May 2006. I have also worked with Larry Zitnick and Michael Cohen as a summer intern at Microsoft Research.

Prospective PhD students: please apply to the UW-Madison Computer Sciences program if you'd like to work with me. Unfortunately, I may not be able to respond to individual emails about admissions.

News

8/2021: I joined the CS department at UW-Madison as an Associate Professor

9/2020: Our paper on adaptive anti-aliasing in convnets won the Best Paper Award at BMVC 2020 [link]

6/2020: I will join the CS department at UW-Madison as an Associate Professor in Fall 2021, and spend one year at Cruise as an AI Visiting Faculty till then

5/2020: I received tenure -- I am grateful to all my former and current students, collaborators, mentors, and letter writers

10/2019: Our real-time instance segmentation method, YOLACT, won the Most Innovative Award at the COCO Object Detection Challenge, ICCV 2019

9/2019: I received the UC Davis College of Engineering Outstanding Junior Faculty Award [link]

6/2019: My PhD student Krishna Kumar Singh won the 2019 UC Davis Best Graduate Researcher in Computer Science Award (Honorable Mention) -- congrats Krishna!

6/2019: My undergraduate student Daniel Bolya won the 2019 UC Davis Chancellor's Award for Excellence in Undergraduate Research (Honorable Mention) -- congrats Daniel!

3/2019: I received the Adobe Data Science Research Award -- thanks Adobe!

8/2018: I received the AWS ML Research Award and a new NSF grant to study 1st-3rd person videos

6/2018: My PhD student Fanyi Xiao won the 2018 UC Davis Best Graduate Researcher in Computer Science Award -- congrats Fanyi!

3/2018: I received the NSF CAREER Award [link]

8/2017: I received the Army Research Office (ARO) Young Investigator Award [link]

7/2017: I received the Hellman Foundation Fellowship [link]

4/2016: Organizing the 4th Workshop on Egocentric (First-Person) Vision at CVPR 2016 with Michael Ryoo (Indiana), Kris Kitani (CMU), and Yin Li (Georgia Tech)

7/2014: I joined the CS department at UC Davis as an Assistant Professor

Research Interests

My research interests are in computer vision and machine learning. I am particularly interested in creating robust visual recognition systems that can understand visual data with minimal human supervision.

Teaching

CS 839: Learning based Image Synthesis and Manipulation (Fall 2024)
CS 639: Deep Learning for Computer Vision (Spring 2024)
CS 839: Learning based Image Synthesis and Manipulation (Fall 2023)
CS 639: Deep Learning for Computer Vision (Spring 2023)
CS 839: Learning based Image Synthesis and Manipulation (Fall 2022)
CS 839: Deep Learning for Visual Recognition (Spring 2022)
ECS 174: Computer Vision (Spring 2020)
ECS 269: Visual Recognition (Fall 2019)
ECS 174: Computer Vision (Spring 2019)
ECS 269: Visual Recognition (Fall 2018)
ECS 174: Computer Vision (Spring 2018)
ECS 289G: Visual Recognition (Winter 2018)
ECS 174: Computer Vision (Spring 2017)
ECS 289G: Visual Recognition (Fall 2016)
ECS 289G: Visual Recognition (Fall 2015)
ECS 189G: Intro to Computer Vision (Spring 2015)
ECS 289H: Visual Recognition (Fall 2014)

Lab Members

Zhuoran Yu (PhD student)
Mu Cai (PhD student)
Zeyi Huang (PhD student)
Thao Nguyen (PhD student)
Anirudh Sundara Rajan (MS student)
Harris Zhang (MS student)
Jaden Park (PhD student)
Aniket Rege (PhD student, co-advised with Ramya Vinayak)
Bocheng Zou (Undergraduate student); Trewartha Senior Thesis Research Award 2024

Alumni

Former PhD students
Fanyi Xiao (2015-2020) → Research Scientist at Meta AI; UC Davis Best Graduate Researcher in Computer Science Award 2018
Krishna Kumar Singh (2015-2020) → Research Scientist at Adobe Research; UC Davis Best Graduate Researcher in Computer Science Award (Honorable Mention) 2019
Maheen Rashid (2015-2021) → Research Scientist at Univrses
Xueyan Zou (2018-2024) → Postdoc at UCSD
Utkarsh Ojha (2019-2024) → Postdoc at Carnegie Mellon University
Haotian Liu (2019-2024) → Research Scientist at xAI
Yuheng Li (2019-2024) → Research Scientist at Adobe Research

Former MS students
Zhongzheng (Jason) Ren (2016-2018) → PhD student at UIUC
Wenjian Hu (2017-2018) → Research Scientist at Facebook
Leonardo Ferrer (2017) → Software Engineer at Google
Wei-Pang (Tyler) Jan (2019) → Software Engineer at Amazon
Chong Zhou (2019-2020) → PhD student at Nanyang Technological University
Yangming Wen (2019-2020) → Research Engineer at Electronic Arts
Yang Xue (2020-2022) → AI Perception Engineer at Black Sesame Technologies
Rafael A. Rivera-Soto (2020-2021) → PhD student at Johns Hopkins University

Former BS students and visitors
Antonia Creswell (2014-2015) → PhD student at Imperial College London
Yi Mang (Terry) Yang (2017-2018) → Software Engineer at Amazon
Xie Zhou (2018-2019) → MS student at UC Berkeley
Daniel Bolya (2018-2019) → PhD student at Georgia Tech; Chancellor's Award for Excellence in Undergraduate Research (Honorable Mention) 2019, NSF graduate research fellowship
Aron Sarmasi (2018-2019) → MS student at UC Davis
Waiyu Lam (2019-2020) → MS student at Cornell
Qi Zhu (summer 2015, co-supervised with Ian Davidson) → PhD student at UIUC
Xiuye Gu (summer 2016, 2018-2019) → MS student at Stanford
Haolin Fu (summer 2016, co-supervised with Cho-Jui Hsieh) → MS student at Yale
Haotian Liu (summer 2018) → PhD student at UC Davis
Hao Yu (summer 2018) → PhD student at Boston University
Weixin Luo (visiting research scholar, 2018-2019)
Moniek Smink (Undergraduate student) → MS student at TU Delft; Hilldale Research Fellowship 2023-2024
Harris Zhang (Undergraduate student) → MS student at UW Madison; Trewartha Senior Thesis Research Award 2024

Funding

I am grateful for the support by the National Science Foundation (CAREER IIS-1751206 / IIS-2150012, IIS-1748387, IIS-1812850 / IIS-2204808, and IIS-2404180) , Army Research Office (Young Investigator Program), NASA, Wisconsin Alumni Research Foundation, Hellman Fellows Program, Intel, Adobe, Nvidia, Amazon, ETRI, Samsung, AmFam, and Sony.

Recent talks

Real-time Instance Segmentation with the YOLACT Family
OpenMMLab Tutorial, CVPR, June 2021 (virtual)
[talk video]

Learning to Understand Visual Data with Minimal Human Supervision
University of Wisconsin - Madison, April 2020 (virtual)
[talk video]

Preprints

NEW! Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos
Jianrui Zhang*, Mu Cai*, and Yong Jae Lee
(*equal contribution)
arXiv 2024
[project page] [arXiv] [code] [data] [leaderboard]

NEW! TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Mu Cai, Reuben Tan, Jianrui Zhang, Bocheng Zou, Kai Zhang, Feng Yao, Fangrui Zhu, Jing Gu, Yiwu Zhong, Yuzhang Shang, Yao Dou, Jaden Park, Jianfeng Gao^, Yong Jae Lee^, Jianwei Yang^
(^equal advising)
arXiv 2024
[project page] [arXiv] [code] [data] [leaderboard]

NEW! LLaVA-NeXT: Improved reasoning, OCR, and world knowledge
Haotian Liu, Chunyuan Li, Yuheng Li, Bo Li, Yuanhan Zhang, Sheng Shen, and Yong Jae Lee
January 2024
[blog]

NEW! Matryoshka Multimodal Models
Mu Cai, Jianwei Yang, Jianfeng Gao, and Yong Jae Lee
arXiv 2024
[project page] [arXiv] [code]

NEW! LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
Xiang Li, Cristina Mata, Jongwoo Park, Kumara Kahatapitiya, Yoo Sung Jang, Jinghuan Shang, Kanchana Ranasinghe, Ryan Burgert, Mu Cai, Yong Jae Lee, and Michael S. Ryoo
arXiv 2024
[project page] [arXiv] [code]

NEW! LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models
Yuzhang Shang*, Mu Cai*, Bingxin Xu, Yong Jae Lee^, and Yan Yan^
(*equal contribution, ^equal advising)
arXiv 2024
[project page] [arXiv] [code]

Diversify, Don't Fine-Tune: Scaling Up Visual Recognition Training with Synthetic Images
Zhuoran Yu, Chenchen Zhu, Sean Culatana, Raghuraman Krishnamoorthi, Fanyi Xiao, and Yong Jae Lee
arXiv 2023
[arXiv]

Generate Anything Anywhere in Any Scene
Yuheng li, Haotian Liu, Yangming Wen, and Yong Jae Lee
arXiv 2023
[arXiv]

Publications

NEW! Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding
Mu Cai*, Zeyi Huang*, Yuheng Li, Haohan Wang, and Yong Jae Lee
(*equal contribution)
Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), 2025
[arXiv]

NEW! Yo'LLaVA: Your Personalized Language and Vision Assistant
Thao Nguyen, Haotian Liu, Mu Cai, Yuheng Li, Utkarsh Ojha, and Yong Jae Lee
Neural Information Processing Systems (NeurIPS), 2024
[project page] [arXiv] [code]

NEW! What can Foundation Models’ Embeddings do?
Xueyan Zou, Linjie Li, Jianfeng Wang, Jianwei Yang, Mingyu Ding, Junyi Wei, Zhengyuan Yang, Feng Li, Hao Zhang, Shilong Liu, Arul Aravinthan, Yong Jae Lee*, and Lijuan Wang*
(*equal advising)
Neural Information Processing Systems (NeurIPS), 2024
[arXiv] [code]

NEW! VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation
Bocheng Zou*, Mu Cai*, Jianrui Zhang, and Yong Jae Lee
(*equal contribution)
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
[project page] [arXiv] [code] [dataset]

NEW! MATE: Meet At The Embedding - Connecting Images with Long Texts
Young Kyun Jang, Junmo Kang, Yong Jae Lee, and Donghyun Kim
Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP Findings), 2024
[arXiv]

NEW! Removing Distributional Discrepancies in Captions Improves Image-Text Alignment
Yuheng Li, Haotian Liu, Mu Cai, Yijun Li, Eli Shechtman, Zhe Lin, Yong Jae Lee, and Krishna Kumar Singh
Proceedings of the European Conference on Computer Vision (ECCV), 2024
[project page] [arXiv] [code]

NEW! CounterCurate: Enhancing Physical and Semantic Visio-Linguistic Compositional Reasoning via Counterfactual Examples
Jianrui Zhang*, Mu Cai*, Tengyang Xie, and Yong Jae Lee
(*equal contribution)
Findings of the Association for Computational Linguistics (ACL Findings), 2024
[project page] [arXiv] [code]

Cross-Modal Self-Supervised Learning with Effective Contrastive Units for Point Clouds
Mu Cai, Chenxu Luo, Yong Jae Lee, and Xiaodong Yang
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024
[arXiv]

Improved Baselines with Visual Instruction Tuning (LLaVA-1.5)
Haotian Liu, Chunyuan Li, Yuheng Li, and Yong Jae Lee
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024 (Highlight, top 2.8%)
[project page] [arXiv] [demo] [code]

Making Large Multimodal Models Understand Arbitrary Visual Prompts (ViP-LLaVA)
Mu Cai, Haotian Liu, Siva Karthik Mustikovela, Gregory P. Meyer, Yuning Chai, Dennis Park, and Yong Jae Lee
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[project page] [arXiv] [demo] [code]

Edit One for All: Interactive Batch Image Editing
Thao Nguyen, Utkarsh Ojha, Yuheng Li, Haotian Liu, and Yong Jae Lee
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[project page] [arXiv] [code]

Computer Vision on the Edge: Individual Cattle Identification in Real-Time With ReadMyCow System
Moniek Smink, Haotian Liu, Dorte Dopfer, and Yong Jae Lee
Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), 2024
[pdf]

Investigating the Catastrophic Forgetting in Multimodal Large Language Models
Yuexiang Zhai, Shengbang Tong, Xiao Li, Mu Cai, Qing Qu, Yong Jae Lee, and Yi Ma
Conference on Parsimony and Learning (CPAL), 2024
[arXiv]

Exploring the Capabilities of a General-Purpose Robotic Arm in Chess Gameplay
Kazuki Shin, Sankalp Yamsani, Roman Mineyev, Hongyu Chen, Nitish Gandi, Yong Jae Lee, and Joohyung Kim
IEEE-RAS International Conference on Humanoid Robots (Humanoids), 2023
[pdf] [video]

Visual Instruction Tuning (LLaVA)
Haotian Liu*, Chunyuan Li*, Qingyang Wu, and Yong Jae Lee
(*equal contribution)
Neural Information Processing Systems (NeurIPS), 2023 (Oral presentation, top 0.5%)
[project page] [arXiv] [demo] [code]

What Knowledge Gets Distilled in Knowledge Distillation?
Utkarsh Ojha*, Yuheng Li*, Anirudh Sundara Rajan*, Yingyu Liang, and Yong Jae Lee
(*equal contribution)
Neural Information Processing Systems (NeurIPS), 2023
[arXiv]

Visual Instruction Inversion: Image Editing via Image Prompting
Thao Nguyen, Yuheng Li, Utkarsh Ojha, and Yong Jae Lee
Neural Information Processing Systems (NeurIPS), 2023
[project page] [arXiv] [code]

Segment Everything Everywhere All at Once
Xueyan Zou‡, Jianwei Yang‡, Hao Zhang‡, Feng Li‡, Linjie Li, Jianfeng Wang, Lijuan Wang, Jianfeng Gao*, and Yong Jae Lee*
(‡,*equal contribution)
Neural Information Processing Systems (NeurIPS), 2023
[arXiv] [demo] [code]

A Sentence Speaks a Thousand Images: Domain Generalization through Distilling CLIP with Language Guidance
Zeyi Huang, Andy Zhou, Zijian Ling, Mu Cai, Haohan Wang, and Yong Jae Lee
Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2023
[arXiv] [code]

GLIGEN: Open-Set Grounded Text-to-Image Generation
Yuheng Li, Haotian Liu, Qingyang Wu, Fangzhou Mu, Jianwei Yang, Jianfeng Gao, Chunyuan Li*, and Yong Jae Lee*
(*equal advising)
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[project page] [arXiv] [demo] [code]

Generalized Decoding for Pixel, Image, and Language
Xueyan Zou*, Zi-Yi Dou*, Jianwei Yang*, Zhe Gan, Linjie Li, Chunyuan Li, Xiyang Dai, Jianfeng Wang, Lu Yuan, Nanyun Peng, Lijuan Wang, Harkirat Behl, Yong Jae Lee‡, and Jianfeng Gao‡
(*^,‡ equal contribution)
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[project page] [arXiv] [demo] [code]

Towards Universal Fake Image Detectors that Generalize Across Generative Models
Utkarsh Ojha*, Yuheng Li*, and Yong Jae Lee
(*equal contribution)
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[project page] [arXiv] [code]

REACT: Learning Customized Visual Models with Retrieval-Augmented Knowledge
Haotian Liu, Kilho Son, Jianwei Yang, Ce Liu, Jianfeng Gao, Yong Jae Lee*, and Chunyuan Li*
(*equal advising)
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023 (Highlight, top 2.5%)
[project page] [arXiv] [code]

InPL: Pseudo-labeling the Inliers First for Imbalanced Semi-supervised Learning
Zhuoran Yu, Yin Li, and Yong Jae Lee
International Conference on Learning Representations (ICLR), 2023
[arXiv] [code]

Delving Deeper into Anti-aliasing in ConvNets
Xueyan Zou, Fanyi Xiao, Zhiding Yu, Yuheng Li, and Yong Jae Lee
International Journal of Computer Vision (IJCV), 2022 (journal extension of our BMVC 2020 conference paper)
Invited article for best papers of BMVC 2020
[pdf] [code]

ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models
Chunyuan Li*, Haotian Liu*, Liunian Harold Li, Pengchuan Zhang, Jyoti Aneja, Jianwei Yang, Ping Jin, Houdong Hu, Zicheng Liu, Yong Jae Lee, and Jianfeng Gao
(*equal contribution)
Neural Information Processing Systems (NeurIPS), Datasets and Benchmarks Track, 2022
[project page] [arXiv] [talk video] [toolkit]

Masked Discrimination for Self-Supervised Learning on Point Clouds
Haotian Liu, Mu Cai, and Yong Jae Lee
Proceedings of the European Conference on Computer Vision (ECCV), 2022
[arXiv] [code] [talk video]

Contrastive Learning for Diverse Disentangled Foreground Generation
Yuheng Li, Yijun Li, Jingwan Lu, Eli Shechtman, Yong Jae Lee, and Krishna Kumar Singh
Proceedings of the European Conference on Computer Vision (ECCV), 2022
[project page] [arXiv]

GIRAFFE HD: A High-Resolution 3D-aware Generative Model
Yang Xue, Yuheng Li, Krishna Kumar Singh, and Yong Jae Lee
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
[project page] [arXiv] [code]

The Two Dimensions of Worst-case Training and the Integrated Effect for Out-of-domain Generalization
Zeyi Huang*, Haohan Wang*, Dong Huang, Yong Jae Lee† and Eric Xing†
(*^,† equal contribution)
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
[arXiv] [code]

Toward Learning Human-aligned Cross-domain Robust Models by Countering Misaligned Features
Haohan Wang, Zeyi Huang, Hanlin Zhang, Yong Jae Lee, and Eric Xing
Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), 2022
[arXiv]

Equine Pain Behaviour Classification via Self-supervised Disentangled Pose Representation
Maheen Rashid, Sofia Broome, Katrina Ask, Elin Hernlund, Pia Haubro Andersen, Hedvig Kjellstrom, and Yong Jae Lee
Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), 2022
[arXiv]

PartGAN: Weakly-supervised Part Decomposition for Image Generation and Segmentation
Yuheng Li, Krishna Kumar Singh, Yang Xue, and Yong Jae Lee
Proceedings of the British Machine Vision Conference (BMVC), 2021
[pdf]

Collaging Class-specific GANs for Semantic Image Synthesis
Yuheng Li, Yijun Li, Jingwan Lu, Eli Shechtman, Yong Jae Lee, and Krishna Kumar Singh
Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2021
[arXiv] [talk video]

YolactEdge: Real-time Instance Segmentation on the Edge
Haotian Liu*, Rafael A. Rivera-Soto*, Fanyi Xiao, and Yong Jae Lee
(*equal contribution)
IEEE International Conference on Robotics and Automation (ICRA), 2021
[arXiv] [code] [youtube] [talk video] [Colab Notebook] [Colab Notebook (TensorRT)]

Few-shot Image Generation via Cross-domain Correspondence
Utkarsh Ojha, Yijun Li, Jingwan Lu, Alexei A. Efros, Yong Jae Lee, Eli Shechtman, and Richard Zhang
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
[project page] [arXiv] [code]

Progressive Temporal Feature Alignment Network for Video Inpainting
Xueyan Zou, Linjie Yang, Ding Liu, and Yong Jae Lee
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
[arXiv] [code] [youtube]

Generating Furry Cars: Disentangling Object Shape and Appearance across Multiple Domains
Utkarsh Ojha, Krishna Kumar Singh, and Yong Jae Lee
International Conference on Learning Representations (ICLR), 2021
[project page] [open review] [arXiv] [talk video]

SinGAN-GIF: Learning a Generative Video Model from a Single GIF
Rajat Arora and Yong Jae Lee
Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), 2021
[project page] [pdf] [talk video]

Seeing the Unseen: Predicting the First-Person Camera Wearer's Location and Pose in Third-Person Scenes
Yangming Wen, Krishna Kumar Singh, Markham Anderson, Wei-Pang Jan, and Yong Jae Lee
International Workshop on Egocentric Perception, Interaction and Computing (EPIC), ICCV 2021
[pdf]

Elastic-InfoGAN: Unsupervised Disentangled Representation Learning in Class-Imbalanced Data
Utkarsh Ojha, Krishna Kumar Singh, Cho-Jui Hsieh, and Yong Jae Lee
Neural Information Processing Systems (NeurIPS), 2020
[project page] [arXiv] [code]

YOLACT++: Better Real-time Instance Segmentation
Daniel Bolya*, Chong Zhou*, Fanyi Xiao, and Yong Jae Lee
(*equal contribution)
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020 (journal extension of our ICCV 2019 conference paper with improved models)
[arXiv] [code]

Delving Deeper into Anti-aliasing in ConvNets
Xueyan Zou, Fanyi Xiao, Zhiding Yu, and Yong Jae Lee
Proceedings of the British Machine Vision Conference (BMVC), 2020 (Oral presentation)
Best Paper Award
[project page] [arXiv] [code] [talk video]

Password-conditioned Anonymization and Deanonymization with Face Identity Transformers
Xiuye Gu, Weixin Luo, Michael Ryoo, and Yong Jae Lee
Proceedings of the European Conference on Computer Vision (ECCV), 2020
[arXiv] [code] [demo] [1 min talk video] [10 min talk video]

Boxer: Preventing Fraud by Scanning Credit Cards
Zainul Abi Din, Hari Venugopalan, Jaime Park, Andy Li, Weisu Yin, Haohui Mai, Yong Jae Lee, Steven Liu, and Samuel T. King
Proceedings of the USENIX Security Symposium (USENIX Security), 2020
[pdf] [project page] [talk video]

MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation
Yuheng Li, Krishna Kumar Singh, Utkarsh Ojha, and Yong Jae Lee
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020
[arXiv] [code] [youtube] [talk video]

Don’t Judge an Object by Its Context: Learning to Overcome Contextual Bias
Krishna Kumar Singh, Dhruv Mahajan, Kristen Grauman, Yong Jae Lee, Matt Feiszli, and Deepti Ghadiyaram
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020 (Oral presentation)
[arXiv] [project page]

Instance-aware, Context-focused, and Memory-efficient Weakly-supervised Object Detection
Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Yong Jae Lee, Alexander Schwing, and Jan Kautz
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020
[arXiv] [project page] [code]

Action Graphs: Weakly-supervised Action Localization with Graph Convolution Networks
Maheen Rashid, Hedvig Kjellström, and Yong Jae Lee
Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), 2020
[arXiv] [code]

Audiovisual SlowFast Networks for Video Recognition
Fanyi Xiao, Yong Jae Lee, Kristen Grauman, Jitendra Malik, and Christoph Feichtenhofer
arXiv 2019
[arXiv]

YOLACT: Real-time Instance Segmentation
Daniel Bolya, Chong Zhou, Fanyi Xiao, and Yong Jae Lee
Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019 (Oral presentation)
Most Innovative Award, COCO Object Detection Challenge, ICCV 2019
[arXiv] [code] [pdf] [talk video]

Identity from here, Pose from there: Self-supervised Disentanglement and Generation of Objects using Unlabeled Videos
Fanyi Xiao, Haotian Liu, and Yong Jae Lee
Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019
[pdf]

FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery
Krishna Kumar Singh*, Utkarsh Ojha*, and Yong Jae Lee
(*equal contribution)
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019 (Oral presentation)
[project page] [pdf] [arXiv] [code] [youtube] [talk video]

You reap what you sow: Using Videos to Generate High Precision Object Proposals for Weakly-supervised Object Detection
Krishna Kumar Singh and Yong Jae Lee
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
[project page] [pdf] [code]

HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-scale Point Clouds
Xiuye Gu, Yijie Wang, Chongruo Wu, Yong Jae Lee, and Panqu Wang
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
[pdf] [supp] [code]

Video Object Detection with an Aligned Spatial-Temporal Memory
Fanyi Xiao and Yong Jae Lee
Proceedings of the European Conference on Computer Vision (ECCV), 2018
[project page] [pdf] [code]

Learning to Anonymize Faces for Privacy Preserving Action Detection
Zhongzheng Ren, Yong Jae Lee, and Michael Ryoo
Proceedings of the European Conference on Computer Vision (ECCV), 2018
[project page] [pdf] [youtube]

DOCK: Detecting Objects by transferring Common-sense Knowledge
Krishna Kumar Singh, Santosh Divvala, Ali Farhadi, and Yong Jae Lee
Proceedings of the European Conference on Computer Vision (ECCV), 2018
[project page] [pdf] [code]

A Visual Attention Grounding Neural Model for Multimodal Machine Translation
Mingyang Zhou, Runxiang Cheng, Yong Jae Lee, and Zhou Yu
Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2018 (Oral presentation)
[pdf]

Cross-Domain Self-supervised Multi-task Feature Learning using Synthetic Imagery
Zhongzheng Ren and Yong Jae Lee
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018
[project page] [pdf] [code]

Who Will Share My Image? Predicting the Content Diffusion Path in Online Social Networks
Wenjian Hu, Krishna Kumar Singh*, Fanyi Xiao*, Jinyoung Han, Chen-Nee Chuah, and Yong Jae Lee
(*equal contribution)
Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM), 2018
[pdf]

Can a Machine Learn to See Horse Pain? An Interdisciplinary Approach Towards Automated Decoding of Facial Expressions of Pain in the Horse
Pia Andersen, Karina Gleerup, Jennifer Wathan, Britt Coles, Hedvig Kjellström, Sofia Broome, Yong Jae Lee, Maheen Rashid, Claudia Sonder, Erika Rosenberger, and Deborah Forster
International Conference on Methods and Techniques in Behavioral Research (Measuring Behavior), 2018
[pdf]

What Should I Annotate? An Automatic Tool for Finding Video Segments for EquiFACS Annotation
Maheen Rashid, Sofia Broome, Pia Andersen, Karina Gleerup, and Yong Jae Lee
International Conference on Methods and Techniques in Behavioral Research (Measuring Behavior), 2018
[pdf]

Hide-and-Seek: Forcing a Network to be Meticulous for Weakly-supervised Object and Action Localization
Krishna Kumar Singh and Yong Jae Lee
Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017
[project page] [pdf] [supp] [code]

Weakly-supervised Visual Grounding of Phrases with Linguistic Structures
Fanyi Xiao, Leonid Sigal, and Yong Jae Lee
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
[project page] [pdf]

Interspecies Knowledge Transfer for Facial Keypoint Detection
Maheen Rashid, Xiuye Gu, and Yong Jae Lee
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
[project page] [pdf] [code] [data]

Identifying First-Person Camera Wearers in Third-Person Videos
Chenyou Fan, Jangwon Lee, Mingze Xu, Krishna Kumar Singh, Yong Jae Lee, David Crandall and Michael Ryoo
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
[pdf]

Who Moved My Cheese? Automatic Annotation of Rodent Behaviors with Convolutional Neural Networks
Zhongzheng Ren, Adriana Noronha, Annie Vogel Ciernia, and Yong Jae Lee
Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2017
[project page] [pdf] [code] [data]

Analyzing the Adoption and Cascading Process of OSN-Based Gifting Applications: An Empirical Study
M. Rezaur Rahman, Jinyoung Han, Yong Jae Lee, and Chen-Nee Chuah
ACM Transactions on the Web (TWEB), 2017
[pdf]

End-to-End Localization and Ranking for Relative Attributes
Krishna Kumar Singh and Yong Jae Lee
Proceedings of the European Conference on Computer Vision (ECCV), 2016
[project page] [pdf] [code]

Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection
Krishna Kumar Singh, Fanyi Xiao, and Yong Jae Lee
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016
[project page] [pdf] [arXiv (with more results)] [code]

Track and Segment: An Iterative Unsupervised Approach for Video Object Proposals
Fanyi Xiao and Yong Jae Lee
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016 (Spotlight presentation)
[project page] [pdf] [code]

Localizing and Visualizing Relative Attributes
Fanyi Xiao and Yong Jae Lee
Springer Book Chapter on Visual Attributes, 2016
[pdf] [code]

Discovering Mid-level Visual Connections in Space and Time
Yong Jae Lee, Alexei A. Efros, and Martial Hebert
Springer Book Chapter on Visual Analysis and Geo-Localization of Large Scale Imagery, 2016
[pdf] [code] [data]

Discovering the Spatial Extent of Relative Attributes
Fanyi Xiao and Yong Jae Lee
Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015 (Oral presentation)
[project page] [pdf] [slides] [code] [video presentation]

FlowWeb: Joint Image Set Alignment by Weaving Consistent, Pixel-wise Correspondences
Tinghui Zhou, Yong Jae Lee, Stella X. Yu, and Alexei A. Efros
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015 (Oral presentation)
[project page] [pdf] [code]

Predicting Important Objects for Egocentric Video Summarization
Yong Jae Lee and Kristen Grauman
International Journal of Computer Vision (IJCV), 2015
[project page] [pdf] [arXiv] [data]

Weakly-supervised Discovery of Visual Pattern Configurations
Hyun Oh Song, Yong Jae Lee, Stefanie Jegelka, and Trevor Darrell
Neural Information Processing Systems (NIPS), 2014
[pdf]

AverageExplorer: Interactive Exploration and Alignment of Visual Data Collections
Jun-Yan Zhu, Yong Jae Lee, and Alexei A. Efros
ACM Transactions on Graphics (Proceedings of SIGGRAPH), 2014 (Oral presentation)
[project page] [pdf] [youtube] [See article in The New Yorker]

Style-aware Mid-level Representation for Discovering Visual Connections in Space and Time
Yong Jae Lee, Alexei A. Efros, and Martial Hebert
Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2013 (Oral presentation)
[project page] [pdf] [slides] [code] [data] [video presentation]

Discovering Important People and Objects for Egocentric Video Summarization
Yong Jae Lee, Joydeep Ghosh, and Kristen Grauman
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012
[project page] [pdf] [supp] [extended abstract] [data]

Object-Graphs for Context-Aware Visual Category Discovery
Yong Jae Lee and Kristen Grauman
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2012
[project page] [pdf] [code]

Key-Segments for Video Object Segmentation
Yong Jae Lee, Jaechul Kim, and Kristen Grauman
Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2011
[project page] [pdf] [code] [data]

ShadowDraw: Real-Time User Guidance for Freehand Drawing
Yong Jae Lee, Larry Zitnick, and Michael Cohen
ACM Transactions on Graphics (Proceedings of SIGGRAPH), 2011 (Oral presentation)
[project page] [pdf] [slides] [video] [youtube] [data]

Face Discovery with Social Context
Yong Jae Lee and Kristen Grauman
Proceedings of the British Machine Vision Conference (BMVC), 2011
[project page] [pdf] [extended abstract]

Learning the Easy Things First: Self-Paced Visual Category Discovery
Yong Jae Lee and Kristen Grauman
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011
[project page] [pdf]

Object-Graphs for Context-Aware Category Discovery
Yong Jae Lee and Kristen Grauman
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010 (Oral presentation)
[project page] [pdf] [supp] [slides] [code]

Collect-Cut: Segmentation with Top-Down Cues Discovered in Multi-Object Images
Yong Jae Lee and Kristen Grauman
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010
[project page] [pdf] [supp] [data]

Foreground Focus: Unsupervised Learning from Partially Matching Images
Yong Jae Lee and Kristen Grauman
International Journal of Computer Vision (IJCV), 2009
[project page] [pdf]

Shape Discovery from Unlabeled Image Collections
Yong Jae Lee and Kristen Grauman
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009
[project page] [pdf] [supp]

Foreground Focus: Finding Meaningful Features in Unlabeled Images
Yong Jae Lee and Kristen Grauman
Proceedings of the British Machine Vision Conference (BMVC), 2008 (Oral presentation)
[project page] [pdf] [slides]

Ray-based Color Image Segmentation
Changhai Xu, Yong Jae Lee, and Benjamin Kuipers
Proceedings of the Canadian Conference on Computer and Robot Vision (CRV), 2008
[pdf]

Theses

PhD thesis: Visual Object Category Discovery in Images and Videos
MS thesis: Foreground Focus: Finding Meaningful Features in Unlabeled Images

	NEW! Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos Jianrui Zhang, Mu Cai, and Yong Jae Lee (*equal contribution) arXiv 2024 [project page] [arXiv] [code] [data] [leaderboard]
	NEW! TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models Mu Cai, Reuben Tan, Jianrui Zhang, Bocheng Zou, Kai Zhang, Feng Yao, Fangrui Zhu, Jing Gu, Yiwu Zhong, Yuzhang Shang, Yao Dou, Jaden Park, Jianfeng Gao^, Yong Jae Lee^, Jianwei Yang^ (^equal advising) arXiv 2024 [project page] [arXiv] [code] [data] [leaderboard]
	NEW! LLaVA-NeXT: Improved reasoning, OCR, and world knowledge Haotian Liu, Chunyuan Li, Yuheng Li, Bo Li, Yuanhan Zhang, Sheng Shen, and Yong Jae Lee January 2024 [blog]
	NEW! Matryoshka Multimodal Models Mu Cai, Jianwei Yang, Jianfeng Gao, and Yong Jae Lee arXiv 2024 [project page] [arXiv] [code]
	NEW! LLaRA: Supercharging Robot Learning Data for Vision-Language Policy Xiang Li, Cristina Mata, Jongwoo Park, Kumara Kahatapitiya, Yoo Sung Jang, Jinghuan Shang, Kanchana Ranasinghe, Ryan Burgert, Mu Cai, Yong Jae Lee, and Michael S. Ryoo arXiv 2024 [project page] [arXiv] [code]
	NEW! LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models Yuzhang Shang, Mu Cai, Bingxin Xu, Yong Jae Lee^, and Yan Yan^ (*equal contribution, ^equal advising) arXiv 2024 [project page] [arXiv] [code]
	Diversify, Don't Fine-Tune: Scaling Up Visual Recognition Training with Synthetic Images Zhuoran Yu, Chenchen Zhu, Sean Culatana, Raghuraman Krishnamoorthi, Fanyi Xiao, and Yong Jae Lee arXiv 2023 [arXiv]
	Generate Anything Anywhere in Any Scene Yuheng li, Haotian Liu, Yangming Wen, and Yong Jae Lee arXiv 2023 [arXiv]

	NEW! Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding Mu Cai, Zeyi Huang, Yuheng Li, Haohan Wang, and Yong Jae Lee (equal contribution) Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV*), 2025 [arXiv]
	NEW! Yo'LLaVA: Your Personalized Language and Vision Assistant Thao Nguyen, Haotian Liu, Mu Cai, Yuheng Li, Utkarsh Ojha, and Yong Jae Lee Neural Information Processing Systems (NeurIPS), 2024 [project page] [arXiv] [code]
	NEW! What can Foundation Models’ Embeddings do? Xueyan Zou, Linjie Li, Jianfeng Wang, Jianwei Yang, Mingyu Ding, Junyi Wei, Zhengyuan Yang, Feng Li, Hao Zhang, Shilong Liu, Arul Aravinthan, Yong Jae Lee, and Lijuan Wang (equal advising) Neural Information Processing Systems (NeurIPS*), 2024 [arXiv] [code]
	NEW! VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation Bocheng Zou, Mu Cai, Jianrui Zhang, and Yong Jae Lee (equal contribution) Conference on Empirical Methods in Natural Language Processing (EMNLP*), 2024 [project page] [arXiv] [code] [dataset]
	NEW! MATE: Meet At The Embedding - Connecting Images with Long Texts Young Kyun Jang, Junmo Kang, Yong Jae Lee, and Donghyun Kim Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP Findings), 2024 [arXiv]
	NEW! Removing Distributional Discrepancies in Captions Improves Image-Text Alignment Yuheng Li, Haotian Liu, Mu Cai, Yijun Li, Eli Shechtman, Zhe Lin, Yong Jae Lee, and Krishna Kumar Singh Proceedings of the European Conference on Computer Vision (ECCV), 2024 [project page] [arXiv] [code]
	NEW! CounterCurate: Enhancing Physical and Semantic Visio-Linguistic Compositional Reasoning via Counterfactual Examples Jianrui Zhang, Mu Cai, Tengyang Xie, and Yong Jae Lee (equal contribution) Findings of the Association for Computational Linguistics (ACL* Findings), 2024 [project page] [arXiv] [code]
	Cross-Modal Self-Supervised Learning with Effective Contrastive Units for Point Clouds Mu Cai, Chenxu Luo, Yong Jae Lee, and Xiaodong Yang IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024 [arXiv]
	Improved Baselines with Visual Instruction Tuning (LLaVA-1.5) Haotian Liu, Chunyuan Li, Yuheng Li, and Yong Jae Lee Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024 (Highlight, top 2.8%) [project page] [arXiv] [demo] [code]
	Making Large Multimodal Models Understand Arbitrary Visual Prompts (ViP-LLaVA) Mu Cai, Haotian Liu, Siva Karthik Mustikovela, Gregory P. Meyer, Yuning Chai, Dennis Park, and Yong Jae Lee Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024 [project page] [arXiv] [demo] [code]
	Edit One for All: Interactive Batch Image Editing Thao Nguyen, Utkarsh Ojha, Yuheng Li, Haotian Liu, and Yong Jae Lee Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024 [project page] [arXiv] [code]
	Computer Vision on the Edge: Individual Cattle Identification in Real-Time With ReadMyCow System Moniek Smink, Haotian Liu, Dorte Dopfer, and Yong Jae Lee Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), 2024 [pdf]
	Investigating the Catastrophic Forgetting in Multimodal Large Language Models Yuexiang Zhai, Shengbang Tong, Xiao Li, Mu Cai, Qing Qu, Yong Jae Lee, and Yi Ma Conference on Parsimony and Learning (CPAL), 2024 [arXiv]
	Exploring the Capabilities of a General-Purpose Robotic Arm in Chess Gameplay Kazuki Shin, Sankalp Yamsani, Roman Mineyev, Hongyu Chen, Nitish Gandi, Yong Jae Lee, and Joohyung Kim IEEE-RAS International Conference on Humanoid Robots (Humanoids), 2023 [pdf] [video]
	Visual Instruction Tuning (LLaVA) Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee (equal contribution) Neural Information Processing Systems (NeurIPS), 2023 (Oral presentation, top 0.5%*) [project page] [arXiv] [demo] [code]
	What Knowledge Gets Distilled in Knowledge Distillation? Utkarsh Ojha, Yuheng Li, Anirudh Sundara Rajan, Yingyu Liang, and Yong Jae Lee* (equal contribution) Neural Information Processing Systems (NeurIPS*), 2023 [arXiv]
	Visual Instruction Inversion: Image Editing via Image Prompting Thao Nguyen, Yuheng Li, Utkarsh Ojha, and Yong Jae Lee Neural Information Processing Systems (NeurIPS), 2023 [project page] [arXiv] [code]
	Segment Everything Everywhere All at Once Xueyan Zou‡, Jianwei Yang‡, Hao Zhang‡, Feng Li‡, Linjie Li, Jianfeng Wang, Lijuan Wang, Jianfeng Gao, and Yong Jae Lee (‡,equal contribution) Neural Information Processing Systems (NeurIPS*), 2023 [arXiv] [demo] [code]
	A Sentence Speaks a Thousand Images: Domain Generalization through Distilling CLIP with Language Guidance Zeyi Huang, Andy Zhou, Zijian Ling, Mu Cai, Haohan Wang, and Yong Jae Lee Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2023 [arXiv] [code]
	GLIGEN: Open-Set Grounded Text-to-Image Generation Yuheng Li, Haotian Liu, Qingyang Wu, Fangzhou Mu, Jianwei Yang, Jianfeng Gao, Chunyuan Li, and Yong Jae Lee** (equal advising) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR*), 2023 [project page] [arXiv] [demo] [code]
	Generalized Decoding for Pixel, Image, and Language Xueyan Zou, Zi-Yi Dou, Jianwei Yang, Zhe Gan, Linjie Li, Chunyuan Li, Xiyang Dai, Jianfeng Wang, Lu Yuan, Nanyun Peng, Lijuan Wang, Harkirat Behl, Yong Jae Lee‡, and Jianfeng Gao‡ (^,‡ equal contribution) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023 [project page] [arXiv] [demo] [code]
	Towards Universal Fake Image Detectors that Generalize Across Generative Models Utkarsh Ojha, Yuheng Li, and Yong Jae Lee (equal contribution) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR*), 2023 [project page] [arXiv] [code]
	REACT: Learning Customized Visual Models with Retrieval-Augmented Knowledge Haotian Liu, Kilho Son, Jianwei Yang, Ce Liu, Jianfeng Gao, Yong Jae Lee, and Chunyuan Li (equal advising) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023 (Highlight, top 2.5%*) [project page] [arXiv] [code]
	InPL: Pseudo-labeling the Inliers First for Imbalanced Semi-supervised Learning Zhuoran Yu, Yin Li, and Yong Jae Lee International Conference on Learning Representations (ICLR), 2023 [arXiv] [code]
	Delving Deeper into Anti-aliasing in ConvNets Xueyan Zou, Fanyi Xiao, Zhiding Yu, Yuheng Li, and Yong Jae Lee International Journal of Computer Vision (IJCV), 2022 (journal extension of our BMVC 2020 conference paper) Invited article for best papers of BMVC 2020 [pdf] [code]
	ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models Chunyuan Li, Haotian Liu, Liunian Harold Li, Pengchuan Zhang, Jyoti Aneja, Jianwei Yang, Ping Jin, Houdong Hu, Zicheng Liu, Yong Jae Lee, and Jianfeng Gao (equal contribution) Neural Information Processing Systems (NeurIPS*), Datasets and Benchmarks Track, 2022 [project page] [arXiv] [talk video] [toolkit]
	Masked Discrimination for Self-Supervised Learning on Point Clouds Haotian Liu, Mu Cai, and Yong Jae Lee Proceedings of the European Conference on Computer Vision (ECCV), 2022 [arXiv] [code] [talk video]
	Contrastive Learning for Diverse Disentangled Foreground Generation Yuheng Li, Yijun Li, Jingwan Lu, Eli Shechtman, Yong Jae Lee, and Krishna Kumar Singh Proceedings of the European Conference on Computer Vision (ECCV), 2022 [project page] [arXiv]
	GIRAFFE HD: A High-Resolution 3D-aware Generative Model Yang Xue, Yuheng Li, Krishna Kumar Singh, and Yong Jae Lee Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022 [project page] [arXiv] [code]
	The Two Dimensions of Worst-case Training and the Integrated Effect for Out-of-domain Generalization Zeyi Huang, Haohan Wang, Dong Huang, Yong Jae Lee† and Eric Xing† (^,† equal contribution) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR*), 2022 [arXiv] [code]
	Toward Learning Human-aligned Cross-domain Robust Models by Countering Misaligned Features Haohan Wang, Zeyi Huang, Hanlin Zhang, Yong Jae Lee, and Eric Xing Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), 2022 [arXiv]
	Equine Pain Behaviour Classification via Self-supervised Disentangled Pose Representation Maheen Rashid, Sofia Broome, Katrina Ask, Elin Hernlund, Pia Haubro Andersen, Hedvig Kjellstrom, and Yong Jae Lee Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), 2022 [arXiv]
	PartGAN: Weakly-supervised Part Decomposition for Image Generation and Segmentation Yuheng Li, Krishna Kumar Singh, Yang Xue, and Yong Jae Lee Proceedings of the British Machine Vision Conference (BMVC), 2021 [pdf]
	Collaging Class-specific GANs for Semantic Image Synthesis Yuheng Li, Yijun Li, Jingwan Lu, Eli Shechtman, Yong Jae Lee, and Krishna Kumar Singh Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2021 [arXiv] [talk video]
	YolactEdge: Real-time Instance Segmentation on the Edge Haotian Liu, Rafael A. Rivera-Soto, Fanyi Xiao, and Yong Jae Lee (equal contribution) IEEE International Conference on Robotics and Automation (ICRA*), 2021 [arXiv] [code] [youtube] [talk video] [Colab Notebook] [Colab Notebook (TensorRT)]
	Few-shot Image Generation via Cross-domain Correspondence Utkarsh Ojha, Yijun Li, Jingwan Lu, Alexei A. Efros, Yong Jae Lee, Eli Shechtman, and Richard Zhang Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021 [project page] [arXiv] [code]
	Progressive Temporal Feature Alignment Network for Video Inpainting Xueyan Zou, Linjie Yang, Ding Liu, and Yong Jae Lee Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021 [arXiv] [code] [youtube]
	Generating Furry Cars: Disentangling Object Shape and Appearance across Multiple Domains Utkarsh Ojha, Krishna Kumar Singh, and Yong Jae Lee International Conference on Learning Representations (ICLR), 2021 [project page] [open review] [arXiv] [talk video]
	SinGAN-GIF: Learning a Generative Video Model from a Single GIF Rajat Arora and Yong Jae Lee Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), 2021 [project page] [pdf] [talk video]
	Seeing the Unseen: Predicting the First-Person Camera Wearer's Location and Pose in Third-Person Scenes Yangming Wen, Krishna Kumar Singh, Markham Anderson, Wei-Pang Jan, and Yong Jae Lee International Workshop on Egocentric Perception, Interaction and Computing (EPIC), ICCV 2021 [pdf]
	Elastic-InfoGAN: Unsupervised Disentangled Representation Learning in Class-Imbalanced Data Utkarsh Ojha, Krishna Kumar Singh, Cho-Jui Hsieh, and Yong Jae Lee Neural Information Processing Systems (NeurIPS), 2020 [project page] [arXiv] [code]
	YOLACT++: Better Real-time Instance Segmentation Daniel Bolya, Chong Zhou, Fanyi Xiao, and Yong Jae Lee (equal contribution) IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020 (journal* extension of our ICCV 2019 conference paper with improved models) [arXiv] [code]
	Delving Deeper into Anti-aliasing in ConvNets Xueyan Zou, Fanyi Xiao, Zhiding Yu, and Yong Jae Lee Proceedings of the British Machine Vision Conference (BMVC), 2020 (Oral presentation) Best Paper Award [project page] [arXiv] [code] [talk video]
	Password-conditioned Anonymization and Deanonymization with Face Identity Transformers Xiuye Gu, Weixin Luo, Michael Ryoo, and Yong Jae Lee Proceedings of the European Conference on Computer Vision (ECCV), 2020 [arXiv] [code] [demo] [1 min talk video] [10 min talk video]
	Boxer: Preventing Fraud by Scanning Credit Cards Zainul Abi Din, Hari Venugopalan, Jaime Park, Andy Li, Weisu Yin, Haohui Mai, Yong Jae Lee, Steven Liu, and Samuel T. King Proceedings of the USENIX Security Symposium (USENIX Security), 2020 [pdf] [project page] [talk video]
	MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation Yuheng Li, Krishna Kumar Singh, Utkarsh Ojha, and Yong Jae Lee Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020 [arXiv] [code] [youtube] [talk video]
	Don’t Judge an Object by Its Context: Learning to Overcome Contextual Bias Krishna Kumar Singh, Dhruv Mahajan, Kristen Grauman, Yong Jae Lee, Matt Feiszli, and Deepti Ghadiyaram Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020 (Oral presentation) [arXiv] [project page]
	Instance-aware, Context-focused, and Memory-efficient Weakly-supervised Object Detection Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Yong Jae Lee, Alexander Schwing, and Jan Kautz Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020 [arXiv] [project page] [code]
	Action Graphs: Weakly-supervised Action Localization with Graph Convolution Networks Maheen Rashid, Hedvig Kjellström, and Yong Jae Lee Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), 2020 [arXiv] [code]
	Audiovisual SlowFast Networks for Video Recognition Fanyi Xiao, Yong Jae Lee, Kristen Grauman, Jitendra Malik, and Christoph Feichtenhofer arXiv 2019 [arXiv]
	YOLACT: Real-time Instance Segmentation Daniel Bolya, Chong Zhou, Fanyi Xiao, and Yong Jae Lee Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019 (Oral presentation) Most Innovative Award, COCO Object Detection Challenge, ICCV 2019 [arXiv] [code] [pdf] [talk video]
	Identity from here, Pose from there: Self-supervised Disentanglement and Generation of Objects using Unlabeled Videos Fanyi Xiao, Haotian Liu, and Yong Jae Lee Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019 [pdf]
	FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery Krishna Kumar Singh, Utkarsh Ojha, and Yong Jae Lee (equal contribution) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019 (Oral presentation*) [project page] [pdf] [arXiv] [code] [youtube] [talk video]
	You reap what you sow: Using Videos to Generate High Precision Object Proposals for Weakly-supervised Object Detection Krishna Kumar Singh and Yong Jae Lee Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019 [project page] [pdf] [code]
	HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-scale Point Clouds Xiuye Gu, Yijie Wang, Chongruo Wu, Yong Jae Lee, and Panqu Wang Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019 [pdf] [supp] [code]
	Video Object Detection with an Aligned Spatial-Temporal Memory Fanyi Xiao and Yong Jae Lee Proceedings of the European Conference on Computer Vision (ECCV), 2018 [project page] [pdf] [code]
	Learning to Anonymize Faces for Privacy Preserving Action Detection Zhongzheng Ren, Yong Jae Lee, and Michael Ryoo Proceedings of the European Conference on Computer Vision (ECCV), 2018 [project page] [pdf] [youtube]
	DOCK: Detecting Objects by transferring Common-sense Knowledge Krishna Kumar Singh, Santosh Divvala, Ali Farhadi, and Yong Jae Lee Proceedings of the European Conference on Computer Vision (ECCV), 2018 [project page] [pdf] [code]
	A Visual Attention Grounding Neural Model for Multimodal Machine Translation Mingyang Zhou, Runxiang Cheng, Yong Jae Lee, and Zhou Yu Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2018 (Oral presentation) [pdf]
	Cross-Domain Self-supervised Multi-task Feature Learning using Synthetic Imagery Zhongzheng Ren and Yong Jae Lee Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 [project page] [pdf] [code]
	Who Will Share My Image? Predicting the Content Diffusion Path in Online Social Networks Wenjian Hu, Krishna Kumar Singh, Fanyi Xiao, Jinyoung Han, Chen-Nee Chuah, and Yong Jae Lee (equal contribution) Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM*), 2018 [pdf]
	Can a Machine Learn to See Horse Pain? An Interdisciplinary Approach Towards Automated Decoding of Facial Expressions of Pain in the Horse Pia Andersen, Karina Gleerup, Jennifer Wathan, Britt Coles, Hedvig Kjellström, Sofia Broome, Yong Jae Lee, Maheen Rashid, Claudia Sonder, Erika Rosenberger, and Deborah Forster International Conference on Methods and Techniques in Behavioral Research (Measuring Behavior), 2018 [pdf]
	What Should I Annotate? An Automatic Tool for Finding Video Segments for EquiFACS Annotation Maheen Rashid, Sofia Broome, Pia Andersen, Karina Gleerup, and Yong Jae Lee International Conference on Methods and Techniques in Behavioral Research (Measuring Behavior), 2018 [pdf]
	Hide-and-Seek: Forcing a Network to be Meticulous for Weakly-supervised Object and Action Localization Krishna Kumar Singh and Yong Jae Lee Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017 [project page] [pdf] [supp] [code]
	Weakly-supervised Visual Grounding of Phrases with Linguistic Structures Fanyi Xiao, Leonid Sigal, and Yong Jae Lee Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 [project page] [pdf]
	Interspecies Knowledge Transfer for Facial Keypoint Detection Maheen Rashid, Xiuye Gu, and Yong Jae Lee Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 [project page] [pdf] [code] [data]
	Identifying First-Person Camera Wearers in Third-Person Videos Chenyou Fan, Jangwon Lee, Mingze Xu, Krishna Kumar Singh, Yong Jae Lee, David Crandall and Michael Ryoo Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 [pdf]
	Who Moved My Cheese? Automatic Annotation of Rodent Behaviors with Convolutional Neural Networks Zhongzheng Ren, Adriana Noronha, Annie Vogel Ciernia, and Yong Jae Lee Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2017 [project page] [pdf] [code] [data]
	Analyzing the Adoption and Cascading Process of OSN-Based Gifting Applications: An Empirical Study M. Rezaur Rahman, Jinyoung Han, Yong Jae Lee, and Chen-Nee Chuah ACM Transactions on the Web (TWEB), 2017 [pdf]
	End-to-End Localization and Ranking for Relative Attributes Krishna Kumar Singh and Yong Jae Lee Proceedings of the European Conference on Computer Vision (ECCV), 2016 [project page] [pdf] [code]
	Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection Krishna Kumar Singh, Fanyi Xiao, and Yong Jae Lee Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016 [project page] [pdf] [arXiv (with more results)] [code]
	Track and Segment: An Iterative Unsupervised Approach for Video Object Proposals Fanyi Xiao and Yong Jae Lee Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016 (Spotlight presentation) [project page] [pdf] [code]
	Localizing and Visualizing Relative Attributes Fanyi Xiao and Yong Jae Lee Springer Book Chapter on Visual Attributes, 2016 [pdf] [code]
	Discovering Mid-level Visual Connections in Space and Time Yong Jae Lee, Alexei A. Efros, and Martial Hebert Springer Book Chapter on Visual Analysis and Geo-Localization of Large Scale Imagery, 2016 [pdf] [code] [data]
	Discovering the Spatial Extent of Relative Attributes Fanyi Xiao and Yong Jae Lee Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015 (Oral presentation) [project page] [pdf] [slides] [code] [video presentation]
	FlowWeb: Joint Image Set Alignment by Weaving Consistent, Pixel-wise Correspondences Tinghui Zhou, Yong Jae Lee, Stella X. Yu, and Alexei A. Efros Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015 (Oral presentation) [project page] [pdf] [code]
	Predicting Important Objects for Egocentric Video Summarization Yong Jae Lee and Kristen Grauman International Journal of Computer Vision (IJCV), 2015 [project page] [pdf] [arXiv] [data]
	Weakly-supervised Discovery of Visual Pattern Configurations Hyun Oh Song, Yong Jae Lee, Stefanie Jegelka, and Trevor Darrell Neural Information Processing Systems (NIPS), 2014 [pdf]
	AverageExplorer: Interactive Exploration and Alignment of Visual Data Collections Jun-Yan Zhu, Yong Jae Lee, and Alexei A. Efros ACM Transactions on Graphics (Proceedings of SIGGRAPH), 2014 (Oral presentation) [project page] [pdf] [youtube] [See article in The New Yorker]
	Style-aware Mid-level Representation for Discovering Visual Connections in Space and Time Yong Jae Lee, Alexei A. Efros, and Martial Hebert Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2013 (Oral presentation) [project page] [pdf] [slides] [code] [data] [video presentation]
	Discovering Important People and Objects for Egocentric Video Summarization Yong Jae Lee, Joydeep Ghosh, and Kristen Grauman Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012 [project page] [pdf] [supp] [extended abstract] [data]
	Object-Graphs for Context-Aware Visual Category Discovery Yong Jae Lee and Kristen Grauman IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2012 [project page] [pdf] [code]
	Key-Segments for Video Object Segmentation Yong Jae Lee, Jaechul Kim, and Kristen Grauman Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2011 [project page] [pdf] [code] [data]
	ShadowDraw: Real-Time User Guidance for Freehand Drawing Yong Jae Lee, Larry Zitnick, and Michael Cohen ACM Transactions on Graphics (Proceedings of SIGGRAPH), 2011 (Oral presentation) [project page] [pdf] [slides] [video] [youtube] [data]
	Face Discovery with Social Context Yong Jae Lee and Kristen Grauman Proceedings of the British Machine Vision Conference (BMVC), 2011 [project page] [pdf] [extended abstract]
	Learning the Easy Things First: Self-Paced Visual Category Discovery Yong Jae Lee and Kristen Grauman Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011 [project page] [pdf]
	Object-Graphs for Context-Aware Category Discovery Yong Jae Lee and Kristen Grauman Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010 (Oral presentation) [project page] [pdf] [supp] [slides] [code]
	Collect-Cut: Segmentation with Top-Down Cues Discovered in Multi-Object Images Yong Jae Lee and Kristen Grauman Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010 [project page] [pdf] [supp] [data]
	Foreground Focus: Unsupervised Learning from Partially Matching Images Yong Jae Lee and Kristen Grauman International Journal of Computer Vision (IJCV), 2009 [project page] [pdf]
	Shape Discovery from Unlabeled Image Collections Yong Jae Lee and Kristen Grauman Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009 [project page] [pdf] [supp]
	Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman Proceedings of the British Machine Vision Conference (BMVC), 2008 (Oral presentation) [project page] [pdf] [slides]
	Ray-based Color Image Segmentation Changhai Xu, Yong Jae Lee, and Benjamin Kuipers Proceedings of the Canadian Conference on Computer and Robot Vision (CRV), 2008 [pdf]

	Real-time Instance Segmentation with the YOLACT Family OpenMMLab Tutorial, CVPR, June 2021 (virtual) [talk video]
	Learning to Understand Visual Data with Minimal Human Supervision University of Wisconsin - Madison, April 2020 (virtual) [talk video]