Machine Teaching

Machine teaching is the control of machine learning. The machine learning algorithm defines a dynamical system where the state (i.e. model) is driven by training data. Machine teaching designs the optimal training data to drive the learning algorithm to a target model.

Machine teaching has various applications, including security, education, and AI system building:

In security applications, the dynamical system is a machine learning algorithm. An adversary intentionally modifies the training data ("poisoning") to force machine learning into building a nefarious model. Machine teaching quantifies the capability of the adversary, and offers defenses.
In education applications, the dynamical system is a human student. The teacher optimizes the lesson (i.e. training data) to help the student learn a target model (i.e. achieving educational goal). If we are willing to assume a cognitive learning model of the student, we can use machine teaching to reverse-engineer the optimal training data.
In AI system building, machine teaching enables a domain expert to build a machine learning model faster and better than simply providing labeled training data.

This page contains our research on the theory, algorithms, and applications of machine teaching.

Publications

Tutorials

Xiaojin Zhu. An optimal control view of adversarial machine learning. arXiv:1811.04422, 2018.
[link]

Xiaojin Zhu, Adish Singla, Sandra Zilles, Anna N. Rafferty. An Overview of Machine Teaching. ArXiv 1801.05927, 2018.

Xiaojin Zhu. Machine Teaching: an Inverse Problem to Machine Learning and an Approach Toward Optimal Education. In The Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI ``Blue Sky'' Senior Member Presentation Track), 2015. AAAI / Computing Community Consortium "Blue Sky Ideas" Track Prize.
An overview of machine teaching.
[pdf | talk slides]

Theory of Machine Teaching

Farnam Mansouri, Yuxin Chen, Ara Vartanian, Xiaojin Zhu, and Adish Singla. Preference-based batch and sequential teaching: Towards a unified view of models. In Advances in Neural Information Processing Systems (NeurIPS), 2019.
[arXiv]

Sanjoy Dasgupta, Daniel Hsu, Stefanos Poulis, Xiaojin Zhu. Teaching a black-box learner. In The 36th International Conference on Machine Learning (ICML), 2019.

Laurent Lessard, Xuezhou Zhang, and Xiaojin Zhu. An optimal control approach to sequential machine teaching. In The 22nd International Conference on Artificial Intelligence and Statistics (AISTATS), 2019.
[pdf | arXiv 1810.06175]

Yuzhe Ma, Robert Nowak, Philippe Rigollet, Xuezhou Zhang, and Xiaojin Zhu. Teacher improves learning by selecting a training subset. In The 21st International Conference on Artificial Intelligence and Statistics (AISTATS), 2018.
[pdf | AMPL code]

Xiaojin Zhu, Ji Liu, and Manuel Lopes. No learner left behind: On the complexity of teaching multiple learners simultaneously. In The 26th International Joint Conference on Artificial Intelligence (IJCAI), 2017.
Minimax teaching dimension to make the worst learner in a class learn. Partitioning the class into sections improves teaching dimension.
[pdf]

Ji Liu and Xiaojin Zhu. The teaching dimension of linear learners. Journal of Machine Learning Research, 17(162):1-25, 2016.
This is the journal version of the ICML'16 paper, with a discussion on teacher-learner collusion.
[link]

Ji Liu, Xiaojin Zhu, and H. Gorune Ohannessian. The Teaching Dimension of Linear Learners. In The 33rd International Conference on Machine Learning (ICML), 2016.
We provide lower bounds on training set size to perfectly teach a linear learning. We also provide the corresponding upper bounds (and thus teaching dimension) by exhibiting teaching sets for SVM, logistic regression, and ridge regression.
[pdf | supplementary | arXiv preprint]

Xuezhou Zhang, Hrag Gorune Ohannessian, Ayon Sen, Scott Alfeld and Xiaojin Zhu. Optimal Teaching for Online Perceptrons. In NIPS 2016 workshop on Constructive Machine Learning, 2016. [pdf]

Xiaojin Zhu. Machine teaching for Bayesian learners in the exponential family. In Advances in Neural Information Processing Systems (NIPS), 2013.
We study machine teaching, or optimal teaching, the inverse problem of machine learning.
[pdf | poster]

Applications in Security, Trustworthy, and Interpretable AI

Xuezhou Zhang, Xiaojin Zhu, and Stephen Wright. Training set debugging using trusted items. In The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI), 2018
[pdf]

Scott Alfeld, Xiaojin Zhu, and Paul Barford. Explicit defense actions against test-set attacks. In The Thirty-First AAAI Conference on Artificial Intelligence (AAAI), 2017.
[pdf]

Scott Alfeld, Xiaojin Zhu, and Paul Barford. Data Poisoning Attacks against Autoregressive Models. In The Thirtieth AAAI Conference on Artificial Intelligence (AAAI), 2016.
[pdf]

Gabriel Cadamuro, Ran Gilad-Bachrach, and Xiaojin Zhu. Debugging machine learning models. In ICML Workshop on Reliable Machine Learning in the Wild, 2016.
Training data repair to ensure certain test items are correctly predicted. An application of machine teaching.
[pdf | extended abstract for CHI 2016 workshop on human centred machine learning]

Shike Mei and Xiaojin Zhu. The security of latent Dirichlet allocation. In The Eighteenth International Conference on Artificial Intelligence and Statistics (AISTATS), 2015.
How might an attacker poison the corpus to manipulate LDA topics? We answer this question via machine teaching.
[pdf]

Shike Mei and Xiaojin Zhu. Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners. In The Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI-15), 2015.
An application of machine teaching to identify the optimal training-set-attacks against a learning algorithm.
[pdf | poster ad | poster | Mendota ice data | Tech Report 1813]

Shike Mei and Xiaojin Zhu. Some Submodular Data-Poisoning Attacks on Machine Learners. Computer Science Tech Report 1822, University of Wisconsin-Madison, 2015.
[pdf]

Applications in Education and Cognitive Psychology

Robert M. Nosofsky, Craig A. Sanders, Xiaojin Zhu, and Mark A. McDaniel. Model-guided search for optimal natural-science-category training exemplars: A work in progress. Psychonomic Bulletin & Review, 2018.

Ayon Sen, Purav Patel, Martina A. Rau, Blake Mason, Robert Nowak, Timothy T. Rogers, and Xiaojin Zhu.
- For teaching perceptual fluency, machines beat human experts. In The 40th Annual Conference of the Cognitive Science Society (CogSci), 2018. [pdf]
- Machine beats human at sequencing visuals for perceptual-fluency practice. In Educational Data Mining, 2018. [pdf]

Kaustubh Patil, Xiaojin Zhu, Lukasz Kopec, and Bradley Love. Optimal Teaching for Limited-Capacity Human Learners. In Advances in Neural Information Processing Systems (NIPS), 2014.
Using machine teaching we construct an optimal training data set to teach human students a categorization task, assuming the students use GCM as the learning algorithm. Our optimal training data set is non-iid, has interesting "idealization" properties, and outperforms iid training data sets sampled from the underlying test distribution.
[pdf | poster | spotlight | data]

Faisal Khan, Xiaojin Zhu, and Bilge Mutlu. How do humans teach: On curriculum learning and teaching dimension. In Advances in Neural Information Processing Systems (NIPS) 25. 2011.
What is the optimal teaching strategy for a threshold function in 1D? Should one start teaching with items around the threshold? Or with the most unambiguous items farthest from the threshold? Two computational theories, teaching dimension and curriculum learning, disagree. We show that humans do the latter. We then extend teaching dimension theory to explain it.
[pdf | data | slides | UCSD teaching workshop talk]

Applications in AI System Building

Jina Suh, Xiaojin Zhu, and Saleema Amershi. The label complexity of mixed-initiative classifier training. In The 33rd International Conference on Machine Learning (ICML), 2016.
Do you do interactive machine learning with a human oracle? Then don't use active learning alone: mixing it with machine teaching is far better both in theory and in practice.
[pdf | supplementary]

Christopher Meek, Patrice Y. Simard, and Xiaojin Zhu. Analysis of a design pattern for teaching with features and labels, 2016. arXiv

Workshops

Machine Teaching for Humans. Jan 2023. Madeira.
NIPS Workshop on Teaching Machines, Robots, and Humans, Dec 2017, Long Beach, CA.
Dagstuhl Seminar on Machine Learning and Formal Methods. August 2017, Germany.

Talks

Toward adversarial learning as control at 2nd AOR/IARPA Workshop on Adversarial Machine Learning. University of Maryland, May 2018
Machine Teaching and its Applications. Department of Computer Science, Duke University. 2018
Machine Teaching as a Probe for Learning Mechanism in Humans at the Tsinghua Laboratory of Brain and Intelligence Workshop on Brain and Artificial Intelligence. Beijing, China. 2017
Debugging the Machine Learning Pipeline at the Interpretable Machine Learning Symposium, NIPS 2017
Introduction to Machine Teaching at the Workshop on Teaching Machines, Robots, and Humans, NIPS 2017
Dagstuhl Seminar on Machine Learning and Formal Methods. August 2017, Germany. A Challenge in Machine Teaching.
Talk at Simons Institute Workshop on Interactive Learning 2017, Berkely, CA. Machine Teaching in Interactive Learning
Talk at ICML 2016 Workshop on Reliable Machine Learning in the Wild, New York: Machine Teaching and Security
Talk at NIPS 2015 Workshop, Montreal, Canada: Machine Teaching for Personalized Education, Security, Interactive Machine Learning
Talk at ICML 2015 Workshop on Machine Learning for Education, Lille, France: Machine Teaching.

Code and Data

Pool-based machine teaching search algorithms through a file-based API: Version 0.1a0. Poolmate provides a command-line interface to algorithms for searching for teaching sets among a candidate pool. Poolmate is designed to work with any learner which can be communicated with through a file-based API.

Training set debugging using trusted items (AAAI 2018) DUTI code

In the media

Squashing the bugs in machine learning: Researchers make computer-trained models more trustworthy Computer Sciences Department News, UW-Madison. By Jennifer Smith. March 8, 2018

"Machine Teaching" Is Seen as Way to Develop Personalized Curricula. The Chronicle of Higher Education. By Mary Ellen McIntire, August 12, 2015

In the Mind of a Student. Inside Higher ED. By Jacqueline Thomsen, September 25, 2015

Machine teaching holds the power to illuminate human learning. University of Wisconsin-Madison News. By Jennifer A. Smith, August 10, 2015

Machine Learning? No, Machine Teaching. Science 2.0. August 13, 2015

The Flipside of Machine Learning. The R&D Magazine. By Greg Watry, August 13, 2015

Back to Professor Zhu's home page