[Paper] How to personalize education at scale.

06 Oct 2018

Education (conversely, learning) is one of the quintessential human experiences. It is also one of the most practical of human endeavors, leading directly to higher income and quality of life, greater social mobility (one of the strongest indicators of societal fairness), and greater tolerance for other people and beliefs.

Differences in education quality are also one of the greatest sources of inequity in the contemporary US, and across the globe. All else being equal, families (or communities) with more money can invest more in education, which does produce results. The issue is not that these families can afford high-quality education; the issue is that there are many families that cannot.

To illustrate the magnitude of this effect, Bloom (of “Bloom’s Taxonomy”) wrote in 1984 about the “Two-Sigma Problem” - that students who receive one-on-one tutoring perform more than two standard deviations above the average of students who receive classroom (1 teacher to ~30 students) instruction. This means that in randomized trials, the average student who receives one-on-one instruction performs better than 98% of students in a regular classroom. Clearly, personalized instruction is effective.

The Problem - Cost

The problem is that under traditional educational paradigms, personalization of education is expensive. It requires a tutor or instructor for every student, which just isn’t feasible in most economies. Engaging parents, leveraging near-peer tutoring (students who passed the class previously), and connecting students with mentors in the community are all great ways to move toward personalized learning for every student. But unfortunately, these do not address the fundamental cost of one-on-one tutoring; they only spread the cost out over more people. Furthermore, these individuals don’t necessarily have the pedagogical training and domain expertise that a dedicated tutor or professional teacher will have acquired through education degrees covering scientifically validated instructional approaches or through years of experience with scores of students.

In an attempt to address this problem in a cost-effective way, computerized tutors have become a popular area of research, with some notable successes. Programs like the Carnegie Math Cognitive Tutor, Khan Academy, or DuoLingo have achieved wide adoption with measured results, in some cases approaching the efficacy of personalized instruction. All of these successes have come from domains that are problem-rich; mathematics and foreign language instruction lend themselves easily to automatically graded questions that give a good idea of what students know. These programs all track what students know and are likely to get right, ensuring that the instruction provided is always appropriate to what the student has mastered. However, they do not accommodate differences in student background, interests, interpretation, or motivation. The question is, how can we extend these amazing results to other subjects like history, writing, or art?

A Solution - Machine Learning?

One of the most exciting technologies for adapting and personalizing processes at scale is machine learning. The data-driven processes that allow Facebook to recognize and tag faces, Google to guess what you’re looking for, and Netflix to recommend a movie you might like can be used to recognize different types of learners, suggest curricula, and recommend resources that can help students understand new topics. The technology is ripe for adaptation to education, if only we can solve a few small problems:

First, we need a way of measuring when we have succeeded. We need a measure of what students have learned, and it needs to be something that’s specific enough to track the benefits to individual students, short enough to be used whenever needed, and cheap enough to create that we can make one for every topic we might want to teach. One-on-one interviews are reliable, but expensive to administer, but automatically graded exams can be either too coarse or too difficult to design. To address this problem, I’ve been working on methods for generating quizzes using crowd-sourcing and machine learning, and some collaborators and I recently had a paper accepted on this topic: “Harnessing the Wisdom of the Classes: Classsourcing and Machine Learning for Assessment Instrument Generation”. In this paper we use a Multi-Armed Bandit Process to select questions from crowd-sourced contributions that are the most informative in distinguishing levels of student knowledge. (More on this in a future blog post!)

Second, we need a way of modeling students so we can predict how they will respond to different instruction. Part of that model is what they know, and the computerized tutors have shown that mastery tracking is enormously helpful, but students are more than just buckets of knowledge. The question is, what other features are helpful for predicting how students learn? Some collaborators at a company in China called Special A Education have suggested that personality assessments, such as the MBTI or Hexaco may be helpful for augmenting our model of students. They have also suggested that teachers may be able to identify character traits and students may be able to identify interests. As data is collected with more students, we can rigorously evaluate which of these additional features (or others) are most useful for determining how individual students learn best. Although it can be tempting to simply hand pick features that we think might be helpful, machine learning can help us to systematically identify features that are genuinely and statistically reliable. An exciting new direction is using open-ended “Notice and Wonder” activities to generate topic-specific features that might be useful for modeling students.

Third, we need a way of systematically and efficiently determining what the best way to teach each type of student is. This is why I am in grad school right now, working in Reinforcement Learning. The idea behind reinforcement learning to create algorithms that can improve over time based on signals of how well they have succeeded. The canonical reinforcement learning problem is the Markov Decision Process (MDP). MDPs have 4 parts (although it can change depending on how pedantic the RL researcher is feeling): (S, A, R, T). S stands for a state space. In other words, the set of possible states of the universe. In the education setting, this includes how much the student knows, what their interests are, and the other features we determine by solving the previous problem. A stands for action space. This is the set of actions the system can take to impact the world. In education, this might be showing an educational video, having the student read an article, play a game, create something, or complete an activity (online or with a teacher or peer in person). R stands for reward, and is the way that we measure success. In our setting this might be how much they know when we test them, or how quickly they master a topic. T represents the transitions in state. If I teach student B using action C, how will the state of the student change?

Clearly, education (and in fact, pedagogical experimentation) can be cast as a reinforcement learning problem. The problem is, it’s a really big reinforcement learning problem. There are so many different types of students, so many different ways of teaching them, and so many different things to teach that there’s no way we can just try every combination and see what works. We have to generalize across different contexts, deal with imperfect knowledge of the student, and hopefully notify teachers when it would be really helpful to have a new way of teaching certain students. But I’ve also just linked to a number of papers from people who have invented techniques that might be able to overcome these difficulties. We have an unprecedented access to data and ability to disseminate knowledge. The time is right for us to use this cutting-edge technology to address one of the most exciting possibilities of our time: giving every student the best education they can get.

A Note about Human Relationships

The goal of this article is not to push for computerized education because it is cheap. The goal of this article is to inspire us to work together on scalable personalized education, because it is effective. Electronic supplements to in-person education can free teachers working with groups of students to focus on important personal skills, to develop students’ communication and collaboration, and to inspire and display the qualities that make us uniquely human - our curiosity, empathy, and courage. Computerized tools can empower classrooms by removing or streamlining the mundane parts of learning, like assessment, memorization, and presentation of fact. Computerized tools can empower classrooms by providing data-driven support for novel pedagogical practices or learning activities. Computerized tools can empower classrooms by equipping interested parties like teachers, parents, and administrators with interpretable measures of what individual students have learned.

In the spirit of empowering human relationships, let’s work together for an amazing future!

This article is based on the paper “Personalized Education at Scale”, which I recently wrote with Evan Cater and Michael Littman.

To cite, feel free to use:

  title={Personalized Education at Scale},
  author={Saarinen, Sam and Cater, Evan and Littman, Michael},
  journal={arXiv preprint arXiv:1809.10025},

- (S)am

Subscribe - Get new insights from the frontier of AI and Education.