Houston: Human tutoring versus Computer tutoring. Who wins?

This is (probably) the final blog about our work in Houston. For newer readers: in Summer 2010 we sent a team to Houston. Their job was to design and launch a massive cohort of full-time math tutors. This was 1 part of a 5-pronged effort (called Apollo 20) to do "turnarounds" of 9 of their lowest-performing schools.

The news:

1. Roland Fryer, the economist who put this whole thing together, today published a scholarly paper analyzing the effects of the Apollo effort.

2. Ericka Mellon of the Houston Chronicle wrote a news article today about that study.

Tutoring seems to work

Fryer's research found that the tutoring - pairing one tutor with two students - was extremely effective, equating to between five and nine extra months in school.

Yay, team.

One new wrinkle in this discussion. If computer-based tutoring could get anywhere CLOSE to the effect of human tutoring, of course we'd propose computer tutoring! Much, much cheaper. And computer tutoring so durned hot right now.

Houston created a natural experiment. Only 6th and 9th graders got MATCH-style human tutoring, and that's where scores rose. Kids in every other grade got computer tutoring, along with a longer day and different teachers, yet test scores didn't move that much.

Here is the description of the 2 approaches:

For all sixth and ninth grade students, one period Monday through Thursday was devoted to receiving two-on-one tutoring in math. The total number of hours a student was tutored was approximately 189 hours for ninth graders and 215 hours for sixth graders. All sixth and ninth grade students received a class period of math tutoring every day, regardless of their previous math performance. The tutorials were a part of the regular class schedule for students, and students attended these tutorials in separate classrooms laid out intentionally to support the tutorial program.

This model was strongly recommended by the MATCH School, which has been successfully implementing a similar tutoring model since 2004. The justification for the model was twofold: first, all students could benefit from high-dosage tutoring, either to remediate deficiencies in students’ math skills or to provide acceleration for students already performing at or above grade level; second, including all students in a grade in the tutorial program was thought to remove the negative stigma often attached to tutoring programs that are exclusively used for remediation.

We hired 250 tutors – 230 were from the greater Houston area, 3 moved from other parts of Texas, and 17 moved from outside of Texas. Tutors were paid $20,000 with the possibility of earning an average bonus of $3,500 based on tutor attendance and student performance.

Consistent with Neal (forthcoming), the student performance portion of the tutor incentive program was based on relative student performance within the distribution of students in the district on the end-of-year state assessment. Tutor candidates were recruited from lists of Teach for America and MATCH applicants; additionally, the position was posted on college and university job boards at over 200 institutions across the country. We partnered with a core team of MATCH alumni who helped screen, hire, and train tutors based on the “No Excuses” philosophy, and develop a curriculum tightly aligned with Texas state standards.

In non-tutored grades – seven, eight, ten, eleven, and twelve – students received a “double dose” of math or reading – if they were below grade level – in the subject in which they the furthest behind.

This provided an extra 189 hours for high school students and 215 hours for middle school students of math/reading instruction for students who are below grade level. The curriculum for the extra math class was based on the Carnegie Math program. The Carnegie Math curriculum uses personalized math software featuring differentiated instruction based on previous student performance. The program incorporates continual assessment that is visible to both students and teachers.

The curriculum for the extra reading class utilized the READ 180 program. The READ 180 model relies on a very specific classroom instructional model: 20 minutes of whole-group instruction, an hour of small-group rotations among three stations (instructional software, small-group instruction, and modeled/independent reading) for 20 minutes each, and 10 minutes of whole-group wrap-up. The program provides specific supports for special education students and English Language Learners. The books used by students in the modeled/independent reading station are leveled readers that allow students to read age-appropriate subject matter at their tested lexile level. As with Carnegie Math, students are frequently assessed to determine their lexile level in order to adapt instruction to fit individual needs.

Computers are great for helping people learn what they want to learn. They're not particularly good at getting someone to learn something they do not want to learn. For that, you need very skilled people (teachers and tutors) who can build relationships, use that to generate order and effort from kids, and then turn that effort into learning. A computer needs to start on "third base" -- take effort and flip that into learning.

I think Steve Jobs had it right.

It is so much more hopeful to think that technology can solve the problems that are more human and more organizational and more political in nature, and it ain't so. We need to attack these things at the root, which is people and how much freedom we give people, the competition that will attract the best people. Unfortunately, there are side effects, like pushing out a lot of 46 year old teachers who lost their spirit fifteen years ago and shouldn't be teaching anymore. I feel very strongly about this. I wish it was as simple as giving it over to the computer.

Taken together:

We show that the average impact of these changes on student achievement is 0.276 standard deviations in math and 0.059 standard deviations in reading, which is strikingly similar to reported impacts of attending the Harlem Children’s Zone and Knowledge is Power Program schools – two strict “No Excuses” adherents.

So that's big.

Then we separate the "whole effect" into "high-dosage tutoring" and "everything else" -- and high-dosage tutoring is creating most of the gains, and the other stuff is kind of "bringing down the average."

For example,

Grade 6 math has an effect size of +.484 standard deviations.....versus .119 in Grades 7 and 8. So gains that are more than 400% bigger. It's double a typical KIPP effect.

Grade 9 math has an effect size of (a whopping) .739 SDs.....versus .165 in Grades 10 and 11. Same thing.

That's not to say we can't leverage computers to help deliver much better tutoring. My view is simply that we need a lot more experimentation on adult/computer tutoring duos, where the computer extends the reach/capacity of a skilled human.

Former MATCH Corps Tim Johnson is still down there (running this program for the district), while Patti Tao is consulting on similar work just launched in Denver. Fryer has launched a nonprofit (I'm on the board) to try to grow this turnaround school work. It's called Blueprint Schools. We're extremely grateful to superintendent Terry Grier for the opportunity.

Looking forward:

There's a big question on the scalability of full-time tutors. Of course. Justifiably. People think about the notion and their heads spin.

My colleague Alan Safran has become extremely interested in that question. He hopes to launch more experiments using full-time tutors -- in charters, districts, and possibly in other ways. He is looking for partners. If you want to know more, let me know.