Do Value-Added Methods Level the Playing Field for Teachers?



  • Value-added measures partially level the playing field by controlling for many student characteristics. But if they don't fully adjust for all the factors that influence achievement and that consistently differ among classrooms, they may be distorted, or confounded (An estimate of a teacher’s effect is said to be confounded when her contribution cannot be separated from other factors outside of her control, namely the the students in her classroom.)
  • Simple value-added models that control for just a few tests scores (or only one score) and no other variables produce measures that underestimate teachers with low-achieving students and overestimate teachers with high-achieving students.
  • The evidence, while inconclusive, generally suggests that confounding is weak. But it would not be prudent to conclude that confounding is not a problem for all teachers. In particular, the evidence on comparing teachers across schools is limited.
  • Studies assess general patterns of confounding. They do not examine confounding for individual teachers, and they can't rule out the possibility that some teachers consistently teach students who are distinct enough to cause confounding.
  • Value-added models often control for variables such as average prior achievement for a classroom or school, but this practice could introduce errors into value-added estimates.
  • Confounding might lead school systems to draw erroneous conclusions about their teachers – conclusions that carry heavy costs to both teachers and society.


Value-added models have caught the interest of policymakers because, unlike using student tests scores for other means of accountability, they purport to "level the playing field." That is, they supposedly reflect only a teacher's effectiveness, not whether she teaches high- or low-income students, for instance, or students in accelerated or standard classes. Yet many people are concerned that teacher effects from value-added measures will be sensitive to the characteristics of her students. More specifically, they believe that teachers of low-income, minority, or special education students will have lower value-added scores than equally effective teachers who are teaching students outside these populations. Other people worry that the opposite might be true - that some value-added models might cause teachers of low-income, minority, or special education students to have higher value-added scores than equally effective teachers who work with higher-achieving, less risky populations.

In this brief, we discuss what is and is not known about how well value-added measures level the playing field for teachers by controlling for student characteristics. We first discuss the results of empirical explorations. We then address outstanding questions and the challenges to answering them with empirical data. Finally, we discuss the implications of these findings for teacher evaluations and the actions that may be based on them.

The Evidence on Charter Schools and Test Scores

The Shanker Institute has just released a policy brief titled "The Evidence on Charter Schools and Test Scores", which finds

The available research suggests that charter schools’ effects on test score gains vary by location, school/student characteristics and other factors. When there are differences, they tend to be modest. There is tentative evidence suggesting that high-performing charter schools share certain key features, especially private donations, large expansions of school time, tutoring programs and strong discipline policies. Finally, while there may be a role for state/local policies in ensuring quality as charters proliferate, scaling up proven approaches is constrained by the lack of adequate funding, and the few places where charter sectors as a whole have been shown to get very strong results seem to be those in which their presence is more limited. Overall, after more than 20 years of proliferation, charter schools face the same challenges as regular public schools in boosting student achievement, and future research should continue to focus on identifying the policies, practices and other characteristics that help explain the wide variation in their results.

Research doesn’t back up key ed reforms

There is no solid evidence supporting many of the positions on teachers and teacher evaluation taken by some school reformers today, according to a new assessment of research on the subject.

The Education Writers Association released a new brief that draws on more than 40 research studies or research syntheses, as well as interviews with scholars who work in this field.

You can read the entire brief (written by Education Week assistant editor Stephen Sawchuk), but here are the bottom-line conclusions of each section:

Q) Are teachers the most important factor affecting student achievement?

A) Research has shown that the variation in student achievement is predominantly a product of individual and family background characteristics. Of the school factors that have been isolated for study, teachers are probably the most important determinants of how students will perform on standardized tests.

Q) Are value-added estimations reliable or stable?

A) Value-added models appear to pick up some differences in teacher quality, but they can be influenced by a number of factors, such as the statistical controls selected. They may also be affected by the characteristics of schools and peers. The impact of unmeasured factors in schools, such as principals and choice of curriculum, is less clear.

Q) What are the differences in achievement between students who have effective or ineffective teachers for several years in a row?

A) Some teachers produce stronger achievement gains among their students than others do. However, estimates of an individual teacher’s effectiveness can vary from year to year, and the impact of an effective teacher seems to decrease with time. The cumulative effect on students’ learning from having a succession of strong teachers is not clear.

Q) Do teacher characteristics such as academic achievement, years of experience, and certification affect student test scores?

A) Teachers improve in effectiveness at least over their first few years on the job. Characteristics such as board certification, and content knowledge in math sometimes are linked with student achievement. Still, these factors don’t explain much of the differences in teacher effectiveness overall.

Q) Does merit pay for teachers produce better student achievement or retain more-effective teachers?

A) In the United States, merit pay exclusively focused on rewarding teachers whose students produce gains has not been shown to improve student achievement, though some international studies show positive effects. Research has been mixed on comprehensive pay models that incorporate other elements, such as professional development. Scholars are still examining whether such programs might work over time by attracting more effective teachers.

Q) Do students in unionized states do better than students in states without unions?

A) Students tend to do well in some heavily unionized states, but it isn’t possible to conclude that it is the presence or absence of unions that cause that achievement.

What Studies Say About Teacher Effectiveness

Teacher attrition and education policy

One of the genuine major issues facing education is one of teacher attrition. Each year significant numbers of teachers leave the profession. From a human capital perspective, this is hugely expensive and impacts education delivery. A large number of studies have been performed to asses this problem. A recent study, published by the american Education Research Association looked at all the major studies in this area. The study can be found here (pdf). What follows are come of the concluding remarks.

This literature review provides a summary and critical evaluation of the recent published research on the topic of teacher recruitment and retention. We reviewed studies that examined (1) the characteristics of individuals who enter teaching, (2) the characteristics of individuals who remain in teaching, (3) the external characteristics of schools and districts that affect recruitment and retention, (4) compensation policies that affect recruitment and retention, (5) pre-service policies that affect recruitment and retention, and (6) in-service policies that affect recruitment and retention.

The reviewed research offered several consistent findings. The strongest results were those relating to the influence of various factors on attrition due to the widespread availability of longitudinal data sets that track the employment of teachers. Below, we summarize the findings that emerged in the recent empirical research literature.

  1. Results that arose fairly consistently regarding the characteristics of individuals who enter the teaching profession were as follows:
    • Females formed greater proportions of new teachers than males.
    • Whites formed greater proportions of new teachers than minorities, although there is evidence that minority participation rose in the early 1990s.
    • College graduates with higher measured academic ability were less likely to enter teaching than were other college graduates. It is possible, however, that these differences were driven by the measured ability of elementary school teachers, who represent the majority of teachers.
    • A more tentative finding based on a small number of weaker studies is that an altruistic desire to serve society is one of the primary motivations for pursuing teaching.
  2. Several findings emerged with a strong degree of consistency in empirical studies of the characteristics of individuals who leave the teaching profession:
    • The highest turnover and attrition rates seen for teachers occurred in their first years of teaching and after many years of teaching when they were near retirement, thus producing a U-shaped pattern of attrition with respect to age or experience.
    • Minority teachers tended to have lower attrition rates than White teachers.
    • Teachers in the fields of science and mathematics were more likely to leave teaching than teachers in other fields.
    • Teachers with higher measured academic ability (as measured by test scores) were more likely to leave teaching.
    • Female teachers typically had higher attrition rates than male teachers.
  3. Regarding the external characteristics of schools and districts that are related to teacher recruitment and retention rates, the empirical literature provided the following fairly consistent findings:
    • Schools with higher proportions of minority, low-income, and low-performing students tended to have higher attrition rates.
    • In most studies, urban school districts had higher attrition rates than suburban and rural districts.
    • Teacher retention was generally found to be higher in public schools than in private schools.
  4. The following statements summarize the consistent research findings regarding compensation policies and their relationship to teacher recruitment and retention:
    • Higher salaries were associated with lower teacher attrition.
    • Teachers were responsive to salaries outside their districts and their profession.
    • In surveys of teachers, self-reported dissatisfaction with salary was associated with higher attrition and decreased commitment to teaching.Teacher Recruitment and Retention
  5. Rigorous empirical studies of the impact of pre-service policies on teacher recruitment and retention were sparse. In general, few results emerged across studies, and the following findings were therefore not particularly robust:
    • Graduates of nontraditional and alternative teacher education programs appear to have higher rates of retention in teaching than national comparison groups and may differ from traditional recruits in their background characteristics.
    • There was tentative evidence that streamlined routes to credentialing provide more incentive to enter teaching than monetary rewards.
    • Pre-service testing requirements may adversely affect the entry of minority candidates into teaching.
  6. Findings from the research on in-service policies that affect teacher recruitment and retention were as follows:
    • Schools that provided mentoring and induction programs, particularly those related to collegial support, had lower rates of turnover among beginning teachers.
    • Schools that provided teachers with more autonomy and administrative support had lower levels of teacher attrition and migration.
    • A tentative finding was that accountability policies might lead to increased attrition in low-performing schools. The entry, mobility, and attrition patterns summarized

One can see from the results of these studies, creating a lower paid, less secure profession as some current corporate education reform policies would do, would create a situation of worse renention, to the determiment of students. The importance of workplace conditions, classroom resources, and support are also critical, and may be lost without the ability to bargain for them.

Clearly, delivery of high quality education at an affordable cost is a complex subject with many complex variables at play. SB5 and HB153's highly prescriptive, and simplistic appoaches to reform without any broad expert consultation are bound to produce sub-optimal results. Ohio should take advantage of it's localized control and delivery of education, and its vast expert resources in education and pedagogy to experiment in reform before committing to a one size fits all simplistic approach.

A second literature review on teacher attrition an be found here.

Making (Up) The Grade In Ohio

Every year, most schools (and districts) in Ohio get one of six grades: Emergency, Watch, Continuous Improvement, Effective, Excellent, and Excellent with Distinction. Schools that receive poor grades over a period of years face a cascade of increasingly severe sanctions. This means that these report card grades are serious business.

The method for determining grades is a seemingly arbitrary step-by-step process outlined on page eight of this guidebook. I won’t bore you with the details, but suffice it to say that a huge factor determining a school’s grade is whether it meets certain benchmarks on one of two measures: The aforementioned “performance index” and the percentage of state standards that it meets. Both of these are “absolute” performance measures – they focus on how well students score on state tests (specifically, how many meet proficiency and other benchmarks), not on whether or not their scores improve. And neither accounts for differences in student characteristics, such as learning disabilities and income.

As I have discussed before, there is a growing consensus in education policy that, to the degree that schools and teachers should be judged on the basis of test results, the focus should be on whether students are improving (i.e., growth), not how highly they score (i.e., absolute performance). The reasoning is simple: Upon entry into the schooling system, poor kids (and those with disabilities, non-native English speakers, etc.) tend to score lower by absolute standards, and since schools have no control over this, they should be judged by the effect that they have on students, not on which students they happen to receive. That’s why high-profile schools like KIPP are considered effective, even though their overall scores are much lower than those in affluent suburbs.

The strong relationship between district poverty and one of these absolute performance measures – the state’s performance index that Fordham’s Terry Ryan discussed – is clear in the graph below, which I presented in a previous post.


Perhaps the people who designed the Ohio system made a good-faith effort to achieve “balance” between the various components – a very difficult endeavor to be sure. But what they ended up with was a somewhat arbitrary formula that produces troubling, implausible results based on contradictory notions of how to measure performance. The grades are as much a function of income and other student characteristics as anything else, and they’re more likely to change than stay the same between years. So, while I can’t say what the perfect system would look like, I can say that Ohio’s report card grades, without substantial changes, should be taken with a shaker full of salt.

Unfortunately, that’s easy for me to say, but parents, teachers, administrators, and other stakeholders have no such luxury. These grades are used in the highest-stakes decisions.

