Why Test Scores CAN'T Evaluate Teachers

From the National Education Policy Center. the entire post is well worth a read, here's the synopsis

The key element here that distinguishes Student Growth Percentiles from some of the other things that people have used in research is the use of percentiles. It's there in the title, so you'd expect it to have something to do with percentiles. What does that mean? It means that these measures are scale-free. They get away from psychometric scaling in a way that many researchers - not all, but many - say is important.

Now these researchers are not psychometricians, who aren't arguing against the scale. The psychometricians as who create our tests, they create a scale, and they use scientific formulae and theories and models to come up with a scale. It's like on the SAT, you can get between 200 and 800. And the idea there is that the difference in the learning or achievement between a 200 and a 300 is the same as between a 700 and an 800.

There is no proof that that is true. There is no proof that that is true. There can't be any proof that is true. But, if you believe their model, then you would agree that that's a good estimate to make. There are a lot of people who argue... they don't trust those scales. And they'd rather use percentiles because it gets them away from the scale.

Let's state this another way so we're absolutely clear: there is, according to Jonah Rockoff, no proof that a gain on a state test like the NJASK from 150 to 160 represents the same amount of "growth" in learning as a gain from 250 to 260. If two students have the same numeric growth but start at different places, there is no proof that their "growth" is equivalent.

Now there's a corollary to this, and it's important: you also can't say that two students who have different numeric levels of "growth" are actually equivalent. I mean, if we don't know whether the same numerical gain at different points on the scale are really equivalent, how can we know whether one is actually "better" or "worse"? And if that's true, how can we possibly compare different numerical gains?

[readon2 url=""]Continue reading...[/readon2]

Teacher Turnover Affects All Students' Achievement

In light of the Kasich education cuts, and the looming sequestration that will lead ot large education cuts, this article appearing in Education Week should be bourne in mind by law makers.

When teachers leave schools, overall morale appears to suffer enough that student achievement declines—both for those taught by the departed teachers and by students whose teachers stayed put, concludes a study recently presented at a conference held by the Center for Longitudinal Data in Education Research.

The impact of teacher turnover is one of the teacher-quality topics that's been hard for researchers to get their arms around. The phenomenon of high rates of teacher turnover has certainly been proven to occur in high-poverty schools more than low-poverty ones. The eminently logical assumption has been that such turnover harms student achievement.

But a couple years back, two researchers did an analysis that showed, counter-intuitively, it's actually the less- effective teachers, rather than the more- effective ones, who tend to leave schools with a high concentration of low-achieving, minority students. It raised the question of whether a degree of turnover might be beneficial, since it seemed to purge schools of underperforming teachers.

When reporting on that study, I played devil's advocate by pointing out that it didn't address the cultural impact of having a staff that's always in flux. The recently released CALDER paper suggests I may have been right in probing this question.

Written by the University of Michigan's Matthew Ronfeldt, Stanford University's Susanna Loeb, and the University of Virginia's Jim Wyckoff, the new paper basically picks up on the same question. Even if overall teacher effectiveness stays the same in a school with turnover, it's well documented that turnover hurts staff cohesion and the shared sense of community in schools, the scholars reasoned. Could that have an impact on student achievement, too?

To find out, they looked at a set of New York City test-score data from 4th and 5th graders over the course of eight years. The data were linked to teacher characteristics.
(All the usual caveats about limitations of test scores apply, of course.)

Among their findings:

• For each analysis, students taught by teachers in the same grade-level team in the same school did worse in years where turnover rates were higher, compared with years in which there was less teacher turnover.
• An increase in teacher turnover by 1 standard deviation corresponded with a decrease in math achievement of 2 percent of a standard deviation; students in grade levels with 100 percent turnover were especially affected, with lower test scores by anywhere from 6 percent to 10 percent of a standard deviation based on the content area.
• The effects were seen in both large and small schools, new and old ones.
• The negative effect of turnover on student achievement was larger in schools with more low-achieving and black students.

Read the whole piece here.

Are you an entertainer?

As we seeing an explosion of technology both in our personal lives and being pushed into the classroom, studies like these are important, and interesting.

There is a widespread belief among teachers that students’ constant use of digital technology is hampering their attention spans and ability to persevere in the face of challenging tasks, according to two surveys of teachers being released on Thursday.

The researchers note that their findings represent the subjective views of teachers and should not be seen as definitive proof that widespread use of computers, phones and video games affects students’ capability to focus.

Even so, the researchers who performed the studies, as well as scholars who study technology’s impact on behavior and the brain, say the studies are significant because of the vantage points of teachers, who spend hours a day observing students.
Teachers who were not involved in the surveys echoed their findings in interviews, saying they felt they had to work harder to capture and hold students’ attention.

“I’m an entertainer. I have to do a song and dance to capture their attention,” said Hope Molina-Porter, 37, an English teacher at Troy High School in Fullerton, Calif., who has taught for 14 years. She teaches accelerated students, but has noted a marked decline in the depth and analysis of their written work.

You can read the entire study from Common Sense Media, here, titled "Children, Teens, and Entertainment Media: The View from the Classroom."

Certainly provocative.

PolitiFact is mostly made up

PolitiFact Ohio, a "fact checking" operation ran by the Cleveland Plain Dealer decided to check out the following statement by Cleveland teachers

"The (Jackson) plan (for reforming Cleveland schools) lacks any data or methods proven to raise student achievement."

PolitiFact goes through a number of cases, based upon assertions made by CEO Gordon

"And while Gordon conceded "there is no empirical study that shows the portfolio strategy is the one strategy" he said there is some evidence that some of the approaches in the Jackson plan have worked to raise test scores."

PolitiFact looked at some "evidence", and so shall we.

For example, Gordon mentioned research that has been done by the Center on Reinventing Public Education. The non-profit group recently issued a report on a number of big city school districts trying reforms similar to those in Jackson’s plan.

The group’s report looked at Denver schools, where many teachers voluntarily opted for a merit pay system instead of the standard teaching contract. Known as the ProComp program, it ties teacher pay to education levels and offers bonus pay to teachers who work in the toughest schools and whose students score higher on tests.

Researchers at the University of Colorado found "significant and positive ProComp effects at both middle and high school for both math and reading, and the effects are larger at high school than middle school." The researchers cautioned, however, that it generally was the more effective teachers who opted into the program.

ProComp was funded by voters to the tune of $25 million in order to pay teachers more. Unless there's a provision in Frank Jackson's plan to ask voters for an additional $25 million on top of the $65 million deficit, we can see straight away that the Jacksons plan and the Denver ProComp system are not at all similar and worthy of comparison.

But let us pretend Frank Jackson's plan does involve giving teachers up to almost $4,000 a year in bonuses. According to a recent study of the ProComp system, researchers found

DPS has experienced significant student learning gains across grades and subjects, but it is not clear that this was the result of ProComp. There was not a consistent pattern across grade levels and subjects in the relationship between ProComp and observed achievement gains. In some cases, the gains appeared primarily among students with ProComp teachers, while in other cases it is NonKProComp teachers who appeared to be more effective. Though puzzling, these findings are consistent with research on other well known interventions that include elements similar to ProComp.

Clearly there is no evidence, as the Cleveland teachers said, that this kind of compensation improves student performance. Gordon and PolitiFact are WRONG.

PolitiFact's next step was to look at the Colorado Innovation Schools Act

Another approach tried in Colorado — a 2008 law called the Innovation Schools Act — gives school officials who opt into the program greater school autonomy and flexibility in operations and academic decisions. "The innovation schools are experiencing growth in test scores but many were exceeding state averages prior to being innovation schools," said a recent report from researchers who have studied the schools.

There are just 21 innovation schools - an incredibly small sample, but according to a recent report

Innovation schools did not tend to look drastically different than other schools.
Innovation schools have experienced high rates of mobility among teachers and principals. Their teachers tend to be somewhat less experienced and are less likely to have master’s degrees than teachers in comparable schools
There are not yet clear trends to help us understand how Innovation will affect student achievement.

They sound an awful lot like most Cleveland charters, and like most Cleveland charters they rely upon less experienced, less qualified teachers, and are not producing better results than traditional schools. Gordon and PolitiFact are WRONG to look at Innovation Schools as evidence of successful reforms.

Next PolitiFact uses this

The Baltimore school district — after working hand in hand with the union — implemented a reworked teacher contract largely based on teacher evaluations and student test scores. That contract only went into effect last year so it’s too soon to say whether it has improved student test scores.

In their own words, there is no evidence this works to improve student achievement, exactly what the teachers in Cleveland claim. Why did PolitiFact even introduce this as evidence? Moving on.

The Jackson plan also calls for increased learning time through either longer school days or a longer school year, a hot topic among educational academics. Research on the subject is mixed — a fact Gordon acknowledged. "Well, no one factor in of itself is a magic bullet solution," he said. "You are going to find time studies where it did work and time studies where it didn’t work."

Now we're getting desperate. So how does PolitiFact rule on this mountain of evidence?

Ohio AFT union head Melissa Cropper said Mayor Frank Jackson’s sweeping plan to improve Cleveland schools "lacks any data or methods proven to raise student achievement" as she labeled the proposal an attack on teachers. For PolitiFact Ohio, a key part of that statement is "lacks any."

While the specific approach Jackson mapped out for Cleveland hasn’t been proven, it does clearly contain elements that researchers suggest may work — at least in some cases -- such as merit pay for teachers, greater flexibility for schools in how they go about their business and longer school days or school year.
On the Truth-O-Meter, the claim by the Ohio Federation of Teachers rates Mostly False.

Huh? "Elements", "may work", PolitiFact contort their own piece to arrive at this ridiculously tortured conclusion. There is no evidence based on research that shows that what is proposed in the "Cleveland Plan" will work (though we hope some of it does!), to then arrive at a conclusion that the teachers are "mostly wrong" is absurd. This should come as no surprise as the Plain Dealer has been carrying the water for Frank Jackson and his SB5 plan on their opinion pages from the gitgo - and that's a Fact, totally true.

Misconceptions and Realities about Teacher Evaluations

A letter, signed by 88 educational researchers from 16 universities was recently sent to the Mayor of Chicago regarding his plans to implement a teacher evaluation system. Because of some of the similarities of the Chicago plan to that of Ohio, we thought we would reprint the letter here.

In what follows, we draw on research to describe three significant concerns with this plan.

Concern #1: CPS is not ready to implement a teacher-evaluation system that is based on significant use of “student growth.” For Type I or Type II assessments, CPS must identify the assessments to be used, decide how to measure student growth on those assessments, and translate student growth into teacher-evaluation ratings. They must determine how certain student characteristics such as placement in special education, limited English-language proficiency, and residence in low-income households will be taken into consideration. They have to make sure that the necessary technology is available and usable, guarantee that they can correctly match teachers to their actual students, and determine that the tests are aligned to the new Common Core State Standards (CCSS).

In addition, teachers, principals, and other school administrators have to be trained on the use of student assessments for teacher evaluation. This training is on top of training already planned about CCSS and the Charlotte Danielson Framework for Teaching, used for the “teacher practice” part of evaluation.

For most teachers, a Type I or II assessment does not exist for their subject or grade level, so most teachers will need a Type III assessment. While work is being done nationally to develop what are commonly called assessments for “non-tested” subjects, this work is in its infancy. CPS must identify at least one Type III assessment for every grade and every subject, determine how student growth will be measured on these assessments, and translate the student growth from these different assessments into teacher-evaluation ratings in an equitable manner.

If CPS insists on implementing a teacher-evaluation system that incorporates student growth in September 2012, we can expect to see a widely flawed system that overwhelms principals and teachers and causes students to suffer.

Concern #2: Educational research and researchers strongly caution against teacher-evaluation approaches that use Value-Added Models (VAMs).

Chicago already uses a VAM statistical model to determine which schools are put on probation, closed, or turned around. For the new teacher-evaluation system, student growth on Type I or Type II assessments will be measured with VAMs or similar models. Yet, ten prominent researchers of assessment, teaching, and learning recently wrote an open letter that included some of the following concerns about using student test scores to evaluate educators[1]:

a. Value-added models (VAMs) of teacher effectiveness do not produce stable ratings of teachers. For example, different statistical models (all based on reasonable assumptions) can yield different effectiveness scores. [2] Researchers have found that how a teacher is rated changes from class to class, from year to year, and even from test to test. [3]

b. There is no evidence that evaluation systems that incorporate student test scores produce gains in student achievement. In order to determine if there is a relationship, researchers recommend small-scale pilot testing of such systems. Student test scores have not been found to be a strong predictor of the quality of teaching as measured by other instruments or approaches. [4]

c. Assessments designed to evaluate student learning are not necessarily valid for measuring teacher effectiveness or student learning growth. [5] Using them to measure the latter is akin to using a meter stick to weigh a person: you might be able to develop a formula that links height and weight, but there will be plenty of error in your calculations.

Concern #3: Students will be adversely affected by the implementation of this new teacher-evaluation system.

When a teacher’s livelihood is directly impacted by his or her students’ scores on an end-of-year examination, test scores take front and center. The nurturing relationship between teacher and student changes for the worse, including in the following ways:

a. With a focus on end-of-year testing, there inevitably will be a narrowing of the curriculum as teachers focus more on test preparation and skill-and-drill teaching. [6] Enrichment activities in the arts, music, civics, and other non-tested areas will diminish.

b. Teachers will subtly but surely be incentivized to avoid students with health issues, students with disabilities, students who are English Language Learners, or students suffering from emotional issues. Research has shown that no model yet developed can adequately account for all of these ongoing factors. [7]

c. The dynamic between students and teacher will change. Instead of “teacher and student versus the exam,” it will be “teacher versus students’ performance on the exam.”

d. Collaboration among teachers will be replaced by competition. With a “value-added” system, a 5th grade teacher has little incentive to make sure that his or her incoming students score well on the 4th grade exams, because incoming students with high scores would make his or her job more challenging.

e. When competition replaces collaboration, every student loses.

You can read the whole letter below.

Misconceptions and Realities about Teacher and Principal Evaluation