added

Ohio Value-added measures poverty

Congratulations Ohio corporate education reformers, you have discovered yet another way to measure poverty. Unfortunately you seem to believe this is also a good way to evaluate teachers.

Value-added was supposed to be the great equalizer -- a measure of schools that would finally judge fairly how much poor students are learning compared with their wealthier peers.

Meant to gauge whether students learn as much as expected in a given year, value-added will become a key part of rating individual teachers from rich and poor districts alike next school year.

But a Plain Dealer/StateImpact Ohio analysis raises questions about how much of an equalizer it truly is, even as the state ramps up its use.

The 2011-12 value-added results show that districts, schools and teachers with large numbers of poor students tend to have lower value-added results than those that serve more-affluent ones.

Of course there are going to be defenders of the high stakes sweepstakes

"Value-added is not influenced by socioeconomic status," said Matt Cohen, the chief research officer at the Ohio Department of Education. "That much is pretty clear."

That is the same Matt Cohen who admitted he is no expert and has no clue how Value-add is calculated

The department’s top research official, Matt Cohen, acknowledged that he can’t explain the details of exactly how Ohio’s value-added model works. He said that’s not a problem.

“It’s not important for me to be able to be the expert,” he said. “I rely on the expertise of people who have been involved in the field.” 

Perhaps if Mr Cohen became more familiar with the science and the data he would realize that:

  • Value-added scores were 2½ times higher on average for districts where the median family income is above $35,000 than for districts with income below that amount.
  • For low-poverty school districts, two-thirds had positive value-added scores -- scores indicating students made more than a year's worth of progress.
  • For high-poverty school districts, two-thirds had negative value-added scores -- scores indicating that students made less than a year's progress.

  • Almost 40 percent of low-poverty schools scored "Above" the state's value-added target, compared with 20 percent of high-poverty schools.
  • At the same time, 25 percent of high-poverty schools scored "Below" state value-added targets while low-poverty schools were half as likely to score "Below."

  • Students in high-poverty schools are more likely to have teachers rated "Least Effective" -- the lowest state rating -- than "Most Effective" -- the highest of five ratings. The three ratings in the middle are treated by the state as essentially average performance.

Is there really any doubt what is truly being measured here? Ohio's secret Value-added formula is good at measuring poverty, not teacher effectiveness.

We predict districts and administrators and those connected to the development of Value-added measures are going to be deluged with lawsuits once high stakes decisions are attached to the misguided application of these diagnostic scores.

OEA Response to PD and NPR Teacher shaming

Here's the statement from the Ohio Education Association, which represents over 121,000 educators

Responding to a series of newspaper, web and radio stories on value-added scroes of individual Ohio teachers, Patricia Frost-Brooks, President of the Ohio Education Association criticized the fairness of the stories and the wisdom of using value-added scores as such a prominent index of teacher success:

"The Ohio Education Association was not contacted for comment on the Plain Dealer/StateImpact Ohio stories, despite our expertise, which would have provided desperately needed context and perspective. Reporters and editors admitted this value-added data was 'flawed,' but they chose surprise and impact over fairness, balance and accuracy," Frost-Brooks said.

"We are all accountable for student success – teachers, support professionals, parents, students and elected officials. And the Ohio Education Association is committed to fair teacher evaluation systems that include student performance, among other multiple measures. But listing teachers as effective or ineffective based on narrow tests not designed to be used for this purpose is a disservice to everyone.

"Value-added ratings can never paint a complete or objective picture of an individual teacher’s work or performance. Trained educators can use a student’s value-added data, along with other student data, to improve student instruction. But the stories promote a simplistic and inaccurate view of value-added as a valid basis for high-stakes decisions on schools, teachers and students."

Very questionable that reporters would not contact the largest teachers assoication in crafting their story.

Battelle Blasts Papers decision

From our mailbag, Battelle for Kids condemns the Plain Dealer and NPR's decision to publish teacher's value-added scores, calling it "the poster child for name, blame, and shame and the antithesis of our approach to using value-added data"

To: All SOAR districts
From: Jim Mahoney and Bobby Moore
Date: June 17, 2013

Yesterday, a three-part series on value-added was launched by The Cleveland Plain Dealer and State Impact Ohio. It includes both articles and radio segments specific to value-added analysis as a measure of teacher effectiveness. Highlighted in the articles is a link to a database of teacher ratings, hosted by The Plain Dealer and the State Impact Ohio partnership.

Currently, Ohio laws governing the release of teacher records would apply to teacher value-added results. Thus, teacher level value-added information is subject to public records requests through ODE. Through The Plain Dealer and State Impact Ohio database, the general public can now access a teacher's overall composite rating derived from two years of his/her results in grades 4-8 math and reading. These data reflect information for less than 1/3 of the math and reading, grades 4-8 teachers in Ohio.

Battelle for Kids was not aware these ratings would be published in this way, at this time.

While Battelle for Kids does support the use of value-added information for school improvement and as one of several components of a multi-measures evaluation system, value-added should NOT be used in isolation to draw conclusions about a teacher's effectiveness.

Multiple data points over time from multiple perspectives are crucial because teaching and learning and the evaluation of teaching and learning are complex.

Therefore, we are NOT supportive of these ratings being publically available and discourage promoting the use of this public database.

Talking points and articles, to support your local conversations, are available on the Ohio Student Progress Portal.

http://cts.vresp.com/c/?BattelleForKids/f43a0e1b46/fb8aa9ca4e/313346eb88/sflang=en

Obviously, this is the poster child for name, blame, and shame and the antithesis of our approach to using value-added data.

Please call if you have any questions.

Thank you for all you do for Ohio's students!

-Jim and Bobby

Shame on the PD and NPR

When the Cleveland Plain Dealer and NPR decided to publish the names of 4,200 Ohio teachers and their value-added grades, their reasoning was specious and self-serving. Most of all, it is damaging to the teaching profession in Ohio.

Despite pointing out all the flaws, caveats, and controversies with the use of value-add as a means to evaluate teachers, both publications decided to go ahead and shame these 4,200 teacher anyway. The publication of teachers names and scores isn't new. It was first done by the LA Times, and was a factor in the suicide of one teacher. The LA Times findings and analysis was then discredited

The research on which the Los Angeles Times relied for its August 2010 teacher effectiveness reporting was demonstrably inadequate to support the published rankings. Using the same L.A. Unified School District data and the same methods as the Times, this study probes deeper and finds the earlier research to have serious weaknesses.

DUE DILIGENCE AND THE EVALUATION OF TEACHERS by National Education Policy Center

The Plain Dealer analysis is weaker than the LA Times, relying on just 2 years worth of data rather than 7. In fact, the Pleain Dealer and NPR stated they only published 4,200 teachers scores and not the 12,000 scores they had data for because most only had 1 years worth of data. A serious error as value-add is known to be highly unreliable and subject to massive variance.

Beyond the questionable statistical analysis, the publication of teachers names and value-added scores has been criticized by a great number of people, including corporate education reformer Bill Gates, in NYT op-ed titled "Shame Is Not the Solution"

LAST week, the New York State Court of Appeals ruled that teachers’ individual performance assessments could be made public. I have no opinion on the ruling as a matter of law, but as a harbinger of education policy in the United States, it is a big mistake.

I am a strong proponent of measuring teachers’ effectiveness, and my foundation works with many schools to help make sure that such evaluations improve the overall quality of teaching. But publicly ranking teachers by name will not help them get better at their jobs or improve student learning. On the contrary, it will make it a lot harder to implement teacher evaluation systems that work.

Gates isn't the only high profile corporate education reformer who is critical of such shaming, Wendy Knopp, CEO of Teach for America has also spoken out against the practice

Kopp is not shy about saying what she'd do differently as New York City schools chancellor. While the Bloomberg administration is fighting the United Federation of Teachers in court for the right to release to the news media individual teachers' "value added" ratings—an estimate of how effective a teacher is at improving his or her students' standardized test scores—Kopp says she finds the idea "baffling" and believes doing so would undermine trust among teachers and between teachers and administrators.

"The principals of very high performing schools would all say their No. 1 strategy is to build extraordinary teams," Kopp said. "I can't imagine it's a good organizational strategy to go publish the names of teachers and one data point about whether they are effective or not in the newspaper."

Indeed, if the editors of the Plain Dealer and NPR had read their own reporting, they would have realized the public release of this information was unsound, unfair and damaging. Let's look at the warning signs in their own reporting

...scores can vary from year to year.

Yet they relied upon only 1 years worth of data for much of their analysis, and just 2 for the teachers whose names they published.

...decided it was more important to provide information — even if flawed.

How can it be useful to the layperson to be provided with flawed information? Why would a newspaper knowingly publish flawed information?

...these scores are only a part of the criteria necessary for full and accurate evaluation of an individual teacher.

And yet they publish 4,200 teachers value-added scores based solely on value add, which at best makes up only 35% of a teachers evaluation. Lay people will not understand these scores are only a partial measurment of a teachers effectiveness, and a poor one at that.

...There are a lot of questions still about the particular formula Ohio.

Indeed, so many questions that one would best be advised to wait until those questions are answered before publically shaming teachers who were part of a pilot program being used to answer those questions.

...variables beyond a teacher’s control need to be considered in arriving at a fair and accurate formula.

Yet none of these reporters considered any of these factors in publishing teachers names, and readers will wholly miss that necassary context.

...The company that calculates value-added for Ohio says scores are most reliable with three years of data.

Again, the data is unreliable, especially with less than 3 years worth of data, yet the Plain Dealer and NRP decided they should shame teachers using just 2 years worth of data.

...Ohio’s value-added ratings do not account for the socioeconomic backgrounds of students, as they do in some other states.

How many "ineffective" teachers are really just working in depressed socioeconomic classrooms? The reporters seem not to care and publish the names anyway.

...Value-added scores are not a teacher’s full rating.

No where in the publication of these names are the teachers full ratings indicated. This again leaves lay-people and site visitors to think these flawed value-added scores are the final reflection of a teachers quality

...ratings are still something of an experiment.

How absurd is the decision to publish now seeming? Shaming people on the basis of the results of an experiement! By their very nature experiments can demonstrate something is wrong, not right.

...The details of how the scores are calculated aren’t public.

We don't even know if the value-added scores are correct and accurate, because the formula is secret. How can it be fair for the results of a secret forumla be public? Did that not rasie any alarm bells for the Plain Dealer and NPR?

...The department’s top research official, Matt Cohen, acknowledged that he can’t explain the details of exactly how Ohio’s value-added model works.

But somehow NPR listeners and Cleveland Plain Dealer readers are supposed to understand the complexities, and read the necessary context into the publication of individual teacher scores?

...StateImpact/Plain Dealer analysis of initial state data suggests.

"Initial", "Suggests". They have decided to shame teachers without properly vetting the data and their own analysis - exactly the same problem the LA Times ran into that we highlighted at the top of this article.

It doesn't take a lot of "analysis" to understand that a failing newspaper needed controversy and eyeballs and that their decision to shame teachers was made in their own economic interests and not that of the public good. In the end then, the real shame falls not on teachers who are working hard everyday often in difficult situations made worse by draconian budget cuts, endless political meddling, and student poverty - but on the editors of these 2 publications for putting their own narrow self-interest above that of Ohio's children.

It's a disgrace that they ought to make 4,200 apologies for.

Value-added: How Ohio is destroying a profession

We ended the week last week with a post titled "The 'fun' begins soon", which took a look at the imminent changes to education policy in Ohio. We planned on detailing each of these issues over the next few weeks.

Little did we know that the 'fun' would begin that weekend. It came in the manner of the Cleveland Plain Dealer and NPR publishing a story on the changing landscape of teacher evaluations titled "Grading the Teachers: How Ohio is Measuring Teacher Quality by the Numbers".

It's a solid, long piece, worth the time taken to read it. It covers some, though not all, of the problems of using value-added measurements to evaluate teachers

Those ratings are still something of an experiment. Only reading and math teachers in grades four to eight get value-added ratings now. But the state is exploring how to expand value-added to other grades and subjects.

Among some teachers, there’s confusion about how these measures are calculated and what they mean.

“We just know they have to do better than they did last year,” Beachwood fourth-grade teacher Alesha Trudell said.

Some of the confusion may be due to a lack of transparency around the value-added model.

The details of how the scores are calculated aren’t public. The Ohio Education Department will pay a North Carolina-based company, SAS Institute Inc., $2.3 million this year to do value-added calculations for teachers and schools. The company has released some information on its value-added model but declined to release key details about how Ohio teachers’ value-added scores are calculated.

The Education Department doesn’t have a copy of the full model and data rules either.

The department’s top research official, Matt Cohen, acknowledged that he can’t explain the details of exactly how Ohio’s value-added model works. He said that’s not a problem.

Evaluating a teacher on a secret formula isn't a practice that can be sustained, supported or defended. The article further details a common theme we hear over and over again

But many teachers believe Ohio’s value-added model is essentially unfair. They say it doesn’t account for forces that are out of their control. They also echo a common complaint about standardized tests: that too much is riding on these exams.

“It’s hard for me to think that my evaluation and possibly some day my pay could be in a 13-year-old’s hands who might be falling asleep during the test or might have other things on their mind,” said Zielke, the Columbus middle school teacher.

The article also performs analysis on several thousands value add scores, and that analysis demonstrates what we have long reported, that value-add is a poor indicator of teacher quality, with too many external factors affecting the score

A StateImpact/Plain Dealer analysis of initial state data suggests that teachers with high value-added ratings are more likely to work in schools with fewer poor students: A top-rated teacher is almost twice as likely to work at a school where most students are not from low-income families as in a school where most students are from low-income families.
[…]
Teachers say they’ve seen their value-added scores drop when they’ve had larger classes. Or classes with more students who have special needs. Or more students who are struggling to read.

Teachers who switch from one grade to another are more likely to see their value-added ratings change than teachers who teach the same grade year after year, the StateImpact/Plain Dealer analysis shows. But their ratings went down at about the same rate as teachers who taught the same grade level from one year to the next and saw their ratings change.

What are we measuring here? Surely not teacher quality, but rather socioeconomic factors and budget conditions of the schools and their students.

Teachers are intelligent people, and they are going to adapt to this knowledge in lots of unfortunate ways. It will become progressively harder to districts with poor students to recruit and retain the best teachers. But perhaps the most pernicious effect is captured at the end of the article

Stephon says the idea of Plecnik being an ineffective teacher is “outrageous.”

But Plecnik is through. She’s quitting her job at the end of this school year to go back to school and train to be a counselor — in the community, not in schools.

Plecnik was already frustrated by the focus on testing, mandatory meetings and piles of paperwork. She developed medical problems from the stress of her job, she said. But receiving the news that despite her hard work and the praise of her students and peers the state thought she was Least Effective pushed her out the door.

“That’s when I said I can’t do it anymore,” she said. “For my own sanity, I had to leave.”

The Cleveland Plain Dealer and NPR then decided to add to this stress by publishing individual teachers value-added scores - a matter we will address in our next post.

The Foolish Endeavor of Rating Ed Schools by Graduates’ Value-Added

Via School Finance 101.

Knowing that I’ve been writing a fair amount about various methods for attributing student achievement to their teachers, several colleagues forwarded to me the recently released standards of the Council For the Accreditation of Educator Preparation, or CAEP. Specifically, several colleagues pointed me toward Standard 4.1 Impact on Student Learning:

4.1.The provider documents, using value-added measures where available, other state-supported P-12 impact measures, and any other measures constructed by the provider, that program completers contribute to an expected level of P-12 student growth.

http://caepnet.org/commission/standards/standard4/

Now, it’s one thing when relatively under-informed pundits, think tankers, politicians and their policy advisors pitch a misguided use of statistical information for immediate policy adoption. It’s yet another when professional organizations are complicit in this misguided use. There’s just no excuse for that! (political pressure, public polling data, or otherwise)

The problems associated with attempting to derive any reasonable conclusions about teacher preparation program quality based on value-added or student growth data (of the students they teach in their first assignments) are insurmountable from a research perspective.

Worse, the perverse incentives likely induced by such a policy are far more likely to do real harm than any good, when it comes to the distribution of teacher and teaching quality across school settings within states.

First and foremost, the idea that we can draw this simple line below between preparation and practice contradicts nearly every reality of modern day teacher credentialing and progress into and through the profession:

one teacher prep institution –> one teacher –> one job in one school –> one representative group of students

The modern day teacher collects multiple credentials from multiple institutions, may switch jobs a handful of times early in his/her career and may serve a very specific type of student, unlike those taught by either peers from the same credentialing program or those from other credentialing programs. This model also relies heavily on minimal to no migration of teachers across state borders (well, either little or none, or a ton of it, so that a state would have a large enough share of teachers from specific out of state institutions to compare). I discuss these issues in earlier posts.

Setting aside that none of the oversimplified assumptions of the linear diagram above hold (a lot to ignore!), let’s probe the more geeky technical issues of trying to use VAM to evaluate ed school effectiveness.

There exist a handful of recent studies which attempt to tease out certification program effects on graduate’s student’s outcomes, most of which encounter the same problems. Here’s a look at one of the better studies on this topic.

  • Mihaly, K., McCaffrey, D. F., Sass, T. R., & Lockwood, J. R. (2012). Where You Come From or Where You Go?

Specifically, this study tries to tease out the problem that arises when graduates of credentialing programs don’t sort evenly across a state. In other words, a problem that ALWAYS occurs in reality!

Researchy language tends to downplay these problems by phrasing them only in technical terms and always assuming there is some way to overcome them with statistical tweak or two. Sometimes there just isn’t and this is one of those times!

[readon2 url="http://schoolfinance101.wordpress.com/2013/02/25/revisiting-the-foolish-endeavor-of-rating-ed-schools-by-graduates-value-added/"]Continue reading...[/readon2]