score

Lawsuit filed over unfair teacher evaluations

The Washington Post is reporting on a lawsuit being filed by Florida teachers, that cold shake the foundations of a lot of teacher evaluation systems both in Florida, but across the country, including here in Ohio

A group of teachers and their unions filed a lawsuit on Tuesday against Florida officials that challenges the state’s educator evaluation system, under which many teachers are evaluated on the standardized test scores of students they do not teach.

The seven teachers who filed the lawsuit include Kim Cook, who, as this post explains, was evaluated at Irby Elementary, a K-2 school where she works and was named Teacher of the Year last December. But 40 percent of that evaluation was based on test scores of students at Alachua Elementary, a school into which Irby feeds, whom she never taught.

Kim Cook's story is very unneverving

Here’s the crazy story of Kim Cook, a teacher at Irby Elementary, a K-2 school which feeds into Alachua Elementary, for grades 3-5, just down the road in Alachua, Fla. She was recently chosen by the teachers at her school as their Teacher of the Year.

Her plight stems back to last spring when the Florida Legislature passed Senate Bill 736, which mandates that 40 percent of a teacher’s evaluation must be based on student scores on the state’s standardized tests, a method known as the value-added model, or VAM. It is essentially a formula that supposedly tells how much “value” a teacher has added to a student’s test score. Assessment experts say it is a terrible way to evaluate teachers but it has still been adopted by many states with the support of the Obama administration.

Since Cook’s school only goes through second grade, her school district is using the FCAT scores from the third graders at Alachua Elementary School to determine the VAM score for every teacher at her school.

Alachua Elementary School did not do well in 2011-12 evaluations that just came out; it received a D. Under the VAM model, the state awarded that school — and Cook’s school, by default — 10 points out of 100 for their D.

In this school district, there are three components to teacher evaluations:
1. A lesson study worth 20 percent. In the lesson study, small groups of teachers work together to create an exemplary lesson, observe one of the teachers implement it, critique the teacher’s performance and discuss improvement.
2. Principal appraisal worth 40 percent of overall score.
3. VAM data (scores from the standardized Florida Comprehensive Assessment Test scores for elementary schools) worth 40 percent of the overall score.

Cook received full points on her lesson study: 100 x .20 (20%) = 20 points
Cook received an 88/100 from her former principal: 88/100 x .40 (40%) = 35.2 points
On VAM data — points awarded by the state for the FCAT scores at Alachua Elementary School: 10/100 x .40 (40%) = 4 points
Total points that she received: 59.2 (Unsatisfactory)

Here's a video of Kim speaking on this issue

We imaging this to be the first, not the last legal action against many of the provisions corporate education reformers are trying to cram into teacher evaluations.

Improving the Budget Bill Part I

Hb 59, the Governor's budget bill can be significantly improved during the legislative process. We're going to detail some of the ways improvements can be made.

Improvements can first start by correcting a major policy flaw inserted into HB555 at the last minute. HB 555 radically changed the method of calculating evaluations for about 1/3 of Ohio's teachers. If a teacher's schedule is comprised only of courses or subjects for which the value-added progress dimension is applicable - then only their value-add score can now be used as part of the 50% of an evaluation based on student growth. Gone is the ability to use multiple measures of student growth - i.e. Student Learning Objectives or SLO's.

Therefore we suggest the legislature correct this wrong-headed policy by repealing this provision of HB555.

Furthermore, greater evaluation fairness could be achieved by lowering the number of absences a student is allowed before their test scores can be excluded from a teacher's value-add score. Currently a student needs to be absent 60 times - or 1/3 of a school year. This is an absurd amount of schooling to miss and still have that student's score count towards the evaluation of his or her teacher. This absence exclusion should be lowered to a more reasonable 15 absences.

Value-add should not be used to punish teachers on evaluations, instead it should be just one component of a multiple measure framework, and a tool to help teachers improve student learning. HB555 moved us much further away from that goal.

Correlation? What correlation?

Dublin teacher, Kevin Griffin, brings to our attention this graph, which he describes thusly

The chart plots the Value-Added scores of teachers who teach the same subject to two different grade levels in the same school year. (ex. Ms. Smith teaches 7th Math and 8th Math, and Mr. Richards 4th Grade Reading and 5th Grade Reading.) The X-axis represents the teachers VA score for one grade level and the Y-axis represents the VA score from the other grade level taught.

If the theory behind evaluating teachers based on value-added is valid then a “great” 7th grade math teacher should also be a “great” 8th grade math teacher (upper right corner) and a “bad” 7th grade math teacher should also be a “bad” 8th grade math teacher (lower left corner). There should, in theory, be a straight line (or at least close) showing a direct correlation between 7th grade VA scores and 8th grade VA scores since those students, despite being a grade apart, have the same teacher.

Here's the graph

Looks morel ike a random number generator to us. Would you like your career to hinge on a random number generator?

Gates Foundation Wastes More Money Pushing VAM

Makes it hard to trust the corporate ed reformers when they goose their stats as badly as this.

Any attempt to evaluate teachers that is spoken of repeatedly as being "scientific" is naturally going to provoke rebuttals that verge on technical geek-speak. The MET Project's "Ensuring Fair and Reliable Measures of Effective Teaching" brief does just that. MET was funded by the Bill & Melinda Gates Foundation.

At the center of the brief's claims are a couple of figures (“scatter diagrams” in statistical lingo) that show remarkable agreement in VAM scores for teachers in Language Arts and Math for two consecutive years. The dots form virtual straight lines. A teacher with a high VAM score one year can be relied on to have an equally high VAM score the next, so Figure 2 seems to say.

Not so. The scatter diagrams are not dots of teachers' VAM scores but of averages of groups of VAM scores. For some unexplained reason, the statisticians who analyzed the data for the MET Project report divided the 3,000 teachers into 20 groups of about 150 teachers each and plotted the average VAM scores for each group. Why?

And whatever the reason might be, why would one do such a thing when it has been known for more than 60 years now that correlating averages of groups grossly overstates the strength of the relationship between two variables? W.S. Robinson in 1950 named this the "ecological correlation fallacy." Please look it up in Wikipedia. The fallacy was used decades ago to argue that African-Americans were illiterate because the correlation of %-African-American and %-illiterate was extremely high when measured at the level of the 50 states. In truth, at the level of persons, the correlation is very much lower; we’re talking about differences as great as .90 for aggregates vs .20 for persons.

Just because the average of VAM scores for 150 teachers will agree with next year's VAM score average for the same 150 teachers gives us no confidence that an individual teacher's VAM score is reliable across years. In fact, such scores are not — a fact shown repeatedly in several studies.

[readon2 url="http://ed2worlds.blogspot.com/2013/01/gates-foundation-wastes-more-money.html"]Continue reading...[/readon2]

HB555 Analysis

The Ohio House of Representatives approved HB 555. The House passed the bill without amendments in a party line vote, 58-27. The bill will now head to the Senate.

The Legislative Services Commision has analyzed the bill and produced the report below. While the devil is in the details, and there are some devils, here's a brief breakdown of the policies HB555 contains

  • Replaces the current academic performance rating system for school districts, individual buildings of districts, community schools, STEM schools, and collegepreparatory boarding schools with a phased-in letter grade system under which districts and schools are assigned grades of "A," "B," "C," "D," or "F" based on 15 measures to reflect the performance profile of each district or school.
  • Creates six component classifications in which each performance measure is categorized and a grade is assigned for each component to be calculated into assigning an overall grade to a school district or building.
  • Requires the State Board of Education to develop an alternative academic performance rating system for community schools serving primarily students enrolled in dropout prevention and recovery programs.
  • Establishes criteria for closing dropout prevention and recovery community schools based on their academic performance.
  • Requires the Department of Education to review additional information included on report cards and submit to the Governor and the General Assembly recommendations for revisions.
  • Establishes a new evaluation process for determining which community school sponsors may sponsor additional schools.
  • Permits the Ohio Office of School Sponsorship to sponsor a community school if the school's sponsor has been prohibited from sponsoring additional schools.
  • Delays implementation of the new sponsor evaluation system until the 2015-2016 school year.
  • Renames the Ohio Accountability Task Force as the Ohio Accountability Advisory Committee and alters its membership and duties.
  • Requires the State Board to submit to the General Assembly recommendations for a comprehensive statewide plan to intervene in and improve the performance of persistently poor performing schools and school districts.
  • Reinstates the permanent requirement for five scoring ranges on the state achievement assessments.
  • Requires a school district to provide immediate services and regular diagnostic assessments for a student found to have a reading deficiency pending development of the student's reading improvement and monitoring plan required under continuing law.
  • Adds college-preparatory boarding schools to the provisions requiring the Department of Education to rank public schools by expenditures.
  • Requires that a designated fiscal officer of a community school be licensed as a school treasurer by the State Board of Education prior to assuming the duties of fiscal officer.
  • Requires the Department of Education to conduct two application periods each year for the Educational Choice Scholarship Program.
  • Establishes measures the Superintendent of Public Instruction must consider before approving new Internet- or computer-based community schools.
  • Restates that the requirements of the standards-based state framework for teacher evaluations and the standards and procedures for nonrenewal of a teacher's contract as a result of the evaluation prevail over any conflicting provisions of a collective bargaining agreement entered into on or after the effective date of the bill.
  • Specifically permits educational service centers to partner in the development of STEM schools
  • Permits an educational service center to sponsor a new start-up community school in any challenged district in the state, instead of just its service territory, so long as it receives approval to do so from the Department of Education.
  • Qualifies for a War Orphans Scholarship, children of military veterans who participated in an operation for which the Armed Forces Expeditionary Medal was awarded.
  • Authorizes the administrators of the Ohio National Guard Scholarship Program and the Ohio War Orphans Scholarship Program to apply for and receive grants; to accept gifts, bequests, and contributions from public and private sources; and to deposit all such contributions into the respective National Guard Scholarship Reserve Fund (existing) or the Ohio War Orphans Scholarship Fund (created by the bill).

OFT is asking that the following fixes be made to HB 555

  1. Eliminate graded items for the current school year. It’s not fair to change the rules in the middle of the game, or year. Delay any grades to 2014-2015.
  2. Don’t grade items that are impacted by a lack of resources - participation in AP courses, dual enrollment participation rate, K-3 literacy rate, college admission testing scores, remediation.
  3. Eliminate Accountability Board language
  4. A composite score dilutes the value of the dashboard and should be eliminated.
  5. Eliminate language that raises the standard and the cut score for achievement tests. This causes double jeopardy for school districts. Raising the cut score and standards from 75 to 80 percent will force more school districts to have lower scores making them and buildings subject to possible vouchers for low performance. Only the cut score should be raised.
  6. Safe harbor: For three years the student portion of teacher evaluations should be reduced from 50 percent to 25 percent. For three years school districts currently earning a continuous improvement rating or higher should be exempt from sanctions.

HB 555 Analysis

How Should Educators Interpret Value-Added Scores?

Via

Highlights

  • Each teacher, in principle, possesses one true value-added score each year, but we never see that "true" score. Instead, we see a single estimate within a range of plausible scores.
  • The range of plausible value-added scores -; the confidence interval -; can overlap considerably for many teachers. Consequently, for many teachers we cannot readily distinguish between them with respect to their true value-added scores.
  • Two conditions would enable us to achieve value-added estimates with high reliability: first, if teachers' value-added measurements were more precise, and second, if teachers’ true value-added scores varied more dramatically than they do.
  • Two kinds of errors of interpretation are possible when classifying teachers based on value-added: a) “false identifications” of teachers who are actually above a certain percentile but who are mistakenly classified as below it; and b) “false non-identifications” of teachers who are actually below a certain percentile but who are classified as above it. Falsely identifying teachers as being below a threshold poses risk to teachers, but failing to identify teachers who are truly ineffective poses risks to students.
  • Districts can conduct a procedure to identify how uncertainty about true value-added scores contributes to potential errors of classification. First, specify the group of teachers you wish to identify. Then, specify the fraction of false identifications you are willing to tolerate. Finally, specify the likely correlation between value-added score this year and next year. In most real-world settings, the degree of uncertainty will lead to considerable rates of misclassification of teachers.

Introduction

A teacher's value-added score is intended to convey how much that teacher has contributed to student learning in a particular subject in a particular year. Different school districts define and compute value-added scores in different ways. But all of them share the idea that teachers who are particularly successful will help their students make large learning gains, that these gains can be measured by students' performance on achievement tests, and that the value-added score isolates the teacher's contribution to these gains.

A variety of people may see value-added estimates, and each group may use them for different purposes. Teachers themselves may want to compare their scores with those of others and use them to improve their work. Administrators may use them to make decisions about teaching assignments, professional development, pay, or promotion. Parents, if they see the scores, may use them to request particular teachers for their children. And, finally, researchers may use the estimates for studies on improving instruction.

Using value-added scores in any of these ways can be controversial. Some people doubt the validity of the achievement tests on which the scores are based, some question the emphasis on test scores to begin with, and others challenge the very idea that student learning gains reflect how well teachers do their jobs.

In order to sensibly interpret value-added scores, it is important to do two things: understand the sources of uncertainty and quantify its extent.

Our purpose is not to settle these controversies, but, rather, to answer a more limited, but essential, question: How might educators reasonably interpret value-added scores? Social science has yet to come up with a perfect measure of teacher effectiveness, so anyone who makes decisions on the basis of value-added estimates will be doing so in the midst of uncertainty. Making choices in the face of doubt is hardly unusual – we routinely contend with projected weather forecasts, financial predictions, medical diagnoses, and election polls. But as in these other areas, in order to sensibly interpret value-added scores, it is important to do two things: understand the sources of uncertainty and quantify its extent. Our aim is to identify possible errors of interpretation, to consider how likely these errors are to arise, and to help educators assess how consequential they are for different decisions.

We'll begin by asking how value-added scores are defined and computed. Next, we'll consider two sources of error: statistical bias and statistical imprecision.

[readon2 url="http://www.carnegieknowledgenetwork.org/briefs/value-added/interpreting-value-added/"]Continue reading...[/readon2]