A new study published in the education policy analysis archives titled "Houston, We Have a Problem: Teachers Find No Value in the SAS Education Value-Added Assessment System (EVAAS®)" looks at the use of Value-add in the real world. Their findings are not shocking, but continue to be troubling as we enter a high-stakes phase of deployment.
Today, SAS EVAAS® is the most widely used VAM in the country, and North Carolina, Ohio, Pennsylvania and Tennessee use the model state-wide (Collins & Amrein-Beardsley, 2014). Despite widespread popularity of the SAS EVAAS®, however, no research has been done from the perspective of teachers to examine how their practices are impacted by this methodology that professedly identifies effective and ineffective teachers. Even more disconcerting is that districts and states are tying consequences to the data generated from the SAS EVAAS®, entrusting the sophisticated methodologies to produce accurate, consistent, and reliable data, when it remains unknown how the model actually works in practice.
As you can see, the findings here are directly relevant to educators in Ohio. The report looked at a number of factors, including reliability, which once again proves to be anything but
As discussed in related literature (Baker et al., 2010; Corcoran, 2010; EPI, 2010; Otterman, 2010; Schochet & Chiang, 2010) and preliminary studies in SSD (Amrein-Beardsley & Collins, 2012), it was evident that inconsistent SAS EVAAS® scores year-to-year were an issue of concern. According to teachers who participated in this study, reliability as measured by consistent SAS EVAAS® scores year-to-year was ironically, an inconsistent reality. About half of the responding teachers reported consistent data whereas the other half did not, just like one would expect with the flip of a coin (see also Amrein-Beardsley & Collins, 2012).
Unless school districts could prevent teacher mobility and ensure equal, random student assignment, it appears that EVAAS is unable to produce reliable results, at least greater than 50% of the time.
A random number generator isn't an appropriate tool for measuring anything, let alone educator effectiveness that might lead to high-stakes career decisions.
Furthermore, the study found that teachers are discovering that despite claims to the contrary, the SAS formula for calculating Value-add is highly dependent upon the student population
teachers repeatedly identified specific groups of students (e.g., gifted, ELL, transition, special education) that typically demonstrated little to no SAS EVAAS® growth. Other teachers described various teaching scenarios such as teaching back-to-back grade levels or switching grade levels which negatively impacted their SAS EVAAS® scores. Such reports contradict Dr. Sanders’ claim that a teacher in one environment is equally as effective in another (LeClaire, 2011).
In conclusion, the study finds
The results from this study provide very important information of which not only SSD administrators should be aware, but also any other administrators from districts or states currently using or planning to use a VAM for teacher accountability. Although high-stakes use certainly exacerbates such findings, it is important to consider and understand that unintended consequences will accompany the intended consequences of implementing SAS EVAAS®, or likely any other VAM. Reminiscent of Campbell’s law, the overreliance on value-added assessment data (assumed to have great significance) to make high-stakes decisions risks contamination of the entire educational process, for students, teachers and administrators (Nichols & Berliner, 2007). Accordingly, these findings also strongly validate researchers’ recommendations to not use value-added data for high-stakes consequences (Eckert & Dabrowski, 2010; EPI, 2010; Harris, 2011). While the SAS EVAAS® model’s vulnerability as expressed by the SSD EVAAS®-eligible teachers is certainly compounded by the district’s high-stakes use, the model’s reliability and validity issues combined with teachers’ feedback that the SAS EVAAS® reports do not provide sufficient information to allow for instructional modification or reflection, would make it seem inappropriate at this point to use value-added data for anything.
the full study can be read below.