effects

Charter School Authorization And Growth

June 14, 2013 in Article

If you ask a charter school supporter why charter schools tend to exhibit inconsistency in their measured test-based impact, there’s a good chance they’ll talk about authorizing. That is, they will tell you that the quality of authorization laws and practices — the guidelines by which charters are granted, renewed and revoked — drives much and perhaps even most of the variation in the performance of charters relative to comparable district schools, and that strengthening these laws is the key to improving performance.

Accordingly, a recently-announced campaign by the National Association of Charter School Authorizers aims to step up the rate at which charter authorizers close “low-performing schools” and are more selective in allowing new schools to open. In addition, a recent CREDO study found (among other things) that charter middle and high schools’ performance during their first few years is more predictive of future performance than many people may have thought, thus lending support to the idea of opening and closing schools as an improvement strategy.

Below are a few quick points about the authorization issue, which lead up to a question about the relationship between selectivity and charter sector growth.

The reasonable expectation is that authorization matters, but its impact is moderate. Although there has been some research on authorizer type and related factors, there is, as yet, scant evidence as to the influence of authorization laws/practices on charter performance. In part, this is because such effects are difficult to examine empirically. However, without some kind of evidence, the “authorization theory” may seem a bit tautological: There are bad charters because authorizers allow bad charters to open, and fail to close them.

That said, the criteria and processes by which charters are granted/renewed almost certainly have a meaningful effect on performance, and this is an important area for policy research. On the other hand, it’s a big stretch to believe that these policies can explain a large share of the variation in charter effects. There’s a reasonable middle ground for speculation here: Authorization has an important but moderate impact, and, thus, improving these laws and practices is definitely worthwhile, but seems unlikely to alter radically the comparative performance landscape in the short- and medium-term (more on this below).

Strong authorization policies are a good idea regardless of the evidence. Just to be clear, even if future studies find no connection between improved authorization practices and outcomes, test-based or otherwise, it’s impossible to think of any credible argument against them. If you’re looking to open a new school (or you’re deciding whether or not to renew an existing one), there should be strong, well-defined criteria for being allowed to do so. Anything less serves nobody, regardless of their views on charter schools.

[readon2 url="http://shankerblog.org/?p=8510"]Continue reading...[/readon2]

2 new studies question value add measures

October 26, 2012 in Article

Evidence is overwhelming, as yet more studies show that using value add to measure teacher quality is fraught with error.

Academic tracking in secondary education appears to confound an increasingly common method for gauging differences in teacher quality, according to two recently released studies.

Failing to account for how students are sorted into more- or less-rigorous classes—as well as the effect different tracks have on student learning—can lead to biased "value added" estimates of middle and high school teachers' ability to boost their students' standardized-test scores, the papers conclude.

"I think it suggests that we're making even more errors than we need to—and probably pretty large errors—when we're applying value-added to the middle school level," said Douglas N. Harris, an associate professor of economics at Tulane University in New Orleans, whose study examines the application of a value-added approach to middle school math scores.

High-school-level findings from a separate second study, by C. Kirabo Jackson, an associate professor of human development and social policy at Northwestern University in Evanston, Ill., complement Mr. Harris' paper.

"At the elementary level, [value-added] is a pretty reliable measure, in terms of predicting how teachers will perform the following year," Mr. Jackson said. "At the high school level, it is quite a bit less reliable, so the scope for using this to improve student outcomes is much more limited."

The first study mentioned in this article concludes(emphasis ours)

We test the degree to which variation in measured performance is due to misalignment versus selection bias in a statewide sample of middle schools where students and teachers are assigned to explicit “tracks,” reflecting heterogeneous student ability and/or preferences. We find that failing to account for tracks leads to large biases in teacher value-added estimates.

A teacher of all lower track courses whose measured value-added is at the 50th percentile could increase her measured value-added to the 99th percentile simply by switching to all upper-track courses. We estimate that 75-95 percent of the bias is due to student sorting and the remainder due to test misalignment.

We also decompose the remaining bias into two parts, metric and multidimensionality misalignment, which work in opposite directions. Even after accounting for explicit tracking, the standard method for estimating teacher value-added may yield biased estimates.

The second study, replicates the findings and concludes

Unlike in elementary-school, high-school teacher effects may be confounded with both selection to tracks and unobserved track-level treatments. I document sizable confounding tracks effects, and show that traditional tests for the existence of teacher effects are likely biased. After accounting for these biases, algebra teachers have modest effects and there is little evidence of English teacher effects.

Unlike in elementary-school, value-added estimates are weak predictors of teachers’ future performance. Results indicate that either (a) teachers are less influential in high-school than in elementary-school, or (b) test-scores are a poor metric to measure teacher quality at the high-school level.

Corporate education reformers need to begin to address the science that is refuting their policies, the sooner this happens, the less damage is likely to be wrought.

Value-Added and Teacher Branding

September 21, 2012 in Article

The video and report discuss the problems found with Value-add

Audrey Amrein-Beardsley and Clarin Collins of the Mary Lou Fulton Teachers College at Arizona State University present “The SAS Education Value-Added Assessment System (SAS® EVAAS®) in the Houston Independent School District (HISD): Intended and Unintended Consequences”.

The SAS Educational Value-Added Assessment System (SAS® EVAAS®) is the most widely used value-added system in the country. It is also self-proclaimed as “the most robust and reliable” system available, with its greatest benefit to help educators improve their teaching practices. This study critically examined the effects of SAS® EVAAS® as experienced by teachers, in one of the largest, high-needs urban school districts in the nation – the Houston Independent School District (HISD).

Using a multiple methods approach, this study critically analyzed retrospective quantitative and qualitative data to better comprehend and understand the evidence collected from four teachers whose contracts were not renewed in the summer of 2011, in part given their low SAS® EVAAS® scores.

This study also suggests some intended and unintended effects that seem to be occurring as a result of SAS® EVAAS® implementation in HISD. In addition to issues with reliability, bias, teacher attribution, and validity, high-stakes use of SAS® EVAAS® in this district seems to be exacerbating unintended effects.

Here's the video

Teacher Retention: Estimating and Understanding the Effects of Financial Incentives

January 03, 2012 in Article

There is currently much interest in improving access to high-quality teachers (Clotfelter, Ladd, & Vigdor, 2010; Hanushek, 2007) through improved recruitment and retention. Prior research has shown that it is difficult to retain teachers, particularly in high-poverty schools (Boyd et al., 2011; Ingersoll, 2004). Although there is no one reason for this difficulty, there is some evidence to suggest teachers may leave certain schools or the profession in part because of dissatisfaction with low salaries (Ingersoll, 2001).

Thus, it is possible that by offering teachers financial incentives, whether in the form of alternative compensation systems or standalone bonuses, they would become more satisfied with their jobs and retention would increase. As of yet, however, support for this approach has not been grounded in empirical research.

Denver’s Professional Compensation System for Teachers (“ProComp”) is one of the most prominent alternative teacher compensation reforms in the nation.* Via a combination of ten financial incentives, ProComp seeks to increase student achievement by motivating teachers to improve their instructional practices and by attracting and retaining high-quality teachers to work in the district.

My research examines ProComp in terms of: 1) whether it has increased retention rates; 2) the relationship between retention and school quality (defined in terms of student test score growth); and 3) the reasons underlying these effects. I pay special attention to the effects of ProComp on schools that serve high concentrations of poor students – “Hard to Serve” (HTS) schools where teachers are eligible to receive a financial incentive to stay. The quantitative findings are discussed briefly below (I will discuss my other results in a future post).

[readon2 url="http://shankerblog.org/?p=4633"]Continue reading...[/readon2]

The full paper can is below:

TEACHER RETENTION: ESTIMATING AND UNDERSTANDING THE EFFECTS OF FINANCIAL INCENTIVES IN DENVER

What Value-Added Research Does And Does Not Show

December 01, 2011 in Article

Worth reading in it's entirety.

For example, the most prominent conclusion of this body of evidence is that teachers are very important, that there’s a big difference between effective and ineffective teachers, and that whatever is responsible for all this variation is very difficult to measure (see here, here, here and here). These analyses use test scores not as judge and jury, but as a reasonable substitute for “real learning,” with which one might draw inferences about the overall distribution of “real teacher effects.”

And then there are all the peripheral contributions to understanding that this line of work has made, including (but not limited to):

That experience does matter;
That the quality of peers affects teacher performance;
That teachers perform differently in different schools;
And that students’ backgrounds explain more of the variation in their performance than school related factors

Prior to the proliferation of growth models, most of these conclusions were already known to teachers and to education researchers, but research in this field has helped to validate and elaborate on them. That’s what good social science is supposed to do.

Conversely, however, what this body of research does not show is that it’s a good idea to use value-added and other growth model estimates as heavily-weighted components in teacher evaluations or other personnel-related systems. There is, to my knowledge, not a shred of evidence that doing so will improve either teaching or learning, and anyone who says otherwise is misinformed.*

As has been discussed before, there is a big difference between demonstrating that teachers matter overall – that their test-based effects vary widely, and in a manner that is not just random –and being able to accurately identify the “good” and “bad” performers at the level of individual teachers. Frankly, to whatever degree the value-added literature provides tentative guidance on how these estimates might be used productively in actual policies, it suggests that, in most states and districts, it is being done in a disturbingly ill-advised manner.

[readon2 url="http://shankerblog.org/?p=4358&mid=5417"]Read entire article[/readon2]

Like an untested drug?

October 31, 2011 in Article

If there was a new drug that had shown some promise in curing the flu in lab trials, but there were also some indicators that it had some nasty, in some cases fatal, side effects, do you think that drug required more testing and trials, or should be rushed into production and given out as widely as possible?

That's basically the scenario we have with using value add scores for high stakes decision making when it comes to teachers. Sure no one is actually going to die, but if corporate education reformers have their way, many might falsely lose their jobs, and the money wasted will never be used to actually educate a student, and what of the opportunity cost of missing out on getting effective reforms into the classroom being missed?

Given the context-dependency of the estimators’ ability to produce accurate results, however, and our current lack of knowledge regarding prevailing assignment practices, VAM-based measures of teacher performance, as currently applied in practice and research, must be subjected to close scrutiny regarding the methods used and interpreted with a high degree of caution.

Methods of constructing estimates of teacher effects that we can trust for high-stakes evaluative purposes must be further studied, and there is much left to investigate. In future research, we will explore the extent to which various estimation methods, including more sophisticated dynamic treatment effects estimators, can handle further complexity in the DGPs.

The addition of test measurement error, school effects, time-varying teacher effects, and different types of interactions among teachers and students are a few of many possible dimensions of complexity that must be studied. Finally, diagnostics are needed to identify the structure of decay and prevailing teacher assignment mechanisms. If contextual norms with regard to grouping and assignment mechanisms can be deduced from available data, then it may be possible to determine which estimators should be applied in a given context.

We must be able to prove that evaluations and the metrics that make them up are fair, accurate and stable, and if they are to have any real benefit they must ultimately demonstrate a cost effective way to improve student achievement and education quality. We're simply not there yet and pretending we are is dangerous and carries some very real risks.