Spring 1998
Contents


Winter 1998


Autumn 1998


Summer 1998-99

 
More articles in Spring 1998
Christianity and Free Enterprise
Robert Clark
Interests, Incentives and Institutions
Joseph Stiglitz
 
 

 

'League Tables' of School Performance
By Keith Gannicott

A Legitimate Tool of Public Policy

League tables of school performance are hated with a passion by many educators, and they do seem to be based on the most outrageous comparisons, telling us more about the catchment area or financial resources of different schools than whether one school is genuinely better than another. The standard way to allow for this is to ensure that performance indicators measure ‘value added’ by schools. Since measured differences in performance may reflect differences in the ability or background of the students rather than in the school itself, it is important to assess the implication of student background versus genuine value added by the school.

While the importance of value-added measures can be readily acknowledged, raw league tables of performance have an important role to play in public policy. Despite their apparent lack of fairness, league tables uncorrected for value added may actually be an excellent way of helping precisely the disadvantaged students and schools that critics claim are most unfairly treated in a simple ranking of school performance. This is because a school in a poor area might be doing well on value added criteria, but its students might still be leaving school without adequate standards of literacy or numeracy. The importance of simple league tables lies in the establishment of benchmarks of performance which all children must achieve, irrespective of social or economic background.

Better information about school performance

It is neither exaggeration nor flippancy to note that Australian parents have more information and more rights when they buy a packet of soap powder than when they choose a school for their child. It is therefore worth starting a discussion of league tables by asking about the role of information in education. Information is an essential component of school choice. ‘School choice’ has become the generic term for a wide array of reforms in education. These range from the limited schemes of dezoning and specialist high schools that are now a feature of most Australian States, to more radical proposals such as ‘charter schools.’ In American terminology, charters are schools that are publicly owned and financed, but are privately established and operated in exchange for being held accountable for student performance. There are now some 8,000 charter schools in the United States.

In Australia, the debate about school choice took a sharper focus when the Coalition Government abolished the New Schools Policy that had been in operation between 1985 and 1996. It was not so much that the Government swept away that policy’s severe restrictions on access to public funding for new non-government schools. The crucial innovation was that government schools will have to face the financial consequences of parental choice. Under the Enrolment Benchmark Adjustment, the subsidy to non-government schools will be met by offsetting its cost against funding for the equivalent number of students in government schools.

A critical feature of these reforms, both here and overseas, is that it makes no sense to allow parents substantial choice of schooling unless they have the information that permits an informed choice. That information must, virtually by definition, include comparisons of school performance. More specifically, performance indicators have to be measured in terms of ‘hard’ outcomes such as standardised test scores if they are to be worthwhile to parents and taxpayers. This is in strong contrast to the input or process measures, such as teacher qualifications, class size, or curriculum requirements, that are more common in Australia. We are not used to outcome indicators in Australia, whereas in the United States there is a long history of public information about standardised test results. And we are even less accustomed to such indicators being used to make direct comparisons between schools in the way that has become routine in Britain since the Educational Reform Act of 1988.

In the past it could be argued that detailed information about each school’s performance would serve no useful purpose, because each government school was designed to be the same through the application of equitable spending and staffing allocations. Whether or not this was true in the past, the fact is that there is now quite enough evidence (from what is known as the effective schools literature) to know that schools do differ in their effectiveness.

Another reason for the lack of information is that few would deny the multifaceted nature of schooling. It is obviously desirable to have performance measures that are not confined to one dimension of school outcomes; the reality is that no single indicator of performance can capture all that schools try to achieve. But not all objectives are equally important, and it is sheer cant to pretend that useful conclusions about schools cannot be drawn if we have only limited information about their performance. If the argument is that imperfect or limited information is actually worse than no information at all, it is a principle that would make life impossible if adopted across the board. In just about every aspect of life we have to manage with only imperfect information to guide decisions.

Health provides a useful analogy. There is a multiplicity of ways in which the health of the population or the performance of complex health systems can be measured. No single indicator can provide an overall picture. Nevertheless, despite some obvious limitations, and some progress with more refined measures, it is common in international comparative work to use one indicator, the infant mortality rate, as a surrogate for the quality of health care available to the population as a whole.

To come back to education, student academic achievement is by far the most important and fundamental issue in schooling. Whatever the variety of objectives they pursue, all schools have as their central purpose the academic development of their students. This central purpose is implicitly acknowledged in the public debate when shorthand phrases such as ‘basic skills’ or ‘literacy and numeracy standards’ are used as proxies for school effectiveness. If we cannot measure performance on this central indicator of academic achievement we might as well throw in the towel and concede that we are unlikely to make much progress with the more diffuse objectives.

Probably the main reason for the lack of information about school performance is a belief that only educators can be trusted with the information: parents and others would misuse or misinterpret it. In most Australian states, teachers’ unions have acquiesced in basic skills testing only on condition that there should be no public release of school results and that no use be made of them to compare schools. This argument seemed to get some support in early 1997 when the unfortunate Year 12 at
Mount Druitt High School in western Sydney found their class photo splashed all over the front page of the Daily Telegraph. No student at the school had received a Tertiary Entrance Ranking (TER) above 44·4 in the 1996 Higher School Certificate. There is no doubt that the media coverage was extremely uncomfortable for students, parents and the Department of School Education, but these are not conclusive arguments against publication (see Box 1).
 

Box 1: The Mount Druitt episode 

My own view of the Mount Druitt episode is that the class photo should not have been published. I thought its use particularly unfortunate because the two main articles written in the Telegraph about Mt Druitt were both very thoughtful pieces, not at all criticising the students, but raising a legitimate matter of public concern about the school and its HSC results. Whatever the causes and significance of those results, this was an outcome that needed explanation. In my view the sensationalism of the photograph detracted from the thoughtfulness of the articles. But even though I thought it inappropriate to publish that photo, the Mt Druitt case does not support in any way an argument that information about school performance will be misused and therefore should be suppressed.  
I have heard other opinions that the use of that photograph was entirely justified, because it brought home in a way no amount of text or TER rankings can do in a tabloid newspaper the deeply personal impact on those students of their HSC results. The crucial point is that, while some objected to publication of those results, there are other views which claim that there was a legitimate matter of public interest, and that the style of presentation was appropriate for the context.  
 

More generally, the idea that information is OK but only if it is provided on terms that educators find acceptable is itself quite unacceptable. Information about performance on the wharves that came solely from the MUA would be treated with derision; it is not acceptable for airlines to set their own safety standards and to decide which safety statistics should be published; car owners prefer the tests of crash performance or repair costs performed by motoring organisations to the claims put out by the car manufacturers; those in New South Wales have learned to treat with exceptional caution data from the Road and Traffic Authority on the alleged need for urban expressways, because it is known that the RTA’s engineers like to build roads; New South Wales has also learned not to take the police at their own valuation of themselves; and so on.

In explaining all these examples there is no need to invoke some silly conspiracy theory or cast doubts on anyone’s integrity. It is enough to note that (i) as parents, taxpayers, citizens, we need information to be able to make responsible and informed decisions and (ii) there are legitimate reasons why a consumer’s or taxpayer’s perspective is not necessarily the same as a supplier’s or provider’s perspective. The same logic should apply to education.

The principle of value-added adjustments

General principles about the flow of information are all very well, but critics have a case when they argue that accurate information about school performance is possible only after careful correction for the prior achievement of the students and the social and economic context of the schools. Without that correction, measured differences in performance may reflect differences in the ability and background of the students, rather than the value added by the school itself.

In current British usage, value-added means adjusting students’ test scores for their prior academic attainment. This adjustment doesn’t take into account the social or economic background of the school and its students, corrections that have been prominent in American studies. The argument in Britain is that prior attainment is already influenced by such factors, so correcting test scores for prior attainment will also correct them for relevant socio-economic differences.

On this definition, value-added is simply a variant of what the equivalent American studies term selection bias. Anyone who looks at the American educational literature quickly realises that both researchers and policymakers have available an enviable range of data on academic performance in schools. The consequence is that American studies can make an elaborate correction for socio-economic context, ethnic group, school tracking, and many other variables. The American studies, in short, usually go well beyond the British concept of correcting only for prior attainment and hoping that this will implicitly incorporate socio-economic effects. Nevertheless, the concept is identical: the aim is to compare schools only after ‘netting out’ differences between students.

It is worth noting that although the practical problems of data availability and measurement can be severe, in principle modern statistical methods are quite capable not only of allowing for a wide variety of both quantitative and qualitative variables, but of adjusting for simultaneous or two-way relationships between variables. One of the most influential books on educational policy in recent years (Politics, Markets and America’s Schools by John Chubb and Terry Moe) measured student performance after allowing for an extraordinary range of variables – economic and social and school and family and personal characteristics. They even measured the impact of variables such as administrative routines in classrooms, disciplinary policy and attitudes of the principal.

Data does not exist to do anything remotely comparable in this country. The nearest contemporary example from Australia is the more modest but nonetheless welcome innovation from Victoria, which for the last two years has published a list of schools ranked by VCE performance on the basis of what would be expected from their performance in the General Achievement Test held earlier in Year 12. The Victorian exercise is much closer to the limited British idea of correcting for prior attainment than the all-singing-all-dancing American comparisons.

The Victorian example also brings home the fact that there is no uniquely correct way of making value-added corrections. Even with an extensive set of data the results will depend on how the variables are specified and the type of model in which they are used. There are also technical issues of multi-level modelling and the mobility of students between schools.

When all is said and done and all the academic caveats are made, the bottom line is that if progress is to be made in providing essential information about performance in schools, properly adjusted for student background, value-added comparisons are clearly the way to go. As already acknowledged, league tables not adjusted for value added can seem outrageously unfair. Schools in areas of social and economic deprivation, or those with a high proportion of students from a non-English speaking background, are compared directly with schools in favoured locations. Critics have a case when they argue that position on a raw league table may tell us a lot about a school’s catchment area, but much less about how good it is at educating children.

Why league tables are a legitimate tool of policy

If this argument is conceded, how is it also possible to claim that plain unadjusted league tables of school performance might actually be in the best interests of those very students who attend schools in areas of economic or social deprivation?

To make sense of this paradox it is necessary to understand the difference between relative and absolute standards of attainment. Value added is a relative measure. Value added does not judge schools against some absolute criterion of achievement: it judges them in relation to their starting point and against the progress made by other pupils in the sample. Test scores in school X are corrected for the prior attainment of students in that school. This measures the progress they have made (or value added) over some specified period. Corrected test scores in other schools are calculated in the same way. It is then possible to compare the progress made by students in school X with the progress made by all other schools in the sample.

It follows that school X may have low final test scores, but in relation to its raw material (many students with low levels of prior attainment) has nonetheless made very strong progress relative to other schools in the sample.  It has made strong relative progress and therefore on a value-added basis rates very highly against other schools. In particular, its value-added rank might be better than that of school Y, whose high performance on an uncorrected basis may owe more to the quality of its intake than the calibre of its teaching.

This is shown in Figure 1. The horizontal axis shows intake scores, or prior attainment. The vertical axis shows output or later attainment, and the regression or trend line shows average performance for all schools. Any school scoring along or near the line is an average school in value-added terms. Its pupils score on final attainment just what we would expect them to get, given the quality of its intake. School 1, which has very high final scores, is actually no better than average in value-added terms. Its final scores, although outstanding, are no better and no worse than would be expected given the high quality of its intake. School 5, on the other hand, has more modest final scores, but would rate very highly in value-added terms.  School 5 lies well above the regression line, showing that it makes better than expected progress given that its intake has low levels of prior attainment.
 

But supposing school 5’s excellent value-added progress masks the fact that its final test scores are very low in an absolute sense? Statistically speaking, the school has done well given its pupil intake, but suppose its graduates are nonetheless leaving with a level of literacy or numeracy considered too low for effective functioning in contemporary society? Exactly how to establish that benchmark is, of course, far from easy, and reasonable people can hold different opinions. Turning to Figure 2, suppose that output score C is judged the benchmark of what is required as an effective score for literacy or numeracy or whatever is being measured along that scale. Measured against that benchmark, school 5’s students are failing to meet adequate standards.

David Reynolds, who has played a prominent role in educational policy in Britain, nicely contrasts the advantages of the relative and absolute perspectives:

There’s no doubt that value added was a great advance on what had gone before. But value added was originally a research tool. It has been pushed by enthusiasts as an instrument of public policy. And when it becomes that, you have to use it differently. The policy judgements to be made about value added go beyond its scientific applications. For example, a school doing moderately well in a poor catchment area would, on value-added criteria, be an effective school. Yet the children coming out of the school might well not be literate or numerate. Public policy is concerned about finishing points – about outcomes being as close as possible to absolute, not relative success. Value added relativises things (1997: 6).

It was the government’s insistence that all students, regardless of social or economic background, should achieve appropriate standards of literacy and numeracy that underpinned the development of raw league tables in Britain. Similarly, the acceptance of absolute levels or benchmarks of attainment underlie the Blair Government’s policy of ‘zero tolerance’ of failure. In other words, every child must reach minimum standards in designated curriculum areas, irrespective of social or economic background.

The only way to establish whether this is happening is to judge students and schools by absolute criteria or benchmarks. In Australia, the State Education Ministers have already agreed that every child leaving primary school should be numerate, and able to read, write and spell at an appropriate level. Whether we like it or not, this approach implies setting the same target for all schools, regardless of their social or economic context.

And this leads back to the starting point of this argument, that those ostensibly very unfair raw league tables can actually play a legitimate role in public policy in ensuring that all students reach adequate standards. As Reynolds notes, value added ‘relativises things.’ A harsher way to make the same point is to observe that value added can lead to complacent acceptance of unacceptably low standards of achievement. Schools with low absolute levels of achievement can claim they are doing as well as can be expected, given the raw material they have available. The real problem, say the advocates of absolute standards, is that relative measures can lead you not to expect enough of your students. Low expectations can become self-fulfilling, so that adequate standards are never achieved. Value added then becomes an excuse for unacceptable results.

It is worth noting that Reynolds’s work on international education played an important role in developing the notion of absolute standards and zero tolerance of failure. In Worlds Apart (1996), he noted that high achievement scores in several East Asian and European countries were partly a result of the belief in those countries that all children are able to achieve certain core skills in core subjects, and that there was no need for the substantial ‘trailing edge’ of low performing pupils that is a feature of schooling in Britain and America.

This is very closely related to the conclusion that emerges from the American studies. Many of the academic studies of school performance in the United States compare public and Catholic schools. The general result that emerges from those studies is that Catholic schools outperform government schools, even when the sample is corrected for selection bias, to use the American term. The reason for this is much more difficult to discern, but it seems to be a result of Catholic schools demanding more of their students.

In one of the best known studies, Hoffer and his colleagues pointed out that ‘Catholic schools place in an academic track many students whose sophomore [second year] achievement would relegate them to a general or vocational track in public schools [and] Catholic schools demand more homework and advanced coursework, especially from those who are disadvantaged in one way or another ...’ (Hoffer, Greeley and Coleman 1985). Thus, it is true to say that public schools which made the same demands as the average Catholic school would produce comparable achievement, but the reality is that many public schools make lesser demands of their students.

In Britain a White Paper titled Excellence in Education was released barely weeks after the last election, and that paper observed that ‘one of the most powerful underlying reasons for low performance in our schools has been low expectations which have allowed poor quality teaching to continue unchallenged. Too many teachers, parents and pupils have come to accept a ceiling on achievement that is far below what is possible. Schools often fail to stretch the most able; and they have not been good at identifying and pushing the modest or poor performers or those with special educational needs.  In some cases the excuse has been that “you cannot expect high achievement from children in a run-down area like this”’ (Department for Education and Employment 1997).

And, to come back briefly to the example of Mt Druitt High School, those who have read the Department of School Education’s report (the Laughlin Review) will know that it was very difficult to identify any specific reason why that Year 12 performed badly, but scattered here and there were suggestions that, in the aftermath of a loss of its better pupils to other schools, a culture of high expectations may not have been part of the Mt Druitt environment.

Conclusion

It is really a question of perspective. To say that there is evidence, on a value-added basis, that school 1 performs better than school 2 does not, of itself, demonstrate that performance in school 1 reaches acceptable standards.  If the aim of public policy is to focus upon absolute standards, which all children must achieve regardless of background, then the perspective shifts. It is now entirely appropriate to know which schools and students, regardless of social or economic background and regardless of the quality of the intake, are failing to meet those standards. It is, in short, appropriate to focus upon a simple league table of performance.

On this line of argument, setting absolute standards which all schools are expected to achieve is the only way to ensure that over time all students reach the minimum appropriate standard. It will certainly be uncomfortable, and without question it will seem outrageously unfair to schools that consider themselves disadvantaged in one way or another. But the gains to society as a whole will be substantial. Ranking schools by performance, with full public disclosure of those not meeting adequate standards, will itself contribute to improved achievement. This ranking and its disclosure will both set the appropriate expectations and provide the motivation for schools to lift their game.
 
References

Department for Education and Employment 1997, Excellence in Schools, London.

Hoffer, T., A. Greeley  and J. Coleman 1985, ‘Achievement Growth in Public and Catholic High Schools,’ Sociology of Education 58(1).

Reynolds, D. 1997, ‘Labour task force professor defends use of raw results,’ Times Educational Supplement 7 March: 6.

Reynolds, D. and S. Farrell 1996, Worlds Apart? A Review of International Surveys of Educational Achievement Involving England, Office for Standards in Education, HMSO, London.

About the Author:
Ken Gannicott is Professor of Education at the University of Wollongong. This article is an edited extract from a study prepared for the Department of Employment, Education, Training and Youth Affairs and published by DEETYA in June 1998 under the title ‘School Autonomy and Academic Performance’. An earlier version of this paper was presented at the Conference of the National Council of Independent Schools’ Associations held in Melbourne in May 1998.

The author would like to thank Ms Fiona Ogilvy-O’Donnell of the Association of Independent Schools of Victoria for her agreement to publication of the paper. The views expressed in the paper do not necessarily represent the views of either DEETYA or AISV.


Policy is the quarterly review of The Centre for Independent Studies. For more information on subscribing to Policy, click HERE

If you are interested in the Centre's activities and publications, why not subscribe to e-PreCIS, our regular email update on the latest news and events.

(e-PreCIS requires html capable email facilities, such as Microsoft Outlook Express or Netscape Messenger)