|
'League
Tables' of School Performance
By
Keith Gannicott
A Legitimate
Tool of Public Policy
League tables of school
performance are hated with a passion by many educators, and
they do seem to be based on the most outrageous comparisons,
telling us more about the catchment area or financial resources
of different schools than whether one school is genuinely
better than another. The standard way to allow for this is
to ensure that performance indicators measure value
added by schools. Since measured differences in performance
may reflect differences in the ability or background of the
students rather than in the school itself, it is important
to assess the implication of student background versus genuine
value added by the school.
While the importance
of value-added measures can be readily acknowledged, raw league
tables of performance have an important role to play in public
policy. Despite their apparent lack of fairness, league tables
uncorrected for value added may actually be an excellent way
of helping precisely the disadvantaged students and schools
that critics claim are most unfairly treated in a simple ranking
of school performance. This is because a school in a poor
area might be doing well on value added criteria, but its
students might still be leaving school without adequate standards
of literacy or numeracy. The importance of simple league tables
lies in the establishment of benchmarks of performance which
all children must achieve, irrespective of social or economic
background.
Better information
about school performance
It is neither exaggeration
nor flippancy to note that Australian parents have more information
and more rights when they buy a packet of soap powder than
when they choose a school for their child. It is therefore
worth starting a discussion of league tables by asking about
the role of information in education. Information is an essential
component of school choice. School choice has
become the generic term for a wide array of reforms in education.
These range from the limited schemes of dezoning and specialist
high schools that are now a feature of most Australian States,
to more radical proposals such as charter schools.
In American terminology, charters are schools that are publicly
owned and financed, but are privately established and operated
in exchange for being held accountable for student performance.
There are now some 8,000 charter schools in the United States.
In Australia, the
debate about school choice took a sharper focus when the Coalition
Government abolished the New Schools Policy that had been
in operation between 1985 and 1996. It was not so much that
the Government swept away that policys severe restrictions
on access to public funding for new non-government schools.
The crucial innovation was that government schools will have
to face the financial consequences of parental choice. Under
the Enrolment Benchmark Adjustment, the subsidy to non-government
schools will be met by offsetting its cost against funding
for the equivalent number of students in government schools.
A critical feature
of these reforms, both here and overseas, is that it makes
no sense to allow parents substantial choice of schooling
unless they have the information that permits an informed
choice. That information must, virtually by definition, include
comparisons of school performance. More specifically, performance
indicators have to be measured in terms of hard
outcomes such as standardised test scores if they are to be
worthwhile to parents and taxpayers. This is in strong contrast
to the input or process measures, such as teacher qualifications,
class size, or curriculum requirements, that are more common
in Australia. We are not used to outcome indicators in Australia,
whereas in the United States there is a long history of public
information about standardised test results. And we are even
less accustomed to such indicators being used to make direct
comparisons between schools in the way that has become routine
in Britain since the Educational Reform Act of 1988.
In the past it could
be argued that detailed information about each schools
performance would serve no useful purpose, because each government
school was designed to be the same through the application
of equitable spending and staffing allocations. Whether or
not this was true in the past, the fact is that there is now
quite enough evidence (from what is known as the effective
schools literature) to know that schools do differ in their
effectiveness.
Another reason for
the lack of information is that few would deny the multifaceted
nature of schooling. It is obviously desirable to have performance
measures that are not confined to one dimension of school
outcomes; the reality is that no single indicator of performance
can capture all that schools try to achieve. But not all objectives
are equally important, and it is sheer cant to pretend that
useful conclusions about schools cannot be drawn if we have
only limited information about their performance. If the argument
is that imperfect or limited information is actually worse
than no information at all, it is a principle that would make
life impossible if adopted across the board. In just about
every aspect of life we have to manage with only imperfect
information to guide decisions.
Health provides a
useful analogy. There is a multiplicity of ways in which the
health of the population or the performance of complex health
systems can be measured. No single indicator can provide an
overall picture. Nevertheless, despite some obvious limitations,
and some progress with more refined measures, it is common
in international comparative work to use one indicator, the
infant mortality rate, as a surrogate for the quality of health
care available to the population as a whole.
To come back to education,
student academic achievement is by far the most important
and fundamental issue in schooling. Whatever the variety of
objectives they pursue, all schools have as their central
purpose the academic development of their students. This central
purpose is implicitly acknowledged in the public debate when
shorthand phrases such as basic skills or literacy
and numeracy standards are used as proxies for school
effectiveness. If we cannot measure performance on this central
indicator of academic achievement we might as well throw in
the towel and concede that we are unlikely to make much progress
with the more diffuse objectives.
Probably the main
reason for the lack of information about school performance
is a belief that only educators can be trusted with the information:
parents and others would misuse or misinterpret it. In most
Australian states, teachers unions have acquiesced in
basic skills testing only on condition that there should be
no public release of school results and that no use be made
of them to compare schools. This argument seemed to get some
support in early 1997 when the unfortunate Year 12 at
Mount Druitt High School
in western Sydney found their class photo splashed all over
the front page of the Daily Telegraph. No student at
the school had received a Tertiary Entrance Ranking (TER)
above 44·4 in the 1996 Higher School Certificate. There
is no doubt that the media coverage was extremely uncomfortable
for students, parents and the Department of School Education,
but these are not conclusive arguments against publication
(see Box 1).
|
Box 1: The
Mount Druitt episode
My own view
of the Mount Druitt episode is that the class photo
should not have been published. I thought its use particularly
unfortunate because the two main articles written in
the Telegraph about Mt Druitt were both very
thoughtful pieces, not at all criticising the students,
but raising a legitimate matter of public concern about
the school and its HSC results. Whatever the causes
and significance of those results, this was an outcome
that needed explanation. In my view the sensationalism
of the photograph detracted from the thoughtfulness
of the articles. But even though I thought it inappropriate
to publish that photo, the Mt Druitt case does not support
in any way an argument that information about school
performance will be misused and therefore should be
suppressed.
I have heard
other opinions that the use of that photograph was entirely
justified, because it brought home in a way no amount
of text or TER rankings can do in a tabloid newspaper
the deeply personal impact on those students of their
HSC results. The crucial point is that, while some objected
to publication of those results, there are other views
which claim that there was a legitimate matter of public
interest, and that the style of presentation was appropriate
for the context.
|
More generally, the
idea that information is OK but only if it is provided on
terms that educators find acceptable is itself quite unacceptable.
Information about performance on the wharves that came solely
from the MUA would be treated with derision; it is not acceptable
for airlines to set their own safety standards and to decide
which safety statistics should be published; car owners prefer
the tests of crash performance or repair costs performed by
motoring organisations to the claims put out by the car manufacturers;
those in New South Wales have learned to treat with exceptional
caution data from the Road and Traffic Authority on the alleged
need for urban expressways, because it is known that the RTAs
engineers like to build roads; New South Wales has also learned
not to take the police at their own valuation of themselves;
and so on.
In explaining all
these examples there is no need to invoke some silly conspiracy
theory or cast doubts on anyones integrity. It is enough
to note that (i) as parents, taxpayers, citizens, we need
information to be able to make responsible and informed decisions
and (ii) there are legitimate reasons why a consumers
or taxpayers perspective is not necessarily the same
as a suppliers or providers perspective. The same
logic should apply to education.
The principle of
value-added adjustments
General principles
about the flow of information are all very well, but critics
have a case when they argue that accurate information about
school performance is possible only after careful correction
for the prior achievement of the students and the social and
economic context of the schools. Without that correction,
measured differences in performance may reflect differences
in the ability and background of the students, rather than
the value added by the school itself.
In current British
usage, value-added means adjusting students test scores
for their prior academic attainment. This adjustment doesnt
take into account the social or economic background of the
school and its students, corrections that have been prominent
in American studies. The argument in Britain is that prior
attainment is already influenced by such factors, so correcting
test scores for prior attainment will also correct them for
relevant socio-economic differences.
On this definition,
value-added is simply a variant of what the equivalent American
studies term selection bias. Anyone who looks at the American
educational literature quickly realises that both researchers
and policymakers have available an enviable range of data
on academic performance in schools. The consequence is that
American studies can make an elaborate correction for socio-economic
context, ethnic group, school tracking, and many other variables.
The American studies, in short, usually go well beyond the
British concept of correcting only for prior attainment and
hoping that this will implicitly incorporate socio-economic
effects. Nevertheless, the concept is identical: the aim is
to compare schools only after netting out differences
between students.
It is worth noting
that although the practical problems of data availability
and measurement can be severe, in principle modern statistical
methods are quite capable not only of allowing for a wide
variety of both quantitative and qualitative variables, but
of adjusting for simultaneous or two-way relationships between
variables. One of the most influential books on educational
policy in recent years (Politics, Markets and Americas
Schools by John Chubb and Terry Moe) measured student
performance after allowing for an extraordinary range of variables
economic and social and school and family and personal
characteristics. They even measured the impact of variables
such as administrative routines in classrooms, disciplinary
policy and attitudes of the principal.
Data does not exist
to do anything remotely comparable in this country. The nearest
contemporary example from Australia is the more modest but
nonetheless welcome innovation from Victoria, which for the
last two years has published a list of schools ranked by VCE
performance on the basis of what would be expected from their
performance in the General Achievement Test held earlier in
Year 12. The Victorian exercise is much closer to the limited
British idea of correcting for prior attainment than the all-singing-all-dancing
American comparisons.
The Victorian example
also brings home the fact that there is no uniquely correct
way of making value-added corrections. Even with an extensive
set of data the results will depend on how the variables are
specified and the type of model in which they are used. There
are also technical issues of multi-level modelling and the
mobility of students between schools.
When all is said and
done and all the academic caveats are made, the bottom line
is that if progress is to be made in providing essential information
about performance in schools, properly adjusted for student
background, value-added comparisons are clearly the way to
go. As already acknowledged, league tables not adjusted for
value added can seem outrageously unfair. Schools in areas
of social and economic deprivation, or those with a high proportion
of students from a non-English speaking background, are compared
directly with schools in favoured locations. Critics have
a case when they argue that position on a raw league table
may tell us a lot about a schools catchment area, but
much less about how good it is at educating children.
Why league tables
are a legitimate tool of policy
If this argument is
conceded, how is it also possible to claim that plain unadjusted
league tables of school performance might actually
be in the best interests of those very students who attend
schools in areas of economic or social deprivation?
To make sense of this
paradox it is necessary to understand the difference between
relative and absolute standards of attainment. Value added
is a relative measure. Value added does not judge schools
against some absolute criterion of achievement: it judges
them in relation to their starting point and against the progress
made by other pupils in the sample. Test scores in school
X are corrected for the prior attainment of students in that
school. This measures the progress they have made (or value
added) over some specified period. Corrected test scores in
other schools are calculated in the same way. It is then possible
to compare the progress made by students in school X with
the progress made by all other schools in the sample.
It follows that school
X may have low final test scores, but in relation to its raw
material (many students with low levels of prior attainment)
has nonetheless made very strong progress relative to other
schools in the sample. It has made strong relative progress
and therefore on a value-added basis rates very highly against
other schools. In particular, its value-added rank might be
better than that of school Y, whose high performance on an
uncorrected basis may owe more to the quality of its intake
than the calibre of its teaching.
This is shown in Figure
1. The horizontal axis shows intake scores, or prior attainment.
The vertical axis shows output or later attainment, and the
regression or trend line shows average performance for all
schools. Any school scoring along or near the line is an average
school in value-added terms. Its pupils score on final attainment
just what we would expect them to get, given the quality of
its intake. School 1, which has very high final scores, is
actually no better than average in value-added terms. Its
final scores, although outstanding, are no better and no worse
than would be expected given the high quality of its intake.
School 5, on the other hand, has more modest final scores,
but would rate very highly in value-added terms. School
5 lies well above the regression line, showing that it makes
better than expected progress given that its intake has low
levels of prior attainment.
But supposing school
5s excellent value-added progress masks the fact that
its final test scores are very low in an absolute sense? Statistically
speaking, the school has done well given its pupil intake,
but suppose its graduates are nonetheless leaving with a level
of literacy or numeracy considered too low for effective functioning
in contemporary society? Exactly how to establish that benchmark
is, of course, far from easy, and reasonable people can hold
different opinions. Turning to Figure 2, suppose that output
score C is judged the benchmark of what is required as an
effective score for literacy or numeracy or whatever is being
measured along that scale. Measured against that benchmark,
school 5s students are failing to meet adequate standards.
David
Reynolds, who has played a prominent role in educational policy
in Britain, nicely contrasts the advantages of the relative
and absolute perspectives:
Theres no doubt
that value added was a great advance on what had gone before.
But value added was originally a research tool. It has been
pushed by enthusiasts as an instrument of public policy. And
when it becomes that, you have to use it differently. The
policy judgements to be made about value added go beyond its
scientific applications. For example, a school doing moderately
well in a poor catchment area would, on value-added criteria,
be an effective school. Yet the children coming out of the
school might well not be literate or numerate. Public policy
is concerned about finishing points about outcomes
being as close as possible to absolute, not relative success.
Value added relativises things (1997: 6).
It was the governments
insistence that all students, regardless of social or economic
background, should achieve appropriate standards of literacy
and numeracy that underpinned the development of raw league
tables in Britain. Similarly, the acceptance of absolute levels
or benchmarks of attainment underlie the Blair Governments
policy of zero tolerance of failure. In other
words, every child must reach minimum standards in designated
curriculum areas, irrespective of social or economic background.
The only way to establish
whether this is happening is to judge students and schools
by absolute criteria or benchmarks. In Australia, the State
Education Ministers have already agreed that every child leaving
primary school should be numerate, and able to read, write
and spell at an appropriate level. Whether we like it or not,
this approach implies setting the same target for all schools,
regardless of their social or economic context.
And this leads back
to the starting point of this argument, that those ostensibly
very unfair raw league tables can actually play a legitimate
role in public policy in ensuring that all students reach
adequate standards. As Reynolds notes, value added relativises
things. A harsher way to make the same point is to observe
that value added can lead to complacent acceptance of unacceptably
low standards of achievement. Schools with low absolute levels
of achievement can claim they are doing as well as can be
expected, given the raw material they have available. The
real problem, say the advocates of absolute standards, is
that relative measures can lead you not to expect enough of
your students. Low expectations can become self-fulfilling,
so that adequate standards are never achieved. Value added
then becomes an excuse for unacceptable results.
It is worth noting
that Reynoldss work on international education played
an important role in developing the notion of absolute standards
and zero tolerance of failure. In Worlds Apart (1996),
he noted that high achievement scores in several East Asian
and European countries were partly a result of the belief
in those countries that all children are able to achieve certain
core skills in core subjects, and that there was no need for
the substantial trailing edge of low performing
pupils that is a feature of schooling in Britain and America.
This is very closely
related to the conclusion that emerges from the American studies.
Many of the academic studies of school performance in the
United States compare public and Catholic schools. The general
result that emerges from those studies is that Catholic schools
outperform government schools, even when the sample is corrected
for selection bias, to use the American term. The reason for
this is much more difficult to discern, but it seems to be
a result of Catholic schools demanding more of their students.
In one of the best
known studies, Hoffer and his colleagues pointed out that
Catholic schools place in an academic track many students
whose sophomore [second year] achievement would relegate them
to a general or vocational track in public schools [and] Catholic
schools demand more homework and advanced coursework, especially
from those who are disadvantaged in one way or another ...
(Hoffer, Greeley and Coleman 1985). Thus, it is true to say
that public schools which made the same demands as the average
Catholic school would produce comparable achievement, but
the reality is that many public schools make lesser demands
of their students.
In Britain a White
Paper titled Excellence in Education was released barely
weeks after the last election, and that paper observed that
one of the most powerful underlying reasons for low
performance in our schools has been low expectations which
have allowed poor quality teaching to continue unchallenged.
Too many teachers, parents and pupils have come to accept
a ceiling on achievement that is far below what is possible.
Schools often fail to stretch the most able; and they have
not been good at identifying and pushing the modest or poor
performers or those with special educational needs.
In some cases the excuse has been that you cannot expect
high achievement from children in a run-down area like this
(Department for Education and Employment 1997).
And, to come back
briefly to the example of Mt Druitt High School, those who
have read the Department of School Educations report
(the Laughlin Review) will know that it was very difficult
to identify any specific reason why that Year 12 performed
badly, but scattered here and there were suggestions that,
in the aftermath of a loss of its better pupils to other schools,
a culture of high expectations may not have been part of the
Mt Druitt environment.
Conclusion
It is really a question
of perspective. To say that there is evidence, on a value-added
basis, that school 1 performs better than school 2 does not,
of itself, demonstrate that performance in school 1 reaches
acceptable standards. If the aim of public policy is
to focus upon absolute standards, which all children must
achieve regardless of background, then the perspective shifts.
It is now entirely appropriate to know which schools and students,
regardless of social or economic background and regardless
of the quality of the intake, are failing to meet those standards.
It is, in short, appropriate to focus upon a simple league
table of performance.
On this line of argument,
setting absolute standards which all schools are expected
to achieve is the only way to ensure that over time all students
reach the minimum appropriate standard. It will certainly
be uncomfortable, and without question it will seem outrageously
unfair to schools that consider themselves disadvantaged in
one way or another. But the gains to society as a whole will
be substantial. Ranking schools by performance, with full
public disclosure of those not meeting adequate standards,
will itself contribute to improved achievement. This ranking
and its disclosure will both set the appropriate expectations
and provide the motivation for schools to lift their game.
References
Department for Education
and Employment 1997, Excellence in Schools, London.
Hoffer, T., A. Greeley
and J. Coleman 1985, Achievement Growth in Public and
Catholic High Schools, Sociology of Education 58(1).
Reynolds, D. 1997,
Labour task force professor defends use of raw results,
Times Educational Supplement 7 March: 6.
Reynolds, D. and S.
Farrell 1996, Worlds Apart? A Review of International Surveys
of Educational Achievement Involving England, Office for
Standards in Education, HMSO, London.
About the Author:
Ken
Gannicott
is Professor of Education at the University of Wollongong.
This article is an edited extract from a study prepared for
the Department of Employment, Education, Training and Youth
Affairs and published by DEETYA in June 1998 under the title
School Autonomy and Academic Performance.
An earlier version of this paper was presented at the Conference
of the National Council of Independent Schools Associations
held in Melbourne in May 1998.
The author would like
to thank Ms Fiona Ogilvy-ODonnell of the Association
of Independent Schools of Victoria for her agreement to publication
of the paper. The views expressed in the paper do not necessarily
represent the views of either DEETYA or AISV.
Policy
is
the quarterly review of The Centre for Independent Studies.
For more information on subscribing to Policy, click HERE
If you are interested in the Centre's activities and publications,
why not subscribe to e-PreCIS, our regular
email update on the latest news and events.
(e-PreCIS requires
html capable email facilities, such as Microsoft Outlook Express
or Netscape Messenger)
|