|
DCLnews
Special Report
Computers
replace humans to rate essays
Computer
programs that rate essays
are widely being used in academic settings and they're proving as
good as, and cheaper than, their human counterparts - but could
quality be at stake? DCLnews reports...
THE
EDUCATIONAL TESTING SERVICE, a New Jersey-based non-profit
organization that annually administers more than 11-million tests
worldwide, is now using "e-raters" to judge Graduate
Management Administration Test
essays. The essays were formerly rated by two human beings, who
averaged 25 essay-reads per hour and typically got paid $20 an hour.
Now they are rated by one human and one computer, which is faster and
cuts costs.
But even that
one human could be unnecessary as around 98 percent of the time the
computer score is the same as, or within one point of, the human
score. This is the same degree of agreement you get when the job is
done by two human scorers.
So should
machine raters replace humans completely?
Michael Gross,
Chief Technology Officer at Data Conversion Laboratory,
doesn't think so. While he agrees that cutting the human input out of
the equation would reduce costs and speed up the rating process, he
believes quality would suffer.
|
"...I
don't believe anybody would be willing to live with even a small
percentage of people getting completely unfair scores." |
|
Michael
Gross, Chief Technology Officer, DCL |
In a statement
last week, he said: "I think what they're missing is that if you
don't have a human reading it, even though 98 percent of the time it
would not matter, you could occasionally have a computer giving a
completely inappropriate score because of a flaw/bug in the software.
And I don't believe anybody would be willing to live even with a
small percentage of people getting completely unfair scores."
At least with
the current system of one computer and one human, they are cross
checking each other, Gross added.
This is the
method used at DCL for data conversion purposes - and one that
Gross feels is currently superior both to purely automated and purely
manual systems. "We let machines do the markup and people check
the result, which at present is the optimal system." he said.
Human touch
Many other
organizations are realizing that too much reliance on technology can
be detrimental. Intelligence services, for example, now recognize
that they have to get back to feet on the street after years of
relying more and more on electronic intelligence - and missing
crucial information as a result.
The medical
profession insist that doctors review medical tests before sending
them to patients - as a wrong interpretation by a computer could be
fatal. Likewise surgeons mark off the location of surgery with a
magic marker pen, rather than relying on the computer printout.
|
"...in
20 years a computer with as much processing power as the human brain
(20 million billion calculations per second) will cost only $1,000." |
|
Ray
Kurzweil, author of The Age of Spiritual Machines |
Spiritual Machines
It could be
argued, however, that with the speed of computer development, it is
inevitable that one day machines will have human type capabilities -
and that people will be cut out of the equation in many industries.
Ray Kurzweil, author of The Age of
Spiritual
Machines
(1999), for instance, believes that this will happen sooner than most
of us think. He estimates that in 20 years a computer with as much
processing power as the human brain (20 million billion calculations
per second) will cost only $1,000. Then, inevitably, computers will
soar past us, he says.
Optimal system
But will they?
It's one thing using computers over humans to rate essays. After all,
human raters use a set of rules to help them mark such things as
grammar competency - and those kind of rules can be programmed into a computer.
But what about
aesthetic considerations? Could a computer decide what makes a good
piece of literature or not? Could it decide whether a magazine
article sets the right tone and asks the right questions? Could it
judge whether a newspaper headline has the perfect level of irony?
Perhaps in a
hundred years or so computers might be able to perform such tasks.
But, again, some level of quality would be at stake. And maybe, even
with phenomenally powerful computers, the optimal system would still
be to use a mixture of human and machine.
DCLnews
Editorial
To discover
more about how computers are being used to rate writing abilities, go to:
http://www.forbes.com/forbes/2001/1029/122.html
Comments and
Correspondence to DCLnews@dclab.com
|