Should the Characteristics of Victims and Criminals
Count?
Payne v Tennessee and Two Views of Efficient
Punishment
Boston College Law Review
XXXIV No.4
pp.731-769
(July 1993)
David D. Friedman
Olin Fellow in Law and Economics
The University of Chicago Law School
Chicago, IL 60637[1]
The purpose of this paper is to investigate two interrelated
issues. One is the question of how to use economic theory to
construct an efficient set of criminal punishments: I will argue that
a simple rule-set expected punishment [2] equal to the damage done by the crime-provides a
useful first approximation, but only a first approximation, to the
correct answer. The other question is how, if at all, punishment
should be affected by the characteristics of criminal and victim. In
answering that question, I hope to demonstrate both the usefulness
and the limitations of the simple version of the economic theory of
punishment, and the simple rule it implies, for dealing with several
of the issues raised in a recent and controversial Supreme Court
case-Payne v Tennessee.[3]
Part I of the article attempts to work out the economics of
efficient punishment. Part II applies the analysis to the question of
whether punishment ought to be affected by characteristics of the
criminal-whether, for example, rich criminals should pay larger fines
than poor criminals for the same crimes. Part III applies it to the
parallel question raised in Payne-whether punishment ought to
be affected by characteristics of the victim. Part IV expands, from
an economic viewpoint, on one issue raised by Payne-the
possibility of varying punishment according to consequences, in order
to selectively deter criminals who have some but not perfect
knowledge of what the consequence of their crime will be. Part V
considers a constitutional issue raised by Payne and by the
analysis of this article-whether making punishment depend on the
characteristics of the victim violates the requirement of equal
protection, applied not to criminals but to victims. Part VI
considers a problem in moral philosophy raised by Payne and
this article-whether it is just to make punishment depend on
consequences of the crime that the criminal may not have anticipated.
Part I. The economics of Efficient Punishment
A legal system may be evaluated in a variety of ways by
economists, legal scholars, or moral philosophers. Through most of
this article, however, I shall assume that it has only one purpose:
economic efficiency.[4] I view a legal
system as a set of rules designed to affect behavior; a change in
legal rules is an improvement if the summed benefits to those
affected, measured by their money equivalent, is larger than the
summed losses, where the money equivalent of a benefit or loss is the
largest sum the affected party would pay to receive the benefit or
avoid the loss.
Seen from this perspective, what is wrong with crimes is that they
occur even if they are inefficient. My willingness to buy a
television set demonstrates that it is worth more to me than to its
present owner, so a voluntary sale is an improvement and should be
permitted. But I may be willing to steal a television set even if it
is worth much less to me than to its present owner. So inefficient
theft may occur, and should be prevented.[5]
It seems to follow from this argument that we want to prevent only
inefficient theft. If my stealing your television set produced a net
benefit, even after allowing for associated costs (my time burgling,
your expenses on burglar alarms), then changing the legal system to
permit me to steal it would be an improvement.[6] One way of doing so is to set the expected
punishment equal to the damage done. The criminal will commit the
crime only if the value to him is greater than his expected
punishment, hence greater than the damage, so efficient crimes and
only efficient crimes will prove worth committing. Many discussions
of efficient punishment argue either that this is what our legal
system does or that it is what it should do.[7]
According to this view, a punishment for a crime[8] is simply a Pigouvian tax; like an emission fee
for pollution, it forces an actor to bear the cost of his action. If
a criminal commits a crime even though he knows that he will suffer
an expected punishment equal to the damage done by the crime, that
demonstrates that his benefit is greater than the victim's loss; the
crime is efficient and ought not to be deterred.[9]
If this is right, the optimal expected punishment is simply equal
to the damage done. All potential offenders whose benefit from
committing the offense is less than the damage done will be deterred;
they will face a punishment greater than their benefit, making the
return from the offense (benefit minus punishment) a net loss.
Potential offenders for whom the benefit is greater than the damage
done will face a punishment less than their benefit, making the
offense a net gain, so they will commit it. Inefficient offenses are
deterred, efficient offenses are not deterred, so we have the
efficient outcome.
If we are talking about speeding tickets, this sounds plausible
enough. Presumably one reason we do not confiscate the cars of
convicted speeders is that that might be too effective a punishment;
we are not sure we want everyone always to keep to the speed limit.
When applied to offenses such as rape or murder, however, the
efficient crime paradigm of enforcement strikes many legal scholars,
especially those who are not economists, as both unrelated to the
real legal system and morally bizarre. It implies, among other
things, that the reason we do not impose stiffer penalties on
convicted murderers is that we are afraid of having too few
murders.[10] It also seems to imply
that the optimal punishment for murder is an increasing function of
the value of the victim-an issue that was central to the recent
Supreme Court case of Payne v. Tennessee.
The
Inefficiency of Preventing All and Only Inefficient Crimes
The argument given above for setting expected punishment equal to
damage done is wrong. The reason it is wrong is that it ignores the
cost of preventing crime.[11] In
order to impose a given expected punishment, we must catch some
fraction of offenders and punish them. Both activities are costly.
Typically, the cost per offense increases with both probability of
apprehension and severity of punishment.[12]
It is obvious why the cost per offense increases with probability
of apprehension; it takes more police to catch fifty murderers out of
a hundred than to catch only twenty-five, and it takes more
prosecutors and court time to convict them. To see why it also
increases with the severity of the punishment, it is worth thinking a
little about what, from an economic standpoint, the "cost of
punishment" means.
Suppose the punishment for an offense simply consists of the
convicted offender paying a thousand dollar fine to the state. The
cost to the criminal, which is what gives the punishment its
deterrent effect, is a thousand dollars. But the net cost, what
economists call "social cost," is zero. Every dollar the criminal
loses the state collects. In this case punishment cost, defined as
the difference between the cost the punishment imposes on the
criminal and the benefit it provides to others,[13] is zero.
What if the criminal cannot pay a fine high enough to provide the
amount of deterrence we want to impose? In that case, instead of (or
addition to) fining him, we imprison him-say for a year. Suppose a
year's imprisonment is equivalent, from his standpoint, to a ten
thousand dollar fine.[14] The cost
the punishment imposes on him is ten thousand dollars, but the
enforcement system receives none of that. Instead, the enforcement
system must spend money-say another ten thousand dollars-to pay the
cost of his imprisonment. So the net cost of the punishment, the
criminal's loss plus the enforcement system's loss, is twenty
thousand dollars.
As we increase the size of the punishment we wish to impose, the
number of offenders who can pay it as a fine decreases, forcing us to
shift to more costly punishments such as imprisonment. So increasing
the severity of the punishment typically increases the punishment
cost per offense punished.[15]
It is inefficient for me to steal a television set that is worth
five hundred dollars to you and only four hundred dollars to me. But
it is still more inefficient to prevent me from stealing the set if
the cost of doing so is two hundred dollars additional expenditure on
police, courts, and prisons. The rule "prevent all inefficient
offenses and only inefficient offenses" is correct only if doing so
is costless. The economically correct rule is to prevent an offense
if and only if the net cost from the offense occurring is greater
than the cost of preventing it. It follows that if there is a
positive cost to preventing an offense, an efficient legal system
will let some inefficient offenses occur.
We now have an answer to one of the criticisms of the economic
approach. The reason we do not increase the punishment for murder
need not be that we are afraid we would then have too few murders. It
may be, and probably is, that although we would like to prevent more
murders than we do prevent (indeed, we might like to prevent all
murders), the cost of doing so is more than we are willing to
pay.[16]
The cost of preventing an offense may sometimes be negative. While
cost per offense increases with increases in expected punishment,
number of offenses decreases, since the higher expected punishment
deters some offenses that would otherwise have been committed. The
fewer offenses occur, the less must be spent to apprehend and punish
offenders. If this second effect outweighs the increase in cost per
offense, then raising the expected punishment lowers the total
enforcement and punishment cost-a system with higher punishments (and
fewer offenses) costs less than a system with lower punishments (and
more offenses). In such a situation, the additional cost of deterring
one more offense is negative, so it is efficient to prevent not only
all inefficient offenses but some efficient ones as well. In the
extreme, one could imagine a society where the penalty for
shoplifting was death, with the result that there were no shoplifters
and nobody ever had to be caught, convicted, and executed.
As a less extreme example of a situation where the cost of
preventing an offense is negative, consider an offense with the
following characteristics:
Cost to victim: $1000 per offense
Cost per offense (enforcement plus punishment costs) of imposing
an expected punishment of P: 
Number of offenses if P=$1000: 100/year
Number of offenses if P=$1100: 50/year
To simplify the exposition, let the probability of conviction be
one, making expected punishment P equal to actual punishment.
We begin with a penalty of $1000; a hundred offenses are occurring
each year. They are all efficient offenses; the fact that they are
committed despite the penalty means that the offenders are getting
more than $1000 by committing them, so the offenders gain more than
the victims lose.
If we raise the penalty to $1100 we will deter fifty efficient
offenses a year. Each would have harmed the victim by $1000 and
benefitted the criminal by something between $1000 and $1100. We know
that the benefit to the criminal is at least $1000 because he still
commits the offense even when the expected punishment is $1000. We
know it is no more than $1100 because he does not commit it when the
expected punishment is $1100. The net gain from each of those
offenses is between zero and $100, so the loss from deterring fifty
of them is between zero and $5000.
But by deterring those offenses, we save the cost of catching and
punishing the offenders. Imposing a $1000 punishment on 100 criminals
costs us $50,000. Imposing an $1100 punishment on 50 criminals costs
$27,500. By raising the punishment we have saved $22,500 in
punishment and enforcement cost. On net we are better
off.[17]
The situation is shown by Table 1; everything except punishment is
per year. Cost to victims is $1000 (the injury per victim) times the
number of offenses. Gain to criminals is the value to them of
committing the offenses. Net cost is the loss to victims plus
enforcement and punishment cost minus the gain to the criminals; our
objective is to minimize it.
X is the total gain to the criminals if the punishment is $1000
and 100 offenses occur each year. The value of the offense to the 50
offenders who would be deterred if we raised the punishment to $1100
is between $1000 and $1100. For simplicity, assume it is $1050,
making the total value to the criminals of the 50 offenses equal to
$52,500. So the total gain to the criminals falls to X-$52,500 when
we raise the punishment to $1100, as shown in the table.
Table 1
Expected Number of Cost to Gain to Enforcement and Net not
Including Net
Punishment Offenses Victims Criminals Punishment Cost
E&P Cost Cost
$1000 100 $100,000 X $50,000 $100,000-X $150,000-X
$1100 50 $50,000 X-$52,500 $27,500 $102,000-X
$130,000-X
Expected Punishment
|
Number of Offenses
|
Cost to Victims
|
Gain to Criminals
|
Enforcement and Punishment Cost
|
Net not Including E&&P Cost
|
Net Cost
|
$1000
|
100
|
$100,000
|
X
|
$50,000
|
$100,000-X
|
$150,000-X
|
$1100
|
50
|
$50,000
|
$X-$52,500
|
$27,000
|
$102,000-X
|
$130,000-X
|
If we increase the punishment from $1000 to $1100, gain to
criminals falls by more than cost to victims, since we are deterring
efficient offenses, so net cost not including enforcement and
punishment cost is higher with the higher punishment, as shown in
the next to last column. But that is more than balanced by the drop
in enforcement and punishment costs, so net cost, the final column,
is lower with the higher punishment.
The table does not show the cost to the criminals of paying the
punishment. If included, it would appear twice, once as a cost and
once as a benefit, and so have no effect on the net cost. It is a
cost to the criminals; if the punishment is $1000, the net gain to
the criminals is only X-$100,000, since they are paying $100,000 in
fines as punishment for their offense. It is a benefit to the
enforcement system that collects the fines. Punishment cost is the
difference between what the criminal pays and what the enforcement
system receives. With a punishment of $1000, for example, the
enforcement system receives $100,000, $50,000 of which goes to pay
the cost of catching and punishing criminals.
We have now seen an example of a situation in which it is
efficient to set an expected punishment higher than the damage done
by the offense, thus deterring some efficient offenses. Generalizing
the argument, we can show that the level of expected punishment
should be set equal to the damage done by an offense only if the
marginal cost of deterring one more offense is zero. If the marginal
cost of deterring one more offense is positive, then expected
punishment should be less than damage done; offenses that are only
slightly inefficient, that injure the victim by only a little more
than they benefit the criminal, are not worth the cost of deterring.
We expect marginal cost of deterrence to be positive for crimes for
which an increase in expected punishment deters only a small fraction
of offenses, so that we end up with almost as many offenses as before
the increase and a substantially larger enforcement and punishment
cost per offense. Such crimes are described, in economic terms, as in
very inelastic supply.
If, on the other hand, the marginal cost of deterring one more
offense is negative, if the sum of enforcement and punishment costs
decreases as we increase the level of punishment, due to the decrease
in the number of offenses to be punished, then the level of
punishment should be more than damage done. In such a situation, as
in the example of Table 1, we are willng to deter a few efficient
offenses in order to avoid the cost of punishing them. We would
expect that situation to occur for crimes for which a small increase
in expected punishment produces a large reduction in offenses-crimes
in very elastic supply. For such crimes the large reduction in number
of offenses as we increase the punishment outweighs the increase in
enforcement and punishment cost per offense.[18]
This solution to the problem of setting optimal punishments
combines elements of two different intuitions: punishment equal to
damage done and enough punishment to deter. If imposing punishment is
inexpensive, the optimum is about equal to damage done-enforcement
and punishment costs are unimportant, so we simply design our system
to deter all inefficient and only inefficient crimes. If the supply
of offenses is highly elastic at some particular level of punishment,
so that below that level there are many offenses and above it very
few, the optimal punishment is at the point where any further
increase would have very little deterrent effect to balance its
cost-just enough punishment to deter most offenses.
Efficient
Punishment: A Formal Treatment
The same argument can be put in a more precise mathematical form
as follows. We define:
(b): the density of offenses per
year as a function of the gain b to the offender of committing the
offense.[19]
: the number of offenses per year
whose perpetrators gain more than P by committing them. Since an
offense will be committed only if the gain is at least as great as
the expected punishment, O(P) is the number of offenses that occur
annually if the expected punishment is P.
C(P): the cost per offense of imposing an expected punishment P,
using the least costly combination of actual punishment and
probability. I assume that this does not depend on the number of
offenses.
D: the damage done per offense. For simplicity this too is assumed
independent of the number of offenses.
We wish to find
, the expected
punishment which minimize a social cost function:
SC(P) = O(P) [D+C(P)] -
(Equation 1)
= 
The first term on the right hand side is the cost of crime-number
of offenses multiplied by damage per offense plus enforcement cost
(the cost of catching, convicting, and punishing offenders) per
offense. The second term is the benefit of offenses to the offenders.
The integral starts at b=P because only crimes for which benefit to
the criminal is at least equal to expected punishment will be
committed.
Setting the derivative of SC(P) with regard to P equal to 0, we
have, for P equal to its optimum value
:
0 = - D
(
)+
+ 
(
)
=
(
)[
-D]
+ 
Solving for the optimal punishment
we
have:
(Equation 2)
Equation 2 is the mathematical equivalent of the result derived in
the earlier verbal argument. O(P)C(P) is the total cost of imposing
an expected punishment of P on Q(P) offenses. Deterring one more
offense requires an increase in P of
, so
is the cost of deterring one more
offense. If
>0 at P=
, then total enforcement cost is
increasing with increasing punishment, and, as can be seen from
Equation 2, the optimal punishment is less than the damage done. If
<0 at P=
, then total enforcement cost is decreasing with
increasing punishment (due to the decrease in the number of offenses)
and the optimal punishment is more than the damage
done.[20]
Wrong Argument, Right Answer?
My analysis so far implies that the simple description of
efficient punishment is wrong. If our objective were economic
efficiency, we would not, even if we could, choose to punish all
inefficient offenses and only inefficient offenses.
Although this way of looking at efficient punishment is wrong, it
is also useful. It provides a simple model that can be applied to a
wide range of legal regulation of behavior. At some extremes, the
model's description is a deceptive one-as when it implies that we are
concerned about not deterring too many murders. But for much
behavior-speeding tickets, pollution charges, library fines, arguably
most of civil law-preventing inefficient behavior is a fairly good,
although somewhat oversimplified, description of our objective.
Even in cases, such as murder, where the literal application of
the model may seem absurd, it still contains a considerable element
of truth. The limiting factor in how many murders we deter is not our
fear of deterring efficient murders. But, seen from the standpoint of
economic efficiency, the reason we are willing to bear substantial
costs in order to deter murder is that we believe it is (very)
inefficient-that the gain to the murderer is typically much less than
the loss to his victim.
Even those who reject economic efficiency as a complete
description of the objectives of our legal system should not reject
it as a partial description. It may well be true that we would want
to deter all murders (supposing we could do so costlessly) even if we
believed that some were, in the strict economic sense,
efficient.[21] But we would be a
great deal less concerned with deterring murders if we did not
believe that the costs of murder were large compared to the benefits.
Consider the following as evidence for that claim. There have been
several famous shipwreck cases involving murder and
cannibalism.[22] People who write
and think about such cases find the punishment of such behavior much
more troubling than the punishment of ordinary murder. That suggests
that if the benefit of committing murder were much higher relative to
the cost, if situations where an individual could preserve his own
life only at the cost of someone else's were common, we might have
substantially different attitudes toward murder.
Alternatively, imagine a society where everyone regarded life
after death-perhaps reincarnation-as a proven fact. To the members of
that society the cost imposed by a murderer on his victim would seem
much lower than it does to most of us. I conjecture that in such a
society murder would be considered less serious relative to other
crimes-more nearly comparable to, say, grand larceny-than it is in
ours.[23]
So one reason the efficient crime model is useful is that it
provides a simple picture that helps unify our view of legal
sanctions. Its simplicity is an advantage, especially for expository
purposes, over the more complicated, more correct, and more general
model that I have set out above. It is also an advantage over models
that treat specific moral judgements, such as our opposition to
murder or theft, as givens, rather than as conclusions to be derived
from more general considerations such as economic efficiency. Its
generality is an advantage over the alternative of considering
separately offenses that we do not deter because they are efficient
(some speeding) and offenses that we do not deter because it would
cost too much to do so (some murders).
A second reason why the model is a useful one is that although it
is quantitatively wrong, it is, for a wide range of cases,
qualitatively right. It does not tell us what the punishment for any
particular offense should be. But it does tell us, in most cases
correctly, in what direction changes in the characteristics of the
offense will move the optimum punishment.
If we actually used our theory to pick out offenses that should or
should not be deterred, the two models would give different results.
But in practice that is not how we use our theory, because we usually
do not have an accurate measure of the benefit to the offender or the
cost to the victim. Rather, we use the theory to produce qualitative
conclusions, to argue, for instance, that certain offenses or certain
offenders will, in an efficient system, be punished more severely
than others.[24] As we will see,
arguments of this sort can be transferred intact from the first model
(efficient crimes) to the second. The quantitative conclusions change
and additional factors become relevant, but the qualitative argument
remains.
I have spoken in the abstract of how moving from one model to the
other affects the conclusions. The rest of this essay provides a
series of examples, showing both how an argument formulated in terms
of the prevention of inefficient crimes remains relevant under the
more sophisticated analysis and how the change introduces additional
factors that might change the conclusion. I start with the question
of how punishment should be affected by the income of the offender,
and then go on to consider how it should be affected by the
characteristics of the victim-the central issue in Payne v.
Tennessee.
Part II: Should the Rich
Pay Higher Fines or Receive Shorter Sentences?
In an article published a few years ago in the Journal of
Political Economy,[25] John Lott
argued that the tendency of our legal system to produce lower
probabilities of conviction for higher income defendants is evidence
for, not against, the economic efficiency of the criminal justice
system.[26] His analysis used a
model of efficient law enforcement in which expected punishment was
set at a level designed to deter only inefficient crimes. His
argument may be summarised as follows:
A month in jail, or a week in court, represents a
larger dollar cost to someone with a higher income; measured in
money, his time is more valuable. If rich defendants receive the same
jail sentences with the same probability as poor defendants, then
they are actually paying a higher (dollar) penalty. If the efficient
penalty is equal to the damage done, it should be the same for rich
and poor. It follows that an efficient legal system will either
impose lower (non-money) penalties on richer defendants or impose
them with lower probability. Our legal system does in fact impose
lower expected (non-money) punishments on richer defendants; that is
evidence in favor of the thesis that our system is economically
efficient.
How does the inclusion of punishment costs affect the conclusion
that richer people should receive lower expected jail sentences? In
the simplest case, it does not. If all the relevant functions-cost of
apprehension, cost of punishment, and elasticity of the supply of
offenses-are the same for rich and poor, then Lott's argument goes
through in this more complicated case. The optimal expected
punishment is a particular amount of money, hence fewer days in jail
(or an equal fine) for people with higher incomes.
Intuitively, that result makes sense. In Lott's model, imposing
equal jail terms on rich and poor would mean either that rich people
were being charged more than the damage done by their offenses (and
hence that some efficient crimes were being deterred) or that poor
people were being charged less than the damage done (and hence that
some inefficient crimes were occurring). In my model, equal jail
terms would mean that the marginal offense committed by a rich man,
while perhaps inefficient, would be less inefficient than the
marginal offense committed by a poor man-hence less worth the cost of
deterring. Both models imply equal fines for rich and poor, or
unequal jail sentences.
The assumption that the functions are independent of income is,
however, an implausible one, for several reasons.[27] One, at least, brings us back to one of the
intuitions of those who believe that rich and poor should receive the
same jail sentences-and that the rich should pay higher
fines.[28] The supply function for
offenses shows the number of offenses as a function of the expected
punishment. If punishments are in money, and rich and poor people
have different values for money, we would expect the deterrent effect
of a given punishment to vary with income.[29]
To make the argument more rigorous, it is worth distinguishing
between two sorts of offense-those that have a roughly equal payoff
in utility for rich and poor and those that have a roughly equal
payoff in money. Stealing $100 provides the same amount of money to a
rich man as to a poor man, so we would expect that the same fine
would deter it. Indeed, since the time of the rich man is worth more
dollars per hour than that of the poor, we would expect that if they
are equally good thieves, so that it takes each the same amount of
time to steal $100, the rich man would be deterred by a lower fine
than the poor.
Consider, however, an offense whose payoff, measured in money, is
higher for richer offenders. One example would be saving ten minutes
by speeding; another would be slugging someone you were mad at. The
money value of the offence is higher to the richer offender, so it
will require a higher (money) punishment to deter him.
Whether this implies a higher efficient punishment depends, in a
somewhat complicated way, on the shape of the supply function for
offenses and the related cost functions for
deterrence.[30] Where the efficient
rule comes close to "impose just enough punishment to deter all
offenders," then the efficient system would impose higher (dollar)
punishments on higher income offenders, since higher punishments are
needed to deter them. The opposite result occurs if imposing the high
expected punishment necessary to deter high income offenders is so
costly that it is not worth deterring those crimes.
So far, the only difference between high and low income offenders
I have considered is in the supply function for offenses. There is a
second difference with less ambiguous implications. A fine is a more
efficient punishment than a prison term, and richer offenders can pay
higher fines. Even if neither offender can pay a sufficiently high
fine, imposing a given dollar punishment via imprisonment requires
fewer days in jail for a higher income offender, and is therefore
cheaper. So punishment costs (per dollar of punishment) should
decrease as income rises, which implies a higher efficient dollar
level of punishment for richer offenders.[31]
Part III: Payne v Tennessee: Does the Value
of the Victim's Life Matter?
"Today's majority has obviously been moved by an
argument that has strong political appeal but no proper place in a
reasoned judicial opinion. Because our decision in Lockett ...
recognizes the defendant's right to introduce all mitigating evidence
that may inform the jury about his character, the Court suggests that
fairness requires that the State be allowed to respond with similar
evidence about the victim. ... This argument is a classic non
sequitur: The victim is not on trial; her character, whether good or
bad, cannot therefore constitute either an aggravating or mitigating
circumstance."
(Justice STEVENS, dissenting in Payne v. Tennessee
)
On the face of it, Justice Stevens' argument seems compelling.
Permitting the character of the victim, like the character of the
defendant, to be introduced in evidence may be fair as between victim
and defendant, but the victim in a criminal case is not a party to
the suit. Insofar as fairness is relevant in that context, it is
fairness between the defendant and the state. And, as pointed out
elsewhere in the dissent, the usual policy in criminal law is to try
to tilt in favor of the defendant, in order to balance the superior
power of the state.
There is, however, a sense in which the Court's position is
correct. If, as I have been assuming, criminal law is intended to
produce an efficient outcome, then decisions such as whether to
impose the death penalty involve balancing costs and benefits. One of
the benefits is saving the lives of potential victims by deterring
crimes that might have been committed against them.[32] One of the costs is executing criminals. The
value of saving lives depends on the value of the lives saved; the
cost of execution depends on the value of the life ended. So a
correct decision requires the jury to balance the value of the
victim's life against the value of the defendant's
life.[33] To that extent, the Court
is right and Justice Stevens is wrong.[34]
This does not mean that murderers should be executed if and only
if their lives are deemed by the jury less valuable than their
victims' lives. Executing a particular murderer will not save his
victim's life-that is already lost. The jury's willingness to execute
a particular murderer for killing a particular sort of victim may,
however, affect how many similar murders occur in the future. So
there is a tradeoff between murderers' lives and victims' lives, but
not necessarily at a rate of one for one.[35]
To the extent that potential murderers know the value of the lives
of their potential victims, the rule announced by the court means
that expected punishment as perceived by the offenders is an
increasing function of the damage done by the offense, as efficiency
requires. The murderer in Payne was aware of the fact-that his
victim was a mother with two small children-that the prosecution used
in persuading the jury to sentence him to death.[36] In such cases the court's rule will tend,
cæ teris paribus, to increase the protection that the
law provides to mothers of small children, and to other victims whose
death will impose large costs on their survivors. Someone
contemplating killing such a person will expect a more severe
penalty, and thus be more likely to be deterred.
The rule established by Payne would also permit such
evidence to be introduced in cases where the offender was not aware
of the relevant facts at the time of the murder. The dissent argued
that this feature of the rule violated the eighth amendment, since it
could make the application of the death penalty depend on something
irrelevant to the wickedness of the murderer's act.[37] A similar argument could be made from an
economic standpoint. If the murderer does not know the value of his
victim's life, then selective punishment will not provide selective
deterrence. Even if the murderer knows that he will be punished more
severely for killing certain kinds of victims, he does not know
whether his potential victim is one of them.[38]
In this case, however, the Court's position can be defended in a
slightly different way. Even if all victims are identical, so that
the issue of selective deterrence does not arise, we still have the
problem of deciding what the penalty for murder should be. A more
severe penalty imposes larger costs on convicted murderers in order
to deter crimes and reduce the cost to potential victims. Where the
decision is whether to impose capital punishment for murder, the jury
is deciding whether to sacrifice the lives of murderers in order to
save the lives of (generic) victims. In choosing a penalty, the jury
is implicitly balancing those costs and benefits.
If the legal rules present the defendant as a living, breathing
human being with parents who care about him, while presenting the
victim as a shadowy abstraction, the result will be to overstate, in
the minds of the jury, the cost of capital punishment relative to the
benefit. So the rule announced in Payne can be interpreted,
not as a way of giving the jury information about the special value
of one victim relative to other victims, but as a way of reminding
the jury that victims, like criminals, are human beings with parents
and children, lives that matter to themselves and others. That seems
relevant information, if the jury is to decide whether the benefit of
deterring some murders is worth the cost of executing some
murderers.[39]
So far in this section I have not distinguished between the simple
version of the efficient punishment model and the correct version.
The reason is that both lead to the same conclusion. If our objective
is to prevent all inefficient murders by setting punishment equal to
damage done, then the punishment for destroying a life should be
higher the more valuable the life. If our objective is to prevent
murders whenever the cost of prevention is less than the net damage
done by the murder, then we should be willing to impose higher
punishment costs-execution rather than imprisonment, for example-for
murders that do more damage. So the result in Payne v.
Tennessee makes sense in terms of both the simple and the
complicated versions of the model.
One feature of the decision that does not seem to fit either
version of the efficient punishment model, however, is the Court's
discussion of what "value of life" means. The Court explicitly
rejected the idea of comparing the value of one life to the value of
another, and seemed to reject the idea of evaluating lives on any
economic basis.[40] The dissent
responded by arguing that without such a comparison evidence about
the victims would tell the jury members nothing they did not already
know, and would introduce "such illicit considerations as ... the
status of the victim in the community." (Justice
Marshall).[41]
One way of making sense out of the Court's position has already
been suggested. If the objective of victim impact statements is not
to give the jury special information about why one victim is more
deserving than another, but rather to remind the jury of the value of
the lives of victims, then no comparative judgement among
victims is required. The comparative judgement is rather between
the lives of victims and the lives of their murderers. This
interpretation seems more consistent with what the Court actually
said than the alternative, in which victim impact statements are
intended to provide the information necessary for selective
deterrence.[42]
A second possible justification for the Court's position was
implied by the Attorney Generals of Tennessee and the U.S. in oral
argument.[43] Even if juries cannot
compare the life of one victim to the life of another, the victim is
not the only injured party. If the effect of one murder is simply to
kill the victim, while the effect of another is to kill the victim
and orphan her three small children, one can argue that the latter is
a more serious offense even though the lives of the victims
themselves are equally valuable.[44]
If either of these interpretations is correct, then the Court,
like the dissent, is rejecting one of the implications of the
economic approach to criminal law. Where criminals are, or might be,
aware of characteristics that affect the value of the lives of their
victims, selective punishment would provide selective deterrence and
thus make the criminal law more efficient. The result in Payne v.
Tennessee will allow that to happen but only, to judge by the
Court's dicta, as an unintended consequence.
Part IV: Punishment by Consequences:
The Selective Deterrence of Imperfectly Informed
Criminals
One argument made repeatedly in both Payne and the prior
literature is that it is unjust to make the punishment of the
criminal depend on factors, such as characteristics of his victim, of
which he was unaware when he committed the crime. A similar argument
applies if one's concern is efficient deterrence.
In most real cases, however, criminals are neither perfectly
informed nor perfectly ignorant. Even someone who murders a stranger
in the course of a robbery is likely to have some idea of the age and
sex of his victim-which is relevant to the probability that the
victim is a mother with small children. In less anonymous cases, the
criminal is likely to have more information. In the actual case of
Payne, the only relevant pieces of information the criminal
did not have when he committed the murder were the fact that one of
his victims would survive and the details of how that victim would
react to the death of his mother and sister.
This raises the question of how the economic analysis of selective
deterrence applies to a criminal with some, but imperfect,
information.[45] The answer to that
question provides another example of the general thesis of this
essay-that the simple version of the economic analysis of optimal
punishment gives a first approximation, but only a first
approximation, to the result of the correct model.
In order to see that, let us consider a simple case. There are two
types of victims-low value victims and high value
victims.[46] The total damage done
to everyone affected by a murder-the victim, survivors, other members
of society-is H for a low value victim and 2H for a high value
victim. Each potential murderer i has a probability pi
that his victim is a low value victim and 1-pi that his
victim is a high value victim.[47]
Each actual murderer has a .5 probability of being apprehended and
convicted.[48] What is the
consequence of making the punishment of a convicted murderer depend
on the value of his victim? How does that legal rule compare, from
the standpoint of economic efficiency, with the alternatives of
either imposing the same punishment on all murderers or making the
punishment depend upon what the court believes the murderer knew at
the time of the crime-the court's estimate of pi?
We first consider this question in the context of the simple
model. We assume there are no costs of punishment[49] or apprehension; our objective is therefore to
set the expected punishment equal to the damage done, deterring all
inefficient crimes and and only inefficient crimes. We do so by
setting the punishment at 2H for killing a low value victim (expected
punishment = probability of conviction x 2H = H = damage done) and 4H
for killing a high value victim.
Consider a potential criminal i. The expected harm his offense
will do is the probability his victim is low value times H plus the
probability his victim is high value times 2H, which is:
<Harm> = pixH+(1-pi)x2H
If he commits the murder, his expected punishment is the
probability his victim is low value (pi) times the
expected punishment for killing a low value victim plus the
probability his victim is high value (1-pi) times the
expected punishment for killing a high value victim, giving:
<Punishment> = .5 x pi x 2H + .5 x
(1-pi) x 4H = pixH+(1-pi)x2H =
<Harm>
So expected punishment equals expected harm, whatever
pi may be.
To put the same analysis verbally, expected damage is a weighted
average of actual damage, expected punishment is a weighted average
of actual punishment,[50] the
weights (pi and 1-pi) are the same in both
cases, so if actual punishment equals actual damage, expected
punishment will equal expected damage. Selective punishment thus
results in the schedule of expected punishments that the court would
impose if it knew pi and could calculate the expected
damage imposed by each murder and adjust the punishment
accordingly.[51] That is a more
efficient result than could be imposed directly by a court with
anything short of perfect information about what each criminal knew
when he committed his crime.[52]
Consider the limiting case where potential criminals know nothing
about their potential victims; pi is the same for all i,
say .5. Each criminal faces an expected punishment of (3/2)H, equal
to the expected harm done by his crime. He has one chance in four of
being convicted of killing a low value victim (punishment 2H) and one
chance in four of being convicted of killing a high value victim
(punishment 4H). Since the criminals are assumed to be risk neutral,
this is equivalent to a system where all criminals who were convicted
(probability one half) received a punishment of 3H.
So in the worst case for selective punishment (criminals have no
information about the victims) or the best case for punishment based
on criminal's knowledge (the court has perfect information about the
criminals) punishment according to outcome (what sort of victim
actually got killed) is no worse than the alternatives; in any other
situation it is better.
What happens to this result in the more sophisticated model, where
we include in our calculations the cost of catching and punishing
criminals? The answer is that the argument carries over in a
qualitative but not a quantitative sense. It is still true that
selective punishment results in a higher expected punishment for
criminals whose victims are more likely to be of high value, and that
a higher punishment for those criminals is desirable. But it is no
longer true that selective punishment produces the optimal result,
nor that it is better than the alternatives as long as criminals have
some information, however little, about their victims, and courts
have less than perfect information about criminals.
This is true for two reasons. The first is that, although an
efficient system will, cæ teris paribus, impose higher
punishments on offenses that do more damage, the relation is no
longer one of simple proportionality between damage done and
efficient punishment. The optimal expected punishment, for reasons
explained in an earlier part of this essay, is damage minus the cost
of adjusting the schedule of punishments (and enforcement) to reduce
the number of offenses by one.[53]
That cost will generally be different at different levels of
punishment. There is no reason to expect that an offense doing twice
as much damage should be punished exactly twice as severely. The
optimal punishment might be three times as large, or only one and a
half times.
The criminal, in calculating the expected punishment he faces,
averages the punishments for killing the two different kinds of
victims, using as weights the relevant probabilities. But the optimal
punishment calculated using the more sophisticated model is not
simply the weighted average of the two punishments. So the expected
punishments that criminals calculate will vary in the right
qualitative way-they will be higher for criminals who have a higher
probability of killing high value victims and thus doing more
damage-but they may well be different from the optimal punishments
that would be set by a court that had all the information the
criminals had and used that knowledge to make punishment depend on
what the criminal knew at the time of his offense.
The second reason why selective punishment is no longer
necessarily optimal is that, once we introduce punishment costs,
different patterns of punishment that are equivalent from the
standpoint of the criminal may no longer be equivalent from the
standpoint of the rest of society-they may have different costs. To
see the relevance of this, again consider the case where criminals
have no information about their victims, with pi=.5 for
all i.
With selective punishment, a criminal who is convicted faces a .5
chance of the punishment for killing a high value victim plus a .5
chance of the punishment for killing a low value victim. Even if this
punishment lottery happens to produce the right expected punishment,
it may not be the least expensive way of doing so. It might be less
expensive to choose an intermediate punishment and impose it on all
offenders.
Consider the following example. It may well be that execution,
because of the repulsion towards killing in our society, is a much
more inefficient punishment than imprisonment-one that imposes a
larger cost per unit of deterrence.[54] Suppose that, from the standpoint of the
criminal, life imprisonment is exactly equivalent-exerts the same
deterrent effect-as a fifty percent chance of execution combined with
a fifty percent chance of a ten year sentence. If so, and if the
social cost of the latter alternative is higher than the social cost
of the former, then selective punishment of completely ignorant
criminals (execution for killing a high valued victim and ten years
for killing a low valued victim) provides the same deterrence as
unselective punishment (life for all murderers), but at a higher
cost.[55]
In this case, as in the case discussed earlier where punishment
might vary with the income of the criminal, the simple model of
deterring all inefficient crimes and only inefficient crimes gives us
an approximation of the right answer, but only an approximation. The
argument and the conclusion carry over to the sophisticated model,
but only approximately. If criminals know a good deal about their
potential victims (pi varies substantially with i), and
courts do not know much about what criminals know (courts do not have
good information about pi), selective punishment based on
victim characteristics is probably more efficient than either
unselective punishment (all murderers get treated equally) or
selective punishments based on the court's estimate of the criminal's
knowledge at the time of the crime. If criminals are badly informed,
or if courts are well informed about what criminals know, selective
punishment based on victim characteristics is still superior in the
simple model, but not in the sophisticated model.[56]
I have discussed this question in the context of capital
punishment for murder, since that was the issue raised by
Payne, but the analysis applies more generally. The argument
of this section provides both an economic justification for making
the severity of the punishment imposed for a crime (or the amount of
damages awarded for a tort) vary with the damage done and a
qualification to that justification in situations where offenders are
badly informed about the consequences of their acts and courts are
well informed about the minds of offenders. In the context of tort
law, the same argument provides a justification for the familiar rule
that the tortfeasor takes his victim as he finds him.[57]
Throughout the discussion, I have assumed that after the offense
has occurred it is possible to measure the consequences, and that it
is therefore at least possible, although not necessarily desirable,
to make the punishment depend on the damage done. There are some
interesting cases where that is not possible. Many, such as
pollution, are handled through the regulatory system. The emission of
a particular pollutant at a particular place and time may do no
damage at all, or it may result in someone dying who would otherwise
have lived. Punishment is based on some ex ante estimate of
expected cost, since actual cost can usually not be measured. A
similar problem occasionally arises in tort, as in the DES
cases,[58] where it was impossible
to assign liability for particular injuries to particular defendants.
Part V: Payne, McClesky, and Equal
Protection for Victims
Neither the Court nor the dissent discussed in detail the reasons
for rejecting comparative judgements among victims, and hence
selective deterrence. One obvious candidate is the general norm of
equal protection, as embodied in the Fourteenth amendment to the U.
S. Constitution.[59] This
possibility is suggested by the evidence offered by the defense in
another case, McClesky v. Kemp.[60] From the standpoint of our present
discussion, one striking feature of that case is the failure of
either the majority or minority opinions to consider the application
of the principle of equal protection to the protection of potential
victims. To see why one might have expected that issue to arise, it
is worth reviewing the evidence offered:
"In support of his claim, McCleskey proffered a statistical study
performed by Professors David C. Baldus, Charles Pulaski, and George
Woodworth (the Baldus study) that purports to show a disparity in the
imposition of the death sentence in Georgia based on the race of the
murder victim and, to a lesser extent, the race of the defendant. The
Baldus study is actually two sophisticated statistical studies that
examine over 2,000 murder cases that occurred in Georgia during the
1970's. The raw numbers collected by Professor Baldus indicate that
defendants charged with killing white persons received the death
penalty in 11% of the cases, but defendants charged with killing
blacks received the death penalty in only 1% of the cases. The raw
numbers also indicate a reverse racial disparity according to the
race of the defendant: 4% of the black defendants received the death
penalty, as opposed to 7% of the white defendants.
...Baldus subjected his data to an extensive analysis,
taking account of 230 variables that could have explained the
disparities on nonracial grounds. One of his models concludes that,
even after taking account of 39 nonracial variables, defendants
charged with killing white victims were 4.3 times as likely to
receive a death sentence as defendants charged with killing blacks.
According to this model, black defendants were 1.1 times as likely to
receive a death sentence as other defendants."[61]
The defense argued that this evidence showed an unconstitutional
discrimination against black defendants. On the evidence presented,
the direction of the discrimination is ambiguous. Black murderers
appear slightly more likely to receive a death sentence than white
murderers, all other things held constant-including the race of the
victim. But black murderers, on average, kill black
victims[62]-with the result that
actual black murderers are substantially less likely than
actual white murderers to receive a death sentence-4% vs
7%.[63]
What is unambiguous is the discrimination against black
victims.[64] The evidence suggests
that, all other things held constant, the murderer of a white victim
is more than four times as likely as the murderer of a black victim
to receive a death sentence. If we take murders as they occur, rather
than trying to use statistical methods to control for factors that
correlate with race, the actual murderer of a white victim (in
Georgia) was about eleven times as likely as the murderer of a black
victim to receive the death penalty.[65]
The fourteenth amendment to the constitution provides that: "...
nor shall any State ... deny any person within its jurisdiction the
equal protection of the laws."[66]
Part of the protection I receive from the law, arguably the most
important part, is the protection provided by a legal system that
punishes crimes committed against me. One important argument in favor
of the death penalty is that it deters more effectively than lesser
punishments. If so, then the evidence presented in McClesky
strongly suggests that blacks in Georgia get substantially less
protection of the law from murder than do whites. It seems odd that
neither the Court nor (with one partial exception) the minority in
the case discussed that issue.[67]
The Court in McClesky neither explicitly accepted nor
rejected the proposition that a judicial system whose policies
resulted in less protection for blacks than for whites was in
violation of the fourteenth amendment. If they had rejected that
proposition, they might still have accepted the weaker claim that
features of a legal system deliberately designed to provide different
levels of protection to different potential victims were
unconstitutional. Even if it is obvious that the law does not, in
practice, protect everyone equally, it may still be improper to make
stronger protection for more valuable lives an explicit justification
for a legal rule.[68] If so, that
would provide an explanation of the Court's unwillingness to base its
defense of victim impact statements on their ability to provide
selective deterrence.
A slightly different reason is hinted at by the dissent, and was
raised explicitly in the briefs and in oral argument.[69] If it is appropriate to impose especially high
punishments on the murderers of especially valuable victims, then it
would seem equally appropriate to impose especially low punishments
on the murderers of especially worthless victims. This raises the
specter of a system where sufficiently unpopular people-prostitutes,
drug users, members of unpopular religious, racial, or political
groups-could be killed with impunity.[70]
One way that a court might try to deal with this problem would be
by creating a legal rule that permitted victim impact statements by
the prosecution but not by the defense-a possibility discussed in the
oral argument.[71] Since the
prosecution is presumably trying to get as high a punishment as
possible, only evidence favorable to the victim would be
introduced.[72] Legal problems
aside,[73] this raises interesting
difficulties of a game-theoretic nature.
Will Prosecutors Tell All?
Suppose we have a legal system in which the prosecution, but not
the defense, may introduce evidence on characteristics of the victim.
Further assume that the objective of each prosecutor is to get as
severe a sentence as possible in the case currently being prosecuted,
and that juries are fully rational and aware of how prosecutors
behave. Finally, assume that the characteristics of victims can be
ranked by their potential effect on the jury, and that prosecutors
are aware of the ranking; they know how juries will react to the
facts about particular victims. How will prosecutors behave?
Suppose a prosecutor follows a policy of only introducing evidence
on the characteristics of a victim if the victim is "above
average"-if the information will lead the jury to impose a more
severe sentence than if the jury knew nothing at all about the
victim. The problem with that policy is that having the jury know
nothing at all about the victim is not one of the prosecutor's
options, since the jury can get information not only from what the
prosecutor says but from what he does not say. When the prosecutor
chooses not to introduce evidence on the characteristics of the
victim, a rational jury will deduce that the victim must be below
average. The jury will therefore treat the victim about whom it has
been told nothing not as an average victim but as an average
unattractive victim-and reduce its sentence accordingly.
To make the argument more precise, imagine that we rank the
victims on a percentile scale, with the most attractive victim rated
1.00, the median victim 0.50, and the least attractive 0.00.
Prosecutors pick some X between 0 and 1 and introduce evidence on the
victim's characteristics if and only if the victim rates above X. If
the prosecutor does not introduce such evidence, a rational jury
aware of how prosecutors behave will conclude that the victim ranks
between 0 and X, and will base the verdict on an average victim,
ranking about X/2.[74]
Consider a prosecutor in a case where the victim ranks slightly
below X but above X/2. The prosecutor can expect to get a more severe
sentence if he reveals the victim's characteristics to the jury than
if he does not. So a strategy of only introducing evidence for
victims who rank above X is unstable-it pays a prosecutor to break
the rule by introducing evidence on any victim slightly below the
cutoff. The argument applies as long as X is greater than zero. The
only stable strategy is for prosecutors to introduce evidence on the
characteristics of all save the least attractive victims.
This analysis assumes that prosecutors, in deciding whether to
introduce evidence in a case, consider only the effect on the outcome
of that case. If there were a single prosecutor controlling all
cases, he would realize that lowering X in order to get a better
result in one case would produce a worse result in cases where he
chose not to reveal the information, since a lower X would result in
a lower estimate by the jury of the attractiveness of victims whose
characteristics were not revealed.
In such a situation, the two effects on the average verdict of
changes in X tend to balance each other. For each victim, there is a
penalty the jury fully informed as to the characteristics of that
victim would give to his killer. If the penalty chosen by a jury
ignorant of the victim's characteristics is simply its desired
penalty averaged over the characteristics the victim might have
had,[75] the balancing is exact;
with more complicated jury preferences it is not.[76] In the simple case, at least, a single
prosecutor controlling all cases would be indifferent to the level of
X, unless he himself, like the jury, was in favor of giving more
severe punishments to defendants who had killed more attractive
victims, in which case he would set X=0 and reveal all.
In a system with many prosecutors, each more concerned with the
severity of his verdicts than with the severity of the verdicts
gotten by other prosecutors, the incentive for a prosecutor to lower
X is stronger.[77] Most of the
undesirable effect of lowering X in a particular case is born by
other prosecutors in other cases, so the gain to a prosecutor from
lowering X for his cases will almost always be less than the loss,
making the only stable situation one in which X=0 and juries are
fully informed of the characteristics of victims.
So there is reason to believe that rational prosecutors, dealing
with rational juries, would find themselves driven to tell all-to
reveal the characteristics of all but the least attractive victims.
Rational juries would then deduce that any case in which the
prosecution did not provide a victim impact statement involved an
extraordinarily unattractive victim, and set the sentence
accordingly. In such a situation, giving only the prosecution the
power to introduce evidence on victim characteristics would not be
sufficient to protect unattractive victims.[78]
If this argument is correct, there may be no way of achieving the
efficiency gains of selective deterrence without the cost of
permitting juries to effectively nullify the law against murdering
unpopular people.[79] This would be
a strong argument against the result in Payne if there were no
other way in which juries could achieve that undesirable result.
Unfortunately, that is not the case. One of the costs of a jury
system is the potential for jury nullification of good laws as well
as bad ones. The experience of German Americans during World War I,
Japanese Americans during World War II, and Black Americans during
much of the past century demonstrates that our legal system provides
very limited protection to sufficiently unpopular minorities.
Part VI: Punishment by Desert or Punishment by
Consequence:
The Problem of Moral Luck
A second issue suggested by Payne v. Tennessee raises some
philosophical puzzles about what ought to determine punishment. The
dissent argued that, at least in the case of capital punishment, the
only thing that matters is the blameworthiness of the particular
defendant. If, as may often be the case, the murderer does not know
much about the defendant when he decides to kill her, then her
characteristics are irrelevant to how wicked he is and should be
irrelevent to his punishment. A victim impact statement, in those
cases where the criminal did not know the victim, makes the
murderer's punishment depend on morally irrelevant factors, and
should therefore be prohibited.
In order to fit this argument into the discussion of this article,
I now drop the assumption that economic efficiency is the only value
by which legal rules ought to be judged. Suppose we assume, instead,
that efficiency and justice are distinct goals, both desirable; we
may be willing to accept some reduction in efficiency in order to
make our system more just, and we may be willing to accept some
reduction in justice in order to make the system more
efficient.[80]
How will this change in our assumptions alter our conclusions so
far as the question of punishing by results is concerned? If it is
just to impose a more severe punishment on a criminal who has done
more damage, then that reinforces any efficiency arguments in favor
of punishment by consequences. If, on the other hand, the only just
basis of punishment is the nature of the act as perceived by the
criminal when he committed it, that is an argument for basing
punishment on the best estimate the court can make of ex ante
expected injury rather than on the court's observation of ex
post actual injury-even if we conclude that the latter rule would
be more efficient.
On the face of it, the moral argument against basing punishment on
actual consequences seems to apply to all crimes and punishments, not
merely murder and execution.[81] If
punishment ought to be a function of how blameworthy the criminal is,
then punishment should never be affected by factors that the criminal
did not know about or could not control. That sounds persuasive but,
as the Court points out, it does not describe how our system actually
works.[82]
Indeed, it probably does not describe how any legal system
actually works. A drunk driver who runs into a tree is subject to
considerably less severe sanctions than one who runs into a
pedestrian. A gunman whose victim survives is guilty of attempted
murder; one whose victim dies is guilty of actual murder. The
blameworthiness is the same, but the penalty is
different.[83] In a wide range of
civil and criminal cases, the sanction visited upon an offender
depends in part on things that have little or nothing to do with how
bad a person he is.[84]
This paradox-that punishment does, and that to most people it
seems that punishment should, depend on factors unrelated to how
wicked the crime shows the perpetrator to have been-has long
concerned philosophers writing about both moral desert and legal
punishment. Current discussions often include it in the more general
category of "Moral Luck."[85] The
case in favor of ignoring luck in moral judgements was made by Kant,
who wrote:
"The good will is not good because of what it effects
or accomplishes or because of its adequacy to achieve some proposed
end; it is good only because of its willing, i.e. it is good of
itself. ... Usefulness or fruitlessness can neither diminish nor
augment this worth."
Kant does not go on to apply the argument to bad will. Adam Smith,
however, made a strong argument against the moral relevance of luck,
for good or ill:
"Whatever praise or blame can be due to any action,
must belong, either, first, to the intention or affection of the
heart, from which it proceeds; or, secondly, to the external action
or movement of the body, which this affection gives occasion to; or,
lastly, to the good or bad consequences, which actually, and in fact,
proceed from it. These three different things constitute the whole
nature and circumstances of the action, and must be the foundation of
whatever quality can belong to it."
"...To the intention or affection of the heart,
therefore, to the propriety or impropriety, to the beneficence or
hurtfulness of the design, all praise and blame, all approbation or
disapprobation of any kind, which can justly be bestowed upon any
action, must ultimately belong."[86]
"That the last two of these three circumstances cannot be the
foundation of any praise or blame, is abundantly evident; nor has the
contrary ever been asserted by any body. ... The consequences which
actually, and in fact, happen to proceed from any action, are, if
possible, still more indifferent either to praise or to blame, than
even the external movement of the body. As they depend, not upon the
agent, but upon fortune, they cannot be the proper foundation for any
sentiment, of which his character and conduct are the objects."
"... yet, when we come to particular cases, the actual
consequences which happen to proceed from any action, have a very
great effect upon our sentiments concerning its merit or demerit, ...
. Scarce, in any one instance, perhaps, will our sentiments be found,
after examination, to be entirely regulated by this rule, which we
all acknowledge ought entirely to regulate them."[87]
The next two chapters of Smith's discussion of the
paradox[88] provide an explanation
of why we feel this way-an explanation of moral sentiments not moral
facts. He concludes with a consequentialist argument, designed to
show that our feelings, although irrational, are useful, and thus
evidence of the divine wisdom:
"Sentiments, designs, affections, though it is from
these that according to cool reason human actions derive their whole
merit or demerit, are placed by the great Judge of hearts beyond the
limits of every human jurisdiction, and are reserved for the
cognizance of his own unerring tribunal. That necessary rule of
justice, therefore, that men in this life are liable to punishment
for their actions only, not for their designs and intentions, is
founded upon this salutary and useful irregularity in human
sentiments concerning merit or demerit, which at first sight appears
so absurd and unaccountable. But every part of nature, when
attentively surveyed, equally demonstrates the providential care of
its Author; and we may admire the wisdom and goodness of God even in
the weakness and folly of men."[89]
Smith's argument is that, since we can observe outcomes but not
intentions, it is sensible to base human punishments on outcomes and
leave the punishing (and rewarding) of intentions to
God.[90] It is equivalent, in a less
mathematical form, to my earlier discussion of punishing imperfectly
informed criminals. If the court had the information that God has, it
would know the ex ante probabilities facing the criminal; by
basing its punishment on that information, it could (in a world of
costly punishment) do better than if it based punishment on actual
outcome. Since courts do not have that information, they are better
off basing punishment on outcome.
For Smith, and similarly for Beccaria,[91] this provides a moral as well as a prudential
argument for punishment according to outcome. It is the best that man
can do, and God will take care of correcting the inevitable errors in
both directions. For those of us who are concerned with providing
justice without divine assistance, however, the prudential argument
still leaves a moral problem.[92]
Even if it is prudent to use selective punishment to provide
selective deterrence, is it just to punish differently offenders who
may be equally wicked, merely because one had the good luck to miss
his intended target or to choose a less attractive victim?
Thomas Nagel discusses this problem at considerable
length.[93] His analysis, applied to
the sort of situation considered here, implies not only equal
punishment for the murderer who succeeds and the murderer who fails,
but also equal punishment for the person who, yielding to a
particular temptation, commits murder and the person who would have
committed murder if faced by the same temptation, but had the good
luck never to be so tempted. After discussing the clash between such
conclusions and our moral intuitions, he concludes that:
"I believe that in a sense the problem has no
solution, because something in the idea of agency is incompatible
with actions being events, or people being things. But as the
external determinants of what someone has done are gradually exposed,
in their effect on consequences, character, and choice itself, it
becomes gradually clear that actions are events and people things.
Eventually nothing remains which can be ascribed to the responsible
self, and we are left with nothing but a portion of the larger
sequence of events, which can be deplored or celebrated, but not
blamed or praised."
...
"The problem of moral luck cannot be understood
without an account of the internal conception of agency and its
special connection with the moral attitudes as opposed to other types
of value. I do not have such an account. ..." (pp. 37-38)
One possible answer to these problems is that Nagel and others are
too quick to assume that what people deserve depends only on what
they are. They, following Smith and Kant, take it for granted that
differences in outcome due to factors beyond the agent's control
cannot be morally relevant. To put it differently, they take it for
granted that the answer to the question "what ought to happen to you"
can depend only on the anwer to the question "what sort of a person
are you" and not on such extraneous issues as what consequences your
actions have caused.
My point here is closely related to one raised by Robert Nozick in
a different context. In discussing the problem of defining a just
society,[94] he distinguished
between ethics of desert and ethics of entitlement. The distinction
can be shown with a simple example:
Suppose we have a society in which everyone has what
he deserves, however that is correctly calculated. In this society,
two people decide to bet a dollar on a flip of a coin. The loser pays
the winner.
If justice is a matter of desert, the society is now unjust. The
winner did not deserve to win- which way the coin fell was a matter
utterly unrelated to his moral worthiness. Since it was unrelated to
moral worth, it cannot have increased what the winner deserved by a
dollar and decreased what the loser deserved by a dollar. Yet most of
us would say that it is just for the loser to pay off his voluntarily
occurred debt.
Nozick deals with this problem by the idea of entitlement-a moral
category different from desert. I am entitled to something if I have
acquired it in a morally legitimate way from someone who legitimately
owned it. Mutual assent, as in the case of the bet, is a morally
legitimate form of transfer, and the starting situation was, by
assumption, just, so the winner is entitled to his dollar.
This simple example brings up an important tension in our moral
intuitions. On the one hand, we feel as though reward and punishment
ought to be deserved. On the other hand, we feel as though certain
acts create obligations or entitlements, not because of what they
tell us about the moral worthiness of those who take them but because
of their consequences.
I offer the conjecture that these two different approaches to
moral desert are ultimately grounded in two different ways of
thinking about the moral problem. One approach considers the problem
from the viewpoint of God judging mankind. Actual consequences are
irrelevant-God can cancel them, if he wishes, with a wave of his
hand. Desert is entirely a question of how good or bad a person is,
and that is a matter that God is competent to judge.
The other approach assumes moral judgements are to be made within
a society of equals. My opinion about how good or bad a person you
are has no special status-there is no reason to believe that it is
more accurate than anyone else's opinion, including yours. The
consequences of your acts, on the other hand, are there to be
observed by everyone.[95] Thus a
moral system that makes punishment and reward depend on outcomes
seems more appropriate to such a society than one in which they
depend on someone's opinion of moral merit.
Furthermore a society of equals, unlike a society ruled by divine
providence, faces a budget constraint. If my careless driving results
in an accident that damages your car, somebody is going to have to
pay for fixing it.[96] Bad outcomes
that occur without any wicked intention still result in costs that
must be paid by someone; wicked intentions alone, without bad
consequences, do not. So it again makes sense for the system of moral
obligations to be based on outcomes, not merely
intentions.[97]
We are left with two different sorts of rules. One sort allocates
punishments and rewards according to moral merit-a sort of divine
report card. The other bases them on something more like a system of
accounts. Certain acts under certain circumstances result in some
people having obligations to others-obligations that may be entirely
independent of moral merit.
While I have presented the former approach as theist and the
latter as humanist, that is a description of the pattern of the
rules, not the beliefs of those that hold them. Rules suitable to be
applied by humans may seem appropriate to a theist considering human
institutions. That is the position of both Smith and Beccaria. They
reject punishment by moral desert not because it is inappropriate but
because it is inappropriate to human courts and should therefore be
left to divine justice. Similarly, one may believe in reward and
punishment based on moral merit even if one does not believe in the
existence of a God with the knowledge and power necessary fully to
carry out such a program.
Seen from this standpoint, the dissent's claim that whether or not
someone is executed should depend only on his blameworthiness seems
problematic. One factor relevant to punishment is how bad a person
the criminal has revealed himself to be, but another may be how much
damage he has done.[98]
[1] I would like to thank Gary Becker, David
Emmanuel, Wendy Gordon, William Landes, James Lindgren, Larry Lessig
and Richard Posner for useful comments and suggestions.
[2] Expected punishment is probability of
punishment times amount of punishment. If an offender faces a .1
probability of having to pay a $1000 fine, his expected punishment is
.1x$1000 = $100. If there are several different possible punishments
for the same offense, then the expected punishment is probability
times punishment summed over all the punishments. Thus if the
offender faces a .1 probability of a $1000 fine and a .2 probability
of a $100 fine, his expected punishment is .1x$1000 + .2x$100 = $120.
[3] 111 S.Ct. 2597, 115 L.Ed.2d 720, 59 U.S.L.W.
4814, reversing Booth v. Maryland, 482 U.S. 496, 107 S.Ct.
2529, 96 L.Ed.2d 440 (1987), and South Carolina v. Gathers,
490 U.S. 805, 109 S.Ct. 2207, 104 L.Ed.2d 876 (1989)
[4.]This is the same objective that Richard
Posner describes as "wealth maximization." (Richard Posner,
Economic Analysis of Law, 1992, pp 12-16, "The Problems of
Jurisprudence," 1990, pp. 356-357.) For a more detailed discussion of
what it means and why it might be a desirable objective, see Chapter
15 of Price Theory: An Intermediate Text (2nd edn.), David
Friedman, 1990. The assumption that efficiency is the only purpose of
the legal system will be dropped in the final section of the article,
where I discuss some philosophical issues relevant to the decision in
Payne
[5] One reason the television set is worth less
to me may be that its value is net of the cost to me of stealing
it-burglar's tools, time and effort spent breaking into your house,
and the like. Economic analysis of the market for theft implies that
the marginal thief gets no net benefit; the cost to him of being a
thief equals the value to him of what he steals, so the cost to his
victim is a net loss with no benefit to balance it. The analysis is
worked out in Friedman (1990) Chapter 20, pp. 565-569.
[6 ]This particular example is an implausible
one. If your television set is worth more to me, there is no need for
me to steal it; I can buy it instead. My gain from stealing it is
only the money I save by not buying it from you. But that is equal to
your loss, so after including the associated costs the theft is
inefficient. It follows that if a crime is simply an involuntary
substitute for a voluntary transaction, we would never expect it to
be efficient. See Friedman (1990) pp. 569-573, Posner (1992), pp.
14-16,206-211, 220-222.
There are, however, involuntary transactions that have no
voluntary substitute. Many are things we usually classify as torts,
but some are crimes. If, for instance, I drive home after having two
glasses of beer, I save myself a taxi fare but impose a cost, in
possible death or injury, on every driver, rider, and pedestrian
along my route. Even if the savings to me is larger than the cost to
them, there are severe transactional problems with trying to get all
of them to agree to allow me to drive. Or consider an efficient
assault. One can imagine a situation where one person is so angry at
another that he is willing to attack him, even though he knows he
will be fined an amount equal to the full damage done. A more exotic
example would be efficient theft-by someone who enjoyed the
excitement enough to more than make up for the associated costs. See
Posner (1992) p. 218.
[7] See, for example, the discussion in Chapter
7 of Posner (1992). Posner suggests that expected punishment should
be slightly above damage done to deter inefficient crimes and force
criminals whose crimes would be efficient to substitute still more
efficient market transactions, while permitting efficient crimes for
which no good market substitute exists.
[8 ]Or a damages award for a tort. The analysis
applies to civil damages as well as to criminal penalties, as I
discuss in "An Economic Explanation of Punitive Damages," Alabama
Law Review 40 (spring 2989) 1125-114. The same analysis could
also be applied to administrative penalties and to sanctions used by
a firm to control the behavior of its employees. In this article I
will be concentrating on the application of the analysis to punishing
crimes, since that is the particular issue raised by Payne.
[9] Throughout my analysis, I assume that costs
and benefits to criminals count, in social welfare calculations, just
like costs and benefits to anyone else. This assumption has been
questioned by a few scholars in the law and economics field, most
notably George Stigler and Gordon Tullock, who suggest that benefits
to criminals ought to be given no weight in such calculations. My
reasons for rejecting this position are discussed at some length in
David Friedman, "An Economic Explanation of Punitive Damages,"
Alabama Law Review 40, 3 (spring 1989).
[10] This point, and to some extent this
article, were suggested to me by discussions with Stephen Schulhofer,
themselves arising out of a correspondence between Stephen Schulhofer
and John Lott.
[11] I am ignoring in this essay two other
problem with the argument. The probability of apprehension, and hence
the expected punishment, is different for different criminals, so
even if we wanted to prevent inefficient and only inefficient crimes,
there is no pattern of enforcement that would do so. And some
punishments, such as imprisonment or execution, not only provide an
incentive not to commit a crime but also make it more difficult to
commit further crimes. I am considering only deterrence, not
incapacitation.
[12 ]This argument appears in David Friedman,
"Reflections on Optimal Punishment or Should the Rich Pay Higher
Fines?," Research in Law and Economics, 1981. A more recent
discussion is in Friedman (1989). See also A. Mitchell Polinsky and
Daniel L. Rubinfeld, "The Welfare Implications of Costly Litigation
for the Level of Liability," XVII JLS 1, (1988).
[13] The benefit considered here is the direct
benefit of the punishment-a fine received by the state, tort damages
received by the victim in a civil case, the cost of running a prison
(a negative benefit) or the like. It does not include the deterrent
effect of the punishment, which is considered separately in the
analysis. It does include benefits or costs that the victim, or
others, receive from knowing that the punishment has been imposed.
The death penalty might be a very inefficient punishment if many
people in the society were made unhappy by the knowledge that a
criminal had been executed.
Deterrence depends on expected punishment, but punishment cost per
unit of punishment typically depends on the actual punishment
employed. This is probably true of costs such as public disapproval
as well as costs such as maintaining a prison. While economic theory
focusses on the appropriate expected punishment for a given
crime, the public, which observes punishment but not probability, may
well judge the system by the relation of the actual punishment
to the crime. If so, then one effect of the victim impact statements
discussed below may be to raise the perceived wickedness of the crime
in the eyes of the jury, and thus raise the ceiling on the maximum
punishment the jury is willing to impose. The effect is the same as
if the jury were adjusting expected punishment-probability times
actual punishment-in response to an increase in its perception of the
damage done by the offense, since in either case the probability of
imposing the death penalty rises, but the reason for the effect is
different. This point was suggested to me by James Lindgren.
[14] When I say that one punishment is
equivalent to another, I mean that they have the same deterrent
effect. From the standpoint of utility theory, this is equivalent to
saying that both punishments have the same disutility for the
offender.
[15] A more rigorous form of the argument
appears in Friedman (1981).
[16. ]While my discussion will focus on direct
costs of enforcement and punishment, one should also include costs
such as the possibility that more severe enforcement will result in
more innocent parties being convicted, that increases in governmental
powers designed to catch more criminals may be used in other and less
desirable ways, etc.
[17] An even better solution would be to punish
only the inefficient offenses. We would thus avoid both the cost of
punishing the efficient offenses and the inefficiency of preventing
them. But in order to know which offenses are efficient, we must
somehow find out whether the criminal's gain is more or less than the
cost imposed on the victim. The way we find out, just as on ordinary
markets, is by charging a price-an expected punishment in the case of
offenses-and seeing whether the criminal is willing to pay it. In
order to do that we must impose the punishment on inefficient as well
as efficient offenses.
Where there is some other way of identifying the efficient
offenses, we can save the expense of punishing them. One example is
the excuse of necessity. The hunter who, lost in the woods and
starving, breaks into a locked cabin in order to telephone for help
will not be treated like an ordinary trespasser.
[18] The earliest mention of the effect of the
elasticity of the supply function for offenses on optimal punishment
that I have come across is in Gary Becker, "Crime and Punishment an
Economic Approach," 76 JPE 169 (1968).
[19] I am making no assumption as to whether or
not each offense is committed by a different offender. Since I am
considering only deterrence and not incapacitation, the analysis is
the same for the case where all offenses are committed by the same
criminal, the case where each offense is by a different criminal, or
anything in between.
[20] It is worth noting that the relationship
between optimal expected punishment and damage done depends on how
enforcement cost changes with expected punishment at the
optimum. In general, the elasticity of the supply of offenses
will be different at different values of P.
[21] Readers who cannot imagine how a murder
could be efficient may find the following hypothetical of interest. A
wealthy and bored big game hunter decides that the only animal
dangerous enough to be really worth hunting is man. He accordingly
makes the following offer to a group of adventurers:
"I will pay ten of you a million dollars each. In
exchange, each agrees that I may choose one of the ten at random and
attempt to kill him."
Ten adventurers accept the offer. Ex ante, the contract is
a Pareto improvement-everyone concerned is better off, since the
adventurers are each willing to accept a ten percent chance of being
picked in exchange for a million dollars (assume nobody else knows
about the contract). Yet many people would still believe that such a
contract should not be enforced-that murder ought to be illegal even
between consenting adults.
[22 ]Regina v Dudley and Stevens 1884,
14 QBD 273 is one example. See the discussion in Posner (1992), pp.
241-242.
[23] This raises an interesting historical
question. Was the differential punishment for murder less in
societies that believed in either an afterlife or reincarnation? In
England in the Middle Ages, all felonies, not only murder, were
capital offenses-and punishment for all felonies, including murder,
might sometimes be converted into a fine.
A further complication is that in such societies death may be a
weaker sanction than in ours, at least if the offender expects an
attractive afterlife-which may explain why heretics were often not
merely executed, but executed in strikingly painful ways. See the
discussions in Paul Brest, "The Misconceived Quest for the Original
Understanding," in Boston University Law Review vol. 60, pp. 204-238
at 221 and in Posner (1992), p. 229-230.
[24] If we believe that efficiency is
desirable, we might also use the analysis to make qualitative
recommendations-to argue that certain offenses ought to be punished
more severely than others.
[25] Lott, John, "Should the Wealthy be Able to
`Buy Justice'," JPE 95, December 1987, pp. 1307-1316.
[26] We would also expect, from applying the
simple model to differences in the money equivalent of the damage
done rather than the money equivalent of the punishment, that fines
for assaulting rich people would be higher than for assaulting poor
people. Here again, one may believe that the result is unjust but
also that it is correct-that in this regard our legal system does
resemble an economically efficient one, whether or not it should.
Such distinctions were an explicit element of the Anglo-Saxon law out
of which our law developed. One argument against the Court's decision
in Payne v. Tennessee, discussed below, is that one of its
effects may be to make the punishment for murdering rich people
higher than for murdering poor people.
[27] "But what is the same punishment? Is the
same fine, for example, productive of the same effect on rich and
poor? Or does the same number of years in prison have the same effect
on different individuals regardless of their diverse temperaments or
physique?" Morris Raphael Cohen, "Moral Aspects of the Criminal Law,"
Yale Law Journal, Vol. 49, pp. 987-1026 (1940).
[28]
"Some crimes are attempts against the person, others
against property. The penalties for the first should always be
corporal punishments. ... The great and rich should not have it in
their power to set a price upon attempts made against the weak and
the poor; otherwise riches, which are, under the laws, the reward of
industry, become the nourishment of tyranny. ... I shall limit myself
to considering only the punishments to be assigned to noblemen,
asserting that they should be the same for the first as for the least
citizen." (Cesare Beccaria, On Crimes and Punishments, 1764,
p.40. Henry Paolucci translator, Bobbs Merrill, Indianapolis 1963.
pp. 69-70.)
In this passage, Beccaria is arguing, in effect, for equal jail
sentences rather than either equal fines or unequal jail sentences;
he does not consider the possibility of unequal fines. He goes on to
deal with the claim that the punishment really imposes a larger cost
on a noble, because of his greater education and greater
vulnerability to social stigma ("the disgrace that is spread over a
noble family") by arguing that the proper measure of punishments is
the public injury done and that greater damage is done by a crime
"when committed by a person of rank"-presumably because of the bad
example. His argument is in part consistent and in part in conflict
with mine.
[29] This argument is worked out in
considerably more detail in Friedman (1981).
[30] See Friedman (1981) for an explanation and
formal analysis.
[31] A similar argument might apply to the
enforcement of rules regulating the behavior of firms. Suppose that
any judgement above $10 million will push a particular firm into
bankruptcy. Further suppose that bankruptcy is a bad thing-the real
value of the firm is greater as a going concern. In that case a
punishment of $11 million (of which only $10 million will be paid) is
much more costly than a punishment of $9 million. This would apply to
criminal punishments, civil punishments, and administrative
penalties. In each case, punishment cost becomes large when the
punishment reaches a level that creates a significant probability of
bankruptcy and becomes infinite when it exceeds the liquidation value
of the firm. It follows that it may be efficient to impose larger
punishments on wealthier firms, even if the offense is the same.
Another application of the analysis would be to a firm attempting
to controll the behavior of its employees. Some employees can be
punished for malfeasance by firing, denial of promotion or other
internal sanctions. Others can only be punished by expensive legal
procedures. The optimal sanctions for employee malfeasance and the
appropriate level of precautions will vary accordingly.
[32] I, like most economists, assume that
increasing the penalty for a crime will tend to decrease its
occurrence; I realize that some people disagree, and that there are
other grounds on which the case for and against capital punishment
can be and is argued.
[33] Throughout my discussion, I assume that
the sentence is set by the jury, as was the case in Payne.
Essentially the same arguments would apply if it were set by the
judge instead.
In order to avoid misunderstanding, I should make it clear that I
am not claiming that jurors (or judges) are necessarily trying to
produce the economically efficient result, still less that they
always succeed in doing so. My claim is only that the relation
between the value of the victim and the value of the defendant is
relevant both to the efficient decision and to the actual behavior of
the jury.
[34] For a general overview of the history of
capital murder and the attempt to determine which murderers ought to
be executed, see the discussion in the commentary on the Model Penal
Code SS 210.6 (American Law Institute 1985).
[35]Isaac Ehrlich, in a widely discussed and
widely criticized study of the effect of capital punishment ("The
Deterrent Effect of Capital Punishment: A Matter of Life and Death,
65 Am. Econ. Rev. 397 (1975)), concluded that each execution
deterred several murders.
[36] He was not aware that one of the children
would survive, to be the subject of the prosecutor's oratory. The
dissent did not, however, try to argue that, having done his best to
kill all three victims, he should not be held morally responsible for
the emotional pain to the one who survived.
[37] "Where, as is ordinarily the case, the
defendant was unaware of the personal circumstances of his victim,
admitting evidence of the victim's character and the impact of the
murder upon the victim's family predicates the sentencing
determination on "factors ... wholly unrelated to the blameworthiness
of [the] particular defendant." Booth v. Maryland, supra, 482
U.S., at 504, 107 S.Ct., at 2534; South Carolina v. Gathers,
supra, 490 U.S., 810, 109 S.Ct., at 2210." (Justice Marshall,
dissenting in Payne v. Tennessee) The dissent also argued
that, even if the defendant was aware of the relevant circumstances,
presenting them to the jury would tend to produce a decision based on
emotion rather than reason. The experience of reading the case,
surely less moving than the experience of sitting through it,
provides both evidence for this claim and a powerful argument in
favor of the jury's decision.
[38] If the criminal has some information about
the value of the victim's life-knows, for example, that she is of an
age at which she is likely to be a mother with small children-then
selective punishment produces some selective deterrence, although
less than if the criminal were perfectly informed about the victim.
This point is discussed at greater length below.
[39] The point is demonstrated,
unintentionally, by one of the briefs opposing the result eventually
reached by the court:
"In the case at bar, therefore, the prosecution could
have argued to the jury that the perpetrator likely knew that, if by
chance a child survived the attack, he or she would long for his or
her mother or sibling.
The fact that the prosecution could have made this argument does
not justify its formal presentation of Ms. Zvolanek's testimony in
blatant violation of Booth. Her live emotional testimony that
Nicholas did in fact cry for his mother, that he repeatedly asked for
"my Lacie", and that he asked his grandmother if she "also missed
Lacie" is markedly different from the prosecutor's merely drawing a
general inference during an argument." (Petitioner's brief,
Payne.)
The obvious response is that "drawing a general inference during
an argument" presents a less, not more, accurate picture of the
damage done by a murder than the sort of dramatic and emotional
testimony that was actually introduced. This point is made in the
amicus brief of the State of California: "Booth has
relegated the victim of a capital crime to a faceless,
undifferentiated mass ... ." and again in the amicus brief for
The National Organization For Victim Assistance:
"Victims speaking of harm done, of the effect the
crime has had on their lives, do not claim that one life is more
valued than another, but rather bring into sharp focus for the judge,
the jury, and society, the realities of what the aftermath of violent
crime exacts on each of these essential parts of life in a free
society. To muzzle all victims at capital sentencing hearings for
fear that some may be more persuasive or express more eloquently the
horrors of crime, is the truly arbitrary and capricious decision."
(Judith Rowland)
"To require, as we have, that all mitigating factors which render
capital punishment a harsh penalty in the particular case be placed
before the sentencing authority, while simultaneously requiring, as
we do today, that evidence of much of the human suffering the
defendant has inflicted be suppressed is in effect to prescribe a
debate on the appropriateness of the capital penalty with one side
muted. If that penalty is constitutional, as we have repeatedly said
it is, it seems to me not remotely unconstitutional to permit both
the pros and the cons in the particular case to be heard."
Booth at 520 (Scalia, J. dissenting).
"What Booth and Gathers ... are suggesting is a
generic victim, an abstract victim, an invisible victim at the
sentencing." (Burson, page 44 of the transcript provided by Alderson
Reporting Company, Inc., hereafter referred to as "the transcript").
[40
]"Payne echoes the concern voiced
in Booth's case that the admission of victim impact evidence permits
a jury to find that defendants whose victims were assets to their
community are more deserving of punishment than those whose victims
are perceived to be less worthy. Booth, supra, 482 U.S., at 506, n.
8, 107 S.Ct., at 2534 n. 8. As a general matter, however, victim
impact evidence is not offered to encourage comparative judgments of
this kind-for instance, that the killer of a hardworking, devoted
parent deserves the death penalty, but that the murderer of a
reprobate does not. It is designed to show instead each victim's
"uniqueness as an individual human being," whatever the jury might
think the loss to the community resulting from his death might be.
The facts of Gathers are an excellent illustration of this:
the evidence showed that the victim was an out of work, mentally
handicapped individual, perhaps not, in the eyes of most, a
significant contributor to society, but nonetheless a murdered human
being." (Rehnquist, C.J for the Court in Payne v. Tennessee)
The amicus brief by the State of California, on the other
hand, argued for comparative judgements among victims:
"Contrary to the assumption in Booth, the harm to
society may be greater depending upon the characteristics of the
victim. The murder of a police officer, parent or child harms society
more than the murder of a drug dealing child molester."
[41 "]The fact that each of us is unique is a
proposition so obvious that it surely requires no evidentiary
support. What is not obvious, however, is the way in which the
character or reputation in one case may differ from that of other
possible victims. Evidence offered to prove such differences can only
be intended to identify some victims as more worthy of protection
than others." (Stevens, J., dissenting in Payne v. Tennessee).
[42] Another possible interpretation of what is
actually happening in cases like Payne is that the prosecution
is establishing, not the value of the victim's life, but the
innocence of the victim. Jurors may be more likely to identify with
victims who are entirely innocent than with those who were, in some
sense, partly the cause of their own deaths. One example would be a
drug dealer killed by a rival; a less clear one would be the victim
in a marital quarrel. This point was suggested to me by Wendy Gordon.
Such considerations were not raised by either the Court or the
dissent.
[43] Burson distinguished during the hearing
between "worth and sanctity of a human life," which is the same for
all lives, and societal harm, which might vary from one victim to
another (p. 38 of the transcript). And Thornburgh responded to a
question by Justice Scalia with "It's not the characteristics
themselves but what has resulted from the death of that individual in
a loss to the victim, the family, and the community" (p. 54 of the
transcript).
[44] The civil law, in case of wrongful death,
has traditionally carried this argument even farther, basing damages
on the injury to everyone except the victim. The concept of "hedonic
damages" repesents a recent attempt to include in the calculation the
value of the victim's life to himself.
[45] In this discussion, I am taking the
criminal's knowledge as given. One effect of a legal system that made
the severity of punishment depend, in part, on the characteristics of
the victim would be to give potential criminals an incentive to learn
more about potential victims before deciding whether to kill them.
[46] At this point I am adopting the view the
court rejected-that punishment should vary with the value of the
victim's life. Readers who are uncomfortable with the idea that some
victims are more valuable than others may wish to think of high value
victims as mothers with small children and low value victims as
ninety year old men with incurable cancer. Those who are still
uncomfortable with the idea may wish to transfer the analysis to some
less serious crime than murder, and consider it as applicable to the
question of whether imperfectly foreseen harm should be considered in
setting the sentence for that crime.
[47] The probability is a description of what
the potential murderer knows when he decides whether to commit the
murder. It is his knowledge that is relevant to his decision, and it
is his decision that we are trying to affect by imposing a punishment
in order to deter a crime.
[48] In a more elaborate analysis, one would
want to let the probability of apprehension depend on the value of
the victim; the police could, probably do, and in an efficient system
probably would, try harder to apprehend murderers of victims whom
they consider more valuable. A further possibility not considered
here is that in some cases the difficulty of catching the offender
may depend on the outcome of the offense. Consider, for example,
attempted and actual murder. A successful murderer does not have to
worry about being identified by his victim's testimony-an
unsuccessful murderer does. For a situation where the offender who
has done more damage is easier to apprehend, consider violations of
safety regulations. It is harder to conceal a violation if it has
killed someone.
[49] This assumption implies that criminals are
risk neutral, since an uncertain but otherwise costless punishment
imposed on a risk averse criminal would generate a cost of risk
bearing.
[50] More precisely, actual expected punishment
for killing a victim of a given value. It is still an expected
punishment because it is an actual punishment if convicted times the
probability of being convicted.
[51] This appears to be the policy advocated by
most of the opponents of the Court's decision in Payne,
insofar as they are willing to accept the idea that the consequences
of some murders are predictably more heinous than the consequences of
others. See the passage from the petitioner's brief in Payne
quoted in fn 39 above.
[52]Throughout this analysis I assume that any
attempt at selective deterrence by the court must be based on the
criminal's beliefs about the victim. One could imagine a system where
a court was better informed than the criminal, ex ante, about
the costs imposed by a particular offense, and conveyed that
information to the potential criminal by announcements about its
penalty schedule. For example, the court might (and some
legislatures, in effect, do) announce that "the lives of policement
are especially valuable, and we will therefore execute you if you
kill one." Such an announcement affects the incentives of a potential
murderer, even if his subjective probability that the policeman he is
contemplating killing is a particularly valuable person is very low,
since what matters is what he thinks the court thinks.
In my analysis, I am concerned with how the court uses the
criminal's knowledge about the victim to provide selective
deterrence-either by basing punishment on what the court thinks the
criminal knew, or by basing punishment on the actual outcome, and
relying on the effect of that policy working through the criminal's
probabilities for the alternative outcomes.
This issue is discussed at greater length in David Friedman,
"Deterring Imperfectly Informed Tortfeasors: Optimal Rules for
Penalty and Liability" (1992) (manuscript available from the author).
[53] This form of the result of the argument is
worked out explicitly as Equation 2 in part I above and in Friedman
(1981).
[54] Here, as in most (but not all) of the law
and economics literature, the social cost of the punishment includes
both the cost to the criminal (his life, in the case of capital
punishment) and the cost to others, including moral revulsion, the
cost of running prisons, the hangman's fee, etc.
[55] Such a situation is particularly likely if
one of the alternatives has a very low probability but a very high
cost. Consider a crime, such as replacing the medicine in a bottle
with aspirin or putting a sub-lethal dose of poison on Chilean
produce in a U.S. grocery store as a protest against the policies of
the Chilean government, which usually does no significant damage but
has a small probability of killing someone. The equivalent of
selective deterrence would be a policy of punishing the perpetrator
according to the damage done-a small fine most of the time, and
execution if someone dies. It may be less costly and more effective
to instead impose a moderately severe punishment based on the
expected damage. A version of this example was suggested to me by
David Emmanuel.
[56] This result must be stated in such an
imprecise form because I have not specified the actual form of the
relevant supply curve (of offenses as a function of expected
punishment) and cost curves (for punishment, apprehension, and
conviction). If the additional term in the optimal punishment
calculation added by the existence of these costs varies only
slightly over the relevant range of punishments, then the
sophisticated model gives almost the same result as the simple model.
In that case, either a very well informed court or a very badly
informed set of criminals would be necessary to make selective
punishment by victim less efficient than selective punishment by
court's estimate of criminal's knowledge.
[57] This issue is discussed at greater length
in Friedman (1992) (manuscript available from the author). For a
somewhat different view, see A. Mitchell Polinsky, "Optimal Liability
When the Injurer's Information about the Victim's Loss is Imperfect,
IRLE(1987), 7 (139-47)
[58]Sindell v. Abbott Laboratories, 26 Cal. 3d
588, 607 P.2d 924, 163 Cal. Rptr. 132 (1980), Murphy v. E.R. Squibb
& Sons, Inc., 40 Cal. 3d 672, 710 P. 2d 247, 221 Cal. Rptr. 447
(1985)..
[59] "Isn't the real problem with getting into
the-or at least with the prosecution's taking the affirmative in
getting into the character of the victim, that it implies that
society is valuing victims differently?" " Isn't the real problem
one, almost one, a kind of maybe a second-tier equality before the
law argument, that society is placing different values on their
victims-on victims?" (Remarks by Justices on pp. 26 and 27 of the
transcript )
[60]481 U.S. 279, 107 S.Ct. 1756
[61] McClesky v. Kemp, Majority opinion
by Justice Powell.
[62] Black murderers presumably differ in other
statistically relevant ways from white murdereres. I have not worked
with the original data, so do not know how much of the discrepency
between the treatment of black murderers and white murderers is due
to the difference in the race of their victims, but it seems likely
that it is the major factor.
[63]
"Most black victims are killed by black murderers, and
a disproportionate number of murder victims is black. Wherefore the
discrimination in favor of murderers of black victims more than
offsets, numerically, any remaining discrimination against other
black murderers." Ernest Van den Haag, "The Death Penalty Once More,"
U.C. Davis Law Review, Vol. 18 p. 961 (1985).
"Those who demonstrated the pattern seem to have been under the
impression that they had shown discrimination against black
murderers. They were wrong. However, the discrimination against black
victims is invidious and should be corrected." Van den Haag (1985),
p. 961 fn 23.
Gary Kleck, in "Racial Discrimination in Criminal Sentencing: A
Critical Evaluation of the Evidence with Additional Evidence on the
Death Penalty," 46 Am. Soc. Rev. 783, 797-98 found that the risk of a
death sentence was higher for a white defendant than a black
defendant throughout the period 1967-1978, presumably because of
discrimination by race of victim.
[64] This issue is raised in Norval Morris,
"Race and Crime," Judicature vol. 72 p. 111. For a survey of the
various studies, see Samuel R. Gross, "Evaluating Evidence of
Discrimination," U.C. Davis Law Review, Vol. 18 pp. 1275-1325 (1985).
The author concludes that "The scientific implications of these
studies are simple. The evidence indicates, unmistakably, that there
has been substantial discrimination in capital sentencing by race of
victim, at least in those states that have been extensively studied."
[65] "All [Baldus's models] showed
race-of-victim disparities, virtually all of which were highly
statistically significant. Many showed race-of-defendant disparities
as well." McClesky v. Kemp, Statement of the Case:
Petitioner's Record Evidence, in Landmark Briefs and Arguments of
the Supreme Court of the United States, Philip B. Kurland and
Gerhard Casper, Editors, vol. 171 pp. 468-9. "In sum, most of Baldus'
many measures revealed strong, statistically significant disparities
in capital sentencing in Georgia homicide cases, based upon the race
of the victim. (T. 726-28). Race of defendant disparities also
regularly appeared, although not with the invariable consistency of
the victim statistics.
[66] U.S. Constitution, Amendment XIV, Section
1.
[67] The one exception is a passage in Justice
Blackmun's dissent (beginning "Moreover, the legislative history of
the Fourteenth Amendment reminds us that discriminatory enforcement
of States' criminal laws was a matter of great concern for the
drafters," and including a footnote on discriminatory law enforcement
during the post civil war period) which raises the issue of unequal
protection of potential victims, but does not apply it to the case
under discussion. The issue was also raised in the brief for the
petitioner:
"The history of the Equal Protection Clause
establishes that race-of-victim discrimination was a major concern
for its Framers, just as Professor Baldus has now found that it is a
major feature of Georgia's administration of the death penalty.
Following the Civil War and immediately preceding the enactment of
the Fourteenth Amendment, Southern authorities not only enacted
statutes that treated crimes against black victims more leniently,
but frequently declined even to prosecute persons who committed
criminal acts against blacks. ... The congressional hearings and
debates that led to enactment of the Fourteenth Amendment are replete
with references to this pervasive race-of-victim discrimination; the
Amendment and the enforcing legislation were intended, in substantial
part, to stop it. As the Court recently concluded in Briscoe v.
LahueI, 460 U.S. 325, 338 (1983), "[i]t is clear from the legislative
debates that, in the view of the ... sponsors, the victims of Klan
outrages were deprived of `equal protection of the laws' if the
perpetrators systematically went unpunished."Landmark Briefs and
Arguments of the Supreme Court of the United States, Philip B.
Kurland and Gerhard Casper, Editors, vol. 171. pp. 647-9. "Similarly,
if the death penalty is meant to deter capital crime, it ought to
deter such crime equally whether inflicted against black or against
white citizens." fn 13, pp. 651-652.
The Court, of course, mentioned in its opinion the evidence of a
race-of-victim effect. But it did not discuss the implication that
Georgia's law might be unconstitutional because it failed to protect
black potential victims. Thus the court wrote:
"Similarly, since McCleskey's claim relates to the
race of his victim, other claims could apply with equally logical
force to statistical disparities that correlate with the race or sex
of other actors in the criminal justice system, such as defense
attorneys." (McClesky v. Kemp, Majority opinion by Justice
Powell.)
Such claims would not apply "with equally logical force" if
McCleskey's claim was seen as based on equal protection for victims
from crime via deterrence. There is a very large and obvious
difference between failing to protect someone from being murdered and
failing to protect someone from not being hired as a defense
attorney.
It is possible that the Court ignored the issue on grounds of lack
of standing; McClesky was a murderer not a victim. But, as the recent
case of Powers v. Ohio, 111 Sct 1364 shows, a convicted
criminal can sometimes succeed in raising a ius tertii
defense-a defense based on the violation of someone else's rights.
In oral argument, counsel for the petitioner dealt with the issue
of standing by putting the argument in terms of unfairness to a
defendant who had killed a white, not in terms of unfairness to black
potential victims:
Question: But I am not sure how that supports a claim
of discrimination against the defendant.
Mr. Boger: Well, if the question is one, if you would, of
standing, a defendant - if I have two defendants at my right hand,
and two at my left, and the two at my left have murdered blacks,
surely my defendants on the right hand would have standing if Georgia
had a statute that made killing a white person a more serious crime.
They'd say that's unconstitutional. That's an invidious
discrimination.
(Landmark Briefs and Arguments of the Supreme Court of the United
States, Philip B. Kurland and Gerhard Casper, Editors, vol. 171. p.
970. )
Alternatively, the Court may have ignored the issue because it was
not demonstrated that the discriminatory outcome was a result of
discriminatory intent; see Village of Arlington Heights v.
Metropolitan Hous. Dev. Corp., 429 U.S. 252, 264-66 (1977);
Washington v. Davis, 426 U.S. 229 (1976); cf. Oyler v.
Boles, 368 U.S. 448, 456 (1961) (selective enforcement of
habitual criminal statute does not violate equal protection clause
absent discriminatory intent), all cited in Gross (1985) at p. 1284
fn 43. This seems to have been the grounds for rejecting a 14th
amendment claim in Spinkellink v. Wainright, 578 F.2d 582 (5th
Cir. 1978), cert. denied, 440 U.S. 976 (1979).
[68] Higher civil damages for the wrongful
death of more valuable victims might be justified as fair
compensation, avoiding the issue of unequal protection. If the
objective of criminal punishment is deterrence, then using selective
punishment to produce selective deterrence implies that the law is
deliberately choosing to protect some potential victims more than
others.
One possible response would be to argue, along lines suggested
earlier, that although selective deterrence aimed at better
protection for richer or better educated victims was unconstitutional
on equal protection grounds, selective deterrence in favor of victims
whose deaths would impose severe costs on other people was not. From
this standpoint, imposing a more severe penalty on the murderer of a
mother with three children than on the murderer of a bachelor is
analogous to imposing a more severe penalty on someone who kills one
person and severely injures three others than on someone who simply
kills one person.
[69]
"Prosecutors and juries would also be authorized to
find that the lives of homeless people, prostitutes, the politically
unpopular, or others who are different are not "worth" as much as
other members of society. See, e.g., Belkin, Texas Judge Eases
Sentence for Killer of 2 Homosexuals, N.Y. Times, Dec. 17, 1988,
section 1, page 8, col. 5 (thirty-year sentence for murders of two
homosexuals explained by: `I put prostitutes and gays at about the
same level. And I'd be hard put to give somebody life for killing a
prostitute.')" from the amicus brief of the SCLC in
Payne.
Stevens: "Should the defendant be allowed to bring out evidence
that the victim was an unworthy person?"
Burson (Attorney General of Tennessee): "No. That would invite
`open season' on victims." (59 LW 3762, 5-14-91.) His comment is
given at greater length in the transcript as: "For instance, a state
may well conclude that to allow a defendant to put on a negative
social impact evidence without the state opening it up, that that, in
essence, would invite open season on victims." (p. 36 of the
transcript).
The implication is that the defense could counter prosecution
evidence about the characteristics of the victim, but could not
introduce such evidence unless the prosecution did. There are similar
comments by Richard Thornburgh on pages 47 and 49 of the transcript.
[70] From the standpoint of economics, if not
of justice, this is a problem of jury error. There is nothing
inefficient about a system where the punishment for ending a life
with little value-say the life of someone who is dying from cancer
and has only a few days left-is relatively low. The problem is that
the jury may be measuring, not how much the victim's life is worth,
but how much it is worth to the jury-which is a very different thing.
"Justice Souter ... asked ... isn't the real problem of getting into
victims' character that society is involving itself in evaluating the
comparative value of victims' lives?" (59 LW 3762, 5-14-91.)
[71] A similar issue came up in a later Supreme
Court case, involving offenders rather than victims. In Dawson v.
Delaware the Court, in an 8-1 vote, ruled that certain negative
information about the offender (his membership in a prison gang
called Aryan Brotherhood) could not be introduced in the sentencing
stage of the trial, even though all positive evidence could be. The
Court based its decision on First Amendment grounds, but Justice
Thomas, the lone dissenter, argued that the case created a double
standard allowing defense lawyers to point out good associations but
forbidding prosecutors from pointing out bad ones.
[72] This might not be the case if the
prosecution, as well as the jury, held the victim in low regard and
therefore wanted the defendant to get off as easily as possible. But
in such a situation it seems unlikely that any prosecution would
occur or, if it occurred, would result in a conviction.
[73] Arguably, such a rule would be
inconsistent with due process. Richard Thornburgh, Attorney General
of the U.S., argued during the case that it would be constitutional
to allow the prosecution to introduce a victim impact statement and
not to allow the defense to do so (p. 47 of the transcript) and that,
in the absence of any specific state law on the subject, the
defendant should not be allowed to raise the issue of victim
characteristics (p. 49).
[74] I say "about X/2" because a jury that is
trying to impose the efficient level of punishment will be making a
calculation more complicated than simply averaging victim rankings;
the argument could be made more precise but at considerable cost in
clarity.
[75] Since the jury knows it has not been given
evidence on the characteristics of the victim, only victims for whom
the prosecutor would not have presented evidence on victim
characteristics are possible victims, so only they are averaged over.
[76] For instance, jurors may be more or less
willing to impose a given average level of punishment if they believe
that it will go selectively to those who have killed particularly
valuable victims. One could imagine a juror who considered it very
important to deter killers of young mothers, but only moderately
important to deter killers of old men. With no information on victim
characteristics he would favor the death penalty for all murderers,
in order to get a sufficiently high level of deterrence against those
who killed young mothers.
[77] This assumes that the jury knows the
behavior of prosecutors in general but not of each individual
prosecutor-the X relevant to the jury's decision is an average of the
values for the different prosecutors. Without that assumption, the
situation is equivalent to having a single prosecutor for all cases.
[78]This conclusion is strengthened by the
likelihood that a jury, during the course of a trial, will acquire
information about the victim even if it is not introduced in the
context of determining punishment.
[79] The argument can be generalized to any
case in which juries are expected to do a bad job of evaluating the
value of the lives of victims. If one believes, as the minority in
Payne v. Tennessee and the majority in Gathers and
Booth perhaps did, that this is the normal case, one will
naturally be suspicious of both selective deterrence and the
Payne result.
[80]Readers whose initial response is that we
should never make such tradeoffs on any terms may wish to ask
themselves whether they would favor spending an additional hundred
billion dollars a year on the court system if the result was to
eliminate one false, and thus unjust, conviction for illegal parking.
[81]
"if evidence of the full range of harm caused by a
defendant is truly irrelevant because it does not inform the
sentencer of the defendant's mental state, then it should be equally
irrelevant in all criminal cases. While the severity of the penalty
in capital cases requires greater procedural safeguards, the
qualitative difference in penalty cannot justify any difference in
the substantive determination of whether a particular class of
evidence is relevant." (Charles W. Burson, Attorney General of
Tennessee, for respondent in Payne.)
This issue is discussed in "The Significance of Victim Harm:
Booth v. Maryland and the Philosophy of Punishment in the
Supreme Court," Richard S. Murphey, 51 Univ. Chi. L.Rev. 1303 (1988).
The author argues that, on a retribution theory of punishment, the
harm the criminal actually caused is irrelevant to the punishment he
deserves, and "Hence, the Supreme Court's decision in Booth,
by holding that victim impact statements are per se irrelevant
to the capital sentencing decision, is completely consistent with and
in fact required by the retributivist model of punishment." He goes
on to argue that "The weakness in the Booth Court's reasoning
is that it fails to recognize that the criminal law categorizes
punishments according to the actual results. Thus, to reject the
degree of harm inflicted as irrelevant, when divorced from the
defendant's intentions, is to reject a principle that pervades the
criminal justice system." He concludes that "as a matter of
constitutional interpretation the Court is misguided" in its
rejection of utilitarian theories in favor of retributive theories of
punishment.
As one example of how strong the intuition of "punishment by moral
desert rather than by consequences" seems to some, consider the
following quote from an English legal philosopher:
"The penalties for attempts used to be lower than
those for successful crimes, and although this is no longer so in
England, courts are still apt to take a more lenient view of them,
illogical as this is. As for harms which are knowingly risked--for
example by motorists who drive `recklessly,'--sentencers usually take
a more lenient view of them if they do not actually happen (again the
logic is questionable)." Nigel Walker, Why Punish, p. 96, Oxford
University Press, Oxford 1991.
In an earlier and more moralistic statement of the case for
punishment as a response to wickedness, Sir James Fitzjames Stephen
wrote:
"Everything which is regarded as enhancing the moral
guilt of a particular offense is recognized as a reason for
increasing the severity of the punishment awarded to it... The
criminal law thus proceeds upon the principle that it is morally
right to hate criminals, and it confirms and justifies that sentiment
by inflicting upon criminals punishments which express it.
I think that whatever effect the administration of criminal
justice has in preventing the commission of crimes is due as much to
this circumstance as to any definite fear entertained by offenders of
undergoing specific punishment."( "Of Crimes in General and of
Punishments," from History of the Criminal Law of England, Vol. II,
pp. 75-93, included in Crime, Law and Society, readings selected by
Abraham S. Goldstein and Joseph Goldstein, The Free Press, London,
1971, pp. 22-23. )
The second paragraph of the quote provides an old and important
prudential argument for what is elsewhere presented as a moral
principle. By punishing (and hating) the wicked we teach people to be
less wicked.
[82] This point is discussed by H.L.A. Hart:
"The almost universal tendency in punishing to
discriminate between attempts and completed crimes rests, I think, on
a version of the retributive theory which has permeated certain
branches of English law, and yet has on occasion been stigmatized
even by English judges as illogical. This is the simple theory that
it is a perfectly legitimate ground to grade punishments according to
the amount of harm actually done, whether this was intended or not;
`if he has done the harm he must pay for it, but if he has not done
it he should pay less.' To many people such a theory of punishment
seems to confuse punishment with compensation ... . Why should the
accidental fact that an intended harmful outcome has not occurred be
a ground for punishing less a criminal who may be equally dangerous
and equally wicked? I may be wrong in thinking that there is so
little to be said for this form of retributive theory. It is is
certainly popular ... ." (H.L.A. Hart, Punishment and Responsiblity
130-131 (1968).)
He comments further on the issue of making punishment depend on
ex post outcome rather than ex ante expectations, in
the context of liability for negligence, on pp. 134-5, again without
finding any justification for the existing law.
[83]
"Obviously this apportionment of punishment [for
attempt] can be explained only by an assumption that to some extent
it is designed for retribution. If the law's purpose were merely
preventive, it would apply to the act done the same consequence,
regardless of whether the act were successful or unsuccessful, since
its objective would be the prevention of acts likely to result in
harm. The fact that the punishment for success is twice as severe as
the punishment for an unsuccessful attempt must mean that the
additional suffering consequent upon success is a matter of expiation
of retribution because of that success." J. Waite, The Prevention of
Repeated Crime 8-9 (1943)
Waite's claim that a system of punishments designed only for
deterrence must impose the same punishment for an unsuccessful
attempt as for a completed crime is wrong. To see why, apply the
analysis of optimal punishment for the killing of high value and low
value victims given above to the case of murder (high
injury--corresponding to killing a high value victim) and expected
murder (low injury--corresponding to killing a low value victim). The
analysis is the same, so the conclusion, that it may be optimal to
base punishment on outcome ex post instead of expected outcome
ex ante, remains. Waite's point is valid, however, if we take
it as demonstrating that differential punishment for attempts, if
based on desert rather than deterrence, implies that desert is
affected by consequences.
The Model Penal Code provides, in section 5.05(1), that "Except as
otherwise provided in this Section, attempt, solicitation and
conspiracy are crimes of the same grade and degree as the most
serious offense which is attempted or solicited or is an object of
the conspiracy. An attempt, solicitation or conspiracy to commit a
[capital crime or a] felony of the first degree is a felony of the
second degree." Putting aside the exception for first degree
felonies, this is consistent with the idea that punishment should
depend only on intent, not outcome. Similarly, in discussing
aggravating circumstances that may justify the death penalty, the
Code does not seem to include the outcome of the crime, except
insofar as it is foreseeable. SS 210.6(3) h gives, as an aggravating
circumstance, that "the murder was exceptionally heinous, atrocious
or cruel, manifesting exceptional depravity." The final requirement
would seem to exclude a murder that was especially heinous for
reasons of which the murderer was unaware when he committed the
crime. (Model Penal Code Official Draft and Explanatory Notes,
American Law Institute 1985).
[84] Stephen Schulhofer has raised this issue
with regard to torts as well as crimes:
"Theoretically, it would be more appropriate for
everyone to pay into an insurance fund a premium based on the risks
he creates in the course of his activities. Those who suffer injury
would then seek compensation from the fund rather than attempting to
impose the entire loss on the negligent defendants who happened to
cause their particular injuries. ... In the absence of such a
framework, however, the law of torts can properly treat those who
cause harm differently from those who do not, in order to allocate
fairly the loss which has befallen the victim. This allocation of the
loss, fortuitous as between risk-creators, is preferable to an
allocation of the loss which would be fortuitous as between faultless
victims."(Stephen Schulhofer "Harm and Punishment: A Critique of
Emphasis on the Results of Conduct in the Criminal Law," U. of PA
Law Review, 122 p. 5964, fn 64.)
The final conclusion seems problematic. If a driver is only liable
for risk and not for result, then imposing costs on him beyond the
amount of the risk is no more just than imposing them on another
driver, or on the victim. The fact that this seems sharply contrary
to our moral intuition strikes me as evidence against the thesis
that, in determining the obligations of those who have done damage,
justice requires that we ignore consequences insofar as they are due
to events beyond the actor's control.
[85] See Bernard Williams, "Moral Luck," in
Proceedings of the Aristotelian Society, supplementary Vol. I.
(1976) pp. 115-35 and, in a slightly revised version, as Chapter 2 of
Bernard Williams, Moral Luck, Cambridge University Press,
Cambridge 1981.
[86] Foundations of the Metaphysics of
Morals, first section third paragraph.
[87] Adam Smith, The Theory of Moral
Sentiments, Part 2 Section III Introduction.
[88] Smith, op.cit., Part 2 Section III
Chapters 1 and 2.
[89] Smith, op. cit., Part 2 Section III
Chapter 3.
[90] One striking difference between Smith's
discussion of these issues and more modern discussions is that Smith
is concerned not with the possibility that punishment by desert will
provide arguments against punishing with special severity those who
(happen to have) committed crimes with particularly heinous
consequences but with the possibility that it will provide arguments
for punishing those who have not committed crimes but might, under
other circumstances, have done so. He writes that, if we resented
intentions as strongly as we resent actions, "Sentiments, thoughts,
intentions, would become the objects of punishment; and if the
indignation of mankind run as high against them as against actions;
if the baseness of the thought which had given birth to no action,
seemed in the eyes of the world as much to call aloud for vengeance
as the baseness of the action, every court of judicature would become
a real inquisition. There would be no safety for the most innocent
and circumspect conduct. Bad wishes, bad views, bad designs, might
still be suspected ... ." (Smith, op. cit., Part 2 Section III
Chapter 3).
[91]
"Finally, some have thought that the gravity of
sinfulness ought to enter into the measure of crimes. The fallacy of
this opinion will at once appear to the eye of an impartial examiner
of the true relations between men and men, and between men and God.
The first are relations of equality. ...The second are relations of
dependance on a perfect Being and Creator, who has reserved to
himself alone the right to be legislator and judge at the same time,
... . If he has established eternal punishments for anyone who
disobeys his omnipotence, what insect is it that shall dare to take
the place of divine justice, ... . The weight of sin depends on the
inscrutable malice of the heart, which can be known by finite beings
only if it is revealed. How then can a norm for punishing crimes be
drawn from this? Men might in such a case punish where God forgives,
and forgive where God punishes." (Cesare Beccaria, On Crimes and
Punishments, 1764, pp.65-66. Henry Paolucci translator, Bobbs
Merrill, Indianapolis 1963.)
[92] But not for Holmes, who wrote: "On the one
side is the notion that there is a mystic bond between wrong and
punishment; on the other, that the infliction of pain is only a means
to an end..." (Oliver Wendell Holmes, Jr., The Common Law, pp.
41-51) and, arguing from the fact that the reasonable man standard
makes a less than reasonable defendant liable even though his action
is not blameworthy, "If the foregoing arguments are sound, it is
already manifest that liability to punishment cannot be finally and
absolutely determined by considering the personal unworthiness of the
criminal alone." (p. 32 ).
[93] in Chapter 3 ("Moral Luck") of Thomas
Nagel, Mortal Questions, Cambridge University Press, Cambridge
1979.
[94] Anarchy, State and Utopia Chapter
7, especially pp. 155-164, where Nozick puts the distinction in terms
of patterned principles (of which "to each according to his moral
merit" is one example) vs entitlement principles.
[95] I am ignoring here difficult questions of
evidence and causality which are important to the workings of real
legal systems but not, I think, to the point made here.
[96] Or else somebody must bear the cost of
having a car that has been damaged and not fixed.
[97] This need not imply a system based
entirely on outcomes-we still need some way of deciding who bears the
costs, and intention may be one way to decide. My point is only that
in a society facing a budget constraint outcomes become morally
relevant, if only because they constrain the range of possible
allocations. Whatever we may all deserve, once my car has been
destroyed either I do not have a car or someone pays to buy me
another.
[98]A different approach to the problem of
justifying different punishments for offenders who may be equally
wicked is implied by Norval Morris' position that desert sets upper
and lower bounds to appropriate criminal punishment, within which
other considerations may determine actual punishment. If murderers
deserve to die in Morris' sense-if, in other words, capital
punishment is not unjust even though not morally required-then the
court is free to decide on other grounds which murderers are to be
executed. If the court's information about what murderers knew when
they committed their crimes is imperfect, then the arguments given
here in favor of selective deterence provide a reason for executing
those murderers who have done the most damage, even if we believe
that some of them did not know, ex ante, how much damage they
were doing.
"By a limiting principle of punishment I mean a
principle that, though it would rarely tell us the exact sanction to
be imposed, as deterrence might, would neverthless give us the outer
limits of leniency and severity which should not be exceeded. Desert,
I will submit, is such a limiting principle." (Norval Morris,
"Punishment, Desert & Rehabilitation" in Equal Justice Under Law:
U.S. Dept of Justice Bicentennial Lecture Series, 1976. pp. 5-6 (pp.
141-2 in the collected papers version of the lectures.))
There is also a lengthy literature which tries to base appropriate
punishments on a retributive principle, with a variety of different
justifications and consequences. A recent example is Margaret Falls,
"Retribution, Reciprocity, and Respect for Persons," in Law and
Philosophy 6 (1987 pp. 25-51); the author writes that "Criminal deeds
differ in the degree to which they involve morally relevant factors
like harm to others, violation of rights, and perhaps wickedness of
intent." (p. 45). She offers a theory of retributivism-punishment
based on the seriousness of the crime-based on the argument that "One
of the most fundamental duties of treating people as autonomous moral
decisionmakers is to hold them responsible for their acts."
Back to the list of articles.
Back to my home page.