The Paradox of Criminal History

Criminal history is all-important in the criminal and immigration systems. But these systems have little substantive information about past crimes. This creates a paradox. A person’s past convictions dictate whether they will face new criminal charges, make bond, suffer a lengthy sentence, or be targeted for deportation, among many other consequences. Yet, despite the vital role that criminal history plays in these decisions, judges and prosecutors know very little about the prior crimes of the people they process. Factually rich accounts of a person’s convictions are rarely available. The system instead relies on rap sheets that record only basic facts—the charge, the date of conviction, and the nominal sentence.

Because of this information poverty, the criminal and immigration systems employ criminal history heuristics when determining the consequences of prior convictions. Such heuristics include the number of past convictions, the types of crimes charged, and the apparent sentences. These heuristics are inputted into mechanical formulas like “three strikes” laws, sentencing guidelines, and bail algorithms. Such formulas translate past conviction information into often-severe consequences like deportations and mandatory minimum sentences. This mechanistic way of using criminal history creates many serious problems in our system. It causes irrational and unjust case outcomes, renders the system arbitrary to the people being processed, exacerbates systemic racism, and makes access to a competent lawyer vital. This Article diagnoses these problems and proposes a variety of possible reforms.

Introduction

Ms. L is a nineteen-year-old lawful permanent resident who lives in Seattle, Washington. In 2009, she is arrested for shoplifting a pair of pants from a J.C. Penney. She is charged with the crime of theft in the third degree, which is the least serious theft crime in Washington.1 Ms. L’s court-appointed lawyer tells her that she should plead guilty to this charge, because it is a misdemeanor and the sentence will be quite short. She follows this advice, and the judge sentences her to probation with one day in jail and a suspended sentence of 365 days. This kind of sentence is standard in Washington for misdemeanors—judges routinely impose a 365-day suspended sentence so that they can send the defendant to jail for any subsequent probation violations, and it is extremely rare for the defendant to end up serving the full sentence.2 Everyone at the sentencing hearing—the judge, the prosecutor, Ms. L’s lawyer, and Ms. L herself—know that she is only going to serve one day in jail and that the rest of the sentence will not be imposed.3

Now let us fast forward. After Ms. L is released from jail on her theft conviction, she is sent into deportation proceedings. Here, she has no court-appointed lawyer. There is only an “immigration judge” (an executive branch employee), and a lawyer for the Department of Homeland Security (DHS) arguing that she should be stripped of her residency and deported. At Ms. L’s hearing, the DHS lawyer presents a rap sheet printed out from a state database.4 This rap sheet shows the following entry:

Arrest Date: 5/23/2009

Charge: THEFT, RCW 9A.56.050 (MISDEMEANOR)

*Disposition: CONVICTED

Sentence: 365 DAYS

No other information about the case is presented. Because this conviction has an announced sentence of one year, and because it is a theft offense, it qualifies as an “aggravated felony” under federal immigration law.5 This means that Ms. L will lose her residency and be deported.6 The hearing lasts ten minutes.7

This case is a typical one in our system. Superficial features of Ms. L’s conviction triggered enormous downstream consequences. She lost her residency and was deported because the statute of conviction (Washington third degree theft) qualifies as a “theft offense,” and the announced sentence was at least one year, making it an aggravated felony.8 If the suspended portion of her sentence had been one day shorter, or if she had pled guilty to a more serious felony charge with no suspended sentence, she would have kept her residency.9 It did not matter that the actual crime was petty, nor did it matter that the judge who imposed the sentence intended for her to serve only one day in jail. The DHS lawyer and immigration judge likely did not even know these facts about her prior conviction. Even if they did, it would not have mattered because suspended sentences count when deciding whether a prior conviction is an aggravated felony.10 Ms. L lost her residency because the superficial features of her state conviction matched a category of crime in federal immigration law.

Why does the system work like this? Surely a more sensible approach would be for the immigration judge to look into what Ms. L actually did and then decide whether shoplifting pants merits deportation. But instead, criminal and immigration courts use mechanical procedures like this to decide whether certain aspects of prior convictions (like the charge or the sentence) will trigger certain consequences. They do so because of a basic paradox: criminal history matters immensely in the criminal and immigration systems, but these systems know little of substance about prior crimes.

In the United States, a person’s past convictions determine what happens to them at every stage of a criminal and immigration case.11 Prior convictions decide whether a person will be charged with a crime in the first place, what crime they will be charged with,12 whether they will be released pending trial, whether a mandatory minimum sentence will apply, what the ultimate sentence will be, and even whether they qualify for post-conviction relief.13 Every major decision that judges and prosecutors make is dictated, in significant part, by the criminal history of the accused. This is also true of our immigration system. A non-citizen’s prior criminal convictions determine whether they can apply for immigration status, whether they qualify for deferred action, whether they are deportable, whether they qualify for any forms of relief from deportation, and whether they will be targeted for immigration enforcement in the first place.14

But ours is not a legal system that retains much information about past crimes. In the standard criminal case, no substantive information about the defendant’s conduct is established beyond what can be inferred from the charge and sentence. Thus, those pieces of information are all that the system can rely on in future cases. Judges in the United States do not conduct meaningful fact-finding in criminal cases. Nearly all convictions are established by the defendant declaring her- or himself guilty, and not by an official government-conducted accounting of the facts. To the extent that a judge does any fact-finding, it is usually limited to verbally confirming the defendant’s guilty plea. There is normally no police report or presentence report lodged in the record, nor is there any other factually thick narrative of what the defendant actually did. In most cases the only public records that a court keeps are the charging documents, the plea documents, and the minutes of each of the court appearances. These records are costly to obtain and contain little substantive information.

Consequently, our criminal and immigration systems know surprisingly little about the criminal histories of the people they process. The judges and lawyers usually only have access to rap sheets that are produced by law enforcement databases. Such databases make our system’s reliance on criminal history possible. They are used by government agencies to quickly produce rap sheet printouts containing the dates of prior arrests, the statutory sections that were charged, and information about the sentences imposed.15 These rap sheets do not have any substantive details about what actually happened in a prior case aside from the charge, the sentence, the date of arrest, and whether or not the case ended in a conviction (sometimes even some of this information is lacking). And in most cases, these rap sheets provide all the information that the prosecutors, judges, and defense lawyers will ever see about a defendant’s prior convictions.

These two facts—that past convictions are all-important in our system,16 and that we know precious little about them—give rise to cases like Ms. L’s. For the most part, our criminal and immigration systems do not rely on substantive information about a person’s actual past conduct. Instead, they rely on heuristics about past convictions. In the context of this Article, the term “heuristic” refers to partial information about a past case that is used both to infer what actually happened and to impose consequences in the current case. The most common such heuristics are the nature of the convicted charge (e.g., its elements, whether it is a felony or a misdemeanor, the maximum possible punishment), the sentence announced by the judge, and the number of prior cases. These heuristics are operationalized through what this Article will term “mechanical formulas”—procedures that translate particular heuristics about past cases into mandatory consequences for the current case. One such mechanical formula is an intermediate crime category, like “strike,” “aggravated felony,” or “crime of violence,” that is used to give people harsher punishments if any of their past crimes contain certain elements.17 Another mechanical formula is the procedure that most sentencing guidelines systems use to count criminal priors: assigning a numerical value to each prior conviction based on the charge or the sentence, and adding those numbers up to put the defendant in a certain “criminal history category.”18

These heuristics and mechanical formulas are fundamental features of American criminal and immigration law. Legislatures regularly incorporate them into new laws, and the courts regularly decide important cases that define their effect. For example, just last year Congress passed the “First Step Act,” a law intended to reduce sentences in the federal criminal justice system.19 This law codified several intermediate crime categories, including “serious drug felony,” “serious violent felony,” and “prior 2-point violent offense.”20 Also in just the last year, the Supreme Court decided a number of important legal questions concerning whether certain criminal charges fit into certain mechanical formulas. In Nielsen v. Preap, the Court decided that aliens who are convicted of crimes categorized as “deportable” can be detained indefinitely in the immigration system.21 In Stokeling v. United States, United States v. Stitt, and United States v. Sims, the Court decided that burglary in Arkansas, burglary in Tennessee, and robbery in Florida all qualify as “violent felonies” under the Armed Career Criminal Act.22 In 2018, the State of California enacted a law ending cash bail and replacing it with a system where defendants with certain prior convictions (including “serious felonies” and “violent felonies”) will be presumptively detained without bail.23

This Article explores the consequences of proceduralizing the use of criminal history in this way.24 Our system abstracts away from the actual things that a criminal defendant did in the past. It focuses instead on the cosmetic features of prior court cases, like the statutory section charged and the sentence imposed. Legislatures and sentencing commissions enact, judges interpret, and prosecutors enforce mechanical formulas that rely only on these heuristics. This turns the use of criminal history by prosecutors and defense lawyers into a highly technical, legalistic enterprise. It limits the ability of criminal history to perform the one function that might justify its use—to reveal substantive information about the person who is before the court. Proceduralizing the inquiry divorces it from the questions of moral culpability and future dangerousness that motivate the reliance on prior convictions in the first place.

Mechanical formulas also create a system that processes people in starkly different ways based on arbitrary differences in their records. Enormous harms, and enormous breaks, turn on superficial features of a person’s prior convictions. Whether someone is deported, detained without bail, or sentenced to a twenty-year mandatory minimum turns on things like the precise wording of the statute they were previously convicted under, or obscure details about their prior sentence.25 This creates a number of perverse consequences. It makes the system a minefield that destroys some and spares others, while failing to give the people it processes any morally coherent account of why one must suffer a greater punishment than another. It makes finding a smart and thorough lawyer incredibly important in a system where nearly all criminal defendants rely on a court-appointed lawyer, and nearly all immigration defendants have no lawyer at all. It also exacerbates racial discrimination. Mechanical formulas often prioritize the number of past cases over substantive facts about those cases, and to the extent that minority racial groups are policed more aggressively, they will have quantitatively more prior cases. Given these serious problems, our system should either fundamentally change the way it uses criminal history, or else not rely on criminal history to the degree it currently does.

This Article’s argument proceeds in six Parts. Part I catalogues the many ways criminal history matters in our system. It shows that criminal priors determine prosecutors’ charging decisions, whether a person is released pre-trial, the sentence that is ultimately imposed, and the availability of post-conviction relief like good-time credit, expungement, charge reduction, or parole. It also shows that criminal priors dictate immigration outcomes—whether an immigrant will be targeted for enforcement, receive status, lose status, be eligible for relief, get released on bond, and qualify for deferred action, among many other consequences.

Part II considers why we punish people for crimes they have already been punished for in the past. It concludes that the only defensible justification for this practice is that past crimes reveal substantive information about a person and their decision-making that merits a higher punishment. Therefore, a system that relies on criminal history can be justified, if at all, only to the extent that the criminal history actually reveals information about the person charged.

Part III examines how our system came to rely on heuristics about criminal history. It begins by exploring how few substantive facts are actually recorded in most criminal cases. The informality of our system—its overwhelming reliance on defense waivers and guilty pleas—prevents it from conducting official fact-finding. Our system instead depends on rap sheet databases maintained by the FBI and various state agencies to track people’s criminal histories over time. These databases record only the arrest, the conviction, the charged crime, and the imposed sentence. This in turn forces legislators, sentencing commissioners, judges, and other criminal justice policymakers to rely on heuristics drawn from the limited information that is available in those rap sheets. And this reliance on heuristics obscures what happened in the underlying case. It washes away all of the context and substantive detail, substituting code sections and often inaccurate sentence information. In many situations, this leads to systematic misinterpretation of prior cases.

Part IV turns to mechanical formulas, which are used to translate heuristics about past cases into non-discretionary outcomes for the current case. It begins by examining how legislatures create mechanical formulas like intermediate crime categories (e.g., “strikes” and “aggravated felonies”). It then explores the way state and federal sentencing commissions use criminal history, mostly by counting up prior convictions to create a criminal history score that decides the recommended sentence. Finally, it discusses how mechanical formulas require formalistic procedures for translating crimes across jurisdictions, such as the categorical approach.

Part V develops a broad critique of how our system uses heuristics about past cases, and mechanical formulas built upon those heuristics, to punish and deport people. This set of institutional practices avoids substantive information about prior cases, undermining the justification for relying on criminal history in the first place. It also renders the system arbitrary to the people it processes, discriminatory against groups with more criminal justice contacts, and favorable to those who can hire clever and well-resourced immigration or defense lawyers. In its most nightmarish version, a system based on mechanical formulas could become devoid of human moral judgments altogether.

Part VI suggests a number of directions for reform. It first explores two large-scale fixes—generating substantive information about criminal history and declining to use criminal history at all. It then considers a second dimension of reform—giving discretion to judges and other decisionmakers, thereby moving from a rules-based weighing of criminal history to a standards-based one. Last, it describes a number of reforms that can be pursued in the current system. These include eliminating the convention of “time served” sentences, giving defendants more power to retroactively challenge their convictions and sentences, and writing state criminal laws in a way that anticipates their immigration consequences.

There is a growing scholarly literature on the uses and misuses of criminal history. This literature focuses on the problems that our system uses criminal history too frequently and too harshly. Some authors argue that records of prior convictions are too widely available and carry too many collateral consequences, preventing people from reintegrating into society after they have served their punishments.26 Some argue that sentencing enhancements for prior convictions are much too harsh to be justified.27 Some criticize the very idea of using past crimes to determine the level of punishment.28 There has also been a recent scholarly focus on the unique problems with misdemeanor courts, where people are processed and marked by criminal records without being able to meaningfully contest their charges.29 And a number of scholars have criticized the immigration system’s overreliance on criminal history.30

The present Article expands this scholarly conversation by addressing not just our overuse of criminal history, but also our flawed, mechanical method for using it. This Article’s first contribution is to describe the actual process by which criminal history is used in our criminal and immigration systems, and the surprisingly small amount of information they rely on. Among other things, this provides a deeper institutional explanation for these systems’ dependence on formalistic tests like the categorical approach, which has fallen under heavy criticism by judges and sentencing commissions.31 It also provides crucial insights into the interface between the immigration and criminal justice systems, namely that the former takes partial, and often inaccurate, snippets of information from the latter and uses them to decide whether a person will be deported. The Article’s second contribution is to excavate the problems that emerge from the use of heuristics and mechanical formulas to decide people’s fate. It catalogues the injustices caused by our system’s limited recall of criminal history, which include severe arbitrariness and racial disparities. The Article’s third contribution is that it develops a toolkit for reform. It elaborates a set of proposals, some ambitious and others modest, that would make our approach to counting criminal history more rational and less arbitrary.

I. Prior Convictions Matter Pervasively in the Criminal and Immigration Systems

Prior criminal convictions dictate what happens to a person as they are processed through the American criminal and immigration systems. From the moment that a person encounters the criminal justice system, their prior record is their destiny.32 Past convictions determine what charges will be brought against a person (if any), whether a person is held in custody or released on bond, what their punishment will be, and whether post-conviction relief is available. Past convictions are also decisive at every stage of the immigration system. They determine whether a person will be targeted for immigration enforcement, keep or lose their immigration status, bond out of immigration detention, be permitted to naturalize as a U.S. citizen, qualify for relief from deportation, and face prosecution for immigration crimes. The examples in this Part are illustrative, and certainly not comprehensive. They are meant to show the extreme degree to which American law relies on past convictions when deciding people’s fate.33

The first major decisions a prosecutor makes in a criminal case are whether to charge a person with a crime, and if so, what crime to charge them with.34 In making these decisions, prosecutors rely on a person’s criminal history in a variety of ways. First, many criminal charges contain as an element that the person has been convicted of a prior crime. To take just a few examples: the federal government and a number of states criminalize weapons possession by people with certain prior convictions;35 in many states a second drunk driving offense is a separate, more serious crime;36 and in the federal system, a second illegal entry conviction can be charged as a more serious felony.37 For crimes such as these, the existence of a prior conviction opens the defendant up to prosecution in the first place.38 Second, prosecutors have discretion in every case to decline to charge the accused with a crime, and often also have discretion to refer the accused to a diversion program or some other alternative to the criminal justice system.39 Such decisions are strongly influenced by the prosecutor’s understanding of the defendant’s criminal history.40 Third, a prosecutor has discretion over what charges to bring in a criminal case. In many circumstances there is both a more severe and a less severe charge that cover the same conduct. For example, frequently there is both a felony and a misdemeanor charge that would apply to the same act.41 Prosecutors use a defendant’s criminal history in such circumstances to help decide which charge they will bring.42

The next major decision that is made in a criminal case is whether the defendant will be released on bond or remain in custody. Being released on bond alters the course of a criminal case.43 People who are released pending trial are subject to less pressure to take a plea deal, and are more able to fight their charges and obtain a better outcome.44 Criminal history is important to judges when they decide whether to set bail, and if so, what amount to set. In a standard bail hearing the judge and the prosecutor work off of a rap sheet showing the defendant’s criminal history, and they give great weight to the defendant’s prior record (defense lawyers are not present at bail hearings in many jurisdictions; where they are present, they too rely on rap sheets).45 Further, in many states there is a statutory requirement that criminal priors be considered when determining bail, and some laws instruct that people with certain priors be presumptively detained without bail.46 Additionally, an increasing number of courts use algorithmic tools to inform judges’ bail decisions.47 These tools use prior convictions, among other factors, as inputs for formulas that generate recommendations for bail and other conditions of pretrial release.48

In the few cases that go to trial, prior convictions can be used to impeach a witness, including the defendant if they testify in their own defense.49 Consequently, prior convictions determine whether a person will be able to testify at their own trial without the jury learning prejudicial information about past criminal cases.50

If a person is ultimately sentenced for a crime, whether after a trial or after a guilty plea, their past convictions are used against them in a multitude of ways. Prior convictions can increase the maximum available sentence under the law.51 They can also trigger mandatory minimum provisions that require the judge to impose no less than a certain amount of time in prison.52 For example, for certain federal drug charges, having a prior felony that counts as a “serious” violent or drug-related crime triggers a fifteen-year mandatory minimum.53 The federal government and roughly half of the states have enacted “three strikes” laws, which impose harsher sentences on people with certain priors.54 California’s three-strikes law, for example, automatically doubles a defendant’s sentence and adds another five years if they have just one previous “strike” offense.55 A little less than half of the states and the federal government also have sentencing guideline systems that are produced by sentencing commissions.56 In jurisdictions with sentencing guidelines, a person’s prior convictions determine their ultimate sentence under the guidelines.57 Prior convictions also affect discretionary sentencing decisions that are made by prosecutors and judges, even without guidelines or mandatory sentences. In jurisdictions where most sentences are determined through agreements between the prosecutor and the defense, the prosecutor’s plea offers involve longer sentences for defendants with more serious prior records.58 The same is true in jurisdictions like the federal system where judges retain sentencing discretion—defendants with more prior convictions, or more serious ones, are treated more harshly by the judges.59

Past convictions continue to decide a person’s fate even after their sentence is imposed. When state and federal prison bureaus decide what level of prison a person gets sent to—from maximum security to minimum security—they look at that person’s prior convictions.60 Indeterminate sentencing systems are also governed by prior convictions. Some jurisdictions prevent a person from receiving “good time” or “conduct” credit—a type of sentence reduction that is generally awarded for behaving well in custody and participating in prison programming—if they have certain prior convictions.61 A person can also be denied parole based on their prior record.62 Finally, even after release from custody on the current case, a person’s previous cases can still haunt them. A past criminal record prevents people from being able to expunge their new convictions or reduce their severity. For example, in California certain felony convictions can now be retroactively reduced to misdemeanors under Proposition 47, but they cannot be reduced if the person has any prior conviction that counts as a “superstrike.”63

The American immigration system also relies on criminal history. Fundamentally, the immigration system decides who is legally permitted to live in the United States and who can be deported. Subsumed within that governing binary, there are dozens of micro-decisions that determine a particular immigrant’s fate. Criminal priors play a large role in nearly all of these micro-decisions.64 When an immigrant applies for legal status in the United States, whether in the form of a visa, permanent residency, or U.S. citizenship, the immigration services consider their criminal history.65 If an immigrant has achieved legal status short of citizenship, this status can be stripped if they are convicted of crimes categorized as “aggravated felon[ies]” or “crimes involving moral turpitude.”66 Beginning under the Obama administration, immigrants with no legal status who have been in the United States since childhood have been able to apply for a special immigration status called “Deferred Action for Childhood Arrivals” (DACA).67 However, DACA specifically excludes applicants with certain prior convictions, including “significant misdemeanors.”68 Other forms of relief from deportation—such as the 1986 amnesty, voluntary departure, withholding, and cancellation of removal—are also denied based on a person’s criminal history.69 And more generally, a record of prior convictions will make an undocumented immigrant a more likely target for arrest and deportation, and less likely to be able to bond out of immigration detention.70

II. Why Should Past Crimes Matter?

We have established that our system, quite emphatically, gives harsher treatment to people with criminal records. But how can this reliance on past crimes be morally and intellectually justified?71 There are two basic kinds of arguments for such recidivist enhancements, which track the two basic kinds of philosophical justifications for punishment. The first kind is forward-looking—it rationalizes punishing a recidivist more harshly because doing so will provide some social benefit, usually in the form of less crime. The second kind is backward-looking—it rationalizes punishing a recidivist more harshly because their actions are in some way more wrongful than a comparable non-recidivist’s actions.72 For both of these kinds of justifications, a key link in the chain of moral reasoning is the assumption that a past crime reveals important information about the person.

Forward-looking justifications for criminal punishment are consequentialist. They stem from an approach to punishment that emphasizes preventing future crime.73 One familiar justification for punishment is that it deters crime. If someone is contemplating committing a crime, they may take into account the likelihood and severity of punishment, and if these are significant enough they may ultimately decide against it.74 For deterrence theorists, the justification for a recidivist enhancement is straightforward: if a person commits a crime, is punished, then commits another crime, they have revealed that the first punishment was not adequate to deter them.75 Under the assumption that harsher prison terms deter more effectively, this might warrant a higher sentence for people with criminal records. A second forward-looking justification for punishment is that it incapacitates people who commit crimes by separating them from society. An incapacitation theorist might justify harsher punishment for recidivists on the grounds that they have revealed a greater disposition for committing crimes. The basic reasoning here is that a person who has committed one or more crimes in the past is more likely to commit further crimes in the future, and that this merits depriving them of liberty for a longer period of time to lower the risk of new crimes being committed.76 Of course, these forward-looking justifications depend on the contested empirical assumption that added punishment actually reduces crime through deterrence and/or incapacitation.77

Backward-looking justifications for punishment look not to the supposed benefit for society, but instead to the wrongfulness of the acts being punished.78 There is a debate among retributivist theorists over whether having a prior conviction makes a new criminal act more wrong, and therefore more deserving of punishment. Some theorists argue that the law should punish only the present crime, and that people with a criminal record should not receive a harsher penalty.79 Others argue that a person who has previously been convicted of a crime has revealed something about themselves that justifies higher punishment. For example, through past crimes they may have revealed character traits that make their current crime more morally blameworthy.80 One might also say that a recidivist has been warned by society that certain actions are punished and should be avoided, and by choosing to ignore that warning has committed an even worse act. For example, when a person commits their second DUI it is harder for them to claim they did not realize driving drunk was considered wrong. We might also expect them to have taken steps to avoid driving drunk again.81 The basic logic in these backward-looking justifications is that a prior conviction reveals information about the defendant, and that we can use this information to impute a higher degree of wrongfulness to the current crime.82

One key assumption unites each of these justifications for relying on past convictions to decide current punishment: that a prior conviction gives useful information about the convicted. To infer that a defendant needs a harsher punishment to be deterred from future crime, or needs to be incapacitated to prevent future crime, we must assume that the past conviction reveals they are more likely to commit crimes in the future. To infer that a defendant has a more wrongful state of mind, we must assume that the past conviction reveals something about their character. And this principle extends to other uses of criminal history—when we use past crimes as a heuristic to decide if people are released on bond, or as a basis for deporting them and denying them immigration status, we infer that these past crimes reveal important information about the person.83 The ability to make such inferences depends, in turn, on the quality of the information we have about a past conviction—information like how long ago it happened, what the defendant actually did, why they did it, and what else was going on in their life at the time. The richer the information we have about prior convictions, the better equipped we are to make these judgments.

III. Why the System Relies on Heuristics

Our legal system usually does not have access to factually rich narrative accounts of what happened in past criminal cases. If a person was convicted of a misdemeanor DUI, for example, future courts have no easy way of looking up the details of what the person actually did. Was there a crash? Were there passengers in the car? Was the person’s blood alcohol level above .10? .20? .30? Were there other drugs in the person’s system? Such information is not easily accessible to the lawyers and judges who work on future cases. This is due to basic features of our legal system—its informality, its reliance on guilty pleas and defense waivers, its localism, and the separation of the investigation function from the adjudication function. Instead, judges and lawyers in future cases will normally only know the charge of conviction, the date of conviction, and the sentence imposed. These pieces of information function as heuristics. When judges and attorneys make discretionary decisions in future cases—decisions like what to charge, whether to grant bond, etc.—they have to infer what happened in the prior cases from this limited information. As this Part shows, such heuristics are often information-poor or even misleading. They frequently inflate the length of prior sentences, and give little indication of the substance of what a person actually did.

A. Substantive Information About Past Crimes Is Difficult to Obtain

In the American criminal justice system, courts keep records of surprisingly few substantive facts. In the majority of criminal cases, the courts do not generate reports, witness testimony, or other information about what the defendant did. Instead, the parties control the fact-finding.84 All of the investigation is done by police agencies, prosecutors, defense investigators, and defense attorneys. Courts are involved only sporadically, such as when disputes arise over what evidence is being provided to the defense.85 Unlike in inquisitorial systems, American judges and court employees do not usually perform their own investigations into the facts underlying a charge. And the investigative work done by the executive branch is not entered into the court records unless the government is forced to prove its case at a hearing or a trial. This happens only rarely. Nearly all criminal cases end in either a guilty plea or a dismissal.86 At a guilty plea, the judge’s fact-finding is usually limited to asking the defendant whether they are guilty of the elements of the charge (or, if it is an Alford or nolo contendere plea, whether they decline to contest the charge).87 When the time comes for sentencing, judges often impose the sentence that has been worked out between the parties through the plea negotiation process.88 These deals are made at pre-trial conferences or through more informal discussions.89 Thus, at every stage of the standard criminal case—from the investigation through the sentencing hearing—the court does not record a factually rich narrative account of what the defendant actually did.90 The system generates dispositions and sentences, rather than facts. There are exceptions, of course—some cases culminate in trials where evidence is presented and facts are proven. But as a general rule, criminal courts generate quite limited factual information.

This information scarcity is manifested in the documents that are available from courts concerning past criminal cases. If a prosecutor or a defense lawyer wants to order the record of a defendant’s prior case in order to help them in the new case, only a few documents will usually be provided. These documents are sometimes not even available—they might be sealed or expunged on the defendant’s request, or destroyed by the court after a period of time elapses.91 If they are available, these documents must be ordered by contacting the courthouse and sometimes paying a fee to have the documents copied.92 While most states have electronic databases that permit the public to view basic information about a case (the charge, the defendant’s name, the court dates, etc.), very few criminal jurisdictions make actual court documents available electronically.93 And the difficulty of ordering court records is compounded by the system’s decentralization. Criminal courts are local, and each court has its own recordkeeping system and ordering process.94

A number of documents can be obtained from court record requests.95 One is the charging document—called a “complaint,” an “information,” an “indictment,” or some other name, depending on the type of case, the procedural posture, and the jurisdiction. This document will usually contain a recitation of the charged crimes and their elements.96 Another document is the list of minute orders, which briefly states who was present and what occurred in court at each calling of the case.97 A further document is the guilty plea form (or alternatively, plea agreement), which will state what criminal charge(s) the defendant is pleading guilty to and what rights they are giving up.98 The last document is the judgment, which will generally close the case and contains the charge of conviction and the sentence.99 The specific appearance of these documents varies across states and across courthouses. Generally speaking, however, these documents will not contain any narrative of what the defendant is accused of having done, what the police witnessed, or what the other evidence shows.100

It is possible that the lawyers stated such information on the record during a court appearance in the prior case. Certainly, if the case went to trial, there would be witness testimony and argument about what actually happened—but, of course, this is an unusual event. It is also possible that the attorneys went into the facts of the case at another type of hearing, such as a plea colloquy, a sentencing hearing, or a preliminary hearing. If this occurred, then the records of those hearings would only be made available if someone either ordered an audio recording—if the jurisdiction has that capability—or paid a court reporter to type out a transcript.101 This is a lengthy and often expensive process.102 Only rarely do lawyers in a new case undertake to order transcripts or audio recordings of hearings in a prior case.

There are two documents generated in criminal cases that usually do contain substantive facts: the police report and the presentence report. However, for a variety of reasons, these documents are quite difficult to obtain in future cases.

Police reports generally state what the arresting agents witnessed and have details of interviews with victims—in cases with available victims—as well as the statements made by the defendant. They are written by police agencies and are generally provided to the prosecutor and often—but not always—to defense attorneys. However, they are not entered into the court record and are difficult to obtain outside the context of the specific case. Police reports are maintained by law enforcement agencies, so to get a copy of a police report for a case one must go to the relevant police agency itself, make a request, and pay any fees.103 Different jurisdictions have different rules about making these reports available to the public, and most impose restrictions for both privacy reasons and law enforcement reasons.104 Obtaining a police report for a prior case is thus a burdensome process, and in most cases the lawyers do not undertake to do so.105

Presentence reports are prepared by court employees to help a judge decide the sentence in a case. They are not prepared in every case—policies vary by state—but in most states, presentence reports are only required in more serious cases.106 The officer tasked with preparing the presentence report will include the details of the crime, details of interviews with the defendant and the victim(s) if they consent to interviews, and other relevant information like the defendant’s job, family, upbringing, and past convictions.107 Presentence reports are provided to the prosecutor, the defense, the court, and sometimes to victim(s), and they are used at the sentencing hearing.108 They are also used by corrections agencies to decide where a person will be incarcerated and what programming resources they can access.109 Thus, in a case where a presentence report is prepared, there is a factual summary of the available details of the crime. However, presentence reports are incredibly difficult to obtain in future cases. For decades, the majority of states and the federal system have restricted access to these reports to just the parties in the specific case.110 Presentence reports are sealed and not made available to the public, or to lawyers working on future cases, and many states even have statutes prohibiting outside access to them.111 Thus, in future cases, presentence reports are not common sources of information about a past case.

There are also constitutional limitations on using information sources like police reports and presentence reports in future cases. If these documents are used to establish facts in future cases that trigger sentencing enhancements, that creates due process and Sixth Amendment problems.112 For example, if there is a ten-year sentencing enhancement for having a prior cocaine-related conviction, and the court imposing that enhancement relies only on a police report in a prior case to establish that the defendant sold cocaine, then the court is relying on information that was not found by a jury or admitted by the defendant. The Supreme Court declared such a finding unconstitutional in Shepard v. United States, reasoning that documents like police reports are “too far removed from the conclusive significance of a prior judicial record” to be used to decide what happened in a past case.113 Thus, even in the unusual event that substantive documents like police reports and presentence reports are made available in future cases, constitutional doctrine limits future courts’ ability to use them.

The lack of central information management in the criminal justice system can be contrasted with the immigration system. Every person who comes into contact with the American immigration system (as a potential immigrant, an alien being deported, or in some other capacity) is assigned a number, called an “A-number” (short for “alien registration number”) that corresponds to their master case file, called an “A-file.” The A-file is a physical paper file maintained by DHS, and it is updated with each new immigration-related event in the person’s life.114 There are currently over seventy million A-files in existence.115 If a person applies for a visa, becomes a permanent resident, is arrested by immigration authorities, appears at a deportation hearing, or does anything else involving the immigration system, the relevant documents are added to their A-file.116 The A-file contains everything—visa papers, court records, immigration-related arrest and investigation reports (called “I-213” reports), and all other immigration documents.117 The A-file also follows the person around over the course of their lives. If a person has multiple immigration hearings in different states in different years, the A-file will be shipped across the country to the relevant agency for each hearing.118 Thus, at each new deportation hearing (or other immigration hearing), the lawyer for the government will have the A-file at the ready and will be able to provide any information it contains to the immigration judge.

Because the immigration system is located entirely within the executive branch, the investigation and adjudication functions are not as clearly separated as they are in the criminal system.119 Immigration judges are employees of the Department of Justice, the same agency that employs the lawyers who prosecute immigration cases. Furthermore, the exclusionary rule and the formal rules of evidence do not apply in the immigration system, and most of the people who are processed through deportation proceedings are not represented by attorneys.120 These features, combined with the decisive importance of the government-compiled A-file, cause the immigration system to more closely resemble a European inquisitorial system than an American-style adversarial system.121 Unlike in the criminal justice system, the government alone investigates and gathers information on the people it processes, keeps that information in a centralized place, and transmits that information to different agencies without having to deal with separation-of-powers hurdles. There is no equivalent of an A-file in the criminal justice system—no single repository of police reports and prior case information that can be transferred between courts. Centralized recordkeeping about prior cases is more feasible in the immigration system, because it lacks the criminal justice system’s decentralization and adversarial checks. However, the immigration system also relies heavily on inputs from the criminal justice system.122 And to the extent that it does, it faces the same hurdles described above—substantive information about past criminal cases is difficult to obtain.

B. Law Enforcement Rap Sheets Are the Main Source of Information

Rap sheets do for criminal history calculation what nautical maps do for seafaring. It is theoretically possible to compile someone’s criminal history without using their rap sheet, but it is cumbersome and carries a high risk of error. Computerized criminal history databases allow one to find a defendant’s prior arrests and convictions quickly, without needing to track down court records for every past case. Every state in the United States maintains databases where information about arrests, charges, convictions, and sentences are inputted and stored.123 These databases are used by police departments, prosecutors, court employees, and probation and parole departments, among other state agencies.124 Employees of these agencies are responsible for inputting new events when they occur, such as arrests, convictions, or sentences.125 The databases are connected to one another through networks that enable computers to access the information maintained by agencies throughout the state. These networks are referred to as “central repositories.”126 For example, California’s central repository was created in the 1970s and is called the “California Law Enforcement Telecommunications System” (CLETS).127 CLETS allows a person working in one agency to enter a name or set of fingerprints into a computer terminal, and find the entries for that person’s past criminal record.128 Indeed, police departments equip their squad cars with laptops that allow officers to quickly search these databases during a traffic stop.129

The Federal Bureau of Investigation’s National Crime Information Center (NCIC) maintains a system that integrates all of these state repositories, as well as federal agencies’ criminal history information, into a nationwide network. This system was created in 1983, and is called the Interstate Identification Index (III).130 The III permits any participating state or federal agency to input a person’s fingerprints or other identifying information, and to receive back that person’s arrest and conviction history from each repository in the system.131 So at the start of a criminal case, the arresting agent and the prosecutor can use the III to quickly learn whether a person has prior arrests or convictions from anywhere in the United States. The prosecutor can then use that information to inform their decisions about what crimes to charge and what plea bargain terms to offer. They can also disclose that information to the judge and the defense lawyer for bail and sentencing purposes.

The rap sheet printouts produced by these queries look something like this:132

NAME                                          FBI UCN                               DATE REQUESTED

Public, John Q.                          94072PX6                           2019/08/05

ARR/DET/CIT: 1988/01/25 San Diego

Cnt 01: 23152(A) VC-DUI Alcohol/Drugs

Dispo: None Reported

ARR/DET/CITE: 1995/08/31 Los Angeles

Cnt 01: 487.1 PC-Grand Theft: Property

Dispo: Dismissed/Furtherance of Justice

Cnt 02: 484(A) PC-Grand Theft: Property

* Dispo: Convicted

Conv Status: Misdemeanor

Sen: 12 months probation, 1 day jail, work program

1996/02/10

Prob Vio: 14 days jail

1996/06/10

Prob Revoked: 6 months jail

ARR/DET/CITE: 1996/05/23 Los Angeles

Cnt: 01: 242 PC-Battery

Dispo: Dismissed

ARR/DET/CITE: 1998/02/14 Pomona

Cnt 01-03: 459 PC-Burglary: First Degree

* Dispo: Convicted

Conv Status: Felony

Cnt 04: 496(A) PC-Receive Known Stolen Property

* Dispo: Convicted

Conv Status: Felony

Sen: 40 months prison, restitution

ARR/DET/CITE: 2007/04/16 Pomona

Cnt 01: 4149 BP-Possess Hypodermic Needle

Dispo: None Reported

Cnt 02: 11550 HS-Use/Under Influence Control Subs

Dispo: None Reported

As this illustration shows, rap sheets generated from criminal history repositories contain limited information. They show entries for arrest dates, charged criminal statutes, case dispositions (conviction, dismissal, etc.), probation violations, and sentences. They do not contain more specific information about what the person did or was accused of doing, or even about what occurred in court beyond the case outcome and sentence. The information contained in rap sheets is entered by human beings, and these human beings are a source of significant error.133 For example, information is frequently missing (especially information about case outcomes, see, e.g., the “none reported” dispositions in the above illustration), and sometimes a person’s rap sheet reflects arrests and convictions that were actually suffered by a different person.134 People also frequently commit errors in reading rap sheets, such as assuming that the listed cases resulted in convictions when they did not.135 Nonetheless, rap sheets are indispensable for processing criminal and immigration cases. At the outset of a criminal case, the prosecutor commonly provides the court and the defense attorney with the defendant’s rap sheet printout, sometimes with multiple rap sheets from different jurisdictions. These rap sheets are used to decide on the charge, set bail, negotiate a plea agreement, and determine the sentence. In the immigration system, they are used to decide whether a person is able to post bond, whether they qualify for immigration relief, and whether they are deported.136

Criminal history repositories are a product of the computer age. They permit any police officer or prosecutor to look up a person’s prior criminal cases with a quick electronic search. It is time-consuming to track down and order court documents, police reports, and hearing transcripts.137 It is easy to enter a name or set of fingerprints into a database. Indeed, this technology is what makes our system’s reliance on criminal history feasible.138 Without rap sheets, the information costs of looking up someone’s criminal record would be steep. One would have to go on a fishing expedition through difficult-to-search court records. And prosecutors, as street-level bureaucrats with high caseloads and resource constraints, avoid such time-consuming investigations if they can.139 It is likely no accident that most of the major recidivist enhancement statutes discussed in this article—three strikes laws, aggravated felonies, and mandatory minimums—were enacted in the 1980s and 1990s, after electronic criminal history repositories came into widespread use.140 Rap sheets make people’s past criminal records legible to the state.141 They reduce human beings’ complex encounters with the criminal justice system to simple, measurable events that are coded and treated as standardized inputs. Through this process of simplification, they transform criminal histories into measurable units of data. And once this standardization is imposed—turning those past events into a list of arrests, charges, convictions, and sentences—rap sheets can be used to efficiently process people through the criminal and immigration systems.

C. The System Relies on Heuristics About Past Convictions

When judges and lawyers process people based on rap sheets, they make inferences from the limited information contained in those rap sheets. This creates a system of justice by heuristics. Consider a standard bail hearing. The prosecutor, defense lawyer, and judge will be working off of a rap sheet similar to that for “John Q. Public” illustrated above.142 The lawyers will reference the entries in this rap sheet to argue that Mr. Public is or is not a flight risk, and that he is or is not a danger to the community.143 The judge will then use this information to decide on a bond. The pieces of information contained in the rap sheet function as heuristics—shorthand devices that reveal partial information about a complex past event. Here are some examples of how the entries are interpreted. A felony conviction looks much worse to the judge than a misdemeanor conviction. An arrest shows that there has been contact with the criminal justice system but provides less definitive information than does a conviction.144 If the rap sheet has a lot of different entries, the judge will infer that the person presents a higher risk. If none of the convictions are recent, the judge might infer that they are less reflective of the person’s current life. If the sentence for a particular conviction is forty months in prison, the judge will infer that the underlying acts were much worse than if the sentence had been only four months. If the rap sheet shows parole violations, probation violations, or warrants, that will count against the person. And the lawyers and judge will infer what the crime was from the limited information in the caption of the charge—for example, “Battery,” “DUI Alcohol/Drugs,” or “Possess Hypodermic Needle.” The criminal and immigration systems use these heuristics drawn from rap sheets—number of convictions, type of charge, recency of criminal history, sentence severity, etc.—to make discretionary decisions about the people they process. They are used to determine what charges to bring, whether to grant or deny bond, what sentence to impose, and whether to order or withhold deportation, among many other decisions.145

These heuristics are inexact tools for inferring facts about a person and their past life. Consider three sources of error.

First, there is a gap between the underlying reality of a person’s life and the portions of that life that result in recorded encounters with the criminal justice system. A rap sheet will only capture arrests and convictions. These reflect the actions of the individual, but also the choices of the police department.146 For example, a person’s race and the neighborhood they live in play a major role in determining how often they come into contact with the police.147 A young white man living in the Maryland suburbs is far less likely to be arrested and suffer entries in a criminal history database than a young black man living in Baltimore.148 As another example, the homeless are far more likely to be arrested for petty crimes than are other groups. If you take two people addicted to cocaine, one homeless and the other living in an apartment, the homeless person is more likely to be arrested for minor drug use in part because they lack a private place to self-medicate.149 The bias of the system then compounds, as discrimination in who gets arrested causes discrimination in how people are processed through the court system.150 More prior cases means a lower likelihood of bail, a higher likelihood of conviction, and a higher sentence.

Second, the outputs generated by the court system contain a degree of randomness. After a person is arrested, three actors in the legal system will collectively dictate what happens in their case—a prosecutor, a defense lawyer, and a judge. Which prosecutor gets assigned will determine what charges are brought and what the plea offer terms look like. If the prosecutor is aggressive, they may threaten more severe charges or a steep trial penalty unless the defendant accepts a harsh deal. If the prosecutor is lazy, they may offer better terms after the defense lawyer threatens to fight the case. If the prosecutor has a heart, they may be swayed to offer a better deal by the defense lawyer’s equities pitch. The choice of defense lawyer is similarly decisive.151 While criminal defendants can theoretically hire their own lawyers, the reality is that a large majority depend on court-appointed lawyers.152 If a defense lawyer is lazy or complacent, they will do nothing more than try to convince their client to take the first deal the government offers. If a defense lawyer is dedicated and effective, they might use their leverage to seek a better deal, fight the case and try to win, or pitch a more lenient outcome to the prosecutor or judge based on the defendant’s equities.153 And the choice of judge will determine whether the defendant is released on bond and able to fight their case, whether they succeed with pretrial motions, and how harsh the ultimate penalty will be.

The random selection of these three actors dictates the outcome of a case, and thereby complicates the relationship between real-world events and rap sheet entries. The dance between these actors decides whether a conviction happens in the first place (or merely an arrest), the crime the defendant is convicted of, and ultimately the sentence. Take two otherwise equivalent arrestees, randomly assign each a different prosecutor, defense lawyer, and judge, and you can get two very different outcomes. One might end up with a felony and the other with a misdemeanor, or with pre-trial diversion resulting in no conviction. One might end up with a new conviction, the other might negotiate for just a probation violation. One might receive a short sentence and the other a long one.154 One might be unable to post bond and so may plead guilty to a crime they did not commit, while the other might have a more reasonable bond that allows them to fight the case.155 One person might win a Fourth Amendment motion or other procedural issue, or have a complaining witness refuse to testify, and thereby win the case for reasons having nothing to do with their innocence. Another person might have their sentence inflated because they fought the case, which takes additional time, rather than accepting a deal right away. At a fundamental level, the criminal justice system produces case outcomes that follow the logic of a competitive negotiation process and that reflect the abilities, goals, and leverage of each actor.156 They also reflect each actor’s discretionary choices, which, empirical evidence shows, discriminate by race.157 The entries that are placed in rap sheet databases represent the outputs of this semi-random negotiation process, not any objective evaluation of what the person did. Case outcomes are therefore highly unreliable signals to future actors about the underlying truth of the crime.

Third, the entries in rap sheet databases do not contain detailed information about the actual conduct. They show only the charge and the sentence. These often convey little information. The charge is an abstraction—a category of conduct like “battery,” “DUI,” or “fraud.” This category corresponds to a criminal code section laying out the elements that must be proven. For example, California defines “battery” as “any willful and unlawful use of force or violence upon the person of another.”158 This could describe an enormous range of acts, from pushing a person to stabbing a person. “Assault with a deadly weapon” could mean an assault with anything from a gun to a skateboard.159 Most criminal charges are similarly information poor—they give a broad category of conduct without specific facts. This problem is compounded by inchoate liability doctrines that permit someone to be found guilty of a crime that they merely aided or abetted, assisted after the fact, or attempted to commit.160 If the rap sheet contains a conviction for one of these inchoate crimes, future readers will have no way of knowing the person’s level of involvement. And the charge of conviction is not even necessarily an accurate account of what happened—the parties may negotiate for a guilty plea to a charge that is misleading or factually wrong.161 This information poverty is even more severe for probation and parole violations, which have no substantive charge attached to them.162 When a probation or parole violation appears in a rap sheet, it simply registers as a violation. So if the lawyers in a criminal case negotiated to dispose of the case as a probation violation with no new charge, the rap sheet will contain no substantive information at all.163 This scarcity of information in rap sheet entries makes it difficult to draw inferences in future cases.164

These sources of inaccuracy make it difficult for prosecutors and judges in future cases to reliably decide what a prior case means about the defendant. Unless the presentence report or police report are made available, the only documented information will be the heuristics drawn from rap sheets. A judge who cares about a prior conviction could, theoretically, ask the defendant to give more information about what happened. But this approach has limited utility. If a person gives a self-serving account of the past case—that the charges were false, that the conduct was not so bad, or that that conviction is not even them—they are unlikely to be credited. If they have damaging things to say, their defense lawyer (if the defense lawyer is doing their job) will advise them not to speak. And few courts would be willing to call an actual hearing with witness testimony about a past case, as that would expend a lot of resources. The upshot is that decisionmakers lack a reliable source of information about the substance of a defendant’s past criminal acts.

D. The Problem of Translation

There is a further difficulty with using rap sheet entries as heuristics. Not only do they contain only partial information about past events, they are also affirmatively misleading when removed from the local contexts of the systems that produced them. A person looking at a rap sheet containing entries from a different decade, a different state, or even a different county within the same state, will have to translate those entries into a totally different normative and procedural context.

In Philosophical Investigations, Ludwig Wittgenstein writes about a group of people, each of whom is holding a box only they can see into.165 Each of these boxes contains something called a “beetle.” But each box-holding person only learns what a “beetle” is by looking into their own box. Person 1 looks into their box and see this “beetle.” Person 2 does the same. When these people discuss what is in their boxes, they use the word “beetle.” But in whatever conversations they have about beetles, each has in mind the object in their own box. That object may or may not correspond to the objects in each of the other boxes. They could, without realizing it, be discussing very different things. The word “beetle,” then, functions as a tool in a language game that could be misleading all of them.

Legal systems that use rap sheets to make inferences about prior cases are in the same situation as these beetle-in-a-box owners. Instead of beetles in boxes, they have convictions in rap sheets. Every criminal conviction and sentence is produced in a thick context of criminal laws, procedural rules, normative conventions, and system-specific jargon. Convictions and sentences can only be understood within those thick contexts. But rap sheets abstract these outputs away from their contexts, turning them into entries on a printout. When a person processing a new criminal case reads one of these rap sheets from a different jurisdiction, they must translate its information into a different context. The same is true of one system’s sentence and another system’s sentence, or one system’s conviction and another system’s conviction.166

Consider the difficulty posed by indeterminate sentences. In most states, the announced sentence that gets entered into a person’s rap sheet is longer than the sentence that the person actually serves. This is because most states have indeterminate sentencing systems—people are released early due to good time credit, split sentences, parole, and other policies.167 These sentencing systems are often complex. For example, when a person is sentenced for a crime in the State of California, the default rule is that they will serve only half of the announced sentence.168 If they receive six years in prison, they will serve three. If they receive eight days of picking up trash by the side of the road, they will serve four. However, if a person has prior convictions for crimes designated as “strikes,” then they will be made to serve eighty percent of the announced sentence.169 And if a person is sentenced under a program called “realignment,” they can serve significantly less than half of their sentence.170 Other states have similarly complex indeterminate sentencing schemes, and these vary widely.171 This Article opened with an example from the State of Washington, where there is a convention in certain cases that almost the entire sentence will be suspended and not imposed.172 Further, in most states, people are discretionarily released on parole with some period of time remaining on their sentences.173

The complexity of local sentencing rules creates problems of translation between systems. When a judge pronounces a sentence in California, everyone in the courtroom—prosecutor, defense lawyer, and defendant—understands that the real sentence is (normally) half of the announced sentence. However, the rap sheet simply reflects the announced sentence.174 Rap sheets thus record the sticker price of a punishment rather than the true punishment, systematically inflating the apparent severity of past crimes. Further problems of translation arise when part or all of a sentence is suspended. A rap sheet may not even note that a sentence has been suspended. For example, if a person is sentenced to six months in prison and that sentence is suspended so long as they complete community service, the rap sheet may simply reflect the six-month sentence. Alternatively, the rap sheet sometimes contains an entry stating something like “ISS” (imposition of sentence suspended) or “ESS” (execution of sentence suspended), with no further explanation of what this acronym means or what portion of the sentence was suspended.175 Even thornier problems arise when a person is sentenced on more than one charge simultaneously. The rules concerning whether to sentence concurrently or consecutively are often complex and vary by jurisdiction.176 The rap sheet commonly does not note if sentences were imposed consecutively or concurrently, and so future courts will generally see separate crimes with independent sentences, inflating the severity of the person’s criminal history. In short, rap sheets do a terrible job of conveying the true meaning of a sentence. They abstract the announced penalty away from its procedural context, turning it into a misleading number.

Differences between jurisdictions’ criminal laws create another problem of translation. The fifty states and the federal government all have their own criminal codes that define crimes differently. This makes it difficult to interpret rap sheet entries generated in a different system. For example, rap sheets note for every entry whether it is a “felony” or a “misdemeanor.”177 These categories of crime are defined differently in different places. In Maryland, misdemeanors can be very serious crimes resulting in up to ten years in prison.178 In Pennsylvania, you can get up to five years for a misdemeanor.179 In the federal system, you can only get up to one year.180 And the same conduct will be defined as a misdemeanor in one state but a felony in another. For example, shoplifting is a felony in Illinois if the loss is as little as $301.181 In South Carolina, the amount must be over $2,000 for it to be charged as a felony.182 The amount varies widely by state.183 Different states also define substantive crimes differently. For instance, the age thresholds for statutory rape charges are different in different states.184 In California, the crime of “robbery”—which normally requires taking something by force—is defined to include shoplifting where the person merely pushes past a security guard to get away.185 Even something as basic as the definition of “aiding and abetting” varies by jurisdiction.186 Essentially, the crimes listed in rap sheets mean very different things depending on the larger body of law that generated those crimes. People interpreting rap sheet entries from other jurisdictions are therefore likely to misunderstand the meaning of a charge.

A further translation problem stems from differences in local norms. Certain conduct can result in different charges and different sentences in different places.187 This is true even for courthouses within the same state.188 Conduct that would earn a misdemeanor charge in one courthouse might be a felony in another. Conduct that would result in drug treatment in one courthouse might mean a decade in prison in another.189 When a case is charged in the federal system, it will usually result in a much harsher punishment than it would in a state system. These differences reflect the different criminal justice cultures of various localities. Consequently, two people who committed the same criminal acts in two different counties might receive totally different rap sheet entries. Norms also change over time. A crime that was punished harshly a decade ago may merit a slap on the wrist today, and vice versa. If one sees a marijuana-related drug conviction from two decades ago, it is possible the same conduct would merit a much lighter sentence today. This variation in norms across time and space further complicates the relationship between a rap sheet entry and the underlying criminal acts.

When a prosecutor or judge makes a decision in a criminal case, they send two separate signals.190 The first signal gets delivered to the participants in the current case, and is operationalized in the local justice system. It generates a case outcome that sounds in the nuanced language of local practices. The second signal gets stripped of its context and reduced to a few data points on a rap sheet. This signal is sent out into the world to be consumed in all future cases. The actor who sends this outgoing signal may not be aware of its content, or even think about the fact that they are sending it. When people use rap sheets to make inferences about criminal history, they must interpret these context-denuded signals.191 The problem is that context collapse makes this task quite difficult. Without knowing the complex combination of statutes, case law, rules of court, and local norms that produced a case outcome, one cannot know what that outcome implies about the actions giving rise to the case.

IV. From Heuristics to Mechanical Formulas

When lawyers make discretionary choices based on heuristics drawn from rap sheets, they insert several types of arbitrariness into the system. When that discretion is removed and replaced by mandatory rules, the arbitrariness multiplies. This Part discusses laws that turn heuristics about prior cases into mandatory outcomes in the current case, such as deportations and lengthy sentences. These laws are herein labeled “mechanical formulas,” mechanical because they omit discretion, and formulas because they operate like mathematical rules. They are syllogisms of the form “if the defendant’s past cases feature X heuristic(s), then Y result is required in the current case.” Different mechanical formulas attach to different kinds of heuristics. Some apply when the defendant has a prior felony or a past sentence of a certain length. Others involve intermediate crime categories like “violent offense” or “aggravated felony,” which are defined in statutes and elaborated by court decisions. Mechanical formulas impose a veneer of uniformity in criminal cases, because they remove front-line discretion to treat cases differently. But this veneer masks a deeper arbitrariness. Mechanical formulas multiply the flaws in the heuristics-signaling system explored in Part III. They take these heuristics, attach severe consequences to some of them, and then remove decisionmakers’ discretion to avoid those consequences.192 Indeed, mechanical formulas even make it unnecessary for judges to look into what happened in prior cases, because they render the facts of past crimes irrelevant. The actual conduct is no longer what matters. The decisive variables are the outputs the system produced: the charge and sentence.

A. Legislative Mechanization—Strikes, Mandatory Minimums, and Other Typologies of Crime

Legislatures create mechanical formulas to impose consequences like deportation, denial of bond, and lengthy mandatory sentences. These formulas became especially common in the last few decades of the twentieth century. Devices like “strikes” and mandatory minimum enhancements emerged out of the populist crime politics of the 1980s and 1990s. They impose harsh, non-discretionary penalties on people who are charged with a new crime and also have a prior record that matches a certain formula. For example, some mechanical formulas are triggered when a person has a prior felony conviction. Some contain a laundry list of specific offenses, such as the various states’ lists of “violent” offenses.193 Some create intermediate crime categories like “aggravated felony” or “crime of violence.” And some attach to the actual sentence imposed in a past case.

These mechanical formulas serve a number of system interests. They constrain judges’ moral discretion, and enable legislators to take symbolic stands against certain broad categories of crime.194 They permit legislatures to create rules in the abstract, leaving it to judges and other front-line actors to sort through how terms like “controlled substances offense” are operationalized. They also permit the court system to process cases more efficiently, because they make the factual details of a past crime irrelevant. The general observation that rules can be applied more efficiently than standards is true of mechanical formulas.195 All that matters to a mechanical formula is the prior conviction or sentence, so there is no need to spend time litigating what actually happened in the past case. The formula also produces a binary outcome, rather than requiring the judge to decide on standard-like criteria like whether a past conviction was “violent” or “particularly serious.” This makes the outputs of mechanical formulas relatively cut-and-dry, although the formulas themselves are often complicated. To give a thicker understanding, it will be helpful to explore three concrete examples of mechanical formulas: California’s three strikes law, federal drug mandatory minimums, and aggravated felonies in the immigration context.

Three strikes laws are a product of the criminal law populism of the 1990s. Nearly half of all states and the federal system have enacted three strikes laws.196 The federal three strikes law, for example, mandates life in prison if the defendant has two prior “serious violent felony” convictions.197 The most famous three strikes law is California’s, which was enacted in 1994 as a reaction to the high-profile murder of a child named Polly Klaas by a man with an extensive criminal record.198 Now, two and a half decades later, California’s three strikes law continues to dominate the state’s criminal justice landscape. The California Penal Code contains two lists of crimes that count as strikes—one list of “serious” felonies, and one list of “violent” felonies.199 These lists have expanded significantly over time, as new legislatures and ballot initiatives have added additional crimes and expanded the definitions of the crimes already included.200 Today the lists are lengthy, with twenty-three “violent” felonies listed and forty-two “serious” felonies.201 They contain many specific criminal statutes, as well as general categories of crime like “mayhem,” “carjacking,” “[a]ny robbery,” “any felony in which the defendant personally used a dangerous or deadly weapon,” and “any felony punishable by death or imprisonment in the state prison for life.”202

If a person is convicted of a crime that counts as a strike, then they face an enormous sentence increase for any future felony convictions. Having a strike on your record makes you ineligible for a probation sentence, so the judge must sentence you to a prison term.203 For most felonies in California, there are three possible prison sentences for the judge to choose from: 16 months, 2 years, or 3 years.204 However, having a strike doubles the length of the sentence you will receive, so for the standard felony it instead becomes 32 months, 4 years, or 6 years.205 Having a prior strike also means you will receive 5 additional years on top of your sentence for the new case,206 and that you will serve at least 80% of your sentence (people without strikes serve only half of their sentences).207 So a single prior strike takes you from a sentence that might be a few months in custody followed by probation to a sentence of several years in custody. Having two strikes and then suffering an additional strike conviction will result in a prison term between 25 years and life.208 Basically, this system of strikes creates major sentencing enhancements that attach to certain prior convictions based on the charge. It does not matter how serious the underlying conduct was. A robbery could be as little as shoplifting and pushing past a security guard.209 And some strikes—such as a conviction for “criminal threats”—do not even involve violence.210 Nor does it matter how much or how little time you served as a sentence. All that matters is that the convicted charge counts as a strike. Indeed, prosecutors sometimes offer sentencing discounts to induce defendants to plead to strike charges, reasoning that the strike will massively increase the defendant’s sentence if they commit another felony in the future.

As a second example, in the 1980s, Congress enacted mandatory minimum sentences for federal drug offenses. These mandatory minimums are pegged to the amount of drugs involved in the case and start at 5 years or 10 years depending on the facts.211 Up until 2019, a person looking at a 10-year mandatory minimum had that amount doubled to 20 years if they had suffered a prior conviction for anything fitting the category “felony drug offense.”212 Under this law, any prior qualifying drug felony doubled a person’s mandatory sentence to 20 years, including even low-level state possession cases resulting in little to no time in custody. Indeed, many state misdemeanor drug offenses triggered the doubling because federal law defines a felony as any crime punishable by more than a year in prison, and many states have misdemeanor crimes fitting that definition.213 Cases that ended in successful diversion could also trigger these enhancements, even if the state conviction was expunged.214 In 2019, through the First Step Act, this enhancement was lowered to a 15-year mandatory minimum, and the category of triggering crimes was changed to include “serious drug felony” and “serious violent felony.”215 The word “serious” in this amendment added a requirement that the defendant had served more than a year in custody on the past conviction for it to qualify.216

The federal drug mandatory minimums also contain a “safety valve” provision, which permits defendants to be sentenced below the 5- and 10-year mandatory minimum sentences if they fulfill certain requirements.217 Previously, safety valve was only available to defendants with zero or one criminal history points under the Federal Sentencing Guidelines.218 This meant that if someone had two prior convictions of any kind, a single prior sentence of more than 60 days, or was on any kind of criminal supervision (including non-reporting misdemeanor probation), they could not escape the 10-year mandatory minimum.219 The First Step Act expanded safety valve in a rather complex manner. The law made safety valve available to people with up to two prior convictions, excluding all convictions with a sentence of less than 60 days, so long as none of the counted convictions involved a “violent offence” or resulted in a sentence over 13 months.220 As this example shows, mechanical formulas can be just as byzantine when they reduce incarceration as when they expand it.

A third example of a mechanical formula comes from the immigration system. In 1988, Congress created an intermediate category of crime it called the “aggravated felony.”221 The original 1988 law defined aggravated felonies as including only three types of crime: murder, drug trafficking, and firearms trafficking.222 However, Congress greatly expanded the definition in subsequent years.223 Today it comprises a lengthy twenty-one-item list, lettered A through U, with many items on the list containing multiple types of crime.224 This list includes both specific statutes and general crime categories. Listed crimes include sexual abuse of a minor, transporting aliens, crimes of violence for which the sentence was at least one year, and theft or burglary offenses for which the sentence was at least one year.225 Aggravated felonies carry severe consequences in both the criminal and immigration systems. If any noncitizen is convicted of an aggravated felony, even a legal permanent resident with a green card, they will lose their immigration status and be subject to deportation.226 Aggravated felonies also make a non-citizen ineligible for forms of relief from deportation in the immigration system, including policies like voluntary departure, cancellation of removal, or asylum.227 Under federal law, a prior aggravated felony increases the maximum sentence for reentry from two years to twenty years.228 Furthermore, prior to 2016, having a prior aggravated felony conviction meant an enormous increase in the recommended sentence for a reentry crime under the Federal Sentencing Guidelines.229 Much like the prior two examples, aggravated felonies attach mandatory, severe consequences to criminal histories that fit a complex formula. Crimes as severe as murder, or as minor as driving an undocumented alien, will trigger mandatory deportation.230 Indeed, even state misdemeanors can count as aggravated felonies, as the example at the beginning of this Article illustrates.231

These are but a few examples of the many mechanical formulas that American legislatures have created. Such formulas pervade our criminal law.232 They attach mandatory consequences, often quite severe consequences, to heuristic information about past crimes. But those heuristics, as the last Part explained, are unreliable signals of what happened in the underlying case. Further, the disappearance of discretion prevents judges from taking uncertainty into account or trying to smooth out disparate treatment.233 Legislatively enacted mechanical formulas are also arbitrary for a further reason: their choices of which past convictions do or do not trigger a consequence are frequently irrational. They focus on listing particular charges and categories of crime, and in doing so sweep in the least culpable versions of the listed crimes, while leaving out other, more culpable conduct. And, as California “strikes” and federal “aggravated felonies” illustrate, these lists tend to expand over time. Legislators have periodic incentives to take symbolic stands against particular crimes, and adding them to mechanical formulas is a useful way to do this.234 Legislators also make laws in the abstract—they tend to see a criminal statute as the platonic form of that crime, rather than as a broad range of conduct including the least culpable set of facts.235 When the California legislature added “robbery” to its three strikes law, for example, it likely was not considering that a large number of robbery charges involve petty shoplifting.236 The combination of abstraction and periodic crime panics thus causes mechanical formulas to metastasize.237 And there is not, or at least there has not yet been, much countervailing political incentive to reduce the scope of these mechanical formulas.238 This dynamic further erodes their connection to anything resembling a reasoned judgment about the people who must suffer their consequences.

B. Sentencing Guidelines—The Triumph of Counting

Sentencing guideline systems are another quite complex type of mechanical formula. The federal system and nearly half of the states have sentencing guidelines.239 These guidelines input facts about the current case and the defendant’s criminal history into an intricate formula, and then generate sentence lengths as outputs. Judges use these formulas to decide what sentence to impose in particular cases. Guidelines are created and periodically updated by specialized agencies called sentencing commissions. Due to their complexity, guideline systems adopt more granular triggering rules than the laws described in the prior section. Instead of imposing one enormous sentence increase based on a single prior conviction, sentencing guidelines tend to increase punishments gradually for each qualifying past conviction.240 This causes guidelines to generally emphasize the number of past convictions over the type of past convictions. They also have varying degrees of mandatoriness. Some guidelines systems are fully advisory and merely provide a recommendation that the judge can disregard.241 Other guidelines systems limit judges’ discretion.242 And the federal sentencing guidelines were mandatory until 2005, when the Supreme Court in United States v. Booker found that their mandatoriness was unconstitutional and made them advisory.243 Consequently, a number of guideline systems permit judges to exercise back-end discretion by imposing a sentence different from the one the guidelines produce. But even advisory guidelines have a significant anchoring effect on judges’ decisions.244

When a person is convicted of a new crime, sentencing guidelines incorporate their criminal history into a formula that decides what sentence they should receive. A few state guideline systems, like Alabama’s and Virginia’s, use worksheets that take the judge through a series of questions about the facts of the case and the defendant’s past convictions.245 However, most guidelines systems use a two-dimensional grid to determine the sentence.246 The Y-axis for this grid represents the current charge, while the X-axis represents the person’s past criminal convictions. As an example, the grid for the Federal Sentencing Guidelines is reproduced below.

To illustrate how guidelines work, imagine a person convicted of a federal crime, for instance transporting aliens. The number of points they receive on the Y-axis will be decided by the facts of the current case. For an alien transportation conviction they will start at twelve points, and then the number of points will go up or down depending on specific facts, such as whether the person being transported was the spouse or child of the defendant, or whether a dangerous weapon was used.247 The number of points they receive on the X-axis will then be decided by heuristics concerning the person’s prior criminal convictions. And the sentencing range that the guidelines generate is in the grid cell where the X-axis and Y-axis scores intersect. While the federal system just uses one grid, most states have multiple different grids for different types of crime. Minnesota, for example, has a “standard grid,” a “sex offender grid,” and a “drug offender grid.”248 The federal system is also unique, in that it frequently incorporates past criminal cases into the Y-axis as well. This double counting happens for a variety of federal crimes, like re-entry, transporting firearms, and anything designated a “violent” or “drug-related” offense.249 Some of the enhancements on the Y-axis can be enormous—the re-entry guidelines section, for example, can give up to two ten-point enhancements for prior felony convictions exhibiting certain heuristics.250

This quantification bias is discriminatory. Recall that arrests and convictions are biased heuristics, and are affected by variables like a person’s race.251 So too are the outcomes of the criminal justice process—a misdemeanor versus a felony, a short versus a long sentence.252 When sentencing guidelines count up past convictions and use them to decide the sentence in the current case, they compound this problem.In deciding where a person lands on the X-axis, guideline systems count up heuristics concerning their prior cases. In the federal system, the main heuristic is sentence length. To generate a person’s federal criminal history score, you add one point for each prior sentence of less than 60 days in the last 10 years, two points for each sentence between 60 days and 13 months in the last 10 years, and three points for each prior sentence over 13 months in the last 15 years.253 The Minnesota sentencing guidelines assign a point value to each crime in Minnesota’s criminal code, running from half a point to two points, and score priors from other jurisdictions according to their equivalent in Minnesota law.254 The State of Washington’s sentencing guidelines simply give one point for each prior felony conviction, and sometimes increase that to two or three points for certain types of priors like “sex offenses” or “serious violent” offenses.255 Oregon’s guidelines decide criminal history score based on the number of prior “person felonies” (crimes involving harm to a person), “non-person felonies” (generally crimes involving property), and “class A misdemeanors.”256 As these examples illustrate, sentencing guideline systems use a number of different formulas to translate the defendant’s past cases into a sentence in the current case. They quantify past convictions based on a variety of heuristics: the charge, the sentence, the type of crime. But fundamentally they all focus on counting past convictions. A person with more convictions, all else equal, receives a higher sentence.

Sentencing guidelines take the limited information we have about prior convictions—the charge and the sentence—and build these into complex formulas that create the appearance of a rational sentencing system. But no amount of sophistication can correct the problem of flawed inputs. In the nineteenth century, there was a vast medical literature on the proper methods for phrenological skull readings.257 Guideline systems superimpose an internally rational formula upon a problematic set of data points. The seeming rationality of guidelines legitimizes them, and disguises the problems with relying on heuristics.258 With the legislative mechanical formulas discussed in the prior section, the injustices are often stark and brutal.259 People are deported or not, and suffer long mandatory sentences or not, based on arbitrary differences in a single past conviction. But because they increase sentences more gradually, and contain opaque and sophisticated formulas, sentencing guidelines normally lack such obvious outward signs of injustice.260 This renders them useful tools for any judges who might wish to process people through the system with a minimal amount of effort or moral contemplation.261 Simply impose the sentence that the guidelines spit out, and move on. And guidelines are not just restricted to the sentencing context. Formulas resembling guidelines have proliferated in the criminal justice system, and are now used in some places to determine bail, parole release, plea bargain offers, and other key decisions.262 This guidelinification raises the specter of a future criminal justice system where every decision is rendered by mechanical formulas that translate heuristics into outcomes, without interference by human moral reasoning.

C. Translation Without Discretion—The Categorical Approach and Fitting Priors into Boxes

When rap sheet entries from other jurisdictions are used to make discretionary choices like sentencing or bond, there are problems of translation.263 It is difficult for judges to know the significance of a conviction from another state or county, or how exactly sentences work there. But this problem can at least be circumscribed—a judge can order further factfinding, or can discount the importance of a prior conviction in light of this uncertainty. When discretion is removed, however, the problem of translating foreign priors becomes much more troublesome. The choice is now binary—a prior conviction either triggers the mechanical formula or it does not. So the system must develop a set of rules to decide whether a foreign conviction qualifies. These rules are necessarily complex, and their outcomes appear arbitrary to the people being processed.

There are two ways for a legislature or sentencing commission to establish such complex translation rules. One is to put them in the law itself. This is what sentencing guideline systems tend to do. For instance, the United States Sentencing Guidelines contain a sophisticated set of rules concerning how many criminal history points to give a prior conviction from any jurisdiction.264 A second, more common method is to delegate the problem of translation to front-line actors, like agency officials in the immigration system and judges in the criminal justice system.265 It is ironic that the translation rules for mechanical formulas are commonly delegated to judges, since the ostensible aim of these laws is to remove judicial discretion. These laws prevent judges from engaging in moral reasoning about whether a recidivist enhancement should apply, but empower them to engage in technical and interpretive reasoning about whether it should apply.

There are limits on how judges can do this translation work. In situations where a mechanical formula increases a defendant’s sentence, the method of translation is constrained by constitutional law. Two doctrines in particular are significant: the void-for-vagueness doctrine and the right to a jury trial. First, a mechanical formula’s triggering rules cannot be so vague that criminal defendants have to guess whether or not certain crimes qualify. In a series of recent decisions, the Supreme Court has held unconstitutional the following federal definition of “crime of violence”: “any … offense that is a felony and that, by its nature, involves a substantial risk that physical force against the person or property of another may be used in the course of committing the offense.”266 The Court reasoned that this formulation gives inadequate notice of which crimes count as violent. The void-for-vagueness doctrine thus places a limit on legislatures’ ability to delegate crime translation to judges—legislatures must do more than gesture at a general description of crimes and then let the courts sort it out.267 Second, if a mechanical formula is triggered by facts of the underlying prior crime, rather than by a legal outcome produced by the prior court, the defendant has the right to a jury trial concerning those facts. This is a consequence of the Supreme Court’s Sixth Amendment decisions Apprendi v. New Jersey and United States v. Booker.268 If a fact about a prior crime triggers a sentence increase in the current case, that fact is effectively an element of the current crime. The Supreme Court’s decision in Almendarez-Torres v. United States creates an exception for the fact of a prior conviction or sentence.269 A past conviction only needs to be proven to a judge by a preponderance of the evidence, using conviction documents from the court that heard the case.270 So prior conduct needs to be agreed to in a plea deal or proven to a jury, but prior convictions only need to be proven to a judge. This exception gives further incentive for legislatures and guidelines commissions to focus on prior convictions rather than prior conduct, given the difficulty of proving events that may have occurred years ago in another part of the country.271

But, as noted, it is difficult to translate the work of another court into a different jurisdictional context. Consider the problem of deciding whether a foreign sentence triggers a mechanical formula. The system must come up with a set of rules to deal with issues like good time credit, indeterminate sentences, suspended sentences, concurrent versus consecutive sentences, and probation violations resulting in custody time. These rules must be easily administrable and avoid discretionary choices. They also must apply to all of the institutional variation in sentencing practices. These requirements produce some overbroad and illogical rules. Here are a few examples. As illustrated by the case of Ms. L at the beginning of this Article, judges must count “suspended” sentences—that is, sentences that will likely never be served—when deciding whether a conviction counts as an aggravated felony.272 The federal sentencing guidelines count concurrent sentences as one single sentence if they are for different crimes resulting in a single arrest, but as separate sentences if they are for different crimes resulting in multiple separate arrests.273 This means that if a person is sentenced to four years in prison on three separate crimes, all of it to run concurrent, the guidelines will treat this as three independent four-year sentences if the arrests were made on different days. State guideline systems have analogous rules for dealing with this problem.274 The federal guidelines also do not take good time credit or other indeterminate sentencing rules into account, instead assuming that the entire announced sentence was served.275 So when a person is sentenced to sixteen months in California, and everyone in the courtroom understands that this really means eight months, the federal sentencing guidelines will still treat that sentence as sixteen months.

Such rigid, formalistic rules cause enhancements to apply in irrational ways. Consider the federal guidelines for reentry offenses. A person can get up to two ten-point enhancements for having prior felony convictions, depending on the length of the sentences and when the person was first deported.276 If a person commits one felony and gets a one-week sentence with probation, then is deported, then commits another felony and gets a five-year indeterminate sentence where they are released from custody in two years, and also a probation violation on the first case to run concurrent, this will count as two five-year sentences.277 So the person will get a twenty-point enhancement for having two five-year sentences (one before and one after their first deportation), even though they only ever served two years in prison. This kind of illogical result happens in part because the judge in the initial case cannot predict how another jurisdiction will interpret the sentence, and so does not know to send a different signal (for example by announcing a lower sentence, or by not issuing a probation violation). It also happens because all that the system knows is the sentence announced by the prior court, so mechanical formulas must be keyed to that. But the announced sentence is a poor heuristic for the amount of time a person actually served, and an even worse heuristic for the wrongness of what the person did.

Translating intermediate crime categories also presents a problem. When a jurisdiction establishes a category like “crime of violence” or “aggravated felony,” it must frequently decide whether or not crimes from other jurisdictions fit this category. Domestic crimes are a limited enough universe that a law can simply list which ones do or do not count. But this is impossible to do with the crimes of all fifty states and the federal system. Thus, legislatures use categories like “theft offense” or “drug felony” to establish a type of conviction that includes foreign statutes. Doing this delegates to judges the question of which foreign statutes count. But how are judges to decide? American law has tackled this problem through a formalistic test called the “categorical approach.”278 The categorical approach involves comparing the elements of the foreign criminal statute to the elements of the analogous domestic crime (or, if there is no comparable domestic crime, to the generic version of the offense).279 If being guilty under the foreign statute necessarily means one is guilty under the domestic statute, then the two laws are categorical matches. However, if the foreign statute is even slightly broader than the domestic one, then they are not matches. Further, if the foreign statute is properly interpreted as containing several distinct crimes, then courts can use the “modified” categorical approach and look at court records to determine which crime the defendant was convicted of, then compare that one.280 Importantly, the categorical approach does not care about what the defendant actually did. Even if their actual actions violated the laws of both jurisdictions, all that matters is whether the foreign statute is broader than the domestic one. The categorical approach is relied on frequently by courts. In the federal system it is used to decide whether drug mandatory minimum enhancements apply, whether a non-citizen has an aggravated felony and must be deported, and whether certain guidelines enhancements apply, among other questions.281 In state systems it is used to determine whether a foreign crime counts as a “strike,” or as a crime of violence, whether a foreign conviction triggers a guideline enhancement, and other questions.282

This method for translating crimes between systems gives rise to some strange results. Small variations in how different jurisdictions write their criminal laws can make an enormous difference in what happens to particular defendants. For example, say that a person is charged with a federal mandatory minimum enhancement for being convicted of a prior drug felony in the state of Maine. Whether their prior crime counts will depend on whether Maine’s drug law is broader than the federal law. If federal law criminalizes 400 drugs and Maine criminalizes 401, then the laws are not categorical matches and the enhancement does not apply.283 And this is true even if the actual drug the defendant sold was prohibited in both jurisdictions, e.g., if it was heroin. Even the definition of a drug can be overbroad. A recent decision of the Ninth Circuit held that methamphetamine convictions in California are not predicates for federal drug crime enhancements, because California defines methamphetamine as including “optical and geometric isomers,” whereas the federal definition only covers “optical isomers.”284 Other sources of categorical mismatch include differences in the mens rea required for accomplice liability, and differences in the age cutoffs for statutory rape.285

As these examples suggest, the categorical approach is an incredibly powerful tool for defense lawyers.286 If a clever lawyer can find a minor difference between one jurisdiction’s statute and another’s, they can help their client avoid a severe punishment. Not coincidentally, the categorical approach has been heavily criticized by judges and sentencing commissioners for its arbitrariness. Judge William Pryor, for instance, has called for scrapping the categorical approach and replacing it with a conduct-based approach.287 And the United States Sentencing Commission recently proposed to change the guidelines so that judges use a conduct-based approach, rather than the categorical approach, to decide if prior-conviction-based enhancements apply.288

This criticism of the categorical approach is misplaced.289 As this Article has shown, the categorical approach is just the icing on a layer cake. Our system for counting priors is arbitrary all the way down – beginning with the heuristics drawn from limited information about past cases, moving up to the mechanical formulas built upon those heuristics, and topped off with the problems of interjurisdictional translation for those mechanical formulas. Moving to a conduct-based approach for interjurisdictional translation, even if we could get past the Sixth Amendment issues, would make our system for counting priors only modestly more rational. And, as this Article has shown, there would be major practical problems with moving away from the categorical approach. If the federal sentencing guidelines were reformed to require hearings to prove the conduct underlying past convictions, lawyers would need to go to other court systems to order whatever transcripts and conviction documents were available. These searches would often prove fruitless, because our legal system does not record many substantive facts about past crimes.290 The categorical approach, or something resembling it, is unavoidable in our system because of courts’ inability in most cases to do substantive factfinding.

V. These Features Create a System that Is Mindless, Arbitrary, and Cruel

In the film Cube, five people wake up in a massive complex shaped like a Rubik’s Cube with 17,576 rooms.291 Upon entering some of these rooms, they will immediately suffer a gruesome death. The rest of the rooms are perfectly safe. As they learn over the course of the film, the only way to know which rooms are safe and which are deadly is to master a mathematical equation too complex for the ordinary human mind. They are left to wonder: why are they in this Cube? What is its purpose? What do its designers want with them?

Our criminal justice system functions like this Cube.292 It is inscrutable to the people processed through it, whose fates it decides based on variables they do not understand and cannot control. People are sorted according to cosmetic details of their past encounters with the system—the charges, the announced sentences, the wording of statutes—rather than the substance of their past actions. And this arbitrary sorting process releases some and holds others, gives some misdemeanors and others felonies, locks some up for years with mandatory sentences and releases others on probation. Certain constellations of prior criminal justice encounters will destroy a person, others will spare a person. And the difference can only be explained by someone who understands the system’s technical rules and internal imperatives. It cannot be explained in terms that are morally coherent, or comprehensible to the people being processed.

Returning to the analysis in Part II, how can this system be justified? The only defensible reason for treating someone more severely based on their past actions is that these actions reveal something meaningful about the person, and that revelation merits harsher treatment. Maybe the past actions show that the person is more predisposed than others to committing crimes, or that when they do commit crimes they have a more culpable state of mind than the baseline person. But our system for counting priors does not register the substance of past acts. It is an edifice of formal rules built upon heuristics drawn from the little information recorded by rap sheets. There are three overlapping problems. First there is the heuristics problem: the information the system records as crimes and sentences represents a small and biased slice of the person’s life, is subject to the randomness of the system’s internal processes, and is represented in abstract and information-poor data points.293 Then there is the mechanical formulas problem: these heuristics are built into rigid and arbitrary decision rules.294 Those rules treat certain patterns of heuristics harshly, and others lightly, based solely on whether they match the formula’s triggering requirements. They focus judges’ attention away from the substantive meaning of past events—what actually happened, why it happened, whether the past act reflects who this person is now—and instead on the technical features of past system outputs. And finally, there is the translation problem: the system’s information poverty and efficiency imperative force it to focus on the formal, surface-level features of convictions from other jurisdictions.295 This means that certain foreign convictions will be counted, and others will not, based on differences in states’ sentencing rules and crime definitions. These three problems prevent the system from using criminal history to make meaningful judgments about the actual person.

This way of designing the system also makes lawyers incredibly important. When the technical details of one’s past cases trigger devices like strikes, guideline enhancements, deportation rules, and mandatory minimums, it is vital to have a clever and thorough lawyer who will anticipate these consequences. Consider the example of Ms. L that opened this Article.296 In order for her court-appointed defense lawyer to know that her misdemeanor conviction counted as an aggravated felony, the lawyer would have to be well-versed in the details of federal immigration law. They would have to know not only what an aggravated felony is, but also that a theft offense with a one-year sentence qualifies as an aggravated felony, and further that the suspended portion of a sentence counts towards the sentence. Such knowledge is crucial to correctly advise Ms. L of the consequences of her guilty plea.297 Even more importantly, it is necessary for the lawyer to be able to negotiate effectively on Ms. L’s behalf, for instance by asking for a different charge that will not trigger deportation. This requires the lawyer to look beyond the constraints and imperatives of the current case and consider what a certain outcome might mean for future cases in other systems.298 An effective lawyer is also important in the later case, when the consequences of a prior conviction are determined. Because the technical features of past cases are so decisive, one needs a dedicated lawyer with mastery of the technicalities to look for issues. The categorical approach, for example, requires lawyers to thoroughly compare the statutes of two jurisdictions to find small distinctions. This is complex work. The problem, however, is that our system underinvests in criminal defense lawyers. The majority of criminal defendants are given court-appointed lawyers, and these are often underfunded, complacent, or both.299 And, even more distressingly, the immigration system does not even provide lawyers for defendants in deportation proceedings.300 This means that the highly technical problems that arise in immigration law—e.g., whether a prior conviction counts as an aggravated felony or crime involving moral turpitude under the categorical approach—will be resolved without a lawyer helping the deportee. One could not imagine a more absurd system. People who usually have no legal training and often do not speak English are deported according to rules so arcane they confuse most lawyers, and are given no lawyer to assist them in understanding or using those rules.301

Our system’s method for counting prior convictions also acts as a force multiplier for racial discrimination.302 It renders the discretionary choices of past government actors decisive for how a person is processed in future cases. And, in the criminal justice system, discretion predictably works against minority groups. This includes choices of where to police, whom to arrest, what charges to bring, whether to set bond, what plea bargain terms to offer, what sentence to impose, and how many defense resources to commit.303 Such systemic discrimination means that arrests, convictions, and penalties are unevenly distributed by race. And the reliance on heuristics and mechanical formulas compounds this problem, because it means we exact harsher penalties on people with more past arrests and convictions. Sentencing guideline systems are particularly bad in this regard, because they increase the current sentence based simply on the number of convictions. This emphasis on counting past cases means that the groups who are policed more will also be punished more.

Our system for counting priors is frightening for a further reason: it raises the specter of a criminal justice system that eliminates human moral judgment altogether.304 Systems built around mechanical formulas do not necessarily require human intervention. Indeed, the guidelines used in many jurisdictions for bond determinations, parole release decisions, and sentencing permit judges to absent themselves from the decision-making process if they so choose. They can simply impose the outcome that the guidelines spit out. With such automation, there is a risk of our criminal justice institutions becoming self-reproducing systems that are closed off from any underlying moral justifications, and that operate only according to their own administrative logic.305 Human beings like prosecutors and judges may still technically run the system, but the established case-processing machinery colonizes their intuitions about proper punishment.306 This creates a closed loop where human moral norms provide no check. We defer to the machine. It is larger than any of us, and its outputs dictate our intuitions about what should be done. Mechanical formulas like guidelines, bail formulas, and mandatory sentencing enhancements facilitate such mindlessness.307 They can render moral judgments obsolete by turning human beings into case-processing robots.

Our legal system has built this case-processing machinery so that it can efficiently make use of people’s criminal history. This machinery worsens the system’s racism, creates irrational and unjust results, and fails to justify the use of criminal history in the first place. We should strive to do better.

VI. Directions for Reform

A. Leveling Up or Leveling Down

There are two ways to resolve the paradox of criminal history. One is to increase the amount of information our system has about past criminal convictions, and take that information into account in new cases. The other is to decline to use criminal history when processing new cases. Both of these resolutions would solve the basic contradiction—that criminal history is all-important, but that very little is known about it. Leveling up by providing more information would create a system where the use of criminal history can be theoretically justified, because it reveals meaningful facts about the person. Leveling down by avoiding criminal history would circumvent the issue altogether. Each resolution presents its own challenges.

If we are going to generate more substantive facts about past crimes, we need to find a point in the system where that information can be established and recorded. The difficulty is that, as explained above, the American criminal legal system is designed for rapid pleas and defense waivers rather than for substantive fact-finding.308 Likely the lowest-cost option would be to make police reports more easily available, for example by uploading the texts of police reports to a criminal history repository. The problem with this approach is that police reports are not subjected to the adversary process—they are summaries written by police officers to justify an arrest, and have not been challenged by the defendant or adopted as facts by a court.309 They should not, therefore, become future courts’ official account of what happened.

There are a number of stages in the adjudication process itself where such information could be generated. Here are a few possibilities. First, we could require a contested preliminary hearing in every case, where the government would bring in its witnesses to testify (and be cross-examined) concerning the evidence against the defendant.310 Second, we could require the production of detailed presentence reports in every case, give the defendant the opportunity to challenge the facts in those presentence reports, and then make those reports available to future courts through a searchable database. Third, we could require a more detailed guilty plea colloquy that gets into specific facts concerning the crime, and then have the court publish a summary of the facts as determined at the plea.311 Fourth, in order to generate more specific information about how much time a person served in custody, we could have prison facilities enter release dates into RAP sheet databases.312 This is not an exhaustive list of ideas. But it does provide a starting point, should the designers of our criminal legal system decide to invest in generating and preserving information about past cases. Such leveling-up solutions are, of course, expensive. They would require that our legal system actually engage in fact-finding. This would necessarily slow down the rapid-fire case processing machinery that our system relies on to convict so many people every year.313 And furthermore, a leveling-up solution would also require that we develop information systems capable of sharing substantive facts about past cases across time and space.314 This would necessitate a significant investment of resources, as well as coordination between jurisdictions.

The other way out of the paradox is not to use criminal history. One could imagine a criminal legal system where a person is convicted of a crime, and then receives a punishment based only on the facts of that crime. In such a system, the “criminal history score” on the X-axis of sentencing guidelines charts would be removed, and recidivist enhancements like “strikes” and “aggravated felonies” would be repealed. A system of this sort is quite plausible, potentially even desirable. It has been defended by numerous philosophers and criminal law theorists.315 It would not serve the actuarial function of using past crimes to predict future crimes, and assigning punishment based on that prediction.316 Instead, it would simply punish the current crime. Whatever other benefits this would have, it would be more honest than the current system. It would represent a frank recognition that we are unwilling to invest the resources necessary to adequately learn about what happened in past cases. And since our information about past crimes is so poor, we cannot justify relying on it in future cases.

B. Introducing Discretion

There is also a second way to mitigate the problems with the current system: give judges more discretion over how to weigh criminal history. Increasing judicial discretion is, indeed, a necessary component of a leveling-up strategy. If mechanical formulas remained in place, there would be no point in providing more information about past convictions. Rich factual details can only be weighed if judges are given the authority to weigh them. On the other hand, increasing judicial discretion can mitigate the problems of the current system even if we do not pursue a leveling-up strategy. If a judge is given discretion over how much weight to give a prior conviction, they can at least take into consideration their uncertainty about what actually happened. Judges can even, in cases where a past conviction matters a lot, instruct the parties to provide more substantive information (e.g., finding police reports, transcripts, presentence reports, or witnesses). Presently, increasing discretion without providing more information seems to be a popular model of reform.317 This makes some sense—it requires less resources than a leveling-up approach, and is more politically palatable than a leveling-down approach.

There is a familiar debate in legal theory over the relative merits of rules and standards.318 Rules have certain advantages: they are easier to administer, provide better notice, and impose uniformity between decisionmakers. Standards also have advantages: they let judges take more factors into account, and mitigate the problems of under- and over-inclusiveness. As a reform strategy, increasing judicial discretion involves replacing rules with standards. Rules certainly have an important place in criminal law—for example, in defining the scope of criminal conduct.319 But this Article has shown that there are significant problems with using rules to count criminal history. In this context, rules are uniquely harsh, complex, and arbitrary.320 Admittedly, moving from a rules-based system to a discretionary system replaces one form of arbitrariness with another. In a rules-based system, the result turns on the superficial features of your prior encounters with criminal courts. In a discretionary system, the result turns on the substantive views of the judge you happen to draw. But the latter system has two features that recommend it as a model for reform. First, it bases decisions on human moral judgments rather than the outputs of a thoughtless machine. This places an important outer limit on unjust punishment, because an actual human being has to decide that the punishment is merited. Second, individual judges can coordinate their decisions through guidelines, courthouse norms, appellate review, and other mechanisms.321 This at least mitigates the randomness of variation between judges.

C. Proposals Within the Current Structure

A further option is to live with the paradox, and to maintain an information-poor bureaucracy that relies on criminal history. If this is the option we pursue, there are a number of limited reforms that could mitigate the system’s defects. Three are explored here. This is by no means an exhaustive list.

1. Eliminate “Time Served” Sentences

In the near-universal custom of American criminal courts, when a person receives a sentence that means they will not serve any more time in custody, that sentence is announced as “time served.” This means that the pronounced sentence includes all of the time that the defendant has already spent in jail, and no additional time. This custom inflates the length of many sentences, because any number of different events can extend the interval of time between an arrest and a sentencing hearing. A person might be sent to the hospital for a period of time while in custody. They might have an unusually slow defense lawyer.322 The court may be overloaded with cases. They may end up fighting the case for a period of time before they plead guilty. They may need to wait on evidence from the government. There may be a natural disaster: in the aftermath of Hurricane Katrina in New Orleans, about 8,000 people remained in custody months past the dates they should have been released because the entire legal system collapsed.323 Court appearances have similarly been suspended across the country during the 2020 coronavirus pandemic. Some people even fight a case for years, beat the more serious charge, and then plead guilty to a lesser charge.324 The problem with imposing a “time served” sentence in these kinds of cases is that the rap sheet then registers the amount of time the person was in custody, and that amount of time is used in future cases as a heuristic for the seriousness of the crime. This is unfairly inflationary. If the amount of time the person deserved to serve is shorter than the amount of time they actually served, then the signal will be too high in future cases.325 This is especially unfair in situations where serious consequences attach to the heuristic of a past sentence length. To remedy this problem, courts should announce the sentence that is deserved at the sentencing hearing, rather than imposing the amount of time the person has served in custody.326

2. Expand Opportunities to Challenge or Revise Past Convictions

If our system is going to decide people’s fates based on cosmetic features of their past cases, it should allow them to revisit, and possibly correct, those features. The actors deciding what happens with a case in Time 1 generally focus only on that case. They often do not anticipate the downstream consequences for a case in Time 2. But minor differences in the charge or sentence in Time 1 can cause immense harm to the defendant in Time 2, through a mandatory minimum sentence or deportation. Such consequences may never have been predicted or intended by the actors in Time 1, and those actors may wish to prevent the consequences, if given the opportunity. Because of this problem, there should be more robust opportunities to revise what happened in Time 1 after the fact. If the court that heard the case in Time 1 is willing to retroactively expunge the conviction, or to reclassify it as a misdemeanor or some other charge that avoids triggering a mechanical formula, the court in Time 2 should accept this change and alter its decision accordingly.327 For example, say that a person received a 16-month indeterminate sentence in state court at Time 1, and it was understood that they would only serve 8 months of this sentence. Now, say in Time 2 they are charged with a federal drug crime, and they cannot get under the 10-year mandatory minimum through the “safety valve” exception because their prior sentence is longer than 13 months.328 In these circumstances, they should be able to go back to state court and ask that the sentence be reclassified as an 8-month sentence. And the federal court should then grant the person safety valve. Generally, our system does not allow this kind of thing.329 But, it should.

Analogous principles should also apply on direct appeal. For instance, under the mootness doctrine, a defendant may not be able to challenge the length of their sentence if that sentence has already been served when the appeal is decided.330 But, just like the fact of a conviction, the length of a sentence can have enormous consequences for a person in future cases. People should therefore be able to appeal sentences that are too long, and get those sentences retroactively shortened, even if they have already been served.

3. Anticipate Federal Immigration Consequences

The federal immigration system depends on information from state criminal justice systems to deport people.331 The actors who run state criminal justice systems must, therefore, responsibly anticipate the immigration consequences of their choices. This begins with legislatures. Small differences in the wording of state criminal laws can decide whether they match with federal categories of deportable offenses, like “aggravated felonies” or “crimes involving moral turpitude.” Legislatures can, thus, strategically write their laws to protect people from deportation. For example, in 2011 the Washington legislature fixed the problem that inspired the example of Ms. L at the beginning of this Article. It did so by reducing the maximum sentence of all misdemeanors in Washington to 364 days.332 With that change, a misdemeanor will not qualify as an aggravated felony or a crime of moral turpitude because there is no way to impose a sentence of one year or more.333 Nevada, California, and Utah have all enacted similar laws limiting misdemeanors to 364 days, also intending to prevent misdemeanor defendants from suffering deportation.334 States could make other crimes non-deportable by taking advantage of the categorical approach. They could do so by defining their criminal laws as slightly broader than the relevant federal equivalents.335

It is also crucial that lawyers working in state criminal justice systems understand the immigration consequences of various state laws. Because state criminal convictions decide whether or not a person is deported, state court systems operate as de facto immigration courts.336 And state prosecutors are the most powerful actors in these courts, since they decide whether immigration-safe charges will be brought. State prosecutors should use this power to protect defendants from deportation in cases where that would be an unjust outcome. Defense lawyers also have an obligation to know the immigration consequences of various charges, and to inform defendants of such.337 This can be quite a complicated task. State public defender systems should employ immigration lawyers who can provide detailed advice on the probable immigration consequences of various charges and plea offers. They should also develop widely accessible practice advisories for each state explaining how certain criminal charges could affect a person in the immigration system.338

Conclusion

Our criminal and immigration systems place enormous weight on past crimes but know little about them. This is because our courts are not designed to make rich factual records containing the details of past crimes. They are instead designed to facilitate a rapid and information-poor plea-bargaining process. To build a system that preserved and made use of substantive information about past crimes would take significant resources. But the system is overloaded. There are nearly 2.3 million people incarcerated in the United States.339 To lock up that many people requires efficient case-processing machinery. Our criminal justice system has too few employees, too little resources, and far too much volume to give substantive consideration to most cases. So instead of investigating a defendant’s past crimes, we rely on a byzantine system of mandatory penalties keyed to heuristics drawn from computerized rap sheets. This machinery lets us use criminal history to process and punish people efficiently, with a minimum amount of thought.

 


* Acting Professor, University of California-Davis School of Law