# Probability

## Abstract and Keywords

Rather than entailing that a particular outcome will occur, many scientific theories only entail that an outcome will occur with a certain probability. Because scientific evidence inevitably falls short of conclusive proof, when choosing between different theories it is standard to make reference to how probable the various options are in light of the evidence. A full understanding of probability in science needs to address both the role of probabilities in theories, or chances, as well as the role of probabilistic judgment in theory choice. In this chapter, the author introduces and distinguishes the two sorts of probability from one another and attempt to offer a satisfactory characterization of how the different uses for probability in science are to be understood. A closing section turns to the question of how views about the chance of some outcome should guide our confidence in that outcome.

Keywords: probability, chance, frequency, determinism, philosophy of science, credence, confirmation, Bayesianism, the Principal Principle

1 Introduction

When thinking about probability in the philosophy of science, we ought to distinguish between probabilities *in* (or *according to*) theories, and probabilities *of* theories.^{1}

The former category includes those probabilities that are assigned to events by particular theories. One example is the quantum mechanical assignment of a probability to a measurement outcome in accordance with the Born rule. Another is the assignment of a probability to a specific genotype appearing in an individual, randomly selected from a suitable population, in accordance with the Hardy-Weinberg principle. Whether or not orthodox quantum mechanics is correct, and whether or not any natural populations meet the conditions for Hardy-Weinberg equilibrium, these theories still assign probabilities to events in their purview. I call the probabilities assigned to events by a particular theory *t, t*-*chances*. The *chances* are those *t*-chances assigned by any true theory *t*. The assignment of chances to outcomes is just one more factual issue on which different scientific theories can disagree; the question of whether the *t*-chances are the chances is just the question of whether *t* is a true theory.

Probabilities of theories are relevant to just this question. In many cases, it is quite clear what *t*-chances a particular theory assigns, while it is not clear how probable it is that those are the chances. Indeed, for any contentful theory, it can be wondered how probable it is that its content is correct, even if the theory itself assigns no chances to outcomes.

Perhaps these two seemingly distinct uses of probability in philosophy of science can ultimately be reduced to one underlying species of probability. But, at first glance, the prospects of such a reduction do not look promising, and I will approach the two topics (p. 418) separately in the following two sections of this chapter. In the last section, I will turn to the prospect of links between probabilities in theories and probabilities of theories.

2 Probabilities in Theories

## 2.1 Formal Features

Scientific theories have two aims: prediction and explanation. But different theories may predict and explain different things, and generally each theory will be in a position to address the truth of only a certain range of propositions. So, at least implicitly, each theory must be associated with its own space of (possible) *outcomes*, a collection of propositions describing the events potentially covered by the theory. We can assume that, for any theory, its space of outcomes forms an *algebra*. That is, (1) the trivially necessary outcome ⊤ is a member of any space of outcomes; (2) if *p* is an outcome, then so is its negation ¬*p*; and (3) if there exists some countable set of outcomes {*p*_{1}, *p*_{2}, …}, then their countable disjunction ⋁_{i}*p*_{i} is also an outcome. Each of these properties is natural to impose on a space of outcomes considered as a set of propositions the truth value of which a given theory could predict or explain.

A theory includes chances only if its principles or laws make use of a *probability function P* that assigns numbers to every proposition *p* in its space of outcomes *O*, in accordance with these three axioms, first formulated explicitly by Kolmogorov (1933):

1. $P(\top )=1$;

2. $P\left(p\right)\ge 0,\text{}\text{for}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{any}\text{\hspace{0.17em}}\text{\hspace{0.17em}}p\in \text{}O$;

3.

*P*(⋁_{i}*p*_{i}) = ∑_{i}*P(p*_{i}) for any countable set of mutually exclusive outcomes {*p*_{1},…} ⊆*O*.

To these axioms is standardly added a stipulative definition of *conditional probability*. The notation *P*(*p*|*q*) is read *the probability of p given q*, and defined as a ratio of unconditional probabilities: when $P\left(q\right)>0,\text{}P\left(p|q\right)=\text{df}\frac{P\left(p\wedge q\right)}{P\left(q\right)}$

A theory that involves such a function *P* meets a formal mathematical condition on being a chance theory. (Of course, a theory may involve a whole family of probability functions, and involve laws specifying which probability function is to apply in particular circumstances.) As Kolmogorov noted, the formal axioms treat probability as a species of *measure*, and there can be formally similar measures that are not plausibly chances. Suppose some theory involved a function that assigned to each event a number corresponding to what proportion of the volume of the universe it occupied—such a function would meet the Kolmogorov axioms, but “normalized event volume”
(p. 419)
is not chance. The formal condition is necessary, but insufficient for a theory to involve chances (Schaffer 2007, 116; but see Eagle 2011*a*, Appendix A).

## 2.2 The Modal Aspect of Chance

What is needed, in addition to a space of outcomes and a probability measure over that space, is that the probability measure play the right role in the theory. Because the outcomes are those propositions whose truth value is potentially predicted or explained by the theory, each outcome—or its negation, or both—is a *possibility* according to the theory. Because scientific theories concern what the world is like, these outcomes are objectively (physically) possible, according to the associated theory. It is an old idea that probability measures the “degree of possibility” of outcomes, and chances in theories correspondingly provide an objective measure of how possible each outcome is, according to the theory. This is what makes a probability function in theory *T* a chance: that *T* uses it to quantify the possibility of its outcomes (Mellor 2005, 45–48; Schaffer 2007, 124; Eagle 2011*a*, §2).

Principles about chance discussed in the literature can be viewed as more precise versions of this basic idea about the modal character of chances. Consider the Basic Chance Principle (bcp) of Bigelow, Collins, and Pargetter (1993), which states that when the *T*-chance of some outcome *p* is positive at *w*, there must be a possible world (assuming *T* is metaphysically possible) in which in which *p* occurs and in which the *T*-chance of *p* is the same as in *w*, and which is relevantly like *w* in causal history prior to *p*:^{2}

In general, if the chance of A is positive there must be a possible future in which A is true. Let us say that any such possible future grounds the positive chance of A. But what kinds of worlds can have futures that ground the fact that there is a positive present chance of A in the actual world? Not just any old worlds … [T]he positive present chance of A in this world must be grounded by the future course of events in some A-world sharing the history of our world and in which the present chance of A has the same value as it has in our world. That is precisely the content of the bcp.

So the existence of a positive chance of an outcome entails the possibility of that outcome, and indeed, the possibility of that outcome under the circumstances which supported the original chance assignment. Indeed, Bigelow et al. argue that “anything that failed to satisfy the bcp would not deserve to be called *chance*.”

Does the bcp merely precisify the informal platitude that *p’s* having some chance, in some given circumstances, entails that *p* is possible in those circumstances? The bcp
(p. 420)
entails that when an outcome has some chance, a very specific metaphysical possibility must exist: one with the same chances supported by the same history, in which the outcome occurs. This means that the bcp is incompatible with some live possibilities about the nature of chance. For example, consider *reductionism* about chance, the view that the chances in a world supervene on the Humean mosaic of occurrent events in that world (Lewis 1994; Loewer 2004). Reductionism is motivated by the observation that chances and frequencies don’t drastically diverge, and indeed, as noted earlier, that chances predict frequencies. Reductionists explain this observation by proposing that there is a nonaccidental connection (supervenience) between the chances and the pattern of outcomes, including outcome frequencies: chances and long-run frequencies don’t diverge because they can’t diverge. Plausible versions of this view require supervenience of chances at some time *t* on the total mosaic, not merely history prior to *t* (otherwise chances at very early times won’t reflect the laws of nature but only the vagaries of early history). Consider, then, a world in which the chance of heads in a fair coin toss is 1/2; the chance of 1 million heads in a row is $1/2{}^{\left({10}^{6}\right)}$ By the bcp, since this chance is positive, there must be some world *w* in which the chance of heads remains 1/2, but where this extraordinary run of heads occurs. But if the prior history contains few enough heads, the long run of heads will mean that the overall frequency of heads in *w* is very high—high enough that any plausible reductionist view will deny that the chance of heads in *w* can remain 1/2 while the pattern of outcomes is so skewed toward heads. So reductionists will deny the bcp for at least some outcomes: those that, were they to occur, would *undermine* the actual chances (Ismael 1996; Lewis 1994; Thau 1994). Reductionists will need to offer some other formal rendering of the informal platitude. One candidate is the claim—considerably weaker than bcp—that when the actual chance of *p* is positive at *t*, there must be a world perfectly alike in history up to *t* in which *p*.^{3}

## 2.3 Chance and Frequency

Chancy theories do not make all-or-nothing predictions about contingent outcomes. But they will make predictions about outcome *frequencies*. It makes no sense to talk about “the frequency” of an outcome because a frequency is a measure of how often a certain type of outcome occurs in a given trial population; different populations will give rise to different frequencies. The frequencies that matter—those that are predicted by chance theories—are those relative to populations that consist of genuine repetitions of the same experimental setup. The population should consist of what von Mises called “mass phenomena or repetitive events … in which either the same event repeats itself again and again, or a great number of uniform elements are involved at the same
(p. 421)
time” (von Mises 1957, 10–11). Von Mises himself held the radical view that chance simply reduced to frequency, but one needn’t adopt that position in order to endorse the very plausible claim that chance theories make firm predictions only about “practically unlimited sequences of uniform observations.”

How can a chance theory make predictions about outcome frequencies? Not in virtue of principles we’ve discussed already: although a probability function over possible outcomes might tell us something about the distribution of outcomes in different possible alternate worlds, it doesn’t tell us anything about the actual distribution of outcomes (van Fraassen 1989, 81–86). Because chances do predict frequencies, some additional principle is satisfied by well-behaved chances. One suggestion is the Stable Trial Principle (STP):^{4}

If (i) A concerns the outcome of an experimental setup

Eatt, and (ii)Bconcerns the same outcome of a perfect repetition ofEat a later timet′, then ${P}_{tw}(A)=x=P{}_{t\text{'}w}(B)$. The STP predicts, for instance, that if one repeats a coin flip, the chance of heads should be the same on both trials.(Schaffer 2003, 37)

If chances obey the STP for all possible outcomes of a given experimental setup, they will be *identically distributed*. Moreover, satisfaction of the STP normally involves the trials being *independent* (dependent trials may not exhibit stable chances because the chances vary with the outcomes of previous trials). So if, as Schaffer and others argue, STP (or something near enough) is a basic truth about the kinds of chances that appear in our best theories, then the probabilities that feature in the laws of our best theories meet the conditions for the *strong law of large numbers*: that almost all (i.e., with probability 1) infinite series of trials exhibit a limit frequency for each outcome type identical to the chance of that outcome type.^{5}

If a theory entails that repeated trials are stable in their chances—that the causal structure of the trials precludes “memory” of previous trials and that the chance setup can be more or less insulated from environmental variations—then the chance, according to that theory, of an infinite sequence of trials not exhibiting outcome frequencies that reflect the chances of those outcomes is zero. The converse to the bcp—namely, that there should be a possible world in which *p* only if *p* has some positive chance—is false: it is possible that a fair coin, tossed infinitely many times, lands heads every time. But something weaker is very plausible (see the principle hcp discussed later): that if the
(p. 422)
chance of *p* being the outcome of a chance process is 1, then we ought to expect that *p* would result from that process. Given this, an infinite sequence of stable trials would be expected to result in an outcome sequence reflecting the chances.

So some chance theories make definite predictions about what kinds of frequencies would appear in an infinite sequence of outcomes. They make no definite prediction about what would happen in a finite sequence of outcomes. They will entail that the chance of a reasonably long sequence of trials exhibiting frequencies that are close to the chances is high, and that the chance increases with the length of the sequence. But because that, too, is a claim about the chances, it doesn’t help us understand how chance theories make predictions. If the goals of science are prediction and explanation, we remain unclear on how chancy theories achieve these goals.

Our actual practice with regard to this issue suggests that we implicitly accept some further supplementary principles. Possibility interacts with the goals of prediction and explanation. In particular: the more possible an outcome is, the more frequently outcomes of that type occur. If a theory assigns to *p* a high chance, it predicts that events relevantly like *p* will frequently occur when the opportunity for *p* arises. Chance, formalizing degree of possibility, then obeys these principles:

• HCE: If the chance of

*p*is high according to*T*, and*p*, then*T*explains why*p*.^{6}• HCP: If the chance of

*p*is high according to*T*, then*T*predicts that*p*, or at least counsels us to expect that*p*.

These principles reflect our tacit opinion on how chance theories predict and explain the possible outcomes in their purview. So, if a probability measure in some theory is used in frequency predictions, that suggests the measure is a chance function. Similarly, if an outcome has occurred, and theoretical explanations of the outcome cite the high value assigned to the outcome by some probability measure, then that indicates the measure is functioning as a chance because citing a high chance of some outcome is generally explanatory of that outcome. So, chance theories will predict and explain observed frequencies, so long as the series of repeated stable trials is sufficiently long for the theory to entail a high chance that the frequencies match the theoretical chances.

Emery (2015, 20) argues that a basic characteristic of chance is that it explains frequency: a probability function is a chance function if high probability explains high frequency. Assuming that high frequency of an outcome type can explain why an instance of that type occurred, HCE is a consequence of her explanatory criterion for chance.

## 2.4 Classical and Propensity Chances

Two issues should be addressed before we see some chance theories that illustrate the themes of this section. The first concerns the modal aspect of chances. I’ve suggested
(p. 423)
that chance theories involve a space of possible outcomes, and that chances measure possibilities. These are both core commitments of the *classical theory of chance*. But that theory adds a further element; namely, that “equally possible” outcomes should be assigned the same chance:

The theory of chance consists in reducing all the events of the same kind to a certain number of cases equally possible, that is to say, to such as we may be equally undecided about in regard to their existence, and in determining the number of cases favorable to the event whose probability is sought.

(Laplace 1951, 6–7)

This further element is not now widely accepted as part of the theory of chance. It is presupposed in many textbook probability problems that do not explicitly specify a probability function: for example, we might be told the numbers of red and black balls in an urn, and asked a question about the probability of drawing two red balls in a row with replacement. The only way to answer such a question is to assume that drawing each ball is a case “equally possible,” so that the facts about the number of different kinds of ball determine the probability. The assumption seems right in this case, though that appears to have more to do with the pragmatics of mathematics textbooks than with the correctness of the classical theory (the pragmatic maxim being that one ought to assume a uniform distribution over elementary outcomes by default, and accordingly mention the distribution explicitly only when it is non-uniform over such outcomes). But the classical theory fails to accommodate any chance distribution in which elementary outcomes, atoms of the outcome space, are not given equal probability, such as those needed to model weighted dice. Since the space of outcomes for a weighted die is the same as for a fair die, chance cannot supervene on the structure of the space of outcomes in the way that the classical theory envisages. Although chance has a modal dimension, it is unlike alethic modals, where the modal status of a proposition as necessary, or possible, can be read straight off the structure of the space of possible outcomes. And, for that reason, the classical theory of chance is untenable. That is not to say that symmetries play no role in the theory of chance; it is clearly plausible in many cases that empirically detected symmetries in the chance setup and the space of outcomes play a role in justifying a uniform distribution (North 2010; Strevens 1998). But the fact that symmetrical outcomes, where they exist, can justify equal probabilities for those outcomes falls well short of entailing that they always exist, or that they always justify equal probabilities.

The other issue I wish to take up here concerns the formal aspect of chances. I’ve assumed that chances must meet the probability axioms, and the pressing question on that assumption was how to discriminate chances from other mathematically similar quantities. But having picked on the connections between chance, possibility and frequency to characterize the functional role of chance, a further question now arises: *must something that plays this functional role be a probability?*

Given that chances predict and explain outcome frequencies in repeated trials, an attractive thought is that they manage to do so by being grounded in some feature of the trials constituting a tendency for the setup involved to cause a certain outcome. It is standard to call such a tendency a *propensity* (Giere 1973; Popper 1959). The idea that
(p. 424)
chances reflect the strength of a causal tendency both promises to explain the modal character of chances—because if a chance setup has some tendency to produce an outcome, it is not surprising that the outcome should be a possible one for that setup—and also to provide illumination concerning other aspects of the chance role. The STP, for example, turns out to be a direct consequence of requiring that repeated trials be alike in their causal structure.

This invocation of tendencies or propensities is implicitly conditional: a propensity for a certain outcome, given a prior cause. So it is unsurprising that many propensity accounts of chance take the fundamental notion to be conditional, recovering unconditional chances (if needed) in some other way (Popper 1959; see also Hájek 2003). The chance of complications from a disease could be understood as the chance of the complication, given that someone has the disease; or, to put it another way, the propensity for the complication to arise from the disease. The idea of conditional probability antedates the explicit stipulative definition given earlier, and causal propensities seem promising as a way of explicating the pre-theoretical notion.

But as Humphreys pointed out (1985; see also Milne 1986), grounding conditional chances in conditional propensities makes it difficult to understand the existence of non-trivial chances without such underlying propensities. Yet such chances are ubiquitous, given that chances are probabilities. The following theorem—*Bayes’ theorem*—is an elementary consequence of the definition of conditional probability:

We say that *p* is *probabilistically independent* of *q* if $P\left(p|q\right)=P\left(p\right)$. It follows from this definition and Bayes’ theorem that *p* is probabilistically independent of *q* if and only if *q* is independent of *p*. If *p* is causally prior to *q*, then although there may be a propensity for *p* to produce *q*, there will generally be no propensity for *q* to produce *p*. If there is no propensity for *q* to produce *p*, then the chance of *p* given *q* should be just the unconditional chance of *p* because the presence of *q* makes no difference: *p* should be probabilistically independent of *q*. But it will then follow that *q* is independent of *p* too, even though there is a propensity for *p* to produce *q*. Humphreys offers an illustrative example: “heavy cigarette smoking increases the propensity for lung cancer, whereas the presence of (undiscovered) lung cancer has no effect on the propensity to smoke” (Humphreys 1985, 559). In this case, causal dependence is not symmetric. There is a causal dependence of lung cancer on smoking—a propensity for smoking to produce lung cancer. This propensity grounds a chancy dependence of lung cancer on smoking. But there is no causal dependence of smoking on lung cancer; so, on the propensity view, there should not be a dependence in the chances. These two results are not compatible if chances are probabilities because probabilistic dependence is symmetrical:
(p. 425)

a necessary condition for probability theory to provide the correct answer for conditional propensities is that any influence on the propensity which is present in one direction must also be present in the other. Yet it is just this symmetry which is lacking in most propensities.

(Humphreys 1985, 559)

Humphreys uses this observation in arguing that chances aren’t probabilistic, because conditional chances are grounded in conditional propensities. But even conceding that some chances are grounded in propensities, many are not. Consider the conditional chance of rolling a 3 with a fair die, given that an odd number was rolled. There is no causal influence between rolling an odd number and rolling a three—whatever dependence exists is constitutive, not causal. These conditional chances exist without any causal propensity to ground them. Given this, the lack of causal dependence of smoking on lung cancer, and the corresponding absence of a propensity, cannot entail that there is no chance dependence between those outcomes. We could retain the role of causal tendencies in explaining certain conditional chances without violating the probability calculus. To do so, we must abandon the idea that every conditional chance should be grounded in a causal tendency. Retaining the formal condition that chances be probabilities certainly puts us in a better position to understand why it is probability functions that appear in our best scientific theories, rather than whatever mathematical theory of “quasi-probabilities” might emerge from executing Humphreys’s program of re-founding the theory of chances on causal propensities. (For an example of what such a theory might look like, see Fetzer 1981, 59–67.) Many successful theories presuppose that chances are probabilities, and dropping that presupposition involves us in a heroic revisionary project that we might, for practical reasons, wish to avoid if at all possible.

## 2.5 Examples of Chance in Scientific Theory

I’ve suggested that theories involve chances only when they involve a probability assignment to a theoretically possible outcome that is used by that theory in prediction and explanation of outcome frequencies. So understood, a number of current scientific theories involve chances. Examination of the grounds for this claim involves unavoidable engagement with the details of the theories; and interestingly enough, the way that different theories implement a chance function does not always follow a common pattern. I give more details in the longer online version of this chapter.

What is common to implementations of chance in physics, from quantum mechanics to classical statistical mechanics, is the use of a *possibility space* (see Rickles, this volume: §3.1). Regions of this space are possible outcomes, and the (one appropriately normalized) volumes of such regions get a natural interpretation as chances if the theory in question gives the appropriate theoretical role to such volumes. What we see in quantum and classical theories is that an assignment of numbers that is formally a probability follows more or less directly from the basic postulates of the theories and that
(p. 426)
those numbers are essentially used by those theories to predict and explain outcomes—including frequency outcomes—and thus play the chance role.^{7}

So, in quantum mechanics, the wavefunction—which represents the quantum state of a system at some time—determines a probability, via the Born rule, for every possible outcome (region of the quantum possibility space), and it is those probabilities that give the theory its explanatory and predictive power because it is those probabilities that link the unfamiliar quantum state with familiar macroscopic measurement outcomes. These probabilities satisfy the role of chance and so are chances. Many questions remain about how precisely the measurement outcomes relate to the possibility space formalism, but no answer to such questions suggests that a quantum explanation of a measurement outcome might succeed even without citing a probability derived using the Born rule (Albert 1992; Ney 2013; Wallace 2011).

The story is similar in classical statistical mechanics (Albert 2000, 35–70; Meacham 2005; Sklar 1993). The interesting wrinkle in that case is that, although it postulates probabilities that play an explanatory role—especially with respect to thermodynamic phenomena—and that cannot be excised from the theory without crippling its explanatory power (Eagle 2014, 149–154; Emery 2015, §5), classical statistical mechanics is not a *fundamental* theory. Whereas thermodynamic “generalisations may not be fundamental … nonetheless they satisfy the usual requirements for being laws: they support counterfactuals, are used for reliable predictions, and are confirmed by and explain their instances” (Loewer 2001, 611–612). The statistical mechanical explanation of thermodynamics doesn’t succeed without invoking probability, and those probabilities must therefore be chances.

## 2.6 The Question of Determinism

Chances arise in physics in two different ways: either from fundamentally indeterministic dynamics or derived from an irreducibly probabilistic relationship between the underlying physical state and the observable outcomes, as in statistical mechanics—and, indeed, in many attractive accounts of quantum mechanics, such as the Everett interpretation (Greaves 2007; Wallace 2011). But many have argued that this second route does not produce genuine chances and that indeterministic dynamical laws are the only way for chance to get in to physics:

To the question of how chance can be reconciled with determinism … my answer is: it can’t be done. … There is no chance without chance. If our world is deterministic there are no chances in it, save chances of zero and one. Likewise if our world somehow contains deterministic enclaves, there are no chances in those enclaves.

(Lewis 1986, 118–120)

(p. 427) There is a puzzle here (Loewer 2001, §1). For it does seem plausible that if a system is, fundamentally, in some particular state, and it is determined by the laws to end up in some future state, then it really has no chance of ending up in any other state. And this supports the idea that nontrivial chances require indeterministic laws. On the other hand, the predictive and explanatory success of classical statistical mechanics, and of deterministic versions of quantum mechanics, suggests that the probabilities featuring in those theories are chances.

This puzzle has recently been taken up by a number of philosophers. Some have tried to bolster Lewis’s incompatibilist position (Schaffer 2007), but more have defended the possibility of deterministic chance. A large number have picked up on the sort of considerations offered in the last section and argue that because probabilities in theories like statistical mechanics behave in the right sort of way, they are chances (Emery, 2015; Loewer 2001). Often these sorts of views depend on a sort of “level autonomy” thesis, that the probabilities of nonfundamental theories are nevertheless chances because the nonfundamental theories are themselves to a certain degree independent of the underlying fundamental physics and so cannot be trumped by underlying determinism (Glynn 2010; List and Pivato 2015; Sober 2010).

It is difficult, however, to justify the autonomy of higher level theories because every occurrence supposedly confirming them supervenes on occurrences completely explicable in fundamental theories. If so, the underlying determinism will apparently ensure that only one course of events can happen consistent with the truth of the fundamental laws and that therefore nontrivial probabilities in higher level theories won’t correspond to genuine possibilities. If chance is connected with real possibility, these probabilities are not chances: they are merely linked to epistemic possibility, consistent with what we know of a system. In responding to this argument, the crucial question is: is it possible for determinism to be true and yet more than one outcome be genuinely possible? The context-sensitivity of English modals like *can* and *possibly* allows for a sentence like *the coin can land heads* to express a true proposition even while the coin is determined to land tails as long as the latter fact isn’t contextually salient (Kratzer 1977). We may use this observation, and exploit the connection between chance ascriptions and possibility ascriptions, to resist this sort of argument for incompatibilism (Eagle 2011*a*) and continue to regard objective and explanatory probabilities in deterministic physics as chances.

3 Probabilities of Theories

Many claims in science concern the evaluation, in light of the evidence, of different theories or hypotheses: that general relativity is better supported by the evidence than its nonrelativistic rivals, that relative rates of lung cancer in smokers and nonsmokers confirm the hypothesis that smoking causes cancer, or that the available evidence isn’t decisive between theories that propose natural selection to be the main driver of population
(p. 428)
genetics at the molecular level and those that propose that random drift is more important. Claims about whether evidence *confirms, supports*, or is *decisive between* hypotheses are apparently synonymous with claims about how probable these hypotheses are in light of the evidence.^{8} How should we understand this use of probability?

One proposal, due to Carnap, is to invoke a special notion of “inductive probability” in addition to chance to explain the probabilities of theories (Carnap 1955, 318). His inductive probability has a “purely logical nature”—the inductive probability ascribed to a hypothesis with respect to some evidence is entirely fixed by the formal syntax of the hypothesis and the evidence. Two difficulties with Carnap’s suggestion present themselves. First, there are many ways of assigning probabilities to hypotheses relative to evidence in virtue of their form that meet Carnap’s desiderata for inductive probabilities, and no one way distinguishes itself as uniquely appropriate for understanding evidential support. Second, and more importantly, Carnap’s proposal leaves the cognitive role of confirmation obscure. Why should we care that a theory has high inductive probability in light of the evidence, *unless* having high inductive probability is linked in some way to the credibility of the hypothesis?

Particularly in light of this second difficulty, it is natural to propose that the kind of probability involved in confirmation must be a sort that is directly implicated in how credible or believable the hypothesis is relative to the evidence. Indeed, we may go further: the hypothesis we ought to accept outright—the one we ought to make use of in prediction and explanation—is the one that, given the evidence we now possess, we are most confident in. Confirmation should therefore reflect confidence, and we ought to understand confirmation as implicitly invoking probabilistic levels of confidence: what we might call *credences*.

## 3.1 Credences

A credence function is a probability function, defined over a space of propositions: the belief space. Unlike the case of chance, propositions in the belief space are not possible outcomes of some experimental trial. They are rather the possible objects of belief; propositions that, prior to inquiry, might turn out to be true. An agent’s credence at a time reflects their level of confidence then in the truth of each proposition in the belief space.

There is no reason to think that every proposition is in anyone’s belief space. But since a belief space has a probability function defined over it, it must meet the algebraic conditions on outcome spaces, and must therefore be closed under negation and disjunction. This ensures that many unusual propositions—arbitrary logical compounds of propositions the agent has some confidence in—are the objects of belief. It also follows from the logical structure of the underlying algebra that logically equivalent propositions are (p. 429) the same proposition. The laws of probability ensure that an agent must be maximally confident in that trivial proposition which is entailed by every other. Since every propositional tautology expresses a proposition entailed by every other, every agent with a credence function is certain of all logical truths. (Note: they need not be certain of a given sentence that it expresses a tautology, but they must be certain of the tautology expressed.) These features make having a credence function quite demanding. Is there any reason to think that ordinary scientists have them, that they may be involved in confirmation?

The standard approach to this question certainly accepts that it is psychologically implausible to think that agents explicitly represent an entire credence function over an algebra of propositions. But a credence function need not be explicitly represented in the brain to be the right way to characterize an agent’s belief state. If the agent behaves in a way that can be best rationalized on the basis of particular credences, that is reason to attribute those credences to her. Two sorts of argument in the literature have this sort of structure. One, given mostly by psychologists, involves constructing empirical psychological models of cognitive phenomena, which involves assigning probabilistic degrees of belief to agents. The success of those models then provides evidence that thinkers really do have credences as part of their psychological state (Perfors 2012; Perfors et al. 2011).

## 3.2 Credences from Practical Rationality

The second sort of argument, given mostly by philosophers and decision theorists, proposes that certain assumptions about rationality entail that, if an agent has degrees of confidence at all, these must be credences. The rationality assumptions usually invoked are those about *rational preference* (although recent arguments exist that aim to avoid appealing to practical rationality: see Joyce 1998). If an agent has preferences between options that satisfy certain conditions, then these preferences can be represented as maximizing subjective expected utility, a quantity derived from a credence function and an assignment of values to outcomes. Any such *representation theorem* needs to be supplemented by an additional argument to the effect that alternative representations are unavailable or inferior, thus making it plausible that an agent really has a credence function when they can be represented as having a credence function. These additional arguments are controversial (Hájek 2008; Zynda 2000).

There are many representation theorems in the literature, each differing in the conditions it imposes on rational preference (Buchak 2013; Jeffrey 1983; Maher 1993; Savage 1954). Perhaps the simplest is the *Dutch book argument* (de Finetti 1937; Ramsey 1926). The argument has two stages: first, show that one can use an agent’s preferences between options of a very special kind—*bets*—to assign numbers to propositions that reflect degrees of confidence; and, second, show that those numbers must have the structure of probabilities on pain of practical irrationality—in particular, that non-probabilistic credences commit an agent to evaluating a set of bets that collectively guarantee a sure loss as fair.

(p. 430) Ramsey suggests the fundamental irrationality of such a valuation lies in its inconsistent treatment of equivalent options. Such a valuation

would be inconsistent in the sense that it violated the laws of [rational] preference between options… if anyone’s mental condition violated these laws, his choice would depend on the precise form in which the options were offered him, which would be absurd.

(Ramsey 1926, 78)

The Dutch book argument, like other representation theorems, is suggestive but far from conclusive. But even philosophers can avail themselves of the psychological style of argument for credences, arguing that, because approaching epistemic rationality via credences allows the best formal systematization of epistemology and its relation to rational action, that is reason to suppose that rational agents have credences:

A remarkably simple theory—in essence, three axioms that you can teach a child—achieves tremendous strength in unifying our epistemological intuitions. Rather than cobbling together a series of local theories tailored for a series of local problems—say, one for the grue paradox, one for the raven paradox, and so on—a single theory in one fell swoop addresses them all. While we’re at it, the same theory also undergirds our best account of rational decision-making. These very successes, in turn, provide us with an argument for probabilism: our best theory of rational credences says that they obey the probability calculus, and that is a reason to think that they do.

(Eriksson and Hájek 2007, 211).

## 3.3 Bayesian Confirmation Theory

The basic postulate of *Bayesian confirmation theory* is that evidence *e* confirms hypothesis *h* for *A* if *A*’s credence in *h* given *e* is higher than their unconditional credence in *h*: ${C}_{A}\left(h|e\right)>{C}_{A}\left(h\right)$ That is, some evidence confirms a hypothesis just in case the scientist’s confidence in the hypothesis, given the evidence, is higher than in the absence of the evidence. Note that, by Bayes’ theorem, *e* confirms *h* if and only if ${C}_{A}\left(e|h\right)>{C}_{A}\left(e\right)$ if the evidence is more likely given the truth of the hypothesis than otherwise. In answering the question of what scientists should believe about hypotheses and how those beliefs would look given various pieces of evidence, we do not need a separate “logic of confirmation”: we should look to conditional credences in hypotheses on evidence.

This is overtly agent-sensitive. Insofar as confirmation is about the regulation of individual belief, that seems right. But although it seems plausible as a sufficient condition on confirmation—one should be more confident in the hypothesis, given evidence confirming it—many deny its plausibility as a necessary condition. Why should it be that evidence can confirm only those hypotheses that have their credence level boosted by the evidence? For example, couldn’t it be that evidence confirms a hypothesis of which I am already certain (Glymour 1981)?

(p. 431) Some Bayesians respond by adding additional constraints of rationality to the framework. Credences of a rational agent are not merely probabilities: they are probabilities that also meet some further condition. It is fair to say that explicit constructions of such further conditions have not persuaded many. Most have made use of the classical principle of indifference (see Section 2.4), but such principles are overly sensitive to how the hypothesis is presented, delivering different verdicts for equivalent problems (Milne 1983; van Fraassen 1989). It does not follow from the failure of these explicit constructions that anything goes: it just might be that, even though only some credences really reflect rational evaluations of the bearing of evidence on hypotheses, there no recipe for constructing such credences without reference to the content of the hypotheses under consideration. For example, it might be that the rational credence function to use in confirmation theory is one that assigns to each hypothesis a number that reflects its “intrinsic plausibility … prior to investigation” (Williamson 2000). And it might also be that, when the principle of indifference is conceived as involving epistemic judgments in the evaluation of which cases are “equally possible,” which give rise to further epistemic judgments about equal credences, it is far weaker and less objectionable (White 2009).

Moreover, there are problems faced by “anything goes” Bayesians in explaining why agents who disagree in credences despite sharing all their evidence shouldn’t just suspend judgment about the credences they ought to have (White 2005). The reasonable thing to do in a case like that, it’s suggested, is for the disagreeing agents to converge on the same credences, adopt indeterminate credences, or do something like that—something that involves all rational agents responding in the same way to a given piece of evidence. But how can Bayesians who permit many rational responses to evidence explain this? This takes a more pressing form too, given the flexibility of Bayesian methods in encompassing arbitrary credences: why is it uniquely rational to follow the scientific method (Glymour 1981; Norton 2011)? The scientific method is a body of recommended practices designed to ensure reliable hypothesis acceptance (e.g., prefer evidence from diverse sources as more confirmatory or avoid ad hoc hypotheses as less confirmatory). The maxims of the scientific method summarize techniques for ensuring good confirmation, but if confirmation is dependent on individual credence, how can there be one single scientific method? The most plausible response for the subjective Bayesian is to accept the theoretical possibility of a plurality of rational methods but to argue that current scientific training and enculturation in effect ensure that the credences of individual scientists do by and large respond to evidence in the same way. If the scientific method can be captured by some constraints on credences, and those constraints are widely endorsed, and there is considerable benefit to being in line with community opinion on confirmation (as there is in actual scientific communities), that is a prudential reason for budding scientists to adopt credences meeting those constraints.

Whether we think that scientists have common views about how to respond to evidence as a matter of a priori rationality or peer pressure doesn’t matter. What ultimately matters is whether the Bayesian story about confirmation delivers plausible cases where, either from the structure of credences or plausible assumptions about “natural” or widely shared priors, we can derive that the conditional credence in *h* given *e* will exceed
(p. 432)
the unconditional credence if and only if *e* intuitively confirms *h*. Howson and Urbach (1993, 117–164) go through a number of examples showing that, in each case, there is a natural Bayesian motivation for standard principles of scientific methodology (see also Horwich 1982). I treat some of their examples in the online version of this chapter (see also Sprenger, this volume). The upshot is that Bayesian confirmation theory, based on the theory of credences, provides a systematic framework in which proposed norms of scientific reason can be formulated and evaluated and that vindicates just those norms that do seem to govern actual scientific practice when supplied with the kinds of priors it is plausible to suppose working scientists have.

## 3.4 The Problem of Old Evidence

The Bayesian approach to confirmation faces further problems. In the online version of this chapter, I also discuss apparent difficulties with the Bayesian justification of abductive reasoning. But here I will only discuss the so-called *problem of old evidence*.

The problem stems from the observation that “scientists commonly argue for their theories from evidence known long before the theories were introduced” (Glymour 1981). In these, cases there can be striking confirmation of a new theory, precisely because it explains some well-known anomalous piece of evidence. The Bayesian framework doesn’t seem to permit this: if *e* is old evidence, known to be the case already, then any scientist who knows it assigns it credence 1. Why? Because credence is a measure of epistemic possibility. If, for all *A* knows, it might be that ¬*e*, then *A* does not know that *e*. Contraposing, *A* knows *e* only if there is no epistemic possibility of ¬*e*. And if there is no epistemic possibility of ¬*e*, then ${C}_{A}\left(\neg e\right)=0\text{}$ and ${\text{C}}_{A}\left(e\right)=1$^{9} If ${\text{C}}_{A}\left(e\right)=1$ however, it follows from Bayes’ theorem that ${C}_{A}\left(h|e\right)={C}_{A}\left(h\right)$ and therefore *e* does not confirm *h* for anyone who already knows *e*.

The most promising line of response might be to revise orthodox Bayesianism and deny that known evidence receives credence 1.^{10} But can we respond to this objection without revising the orthodox picture? Perhaps by arguing that old evidence should not confirm because confirmation of a theory should involve an increase in confidence. But it would be irrational to increase one’s confidence in a theory based on evidence one already had: either you have already factored in the old evidence, in which case it would be unreasonable to count its significance twice, or you have not factored in the old evidence, in which case you were unreasonable before noticing the confirmation. Either
(p. 433)
way, the only reasonable position is that confirmatory increases in confidence only arise with new evidence. If there seems to be confirmation by old evidence, that can only be an illusion, perhaps prompted by imperfect access to our own credences.

## 3.5 Believing Theories

The orthodox response to the problem of old evidence, as well as the problem itself, presupposes a certain view about how the acquisition of confirmatory evidence interacts with credences. Namely: if *h* is confirmed by *e*, and one acquires the evidence that *e*, one should come to be more confident in *h*. The simplest story that implements this idea is that if *A*’s credence is *C*, and *e* is the strongest piece of evidence *A* receives, their new credential state should be represented by that function *C*^{+} such that for every proposition in the belief space $p,\text{}{C}^{+}\left(p\right)=C\left(p|e\right)$ This procedure is known as *conditionalization* because it involves taking one’s old conditional credences given *e* to be one’s new credences after finding out that *e*. If conditionalization describes Bayesian learning, then the Bayesian story about confirmation can be described as follows: *e* confirms *h* for *A* just in case *A* will become more confident in *h* if they learn *e*.

There are a number of controversial issues surrounding conditionalization, when it is treated as the one true principle for updating belief in light of evidence. For example, since it is a truth of probability theory that if *C*(*p*) = 1, then for every *q*, *C*(*p*|*q*) = 1, conditionalization can never lower the credence of any proposition of which an agent was ever certain. If conditionalization is the only rational update rule, then forgetting or revision of prior certainty in light of new evidence is never rational, and this is controversial (Arntzenius 2003; Talbott 1991).

But, regardless of whether conditionalization is always rational, it is sometimes rational. It will often be rational in the scientific case, where evidence is collected carefully and the true propositional significance of observation is painstakingly evaluated, so that certainty is harder to obtain and occasions demanding the revision of prior certainty correspondingly rarer. In scientific contexts, then, the confirmation of theory by evidence in hand will often lead to increased confidence in the theory, to the degree governed by prior conditional credence. This is how Bayesian confirmation theory (and the theory of credence) proposes to explain probabilities of scientific hypotheses.

4 The Principal Principle

## 4.1 Coordinating Chance and Credence

We’ve discussed chance in theories, and credibility of theories. Is there any way to connect the two? Lewis (1986) offered an answer: what he called the *Principal Principle* (PP)
(p. 434)
because “it seem[ed] to [him] to capture all we know about chance.” The discussion in Sections 2.2–2.3 already shows that we know more about chance than this; nevertheless the PP is a central truth about the distinctive role of credence about chance in our cognitive economy.

We need some notation. Suppose that *C* denotes some reasonable initial credence function, prior to updating in the light of evidence. Suppose that $\u27e6P(p)=x\text{}\u27e7$ denotes the proposition that the real chance of *p* is *x*. That is, it is the proposition that is true if the true theory *t* is such that the *t*-chance of *p* is *x*. Suppose that *e* is some admissible evidence, evidence

whose impact on credence about outcomes comes entirely by way of credence about the chances of those outcomes.

(Lewis 1986, 92; see also Hoefer 2007, 553)

Using that notation, the PP can be expressed:

Given a proposition about chance and further evidence that doesn’t trump the chances, any reasonable conditional credence in *p* should equal the chance.^{11}

The PP doesn’t say that one should set one’s credences equal to the chances; if the chances are unknown, one cannot follow that recommendation; and yet, one can still have conditional credences recommended by PP. It does follow from the PP that (1) if you were to come to know the chances and nothing stronger and (2) you update by conditionalization from a reasonable prior credence, then your credences will match the chances.^{12}

But even before knowing the chances, the PP allows us to assign credences to arbitrary outcomes, informed by the credences we assign to the candidate scientific hypotheses. For (assuming our potential evidence is admissible and that we may suppress mention of *e*), the theorem of total probability states that, where *Q* is a set of mutually exclusive and jointly exhaustive propositions, each with non-zero credence, then for arbitrary $p,\text{}C\left(p\right)={\displaystyle \sum _{{q}_{i}\in Q}C\left(p|{q}_{i}\right)C\left({q}_{i}\right)}$ If *Q* is the set of rival scientific hypotheses about the chances (i.e., each ${P}_{x}=\u27e6P\left(p\right)=x\u27e7$), then the PP entails that $C\left(p\right)={\displaystyle \sum _{{P}_{x}\in Q}C\left(p|{P}_{x}\right)C\left({P}_{x}\right)={\displaystyle \sum _{{P}_{x}\in Q}x\text{}\text{\xb7}\text{}C\left({P}_{x}\right)}}$.

That is, the PP entails that one’s current credence in *p* is equal to one’s subjective expectation of the chance of *p*, weighted by one’s confidence in various hypotheses about
(p. 435)
the chance. You may not know which chance theory is right, but if you have a credence distribution over those theories, then the chances in those theories are already influencing your judgments about possible outcomes. Suppose, for example, you have an unexamined coin that you are 0.8 sure is fair and 0.2 sure is a double-headed trick coin. On the former hypothesis, *P*(*heads*) = 0.5; on the latter, *P*(*heads*) = 1. Applying the result just derived, $C\left(heads\right)=1\text{}\text{}\text{}\times \text{}0.2+0.5\text{}\text{}\times \text{}\text{}0.8=\mathrm{0.6.}$, even while you remain in ignorance of which chance hypothesis is correct.

The PP is a *deference* principle, one claiming that rational credence defers to chances (Gaifman 1988; Hall 2004). It has the same form as other deference principles, such as deference to expert judgment (e.g., adopt conditional credences in rain tomorrow equal to the meterologist’s credences), or to more knowledgeable agents (e.g., adopt conditional credences given *e* equal to your future credences if you were to learn *e*, the Reflection Principle of van Fraassen [1984]). It is worth noting that the proposition about chance in the PP is not a conditional credence; this seems to reflect something about chance, namely, that the true unconditional chances are worth deferring to, regardless of the (admissible) information that the agent is given (Joyce 2007).

Lewis showed that, from the PP, much of what we know about chance follows. For example, if it is accepted, we needn’t add as a separate formal constraint that chances are probabilities (Section 2.1). Suppose one came to know the chances, had no inadmissible evidence, and began with rational credences. Then, in accordance with the PP, one’s new unconditional credences in possible outcomes are everywhere equal to the chances of those outcomes. Since rational credences are probabilities, so too must chances be (Lewis 1986, 98). This and other successes of the PP in capturing truths about chance lead Lewis to claim that

A feature of Reality deserves the name of chance to the extent that it occupies the definitive role of chance; and occupying the role means obeying the [PP]…

(Lewis 1994, 489)

One major issue remains outstanding. How can we link up probabilities *in* and probabilities *of* theories? In particular: can we offer an explanation of how, when a theory makes predictions about frequency, the observation of frequencies in line with prediction is confirmatory? Of course it *is* confirmatory, as encapsulated in the direct inference principle HCP from Section 1.3. But it would be nice to offer an explanation.

A chance theory *t*, like the theory of radioactive decay, makes a prediction *f* that certain frequencies have a high chance according to the theory. Let us make a simplifying assumption that we are dealing with a collection of rival hypotheses that share an outcome space. Then, we may say that *t* predicts *f* if and only if *P _{t}
*(

*f*) is high. Notice that such a prediction doesn’t yet permit the machinery of Bayesian confirmation theory to show that the observation of

*f*would confirm

*t*. While the

*t*-chance of

*f*is high, there is no guarantee yet that anyone’s credence in

*f*given

*t*is correspondingly high. So, even (p. 436) if

*P*(

_{t}*f*) is considerably greater than

*C*(

_{A}*f*), that doesn’t yield confirmation of

*t*by

*f*for

*A*unless ${C}_{A}\left(f|t\right)\approx {P}_{t}\left(f\right).$

Suppose *t* is some theory that entails a claim about the chance of a certain frequency outcome *f*. Presumably, for any plausible *t*, the rest of *t* is compatible with that part of *t* that is about the chance of *f*; so factorize *t* into a proposition $\u27e6P\left(f\right)=x\u27e7$ and a remainder *t*′. Suppose that one doesn’t have other information *e* that trumps the chances. Then, the PP applies to your initial credence function (assuming your current credence *C* was obtained by conditionalizing on *e*), so that $C\left(f|t\right)={C}_{\text{initial}}\left(f|t\wedge e\right)={C}_{\text{initial}}\left(f|\u27e6P\left(f\right)=x\u27e7\wedge {t}^{\prime}\wedge e\right)=x$ But since $\u27e6P\left(f\right)=x\u27e7$ is a consequence of *t, P*_{t}(*f*) = *x*, and we thus have the desired equation, $C\left(f|t\right)={P}_{t}\left(f\right)$ which allows the frequency predictions of a theory to confirm it—or disconfirm it (Howson and Urbach 1993, 342–347; Lewis 1986, 106–108).

Not all is smooth sailing for the PP. Consider the case of undermining discussed in Section 2.2. There, a reductionist chance theory assigned some positive chance to a subsequent pattern of outcomes that, if it occurred, would undermine the chance theory. Since the actual chance theory *t* is logically incompatible with the chance theory that holds in the possibility where the undermining pattern exists, it is not possible for *t* to be true and for the undermining future *u* to occur. Given logical omniscience, *C*(*u*|*t*) = 0. But by the PP, $C\left(u|t\right)=C\left(u|\u27e6{P}_{t}(u)=x\u27e7\wedge e\right)=x>0$ Contradiction: the existence of undermining futures, which seems to follow from reductionism, and the PP are jointly incompatible. There is a large literature examining the prospects for the PP and for reductionism and canvassing various related principles that are not susceptible to this problem (Hall 1994; Ismael 2008; Joyce 2007; Lewis 1994). I discuss some of this further in the online version of this chapter.

## Suggested Reading

Briggs (2010) covers material similar to that found in this chapter, and some readers may find a second opinion useful. Implicit in much of the chapter were references to various substantive theories of the truth-conditions for probability claims, historically known as “interpretations” of probability: Hájek (2012) contains a much fuller and more explicit account of the various positions on this issue. Lewis (1986) is the classic account of the Principal Principle, the relationship between credence and chance. The foundations of the theory of credences can be found in a wonderful paper (Ramsey 1926); the best account of the application of the theory of credences to confirmation is Howson and Urbach (1993). There are many textbooks on the philosophy of probability; none is bad, but one that is particularly good on the metaphysical issues around chance is Handfield (2012). But perhaps the best place to start further reading is with Eagle (2011*b*), an anthology of classic articles with editorial context: it includes the items by Ramsey, Lewis, and Howson and Urbach just recommended.

## References

Albert, David Z. (1992). *Quantum Mechanics and Experience*. (Cambridge, MA: Harvard University Press).Find this resource:

Albert, David Z. (2000). *Time and Chance*. (Cambridge, MA: Harvard University Press).Find this resource:

Arntzenius, Frank. (2003). “Some Problems for Conditionalization and Reflection,” *Journal of Philosophy* 100: 356–370.Find this resource:

Bigelow, John, Collins, John, and Pargetter, Robert. (1993). “The Big Bad Bug: What are the Humean’s Chances?” *British Journal for the Philosophy of Science* 44: 443–462.Find this resource:

Briggs, Rachael. (2010). “The Metaphysics of Chance.” *Philosophy Compass* 5: 938–952.Find this resource:

Buchak, Lara. (2013). *Risk and Rationality*. (Oxford: Oxford University Press).Find this resource:

Carnap, Rudolf. (1955). *Statistical and Inductive Probability*. (Brooklyn: Galois Institute of Mathematics and Art). Reprinted in Eagle (2011*b*), 317–326; references are to this reprinting.Find this resource:

de Finetti, Bruno. (1937). “Foresight: Its Logical Laws, Its Subjective Sources.” In Henry E. Kyburg, Jr., and Howard E. Smokler (eds.). *Studies in Subjective Probability, [1964].* (New York: Wiley), 93–158.Find this resource:

Eagle, Antony. (2011*a*). “Deterministic Chance.” *Noûs* 45: 269–299.Find this resource:

Eagle, Antony (ed.). (2011*b*). *Philosophy of Probability: Contemporary Readings*. (London: Routledge).Find this resource:

Eagle, Antony. (2014). “Is the Past a Matter of Chance?” In Alastair Wilson (ed.), *Chance and Temporal Asymmetry* (Oxford: Oxford University Press), 126–158.Find this resource:

Emery, Nina. (2015). “Chance, Possibility, and Explanation.” *British Journal for the Philosophy of Science* 66: 95–120.Find this resource:

Eriksson, Lina, and Hájek, Alan. (2007). “What Are Degrees of Belief?” *Studia Logica* 86: 183–213.Find this resource:

Fetzer, James H. (1981). *Scientific Knowledge: Causation, Explanation, and Corroboration*. (Dordrecht: D. Reidel).Find this resource:

Gaifman, Haim. (1988). “A Theory of Higher Order Probabilities.” In Brian Skyrms and William Harper (eds.), *Causation, Chance and Credence, vol. 1* (Dordrecht: Kluwer), 191–219.Find this resource:

Garber, Daniel. (1983). “Old Evidence and Logical Omniscience in Bayesian Confirmation Theory.” In John Earman (ed.), *Minnesota Studies in Philosophy of Science, volume 10: Testing Scientific Theories*. (Minneapolis: University of Minnesota Press), 99–132.Find this resource:

Giere, Ronald N. (1973). “Objective Single-Case Probabilities and the Foundations of Statistics.” In P. Suppes, L. Henkin, G. C. Moisil, et al. (eds.), *Logic, Methodology and Philosophy of Science IV*. (Amsterdam: North-Holland), 467–483.Find this resource:

Glymour, Clark. (1981). “Why I am not a Bayesian.” In C. Glymour (ed.), *Theory and Evidence*. (Chicago: University of Chicago Press), 63–93.Find this resource:

Glynn, Luke. (2010). “Deterministic Chance.” *British Journal for the Philosophy of Science* 61: 51–80.Find this resource:

Greaves, H. (2007). “Probability in the Everett Interpretation.” *Philosophy Compass* 2: 109–128.Find this resource:

Hájek, Alan. (2003). “What Conditional Probability Could Not Be.” *Synthese* 137: 273–323.Find this resource:

Hájek, Alan. (2008). “Arguments for—or Against—Probabilism?” *British Journal for the Philosophy of Science* 59(4): 793–819.Find this resource:

Hájek, Alan. (2012). “Interpretations of Probability.” In Edward N. Zalta (ed.), *The Stanford Encyclopedia of Philosophy* (Winter 2012 Edition) http://plato.stanford.edu/archives/win2012/entries/probability-interpret/.Find this resource:

(p. 438)
Hall, Ned. (1994). “Correcting the Guide to Objective Chance.” *Mind* 103: 505–518.Find this resource:

Hall, Ned. (2004). “Two Mistakes About Credence and Chance.” In Frank Jackson and Graham Priest (eds.), *Lewisian Themes*. (Oxford: Oxford University Press), 94–112.Find this resource:

Handfield, Toby. (2012). *A Philosophical Guide to Chance*. (Cambridge: Cambridge University Press).Find this resource:

Hoefer, Carl. (2007). “The Third Way on Objective Probability: A Sceptic’s Guide to Objective Chance.” *Mind* 116: 549–596.Find this resource:

Horwich, Paul. (1982). *Probability and Evidence*. (Cambridge: Cambridge University Press).Find this resource:

Howson, Colin, and Urbach, Peter. (1993). *Scientific Reasoning: The Bayesian Approach* 2nd ed. (Chicago: Open Court).Find this resource:

Humphreys, Paul. (1985). “Why Propensities Cannot be Probabilities.” *Philosophical Review* 94: 557–570.Find this resource:

Ismael, Jenann. (1996). “What Chances Could Not Be.” *British Journal for the Philosophy of Science* 47: 79–91.Find this resource:

Ismael, Jenann. (2008). “Raid! Dissolving the Big, Bad Bug.” *Noûs* 42: 292–307.Find this resource:

Ismael, Jenann. (2009). “Probability in Deterministic Physics.” *Journal of Philosophy* 106: 89–108.Find this resource:

Jeffrey, Richard C. (1983). *The Logic of Decision* 2nd ed. (Chicago: University of Chicago Press).Find this resource:

Joyce, James M. (1998). “A Nonpragmatic Vindication of Probabilism.” *Philosophy of Science* 65: 575–603.Find this resource:

Joyce, James M. (2007). “Epistemic Deference: the Case of Chance.” *Proceedings of the Aristotelian Society* 107: 187–206.Find this resource:

Kolmogorov, A. N. (1933). *Grundbegriffe der Wahrscheinlichkeitrechnung, Ergebnisse Der Mathematik und ihrer Grenzgebiete*, no. 3 (Springer, Berlin); translated as (1956). *Foundations of the Theory of Probability*, 2nd ed. (New York: Chelsea).Find this resource:

Kratzer, Angelika. (1977). “What ‘Must’ and ‘Can’ Must and Can Mean.” *Linguistics and Philosophy* 1: 337–355.Find this resource:

Laplace, Pierre-Simon. (1951). *Philosophical Essay on Probabilities*. (New York: Dover).Find this resource:

Lewis, David. (1986). “A Subjectivist’s Guide to Objective Chance.” In D. Lewis (ed.), *Philosophical Papers*, vol. 2. (Oxford: Oxford University Press), 83–132.Find this resource:

Lewis, David. (1994). “Humean Supervenience Debugged.” *Mind* 103: 473–490.Find this resource:

List, Christian, and Pivato, Marcus. (2015). “Emergent Chance.” *Philosophical Review* 124: 59–117.Find this resource:

Loewer, Barry. (2001). “Determinism and Chance.” *Studies in History and Philosophy of Modern Physics* 32: 609–620.Find this resource:

Loewer, Barry. (2004). “David Lewis’s Humean Theory of Objective Chance.” *Philosophy of Science* 71: 1115–1125.Find this resource:

Maher, Patrick. (1993). *Betting on Theories*. (Cambridge: Cambridge University Press).Find this resource:

Meacham, Christopher J. G. (2005). “Three Proposals Regarding a Theory of Chance.” *Philosophical Perspectives* 19(1): 281–307.Find this resource:

Mellor, D. H. (2005). *Probability: A Philosophical Introduction*. (London: Routledge).Find this resource:

Milne, P. (1983). “A Note on Scale Invariance.” *British Journal for the Philosophy of Science* 34: 49–55.Find this resource:

Milne, P. (1986). “Can There Be a Realist Single-Case Interpretation of Probability?” *Erkenntnis* 25: 129–132.Find this resource:

Ney, Alyssa. (2013). “Introduction.” In Alyssa Ney and David Z. Albert (eds.), *The Wave Function*. (New York: Oxford University Press), 1–51.Find this resource:

(p. 439)
North, Jill. (2010). “An Empirical Approach to Symmetry and Probability.” *Studies in History and Philosophy of Modern Physics* 41: 27–40.Find this resource:

Norton, John. (2011). “Challenges to Bayesian Confirmation Theory.” In P. S. Bandypadhyay and M. R. Forster (eds.), *Handbook of the Philosophy of Science, vol. 7: Philosophy of Statistics*. (Amsterdam: North-Holland), 391–439.Find this resource:

Perfors, Amy. (2012). “Bayesian Models of Cognition: What’s Built in After All?.” *Philosophy Compass* 7(2): 127–138.Find this resource:

Perfors, Amy, Tenenbaum, Joshua B., Griffiths, Thomas L., et al. (2011). “A Tutorial Introduction to Bayesian Models of Cognitive Development.” *Cognition* 120: 302–321.Find this resource:

Popper, Karl. (1959). “A Propensity Interpretation of Probability.” *British Journal for the Philosophy of Science* 10: 25–42.Find this resource:

Popper, Karl. (1963). *Conjectures and Refutations*. (New York: Routledge).Find this resource:

Ramsey, F. P. (1926). “Truth and Probability.” In D. H. Mellor (ed.), *Philosophical Papers, 1990*. (Cambridge: Cambridge University Press), 52–94.Find this resource:

Savage, Leonard J. (1954). *The Foundations of Statistics*. (New York: Wiley).Find this resource:

Schaffer, Jonathan. (2003). “Principled Chances.” *British Journal for the Philosophy of Science* 54: 27–41.Find this resource:

Schaffer, Jonathan. (2007). “Deterministic Chance?” *British Journal for the Philosophy of Science* 58: 113–140.Find this resource:

Sklar, Lawrence. (1993). *Physics and Chance*. (Cambridge: Cambridge University Press).Find this resource:

Sober, Elliott. (2010). “Evolutionary Theory and the Reality of Macro Probabilities.” In E. Eells and J. H. Fetzer (eds.), *The Place of Probability in Science*. (Dordrecht: Springer), 133–161.Find this resource:

Strevens, Michael. (1998). “Inferring Probabilities from Symmetries.” *Noûs* 32: 231–246.Find this resource:

Talbott, W. J. (1991). “Two Principles of Bayesian Epistemology.” *Philosophical Studies* 62: 135–150.Find this resource:

Thau, Michael. (1994). “Undermining and Admissibility.” *Mind 103*: 491–503.Find this resource:

van Fraassen, Bas C. (1984). “Belief and the Will.” *Journal of Philosophy* 81: 235–256.Find this resource:

van Fraassen, Bas C. (1989). *Laws and Symmetry*. (Oxford: Oxford University Press).Find this resource:

von Mises, Richard. (1957). *Probability, Statistics and Truth*. (New York: Dover).Find this resource:

Wallace, David. (2011). *The Emergent Multiverse*. (Oxford University Press).Find this resource:

White, Roger. (2005). “Epistemic Permissiveness.” *Philosophical Perspectives* 19: 445–459.Find this resource:

White, Roger. (2009). “Evidential Symmetry and Mushy Credence.” In T. S. Gendler and J. Hawthorne (eds.), *Oxford Studies in Epistemology*. (Oxford: Oxford University Press), 161–186.Find this resource:

Williamson, Timothy. (2000). *Knowledge and Its Limits*. (Oxford: Oxford University Press).Find this resource:

Zynda, Lyle. (2000). “Representation Theorems and Realism About Degrees of Belief.” *Philosophy of Science* 67: 45–69.Find this resource:

## Notes:

(^{1})
I use *theory* as a catchall term for scientific hypotheses, both particular and all-encompassing; I hereby cancel any implication that theories are *mere* theories, not yet adequately supported by evidence. Such well-confirmed theories as general relativity are still theories in my anodyne sense.

(^{2})
In the same vein is the stronger (assuming that the laws entail the chances) “Realization Principle” offered by Schaffer (2007, 124): that when the chance of *p* is positive according to *T*, there must be another world alike in history, sharing the laws of *T*, in which *p*.

(^{3})
Since reductionists don’t accept that chance can float free from the pattern of occurrences, independent restriction to a world with the same chances is redundant where it is not—as in the case above—impossible.

(^{4})
Here “*P _{tw}
*” denotes the probability function at time

*t*derived from the laws of

*w*.

(^{5})
Stepping back from the earlier remarks, the STP may be satisfied by processes that produce merely *exchangeable* sequences of outcomes (de Finetti 1937). In these, the probability of an outcome sequence of a given length depends only on the frequency of outcome types in that sequence and not their order. For example, sampling *without replacement* from an urn gives rise to an exchangeable sequence, although the chances change as the constitution of the urn changes. If a “perfect repetition” involves drawing from an urn of the same prior constitution, then this may be a stable trial even though successive draws from the urn have different probabilities. It is possible to prove a law of large numbers for exchangeable sequences to the effect that almost all of them involve outcome frequencies equal to the chances of those outcomes.

(^{6})
In the sense of *explains* on which *T explains why p* doesn’t entail or presuppose *T*.

(^{7})
Indeed Ismael (2009) argues that every testable theory comes equipped with a probability function which enables the theory to make predictions from partial information, and that such probability functions are perfectly objective features of the content of the theory.

(^{8})
Given the focus of this chapter, I am setting aside those philosophical views that deny that claims about support or confirmation can be helpfully understood using probability (Popper 1963). See Sprenger (this volume) for more on confirmation.

(^{9})
This argument is controversial. Because someone can reasonably have different fair betting rates on *p* and *q*, even when they know both, the betting dispositions interpretation of credence does not require that each known proposition has equal credence, so they needn’t each have credence 1.

(^{10})
Other revisionary responses include claiming that something is learned and confirmatory: namely, that *h* entails *e* (Garber 1983). Note that an agent who genuinely learns this is subject to Dutch book before learning it (since they are uncertain of a trivial proposition) and so irrational; it cannot be a general account of confirmation by old evidence for rational agents.

(^{11})
Lewis’s own formulation involves reference to time-dependency of chance, a reference that, in my view, is a derivative feature of the dependence of chance on the physical trial and that I suppress here for simplicity (Eagle 2014).

(^{12})
The PP is a schema that holds for any proposition of the form $\u27e6P\left(p\right)=x\u27e7$ not just the true one. So, whatever the chances might be, there is some instance of PP that allows updating on evidence about the actual chances.