Fine tuning, total evidence and indexicals
This is a follow-up of the previous post.
The debate on fine tuning and multiverses hinges on complex issues related to Bayesian reasoning. An influential argument from White in particular seems to show that we cannot infer the existence of the multiverse from our evidence of fine tuning. (White’s argument is apparently the main reason Philip Goff rejects the multiverse hypothesis, since most of his examples come from this particular paper.)
The argument rests on the requirement of total evidence that White illustrates with this example:
Suppose I’m wondering why I feel sick today, and someone suggests that perhaps Adam got drunk last night. I object that I have no reason to believe this hypothesis since Adam’s drunkenness would not raise the probability of me feeling sick. But, the reply goes, it does raise the probability that someone in the room feels sick, and we know that this is true, since we know that you feel sick, so the fact that someone in the room feels sick is evidence that Adam got drunk.
This reasoning is silly: the fact that Adam got drunk makes it more likely that someone got sick at the party, but not that it was me. Focusing on weak evidence instead of strong evidence can lead us astray in our inferences.
How is this supposed to apply to fine tuning? Our strong evidence is “our universe is fine-tuned”. From this, we infer the weak evidence “at least one universe is fine-tuned”. The multiverse hypothesis explains that at least one universe is fine-tuned: if there are many universes, the probability that at least one of them is fine-tuned is much higher than if there is only one. But according to White, this does not explain why our universe in particular is fine-tuned. Doing so would mean falling prey to an inverse gambler fallacy (see previous post).
White’s argument is fallacious, and I’ll explain why by using toy models.
Bayesianism
Let me first note that I take Bayesian inference to tell us how likely a model is given our evidence, or how much we should boost our confidence in this model.
A model can be probabilistic, in which case it incorporates probabilistic processes. Then the model predicts that our evidence E should occur with probability p. We can use Bayes’s theorem to “inverse” this probability and infer the likelihood of the model itself given our evidence.
This rationale is not devoid of problems (it requires that we fix prior likelihoods for our models: how?), but detail will not matter much here. For the sake of this article, we can just consider the following weaker principle:
A model M1 is favoured (or “boosted”) by a piece of evidence E compared to another model M2 if M1 makes E more likely than M2 does.
[Side note]
In real scientific practice, the evidence is statistical, because we want our models to have a certain level of generality. This cannot be the case with fine-tuning: we only have one piece of evidence associated with the value of the constants of our universe.
Strictly speaking, this is a problem for the argument for the multiverse. Why should we assign a probability space to the possible values of constants? Why not simply say that our evidence favours the model with actual constants?
The point is that constants are primitive in a theory, but the problem of fine-tuning prompts us to explain their values, which is tantamount to speculating about what future theories would tell us about them. There is no reason to think that future theories will associate the value of physical constants with a probabilistic process (although it’s not excluded).
I personally think that this is enough to discard the problem: just wait for a future theory before speculating. However, this is not the issue that I want to discuss here, so let us grant that there’s a problem that can be framed in a probabilistic setting.
[End side note]
Illustration
It will be easier to first illustrate the issue with a mundane case analogous to fine-tuning. This will shed light on how exactly the requirement of total evidence and the inverse gambler fallacy are involved.
Mary and John are infertile. Their only way to have children is IVF. In their case, the success rate of IVF is 10%. The law of their country stipulates that a couple can only try IVF twice, with a three-year interval between the two trials. They can try a second time even when the first trial was successful.
Let us consider three models.
In the first model, M0, Mary and John never try IVF. The model predicts with 100% chance that Mary will not get pregnant.
In the second model, M1, Mary and John only attempt IVF once. The model predicts a probability of 10% that Mary gets pregnant, and 90% that she doesn’t.
In the third model, M2, Mary and John attempted IVF twice. This gives a 1% chance that the two trials were successful, a 9% chance that only the first was, a 9% chance that only the second was, which means a total of 19% that Mary got pregnant at least once, and a 81% chance that she didn’t get pregnant at all.
Now let us consider three scenarios.
- First scenario:
Mary is one of Robert’s remote relative. Robert knows the procedure that Mary has to follow to get pregnant. Robert asks Mary if she knows what it is like to be pregnant. Mary answers “oh yes, I know it first hand”. From Mary’s answer, Robert learns that Mary got pregnant at least once, but he doesn’t know if it’s once or twice.
Clearly, Robert should boost his confidence in M2, because it predicts his evidence with a 19% probability instead of a 10% one for M1. M0 is completely excluded by his evidence.
To see that Robert’s reasoning is correct, we can imagine that there are 300 situations like Robert’s in the world. In 100 of these, the relative has never performed IVF. In 100 other situations, she has performed IVF once, and in 10 out of them this relative got pregnant. In the 100 last situations, the relative has performed IVF twice, and in 19 out of them, she got pregnant. Most of the situations where the relative can say “oh yes, I know it first hand” are situations where they made two attempts, so people in Robert’s situation are right to bet that they are probably in one such situation.
- Second scenario:
Alice is a doctor at the hospital. She takes care of Mary for her IVF. After a few weeks, she learns that the IVF was successful: Mary got pregnant. However, Alice has no idea whether it’s Mary’s first or second attempt. She doesn’t even know if Mary already has a child. All she knows is that this particular IVF was successful.
Should Alice conclude from the success that it’s Mary’s second attempt? After all, two attempts make success more likely. However, this would be a fallacy.
In this scenario, Alice should consider three models. The first one is M1: this is Mary’s first and only attempt. The second one is M2, complemented with the information that this is the first attempt, some kind of “you are here” sign attached to the model if you will. Call this complemented model M2′. Finally, there is M2 complemented with the information that this is the second attempt, call it M2″.
Note that M0 should not be considered at all: not because it will be discarded by Alice’s evidence, but because it does not apply to the case. When considering a situation, we should only take into account the models that are apt to represent the situation, and M0 is not one of them. Given that she assists to one of Mary’s attempt (this is the context), there is no more reason for Alice to consider M0 than there is to consider the quantum model of the hydrogen atom to represent the situation.
In this case, Alice’s evidence does not favour any of the three models. Indeed, the probability of success is 10% with M1, and it is also 10% with M2′ or M2″: 1% chance that the two trials were successful including this one, and 9% chance that only this one was.
Here we have an illustration of the requirement of total evidence. If Alice went from the strong evidence “this attempt was successful” to the weaker evidence “at least one attempt was successful”, she would mistakenly favour M2 over M1. She would be commiting an inverse gambler fallacy. She must take into account her total evidence, and therefore consider M2′ and M2″ instead of M2, and then, no model is boosted.
We can imagine 300 situations like Alice’s to see why her reasoning is correct. Assume that in 100 of them it is the patient’s first and only attempt, 10 of which will be successful, in 100 of them it is the patient’s first attempt among two, 10 of which will be successful, and in 100 of them it is the patient’s second attempt, 10 of them being successful as well. Successes are equally distributed in all three groups, so people in Alice’s situation have no reason to infer from success that their situation is among any of the three groups.
Note however, that there is a sense in which M2 is boosted in this scenario. When Alice first learns that Mary will attempt IVF, before learning about the outcome, she has some reason to favour the hypothesis that this is one of two attempts, simply because if Mary makes two attempts instead of one in her life, there is more chance that she will meet Alice. But then the focus is slightly different: we are not considering the narrow situation where an IVF attempt is made, but the larger situation where Mary could have decided or not to attempt IVF, could have met Alice or another doctor instead, etc. and we can suspect that more information should be taken into consideration in order to evaluate which of M0, M1 and M2 is boosted by the evidence (for example, the likelihood that Mary meet Alice instead of another doctor). In any case, the evidence of success plays no role in this context.
- Third scenario:
Jane is Mary and John’s daugther. She knows about the process that her parents had to follow for her conception. However, because of dramatic circumstances, she was taken away from her parents just after her birth, so she doesn’t know how many IVF attempts her parents did, and she doesn’t even know if she has a sibling.
The question is: are we closer to the first or to the second scenario?
The fact that Jane has evidence, from her own existence, that her mother got pregnant at one specific time, following one specific conception procedure, could let us think that her situation is very close to the second scenario. However, this is a mistake.
The reason is that in the second scenario, Alice first learns that Mary wishes to attempt IFV. This sets up a context for her inferences. Then she learns that the IFV was succesful: this is her evidence. But this is not the case for Jane.
What sets up the context in Jane’s case is when she learns that her parents had to follow a particular procedure to get children, without reference to one particular IVF attempt. This context does not imply that her parents made any attempt, actually. M0 is still a possible model in this context. It is her evidence that she exists that will favour one or the other model among M0, M1 and M2.
Her evidence will discard M0: the probability of our own existence is 0% with this model. It will favour M2 over M1. Her mother probably made two attempts. This raises the probability of her own existence from 10% to 20% (the case where the two attempts are succesful in M2 makes her existence twice as likely, so we get 20% instead of 19%).
Why doesn’t Jane consider our models M2′ and M2″ above? Because this would be asking a different question: whether she is her mother’s first or second child. Her evidence tells her nothing about this, and indeed, none of M1, M2′ and M2″ is boosted in this inferential context. But it also makes sense for Jane to ask how many times her parents tried IVF, and in this inferential context, her evidence favours the hypothesis that they tried twice.
To see that Jane’s reasoning is correct, we can imagine that there are many cases like Jane’s in the world. Imagine that 100 parents never attempt IVF, 100 parents attempt IVF only once, resulting in 10 children like Jane being born, and that 100 parents attempt IVF twice, resulting in 20 children like Jane being born. Twice as many children in Jane’s situation are children from parents that made two IVF attempts instead of one. So, each of these children would be right to infer that their parents probably made two attempts, because they are more likely to be among the 20 than among the 10.
So, Jane’s case is actually closer to the first scenario, despite her knowledge that her conception occurred. Her inferential context is one where she knows that her parents followed an IVF procedure once or twice, and what she learns from her own existence is not that the first attempt was successful, nor that the second attempt was successful, but only that at least one attempt (which happens to correspond to her conception) was successful.
Back to cosmology
How can we transpose our reasoning to the problem of fine-tuning?
Consider a cosmological model with only one universe U1, with a 10% probability that there is life in this universe, and another cosmological model with two universes U1 and U2, with 10% probability for each universe to contain life. This is exactly analogous to our models M1 and M2: a universe creation event is analogous to an IVF, and the development of life in this universe is analogous to a success leading to pregnancy. U1 incorporates a single universe hypothesis, and U2 a multiverse (actually, bi-universe) hypothesis.
We are in the situation of Jane. We do not know if we have “sibling” universes. We do not know if more than one universe was created. Our only evidence is that we are in a universe with life. We know that it is our universe, this universe, just like Jane knows that she is the result of her conception. However, we have no idea, under the hypothesis that the right model is M2, whether this universe would be U1 or U2 in the model. And just like in Jane’s case, our evidence that our universe contains life favours M2 over M1.
[Side note]
If there is a disanalogy between the cosmological case and the previous case, it lies in the fact that the justification we adopted (“imagine there are 300 situations like this one…”) cannot be adopted here. There is only one multiverse.
One way around this problem is to consider 200 possible situations like ours. In 100 of them, there is only one universe, and in 10 out of them, it’s a universe with life. In the 100 other situations, there are two universes. In 18 out of them, there is one universe with life, and in 1 there are two universes with life. Any possible population of a universe with life would be right to assume that they live in a multiverse, since this is the case for 20 possible universes among 30.
Is this fallacious reasoning? Perhaps, but then any probabilistic reasoning about multiverses will be fallacious, including White’s (see the side note in the introduction). But if we accept that probabilistic reasoning makes sense in this context, we should accept that the multiverse hypothesis is boosted by our evidence.
[End side note]
White insists that we must take into account our evidence that this universe exists, and that this fact is part of our background knowledge in Bayesian inference. The way he models this in his article amounts to pointing to one of the two universes in M2 and declare “this is our universe”. In essence, White asks us to consider the model M2′ as the right representant of the multiverse hypothesis instead of the model M2. And he remarks that this model is not favoured by our evidence, which is right. But M2′ is not the right model.
M2′ would correspond to our situation if we had some means of identifying our universe with respect to other universes in the model. This would be the case, for example, if we were there before the creation of our own universe, just like Alice was there before Mary’s pregnancy. We could have pointed to our universe and say “let’s call this one U1, and let’s see if it contains life”. Then we would have applied our two models, M1 and M2′, and see that none of the models is favoured by our evidence that U1 eventually contains life.
But that’s not how it works. There might be something that identifies our universe with respect to the other ones in the multiverse (perhaps they are ordered in a time sequence, just like Mary’s IVFs) but this is not some information to which we have access. We do not have the indexical information that White thinks we have that would allow us to locate ourselves in the multiverse.
We could use models like M2′ and M2″ if we wanted to know “is U1 in the model our universe, or is it U2?”, in the way Jane could wonder “am I the first or the second child?”, and as expected, our evidence would not favour any of the two hypotheses. But this is not the question that we want to ask in the context of fine-tuning. What we want to know is whether there are other universes. This is why we should consider M2 as the right model of the multiverse instead of M2′ or M2″. The fact that our situation resembles much more that of Jane than that of Alice is enough to make this point.
The difference between M2′ and M2 is that the former comes with a “you are here” sign. Scientific models rarely come with a “you are here” sign, and theoretical models never do. They aim at generality. They’re interested in types, not in particulars, and cosmology, despite its peculiar object, the universe, is no exception to the rule. It applies to the universe the same methods that worked for other types of system, thus implicitly considering that our universe is just a representant of a certain type of physical object (modelled, for example, like an infinite gas, without any specification of our own position).
Now philosophers, drawn by metaphysical considerations, may wonder which “you are here” cosmological model is favoured by our evidence, taking into account our best scientific theories. It’s always nice to see metaphysicians taking into account the outcome of science in their reflection, but it would be better if they were paying due respect to what these models aim at before asking the wrong questions. In this regard, they should acknowledge that science has never been in the business of vindicating “you are here” models.
Commentaires