Sunday, September 28, 2014

Conservation of Information in Evolutionary Search - Talk by William Dembski - part 4

For an introduction to this post, take a look here.

Part 4: 31' 25" - 45' 00"

( I had to pause at 45', there is such an elementary mistake in Dembski's math, it was just to funny...)

Topics: What is Conservation of Information?

William Dembski: Now let us get to the heart of things "Conservation of Information". What is that conservation? Let me put on the next slide.

William Dembski: This is probably the most gem-packed slide in this talk. I want to make a distinction between -what I call - probable and improbable events, and probable and improbable searches. An improbable event is just something that is high in improbability: flip a coin a thousand times, get a thousand heads in a row. Highly improbable. It happens: if you believe in a multi-universe, then there is a universe where this is happening, where someone like me is speaking, my double-ganger flips a coin over the next hour and sees 1000 heads in a row. Probable and improbable search, that is where what is the probability that a search is successful. It is not so much asking whether it actually succeeds, it is not concerned with the result. It is concerned with the probability distribution associated with the search. This is an important distinction because so many intelligent design arguments look for a discontinuity in the evolutionary process. We look for highly improbable events. Such as the intelligent design people: you get for instance Thomas Nagel's "Mind and Cosmos". He is basically looking at probabilistic miracles. Think how the origin of life undercuts a materialistic understanding of biology. So he is looking into improbable events. That is what we do when we try to find evidence for a discontinuity. What I'm doing in this talk is saying, look, I'm going to give you evolution, give you common ancestry, all of that. That is no problem. What I'm interested though is the probability of success for a search.

member of the audience: What are we searching for?

William Dembski: It is whatever the target happens to be.

member of the audience: [???] Can you give an example? [???]

William Dembski: I think that is what I would challenge you on. Actually, you are jumping ahead. I will address this a little bit later. Someone like Richard Dawkins will say that the problem this METHINKS IT IS LIKE A WEASEL example is that it introduces a target, while real biology does not give us targets - and then he takes that back. I will give you a quote from that later. But I would say that the target in biology is teleology. Biological systems are teleological systems, teleological agents, that is what they produce, that is what needs to be explained. If you want to put it in terms of philosophy: there is a natural kind that becomes the target, and that is teleological agents. In fact, one of my good friends and colleagues also is here, James Barham [?], if you want to talk with him , that would be good. Give me a moment, because I want to speak to that, it will really come up.
In the computational context, it is never a problem, you are trying to solve something. Even the people who are writing these AVIDA and ev programs: for instance, in AVIDA, if you saw the article in "Nature" back in 2003 where they were arguing that this program was evolving irreducibly complex systems, they were specifically trying to get Boolean operators of a certain complexity. That was what they were rewarding. That was their target. What I describe to you now is Conservation of Information in a theoretical [???]. What we then do is we go and we look at these actually evolving systems - usually in silico - and then show where the information was put in. We have a theory, and then we show how the theory applies to these specific cases. Give me a moment - I know what you are asking. This is commonly how evolution is built, that there is supposed to be absent teleology. In fact, what I think that they do is, they are slipping it in.
Improbable search. Think of it this way: You have a got disease and two procedures you can take to get well. One has a higher probability, but maybe is more expensive. Which procedure do you want to use? You want to use the hight probability. The actual outcomes may vary: someone who takes the low-probability procedure, it may be successful, he may get lucky. And the high-probability one, he may be unlucky. But the concern is: how likely is the search to find the target. That is what we are interested in in science. Getting lucky is not a good scientific explanation. If you are doing a needle-in-the-haystack problem, try to find that needle, what are you going to do? You try to find a better search which does not make it a needle-in-the-haystack, that provides you with a high probability. That is what Dawkins does.
In METHINKS IT IS LIKE A WEASEL, he does not solve it with randomly shaking out scrabble pieces - that would be $28^{27}$, that would be $10^{40}$, that would be your waiting time on average to get to that target sequence. That becomes your waiting time, waiting time and probability are interchangeable. That would be your average waiting time to get there. Because he substitutes for blind search his Darwinian search, he gets there much faster. But the question then is: what justifies him substituting that search?... The sense that I'm getting of my presentation is that time is running, and I think, this is a good place to come in with this.
But what Dawkins does is essentially, he says: Look, there is this blind search that is hopeless, it is needle-in-the-haystack, a highly improbable search. What I'm going to do - and that is why Darwin is so great - I give you a high probability search that is going to get you there. Then he says, see, Darwin has solved all our problems. Now I think we have somebody on faculty here who has a blog "Why Evolution is True". I'd say it should be probably renamed "How Evolution is True", because the question "why evolution is true", why does this work so well, what did Dawkins do to give us this search, this Darwinian search which is supposed to work, why does it work? Because he infused it with information. That is why it works. That is where I'm going with it...
So, the distinction between probable and improbable search. We can think then of a p-search as a search that has probability p of finding the target. Next, consider that a search can itself be an object of search. What did Dawkins do with METHINKS IT IS LIKE A WEASEL? He did a search for the search. He gave us a search which then with high probability found the target sequence he is after.
This is something people in optimization do, one name for it is "hyper heuristics". You are looking at heuristics, searches, and then it is about how you choose among your heuristic. Or, if you are choosing among heuristics, you are doing a search for a search. We abbreviate that as S4S. Conservation of Information - usually abbreviated as CoI - this is probably the purpose of this talk, it is as clear a statement as you can get: If you have $p < q$ and you want to improve a p-search to a q-search - the p in Dawkins weasel is about $1:10^{40}$ - now you are going to improve this to, well, if you allow yourself 50 or 60 queries, then q is to be close to 1, that improvement requires a $p/q$-search for a q-search. What you have done is: the search for the search has become difficult. If p is very small, and q is large, than $p/q$ becomes pretty small. The search for a search becomes difficult, the search for a good search becomes difficult. If you think of Dawkins weasel, the unimodal distribution is one of many other unimodal distributions.

William Dembski: Let me give you an example. You have got an Easter Egg hunt. Standard Easter Egg. An Easter Egg that is well hidden, but it is hidden in a huge field. Blind search is being highly unlikely to find you that Easter Egg. What you are going to want is a directed search, a search which is [assisted?]. Blind search would be a lot of sampling, you may try to do an exhaustive search but you are not able to exhaust things, because your query limit does not allow you to exhaust the search space.

William Dembski: So you are going to do a directed search. What does a directed search look like? You are walking along the field, and somebody is telling you "warm", "warmer", "cold", "warmer", "warmer", "hot", "you are burning up" - and there it is. That sort of direction - "warm", "warmer", "hot" - what is that? That is information. It is information that is going into the search. Here is the question: Where is this information source? Does the information source know where it is? Is it a search for the information search? Perhaps not a search for the information search. The information source knows the answer, but the process - in this case me meandering about - is getting information. I am doing a search. Let me give you another angle on conservation of information, because I have described information as something that increases the probability [???]. Usually, you are doing negative-logarithmic transformation and then you turn information in something that becomes additive and looks more like money, which is convenient. But let us going to think of it probabilisticly. But we do pay to increase probabilities all the time.

William Dembski: If I'm playing a lottery, the more lottery tickets, the more likely I am to win.

William Dembski: But, this is in the case of a fair lottery (unlike the lotteries that the state runs), where everything what was payed in gets payed out under proper probabilistic principles, by buying more tickets, I will increase my probability of winning the lottery. But have I increased my expected gain? No. I can pay more to increase the probability of winning, but in the end, I did not gain anything. Conservation of information works like that. Let me give you perhaps the simplest example, and actually do the numbers for you.

William Dembski: We all remember "Let's Make a Deal" with Monty Hall.

William Dembski: There are three curtains with a prize behind one of the curtains. Let us say the prize is behind curtain 1. What is the probability of winning? I'm going to do this search. I have got one opportunity. That is my query limit. One opportunity, so I have got a probability of 1/3 to win this thing. But now let's say that someone comes to me and gives me a ticket:

William Dembski:It is one of these tickets. This ticket (1,1) will say "it is behind curtain 1", this one (1,2) will say "it is behind curtain 1 or curtain 2 with equal probability". From the nine possible tickets, these five will increase my probability of getting to curtain 1 and thus winning the prize. But the thing is: only five of these tickets! p is 1/3, that is the original probability, I'm now trying to bump it up to 1/2, that is q, but the actually probability of finding one of these tickets is less than that, it is 5/9, the probability is going down, it is less than p/q. This is typical for these search-for-a-search situations:

So, Dr. Dr. William Dembski does the numbers for us, for this, the simplest of all examples. $p = 1/3$ and $q= 1/2$. What? Wasn't q the probability of finding the prize while using our search strategy, i.e., $P(Choosing\,curtain\,1|Using\,one\,of\,the\,five\,tickets)$? But that is not $1/2$ as he says, it is actually $\frac{4}{5}\cdot \frac{1}{2} + \frac{1}{5} \cdot 1 = \frac{3}{5}$! And therefore, $p/q$ = $\frac{1}{3} / \frac{3}{5} = \frac{5}{9}$, exactly the probability of finding a circled ticket. No surprise here, that is how conditional probabilities work:

$p = \frac{1}{3}$=$P(Choosing\,curtain\,1)$ = $P(Choosing\,curtain\,1|Using\,one\,of\,the\,five\,tickets) \cdot P(Using\,one\,of\,the\,five\,tickets)$ + $P(Choosing\,curtain\,1|Using\,one\,of\,the\,other\,tickets) \cdot P(Using\,one\,of\,the\,other\,tickets)$ = $ \frac{3}{5} \cdot \frac{5}{9} + 0 \cdot \frac{4}{9}$=$q \cdot \frac{5}{9}$

This error is so elementary that the audience wasn't able to spot it...

I have to agree with Dembski, though: This is typical for these search-for-a-search-situations

No comments:

Post a Comment