Friday, October 2, 2009

Which parameters did Dawkins use for his weasel?

If R. Dawkins used a weasel algorithm as understood by virtually everyone who read his book (and tried to program it), he has two parameters to play with:
  1. the size of the population S and
  2. the probability µ to change a letter
So, which values did he use? That, I can't answer. But I can say which values he should have used:

Obviously, the greater the size of the population for a fixed µ, the smaller the number of generations it takes to find the target. On the other hand, for a fixed population size, a similar rule for µ can't be given - one probably wants to keep the parent in the next generation, so µ has to be quite small for small populations, and can be bigger for great ones.

Here, the coloured lines represent the expected values of the number of generations for different sizes of population as a function of µ. At each curve, I added the minimal expected number - and I joined these minima by a black line. So, a population size of 200 results at best in 49 generation on average - and this minimum is reached for µ ~ 0.044.

But if you want to have a fast algorithm, you'd look at the number of queries, not the number of generations, i.e., the total number of children, or how often the fitness function has to be calculated:

(Yep, I like this pic :-) Here, you can see that for getting the best results, the size of the population is surprisingly small: 20 individuals. But though it takes over 325 generations on average to reach the target, the fitness value for only ~6514 individuals has to be calculated - that is, if you chose µ ~ 0.041.
Another interesting effect: for population sizes from 20 to 200, the optimal µ doesn't change very much, it's just a little bit over 0.04.

Of course, no one does these calculations before running the program: you write your weasel and play with it. But the natural behaviour of this algorithm will lead you to population sizes from 20 to 200, and a µ of 0.04 - 0.05.

4 comments:

  1. I assume you got the plots by way of Markov chain analysis. Does "mutation" always change the character? If not, the effective mutation rate is µ * 26 / 27.

    You mentioned on my blog that you have been working in R. Would you be willing to share your code with us?

    ReplyDelete
  2. 1. Yes, I modeled a Markov chain
    2. Yes, indeed, the effective mutation rate is µ*26/27. I'm aware of this inconsistency: when I talked about Dembski's algorithm called Random Mutation, I used the effective rate, while elsewhere, I allowed for a change being not a change at all :-)
    3. Though my code is not elegant, I'll share it, of course. I'm just looking for a place to do so.

    ReplyDelete
  3. DiEb,

    Indeed, who wants to do anything to beat the dead Weasel but hack out the code. I was too lazy to do anything like generate plots. Yours are fantastic. You really do have a gift for visualization. (I've looked at your Wikipedia versus Conservapedia graphics.)

    If you send the R code to the email address in my profile, I'll gladly upload it to my website and provide you with links. They'll be good for at least two years.

    ReplyDelete
  4. By the way, the last two analyses in the IEEE SMC paper of Dembski and Marks are the (1,2)-EA and (1+1)-EA [evolutionary algorithm] solving the ONEMAX problem. I know for a fact that Marks knew the established names for the algorithms, and that he knew that are many analyses of them in the literature. Yet the article cites nothing. With five minutes of googling, I found a paper from 8 years ago that gives an exact analysis that allows for the target to drift due to mutation: Stefan Droste, "On the Analysis of the (1+1) EA for a Dynamically Changing ONEMAX." Obviously a fixed target is a special case.

    ReplyDelete