University of Wisconsin - MadisonCS 540 Lecture NotesC. R. Dyer

Genetic Algorithms (Chapter 4.1.4)


Evolution

Many human inventions were inspired by nature. Artificial neural networks is one example. Another example is Genetic Algorithms (GA). GAs search by simulating evolution, starting from an initial set of solutions or hypotheses, and generating successive "generations" of solutions. This particular branch of AI was inspired by the way living things evolved into more successful organisms in nature. The main idea is survival of the fittest, a.k.a. natural selection.

A chromosome is a long, complicated thread of DNA (deoxyribonucleic acid). Hereditary factors that determine particular traits of an individual are strung along the length of these chromosomes, like beads on a necklace. Each trait is coded by some combination of DNA (there are four bases, A (Adenine), C (Cytosine), T (Thymine) and G (Guanine). Like an alphabet in a language, meaningful combinations of the bases produce specific instructions to the cell.

Changes occur during reproduction. The chromosomes from the parents exchange randomly by a process called crossover. Therefore, the offspring exhibit some traits of the father and some traits of the mother.

A rarer process called mutation also changes some traits. Sometimes an error may occur during copying of chromosomes (mitosis). The parent cell may have -A-C-G-C-T- but an accident may occur and changes the new cell to -A-C-T-C-T-. Much like a typist copying a book, sometimes a few mistakes are made. Usually this results in a nonsensical word and the cell does not survive. But over millions of years, sometimes the accidental mistake produces a more beautiful phrase for the book, thus producing a better species.

Natural Selection

In nature, the individual that has better survival traits will survive for a longer period of time. This in turn provides it a better chance to produce offspring with its genetic material. Therefore, after a long period of time, the entire population will consist of lots of genes from the superior individuals and less from the inferior individuals. In a sense, the fittest survived and the unfit died out. This force of nature is called natural selection.

The existence of competition among individuals of a species was recognized certainly before Darwin. The mistake made by the older theorists (like Lamarck) was that the environment had an effect on an individual. That is, the environment will force an individual to adapt to it. The molecular explanation of evolution proves that this is biologically impossible. The species does not adapt to the environment, rather, only the fittest survive.

Simulated Evolution

To simulate the process of natural selection in a computer, we need to define the following:

With the above defined, one way to define a Genetic Algorithm is as follows:

proc GA(Fitness, theta, n, r, m)
    ; Fitness is the fitness function for ranking individuals
    ; theta is the fitness threshold, which is used to determine
    ;   when to halt
    ; n is the population size in each generation (e.g., 100)
    ; r is the fraction of the population generated by crossover (e.g., 0.6)
    ; m is the mutation rate (e.g., 0.001)

    P := generate n individuals at random
    ; initial generation is generated randomly

    while max Fitness(hi) < theta do
	   i
      ; define the next generation S (also of size n)

      Reproduction step: Probabilistically select
      (1-r)n individuals of P and add them to S intact, where
      the probability of selecting individual hi is
      Prob(hi) = Fitness(hi) / SUM Fitness(hj)
			        j

      Crossover step: Probabilistically select rn/2 pairs
      of individuals from P according to Prob(hi)

      foreach pair (h1, h2), produce two offspring by applying
	 the crossover operator and add these offspring to S
      
      Mutate step: Choose m% of S and randomly invert one
	 bit in each

      P := S

    end_while

    Find b such that Fitness(b) = max Fitness(hi)
				   i
    return(b)

end_proc

Conclusion

Genetic Algorithms are easy to apply to a wide range of problems, from optimization problems like the traveling salesperson problem, to inductive concept learning, scheduling, and layout problems. The results can be very good on some problems, and rather poor on others.

If only mutation is used, the algorithm is very slow. Crossover makes the algorithm significantly faster.

GA is a kind of hill-climbing search; more specifically it is very similar to a randomized beam search. As with all hill-climbing algorithms, there is a problem of local maxima. Local maxima in a genetic problem are those individuals that get stuck with a pretty good, but not optimal, fitness measure. Any small mutation gives worse fitness. Fortunately, crossover can help them get out of a local maximum. Also, mutation is a random process, so it is possible that we may have a sudden large mutation to get these individuals out of this situation. (In fact, these individuals never get out. It's their offspring that get out of local maxima.) One significant difference between GAs and hill-climbing is that, it is generally a good idea in GAs to fill the local maxima up with individuals. Overall, GAs have less problems with local maxima than back-propagation neural networks.


Copyright © 1996-2003 by Charles R. Dyer. All rights reserved.