University of Wisconsin - Madison | CS 540 Lecture Notes | C. R. Dyer |
Many human inventions were inspired by nature. Artificial neural networks is one example. Another example is Genetic Algorithms (GA). GAs search by simulating evolution, starting from an initial set of solutions or hypotheses, and generating successive "generations" of solutions. This particular branch of AI was inspired by the way living things evolved into more successful organisms in nature. The main idea is survival of the fittest, a.k.a. natural selection.
A chromosome is a long, complicated thread of DNA (deoxyribonucleic acid). Hereditary factors that determine particular traits of an individual are strung along the length of these chromosomes, like beads on a necklace. Each trait is coded by some combination of DNA (there are four bases, A (Adenine), C (Cytosine), T (Thymine) and G (Guanine). Like an alphabet in a language, meaningful combinations of the bases produce specific instructions to the cell.
Changes occur during reproduction. The chromosomes from the parents exchange randomly by a process called crossover. Therefore, the offspring exhibit some traits of the father and some traits of the mother.
A rarer process called mutation also changes some traits. Sometimes an error may occur during copying of chromosomes (mitosis). The parent cell may have -A-C-G-C-T- but an accident may occur and changes the new cell to -A-C-T-C-T-. Much like a typist copying a book, sometimes a few mistakes are made. Usually this results in a nonsensical word and the cell does not survive. But over millions of years, sometimes the accidental mistake produces a more beautiful phrase for the book, thus producing a better species.
In nature, the individual that has better survival traits will survive for a longer period of time. This in turn provides it a better chance to produce offspring with its genetic material. Therefore, after a long period of time, the entire population will consist of lots of genes from the superior individuals and less from the inferior individuals. In a sense, the fittest survived and the unfit died out. This force of nature is called natural selection.
The existence of competition among individuals of a species was recognized certainly before Darwin. The mistake made by the older theorists (like Lamarck) was that the environment had an effect on an individual. That is, the environment will force an individual to adapt to it. The molecular explanation of evolution proves that this is biologically impossible. The species does not adapt to the environment, rather, only the fittest survive.
To simulate the process of natural selection in a computer, we need to define the following:
For example, say we want to find the optimal quantity of the three major ingredients in a recipe (say, sugar, wine, and sesame oil). We can use the alphabet {1, 2, 3 ..., 9} denoting the number of ounces of each ingredient. Some possible solutions are 1-1-1, 2-1-4, and 3-3-1.
As another example, the traveling salesperson problem is the problem of finding the optimal path to traverse, say, 10 cities. The salesperson may start in any city. A solution is a permutation of the 10 cities: 1-4-2-3-6-7-9-8-5-10.
As another example, say we want to represent a rule-based system. Given a rule such as "If color=red and size=small and shape=round then object=apple" we can describe it as a bit string by first assuming each of the attributes can take on a fixed set of possible values. Say color={red, green, blue}, size={small, big}, shape={square, round}, and fruit={orange, apple, banana, pear}. Then we could represent the value for each attribute as a substring of length equal to the number of possible values of that attribute. For example, color=red could be represented by 100, color=green by 010, and color=blue by 001. Note also that we can represent color=red or blue by 101, and any color (i.e., a "don't care") by 111. Doing this for each attribute, the above rule might then look like: 100 10 01 0100. A set of rules is then represented by concatenating together each rule's 11-bit string.
For another example see page 620 in the textbook for a bit-string representation of a logical conjunction.
For example, one can give a subjective judgement from 1 to 5 for the dish prepared with the recipe 2-1-4.
Similarly, the length of the route in the traveling salesperson problem is a good measure, because the shorter the route, the better the solution.
For classification problems, the fitness function could be the percent correct classification on a given training set. For example, Fitness(i) = (correct(i))2.
A simple selection method is each individual, i, has the probability Fitness(i) / sum_over_all_individuals_j Fitness(j), where Fitness(i) is the fitness function value for individual i. This method is sometimes called fitness proportionate selection. Other selection methods have also been used, e.g., rank selection, which sorts all the individuals by fitness and the probability that an individual will be selected is proportional to its rank in this sorted list.
One potential problem that can be associated with the selection method is called crowding. Crowding occurs when the individuals that are most fit quickly reproduce so that a large percentage of the entire population looks very similar. This reduces diversity in the population and may hinder the long-run progress of the algorithm.
proc GA(Fitness, theta, n, r, m) ; Fitness is the fitness function for ranking individuals ; theta is the fitness threshold, which is used to determine ; when to halt ; n is the population size in each generation (e.g., 100) ; r is the fraction of the population generated by crossover (e.g., 0.6) ; m is the mutation rate (e.g., 0.001) P := generate n individuals at random ; initial generation is generated randomly while max Fitness(hi) < theta do i ; define the next generation S (also of size n) Reproduction step: Probabilistically select (1-r)n individuals of P and add them to S intact, where the probability of selecting individual hi is Prob(hi) = Fitness(hi) / SUM Fitness(hj) j Crossover step: Probabilistically select rn/2 pairs of individuals from P according to Prob(hi) foreach pair (h1, h2), produce two offspring by applying the crossover operator and add these offspring to S Mutate step: Choose m% of S and randomly invert one bit in each P := S end_while Find b such that Fitness(b) = max Fitness(hi) i return(b) end_proc
Genetic Algorithms are easy to apply to a wide range of problems, from optimization problems like the traveling salesperson problem, to inductive concept learning, scheduling, and layout problems. The results can be very good on some problems, and rather poor on others.
If only mutation is used, the algorithm is very slow. Crossover makes the algorithm significantly faster.
GA is a kind of hill-climbing search; more specifically it is very similar to a randomized beam search. As with all hill-climbing algorithms, there is a problem of local maxima. Local maxima in a genetic problem are those individuals that get stuck with a pretty good, but not optimal, fitness measure. Any small mutation gives worse fitness. Fortunately, crossover can help them get out of a local maximum. Also, mutation is a random process, so it is possible that we may have a sudden large mutation to get these individuals out of this situation. (In fact, these individuals never get out. It's their offspring that get out of local maxima.) One significant difference between GAs and hill-climbing is that, it is generally a good idea in GAs to fill the local maxima up with individuals. Overall, GAs have less problems with local maxima than back-propagation neural networks.
Copyright © 1996-2003 by Charles R. Dyer. All rights reserved.