Genetic Algorithms (GA) are a class of optimization or search algorithms that imitate the biological process of evolution. They are a particular class of Evolutionary Algorithms that use techniques inspired by evolutionary biology such as inheritance, mutation, natural selection, and recombination (or crossover). GA use the three major principles of evolution:
- Replication + Inheritance
- Variation + Recombination
- Selection + Survival of the Fittest
A GA is based on a population of individuals. Each individual has a random number of descendants, i.e. gets copied a random number of times, and the descendants inherit properties of the parents to a certain degree. The number of descendants is proportional to its fitness. As in biological evolution, the survival of the fittest changes and shapes the population, i.e. only individals which are well adapted to the conditions of the environment survive. Therefore in the course of time the population is converging to an optimal or sub-optimal solution, which is determined by the fitness function(s).
Members of the population are set of genes or chromosomes that are represented by some form of strings (bits, numbers or characters). In the most basic form the individuals, i.e. strings, encode the solution directly and define an solution for an optimization problem. In a more advanced form the individuals are some form of "genetic" blueprint for the construction of a phenotype, which has a certain measurable fitness against some criterion. Here the phenotype determines the fitness and the amount of offspring or descendants.
The first population is initialized at random and evolves from generation to generation. In each generation the population is affected by genetic operators and selection mechanism. Genetic operators such as crossover and mutation provide information flow among chromosomes while selection promotes survival of the fittest individuals.
They were first used and invented by John H. Holland and his group around 1970. Holland proved what is called the schema theorem, showing that, "if certain patterns of strings are fitter than others, then, in the long run, the GA can't help but find them, and the average number of strings in the population matching those patterns will increase exponentially" (see ).
Initialization : Choose random initial population with high variety Repeat 1. Evaluate the individual Fitness 2. Select the best Individuals 3. Reproduce them (including Recombination and Mutation) Until (Solution Found) or (Terminating Condition)
Terminating conditions often include:
- Fixed number of generations reached
- Budgeting: allocated computation time/money used up
- An individual is found that satisfies minimum criteria
- The highest ranking individual's fitness is reaching or has reached a plateau such that successive iterations are not producing better results anymore.
- Manual inspection. May require start-and-stop ability
- Combinations of the above
As Kevin Kelly says in his book "Out of Control", the price of evolution and evolving systems is to give up control. It is hard to understand and predict incomprehensible evolved systems, they are usually neither simple nor elegant, and therefore it is nearly impossible to correct, change or fix them manually. He argues "The things we are proud of in engineering - precision, predictability, exactness, and correctness - are diluted when evolution is introduced. These have to be diluted because survivability in a world of accidents, unforeseen circumstances, shifting environments - in short, the real world - demands a fuzzier, looser, more adaptable, less precise stance" (p.310).
Thus a few of the problems are
- Even if the GA finds an optimal solution, it is often hard to understand why the result is optimal (and none of the other solutions) and how it works exactly
- The GA may converge towards local optima rather than finding the global optimum of the problem
- The problem to be solved must be represented by a list of parameters or a string of characters, and the optimal solution must be encoded by a certain fitness function. To find an appropriate population and a suitable fitness function is a first step towards a solution.
Differences to real Evolution
Contrary to real evolution, there is no natural selection in genetic algorithms, only "artificial selection". What is selected is chosen by the programmer. In true evolutionary systems and "real life" the system itself defines the criteria for what it selects, as Kevin Kelly argues in his book "Out of Control" (p.341). The selection criteria "survival of the fittest" arises naturally in biological evolution. Another difference between artificial and natural selection is a litte amount of arbitrariness and randomness: in nature, even the fittest individual could be killed by accident, disaster, misfortune or whatever, which is not the case in artificial systems.
In biological evolution, there is also a clear distinction between genotype and phenotype. Many implementations of genetic algorithms use no digital counterpart of "genes", and do not distinguish between genotype and phenotype.
- John H. Holland, Adaptation in natural and artificial systems: An introductory analysis with applications to biology, control, and artificial intelligence, University of Michigan Press, 1975, ISBN 0262581116
- Melanie Mitchell, An Introduction to Genetic Algorithms, The MIT Press, 1996, ISBN 0262631857
- Kevin Kelly, Out of Control: The New Biology of Machines, Social Systems and the Economic World, Addison-Wesley, 1994, ISBN 0201483408