Genetic Algorithm Fundamentals Basic Concepts Notes
Genetic Algorithms are a family of computational models inspired by evolution. These algorithms encode a potential solution to a specific problem on a simple chromosome-like data structure and apply recombination operators to these structures as as to preserve critical information. Genetic algorithms are often viewed as function optimizer, although the range of problems to which genetic algorithms have been applied are quite broad.
An implementation of genetic algorithm begins with a population of (typically random) chromosomes. One then evaluates these structures and allocated reproductive opportunities in such a way that these chromosomes which represent a better solution to the target problem are given more chances to `reproduce’ than those chromosomes which are poorer solutions. The ‘goodness’ of a solution is typically defined with respect to the current population.
Many human inventions were inspired by nature. Artificial neural networks is one example. Another example is Genetic Algorithms (GA). GAs search by simulating evolution, starting from an initial set of solutions or hypotheses, and generating successive “generations” of solutions. This particular branch of AI was inspired by the way living things evolved into more successful organisms in nature. The main idea is survival of the fittest, a.k.a. natural selection.
A chromosome is a long, complicated thread of DNA (deoxyribonucleic acid). Hereditary factors that determine particular traits of an individual are strung along the length of these chromosomes, like beads on a necklace. Each trait is coded by some combination of DNA (there are four bases, A (Adenine), C (Cytosine), T (Thymine) and G (Guanine). Like an alphabet in a language, meaningful combinations of the bases produce specific instructions to the cell.
Changes occur during reproduction. The chromosomes from the parents exchange randomly by a process called crossover. Therefore, the offspring exhibit some traits of the father and some traits of the mother. A rarer process called mutation also changes some traits. Sometimes an error may occur during copying of chromosomes (mitosis). The parent cell may have -A-C-G-C-T- but an accident may occur and changes the new cell to -A-C-T-C-T-. Much like a typist copying a book, sometimes a few mistakes are made. Usually this results in a nonsensical word and the cell does not survive. But over millions of years, sometimes the accidental mistake produces a more beautiful phrase for the book, thus producing a better species.
In nature, the individual that has better survival traits will survive for a longer period of time. This in turn provides it a better chance to produce offspring with its genetic material. Therefore, after a long period of time, the entire population will consist of lots of genes from the superior individuals and less from the inferior individuals. In a sense, the fittest survived and the unfit died out. This force of nature is called natural selection.
The existence of competition among individuals of a species was recognized certainly before Darwin. The mistake made by the older theorists (like Lamarck) was that the environment had an effect on an individual. That is, the environment will force an individual to adapt to it. The molecular explanation of evolution proves that this is biologically impossible. The species does not adapt to the environment, rather, only the fittest survive.
To simulate the process of natural selection in a computer, we need to define the following: A representation of an individual At each point during the search process we maintain a “generation” of “individuals.” Each individual is a data structure representing the “genetic structure” of a possible solution or hypothesis. Like a chromosome, the genetic structure of an individual is described using a fixed, finite alphabet. In GAs, the alphabet 0, 1 is usually used. This string is interpreted as a solution to the problem we are trying to solve.
For example, say we want to find the optimal quantity of the three major ingredients in a recipe (say, sugar, wine, and sesame oil). We can use the alphabet 1, 2, 3 …, 9 denoting the number of ounces of each ingredient. Some possible solutions are 1-1-1, 2-1-4, and 3-3-1
As another example, the traveling salesperson problem is the problem of finding the optimal path to traverse, say, 10 cities. The salesperson may start in any city. A solution is a permutation of the 10 cities: 1-4-2-3-6-7-9-8-5-10.
As another example, say we want to represent a rule-based system. Given a rule such as “If color=red and size=small and shape=round then object=apple” we can describe it as a bit string by first assuming each of the attributes can take on a fixed set of possible values. Say color=red, green, blue, size=small, big, shape=square, round, and fruit=orange, apple, banana, pear. Then we could represent the value for each attribute as a sub-string of length equal to the number of possible values of that attribute. For example, color=red could be represented by 100, color=green by 010, and color=blue by 001. Note also that we can represent color=red or blue by 101, and any color (i.e., a “don’t care”) by 111. Doing this for each attribute, the above rule might then look like: 100 10 01 0100. A set of rules is then represented by concatenating together each rule’s 11-bit string. For another example see page 620 in the textbook for a bit-string representation of a logical conjunction.
Genetic algorithm vocabulary
Explanation of Genetic Algorithm terms:
Genetic Algorithms Explanation
Chromosome(string, individual) Solution (coding)
Genes (bits) Part of solution
Locus Position of gene
Alleles Values of gene
Phenotype Decoded solution
Genotype Encoded solution
The Canonical Genetic Algorithm
Genetic Algorithms are search algorithms that are based on concepts of natural selection and natural genetics.Genetic algorithm was developed to simulate some of the processes observed in natural evolution, a process that operates on chromosomes (organic devices for encoding the structure of living being). The genetic algorithm differs from other search methods in that it searches among a population of points, and works with a coding of parameter set, rather than the parameter values themselves. It also uses objective function information without any gradient information. The transition scheme of the genetic algorithm is probabilistic, whereas traditional methods use gradient information.Because of these features of genetic algorithm, they are used as general purpose optimization algorithm. They also provide means to search irregular space and hence are applied to a variety of function optimization, parameter estimation and machine learning applications.
The working principle of a canonical GA is illustrated in Fig. 1. The major steps involved are the generation of a population of solutions, finding the objective function and fitness function and the application of genetic operators. These aspects are described briefly below. They are described in detail in the following subsection.
n important characteristic of genetic algorithm is the coding of variables that describes the problem. The most common coding method is to transform the variables to a binary string or vector; GAs perform best when solution vectors are binary.If the problem has more than one variable, a multi-variable coding is constructed by concatenating as many single variables coding as the number of variables in the problem. Genetic Algorithm processes a number of solutions simultaneously. Hence, in the first step a population having P individuals is generated by pseudo random generators whose individuals represent a feasible solution. This is a representation of solution vector in a solution space and is called initial solution. This ensures the search to be robust and unbiased, as it starts from wide range of points in the solution space.
In the next step, individual members of the population are evaluated to find the objective function value. In this step, the exterior penalty function method is utilized to transform a constrained optimization problem to an unconstrained one. This is exclusively problem specific. In the third step, the objective function is mapped into a fitness function that computes a fitness value for each member of the population. This is followed by the application of GA operators.
To illustrate the working principles of GAs, an unconstrained optimization problem is considered. Let us consider following maximization problem,
where, and are the lower and upper bound the variable can take. Although a maximization problem is considered here, a maximization problem can also be handled using GAs. The working of GAs is completed by performing the following tasks.
In order to use GAs to solve the above problem (equation 1), variables ‘s are first coded in some string structures. It is important to mention here that the coding of the variables is not absolutely necessary. There exist some studies where GAs are directly used on the variables themselves, but here these exceptions are ignored and the working principles of a simple genetic algorithm is discussed.