1 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Heuristic Techniques: A Basic Introduction to Genetic Algorithms Lecture 11 March 8, 2004 Olivier de Weck Multidisciplinary System Design Optimization 2 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Heuristic Search Techniques Main Motivation for Heuristic Techniques: (1) To deal with local optima and not get trapped in them (2) To allow optimization for systems, where the design variables are not only continuous, but discrete, integer or even Boolean These techniques do not guarantee that global optimum can be found. Generally Karush-Kuhn-Tucker conditions do not apply. i x ?x i ={1,2,3,4,5}, x i ={‘A’,’B’,’C’} x i ={true, false} x J 3 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Principal Heuristic Algorithms Genetic Algorithms (Holland – 1975) – Inspired by genetics and natural selection Simulated Annealing (Kirkpatrick – 1983) – Inspired by molecular dynamics – energy minimization Particle Swarm Optimization (Eberhart and Kennedy - 1995) – Inspired by the social behavior of swarms of insects or flocks of birds These techniques all use a combination of randomness and heuristic “rules” to guide the search for global maxima or minima 4 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Today: Genetic Algorithms Genetics and Natural Selection A simple genetic algorithm (SGA) “The Genetic Algorithm Game” Encoding - Decoding (Representation) Fitness Function - Selection Crossover – Insertion - Termination Part 1 - Introduction 5 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Premise of GAs Natural Selection is a very successful organizing principle for optimizing individuals and populations of individuals If we can mimic natural selection, then we will be able to optimize more successfully A possible design of a system – as represented by its design vector x - can be considered as an individual who is fighting for survival within a larger population. Only the fittest survive – Fitness is assessed via objective function J. 6 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Matlab GA demo (“peaks”) Maximize “peaks” function Population size: 40 Generations: 20 Mutation Rate: 0.002 -Observe convergence -Notice “mutants” -Compare to gradient search 7 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Natural Selection Charles Darwin (1809-1882) Extremely controversial and influential book (1859) On the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life Observations: Species are continually developing Homo sapiens sapiens comes from ape-like stock Variations between species are enormous Huge potential for production of offspring, but only a small percentage survives to adulthood Evolution = natural selection of inheritable variations 8 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Inheritance of Characteristics Gregor Mendel (1822-1884) Investigated the inheritance of characteristics (“traits”) Conducted extensive experiments with pea plants Examined hybrids from different strains of plant tall tall tall tall tall short short short short Character (gene) for tallness is dominant Character (gene) for shortness is recessive 9 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Genetics … the great unknown …. Walter Sutton (1877-1916) Discovered that genes are part of the chromosomes inside the nucleus of cells … still unexplained … Darwin predicted continuous variation within species. In nature we often observe discontinuous variations … strinkingly different, unexpected variants sometimes appear, how to explain ??? Hugo de Varis (1848-1935) developed a theory of mutation - at first seemingly at odds with Darwinism How has this dilemma been resolved? 11 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics GA Terminology chromosome gene population Generation n Generation n+1 selection crossover insertion mutation genetic operators individuals 12 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Chromosomes Chromosome (string) gene 0 1 0 1 1 1 1 0 1 0 0 1 ….. 0 1 alleles Each chromosome represents a solution, often using strings of 0’s and 1’s. Each bit typically corresponds to a gene. This is called binary encoding. The values for a given gene are the alleles. A chromosome in isolation is meaningless - need decoding of the chromosome into phenotypic values 13 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics GA over several generations Initialize Population (initialization) Select individual for mating (selection) Mate individuals and produce children (crossover) Mutate children (mutation) Insert children into population (insertion) Are stopping criteria satisfied ? Finish y n next gen e r ati on Ref: Goldberg 14 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics “The GA Game” Ca. 15 minutes 111 212 313 4520 584 695 7642 83 9327 10 3 30 1100 200 40 6.075 GA Game Initial Population 0 2 4 6 8 10 12345678910112 Fitness Value N u m b e r of I ndi v i du a l s Generation 1: Population size: N=40 Mean Fitness: F=6.075 (Fitness F = total number of 1’s in chromosome) 0 <= F <= 12 Goal: Maximize Number of “1”s 15 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Creating a GA on Computer (1) define the representation (encoding-decoding) (2) define “fitness” function F - incorporate feasibility (constraints) and objectives (3) define the genetic operators - initialization, selection, crossover, mutation, insertion (4) execute initial algorithm run - monitor average population fitness - identify best individual (5) tune algorithm - adjust selection, insertion strategy, mutation rate 16 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Encoding - Decoding phenotype genotype Biology Design “blue eye” UGCAACCGU (“DNA” blocks) 10010011110 expression (chromosome) decoding encoding Radius R=2.57 [m] H sequencing coded domain decision domain Genetic Code: (U,C,G,A are the four bases of the nucleotide building blocks of messenger-RNA): Uracil-Cytosin- Adenin-Guanin - A triplet leads to a particular aminoacid (for protein synthesis) e.g. UGG-tryptophane 17 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Decoding 0 1 0 1 1 1 1 0 1 0 0 1 ….. 0 1 Radius (genotype) Height Coding and decoding MATLAB functions available: decode.m, encode.m E.g. binary encoding of integers: 10100011 (1*2 7 +0*2 6 +1*2 5 +0*2 4 +0*2 3 +0*2 2 +1*2 1 +1*2 0 ) 128 + 0 + 32 + 0 + 0 + 0 + 2 + 1 = 163 Material x 1 x 2 x n 18 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Binary Encoding Issues Number of bits dedicated to a particular design variable is very important. Resolution depends on: - upper and lower bounds x LB , x UB - number of bits x LB x UB ?x=(x UB- x LB )/2 nbits [G]=encode(137.56,50,150,8) G = 1 1 0 1 1 1 1 1 [X]=decode(G,50,150,8); X = 137.4510 x∈So ?x= (150-50)/2 8 = 0.39 Example Loss in precision !!! Number of bits needed: ln ln 2 UB LB xx x nbits a ? o §· ¨? ?? ? ?1 = ?? 19 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Other Encoding Schemes Not all GA chromosomes are binary strings Can use a different ALPHABET for GA coding Most common is binary alphabet {0,1} can also have - ternary: {0,1,2} {A,B,C} - quaternary: {0,1,2,3} {T,G,C,A} => biology - integer: {1,2,….13,….} -real valued: {3.456 7.889 9.112} -Hexadecimal {1,2,..,A,B,C,D,E,F} Used for Traveling Salesman Problem The set of symbols is the “alphabet” 20 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Traveling Salesman Problem Two representations of the TSP city 1 2 3 4 5 6 The arcs The ordering 1 2 3 4 5 6 6 3 1 2 4 5 1 6 5 4 2 3 Same problem, but two different chromosome representations 21 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics 13 5 A representation for the fire station location problem 1 2 3 4 6 7 8 10 9 11 12 14 1 1 0 1 0 1 0 0 0 0 1 0 0 1 0 “1” represents a fire station 22 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Fitness and Selection Probability Typically, selection is the most important and most computationally expensive step of a GA. 01001110101 decode 1 1.227 Al-7075 n x x aoa o ??? ? = ??? ? ??? ? ??? ? ## Evaluate objective function () Jf= x Map raw objective to Fitness ()F f J= F drives probability of being selected ()P selected F∝ yes no simcode 23 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Fitness Function (I) Each chromosome has a “fitness” The objective function (value) is usually mapped into the fitness of each individual For the TSP the fitness is usually the cost of the tour (time, distance, price) The fitness for the fire station problem should incorporate feasibility – Example: fitness= K- #of fire stations - # of uncovered districts 24 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Fitness Function (II) Choosing the right fitness function is very important, but also quite difficult GAs do not have explicit “constraints” Constraints can be handled in different ways: – implicitly via the fitness function – penalty for violation – via the selection operator (“reject constraint violators”) – implicitly via representation/coding e.g. only allow representations of the TSP that correspond to a valid tour Choosing the right fitness function: an important Genetic Algorithm Design Issue 25 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Maximization vs. Minimization There are many ways to convert a minimization problem to a maximization problem and vice-versa: N-obj 1/obj -obj 26 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Selection Operator (I) Goal is to select parents for crossover Should create a bias towards more fitness Must preserve diversity in the population Example: Let select the k th most fit member of a population to be a parent with probability 1 1 k PD k ? §· = ¨? ?1 () 1 jP Dj ∈ = | (1) Selection according to RANKING Better ranking has a higher probability of being chosen, e.g. 1st ∝ 1, 2nd ∝ 1/2, 3rd ∝ 1/3 ... 27 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Selection Operator (II) (2) Proportional to FITNESS Value Scheme Example: Let select the k th most fit member of a population to be a parent with probability 1 () k P Fitness k F ? =? () jP F Fitness j ∈ = | Probability of being selected for crossover is directly proportional to raw fitness score. This scheme tends to favor the fittest individuals in a population more than the ranking-scheme, faster convergence, but can also be a disadvantage. 28 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Roulette Wheel Selection Roulette Wheel Selection 1 2 3 4 5 6 Probabilistically select individuals based on some measure of their performance. Sum Sum of individual’s selection probabilities 3rd individual in current population mapped to interval [0,Sum] Selection: generate random number in [0,Sum] Repeat process until desired # of individuals selected Basically: stochastic sampling with replacement (SSR) 29 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Tournament Selection 2 members of current population chosen randomly Dominant performer placed in intermediate population of survivors Population Filled ? Crossover and Mutation form new population Old Population Fitness 101010110111 8 100100001100 4 001000111110 6 Survivors Fitness 101010110111 8 001000111110 6 101010110111 8 n y 30 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Crossover 0 1 0 1 1 1 1 0 1 1 1 1 ….. 1 1 1 1 1 0 0 1 0 0 0 1 0 1 ….. 0 0 P1 P2 O1 O2 Question: How can we operate on parents P1 and P2 to create offspring O1 and O2 (same length, only 1’s and 0’s)? crossover ? ? 31 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Crossover in Biology a b c d Crossover produces either of these results for each chromosome ac ac OR ad OR bc OR bd Child P1 P2 This is where the word crossover comes from 32 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Crossover Operator (I) Crossover (mating) is taking 2 solutions, and creating 1 or 2 more Classical: single point crossover 0 1 1 0 1 1 0 0 1 1 The parents 0 1 1 1 1 1 0 0 0 1 crossover point The children (“offspring”) P1 P2 O1 O2 33 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Crossover Operator (II) 0 1 1 0 1 1 0 0 1 1 0 1 1 1 1 1 0 0 0 1 P1 P2 C1 C2 A crossover bit “i” is chosen (deliberately or randomly), splitting the chromosomes in half. Child C1 is the 1st half of P1 and the 2nd half of P2 Child C2 is the 1st half of P2 and the 2nd half of P1 More on 1-point crossover …. i=3 i=3 [ ] 1, 1iil∈∩∈ ?l=length of chromosome l=5 34 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Crossover Operator (III) One can also do a 2-point crossover or a multi-point crossover The essential aspect is to create at least one child (solution/design) from two (or more) parent (solutions/designs) there are many solutions to do this do not necessarily have to do crossover, and do crossover with a probability P x after pairs are chosen Some crossover operations: - single point, versus multiple point crossover - path relinking - permutation operators (list operators), incl. Random keys approach 35 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Path Relinking Given Parents P1 and P2 Create a sequence of children – The first child is a neighbor of P1 – Each child is a neighbor of the previous child – The last child is a neighbor of P2 P1 P2 C1 C2 Cn ... 36 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Example: Path Relinking Parents 1 0 0 1 0 0 1 and 0 0 1 0 1 0 0 P1 P2 Children 1 0 0 1 0 0 0 1 0 0 1 10 0 1 0 0 010 0 1 0 1010 0 Create a path of children, then select the best one. Good approach, but solutions tend to be interpolations of initial population. 37 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics A na?ve crossover operation 1 2 3 4 5 Route A 1 2 3 4 5 5 3 1 2 4 Route B 1 2 3 4 5 2 4 5 3 1 Problem: Na?ve 1-point crossover does not produce a valid route. 1 2 3 4 5 5 3 1 3 1 doesn’t work !!! 38 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Clever TSP crossover rule Select a random subpath P from parent 1 Turn the subpath P into a complete tour by visiting the cities not in P in the order they appear in parent 2 Example: 3 7 8 6 1 9 2 4 5 8 2 6 9 4 5 3 1 7 Parent 1 Parent 2 8 6 1 9 2 4 5 3 7 Child subpath P 39 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Some Insertion Strategies Can replace an entire population at a time (go from generation k to k+1 with no survivors) - select N/2 pairs of parents - create N children, replace all parents - polygamy is generally allowed Can select two parents at a time - create one child - eliminate one member of population (weakest?) “Elitist” strategy - small number of fittest individuals survive unchanged “Hall-of-fame” - remember best past individuals, but don’t use them for progeny N = # of members in population if steady state 40 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Replacement schemes Replacement scheme specifies, how individuals from the parent generation k are chosen to be replaced by children from next generation k+1: replace all replace worst replace parent replace random replace most similar there are others …. Academic question: what happens in real life? 41 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Initialization Random initial population, one of many options Use random number generator to create initial population (caution with seeds !) Typically use uniform probability density functions (pdf’s) Typical goal: Select an initial population that has both quality and diversity Somehow we need to create an initial population of solutions to start the GA. How can this be done? Example: N ind - size of binary population L ind - Individual chromosome length Need to generate N ind x L ind random numbers from {0,1} round(rand(1,6)) >> 1 1 1 1 0 0 42 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics GA Convergence Typical Results generation global optimum (unknown) Converged too fast (mutation rate too small?) Average performance of individuals in a population is expected to increase, as good individuals are preserved and bred and less fit individuals die out. Average Fitness 43 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics GA Stopping Criteria Again as for other heuristics there are no clear, obvious termination criteria. Some options: X number of generations completed - typically O(100) mean deviation in performance of individuals in the population falls below a threshold σ J <x (genetic diversity has become small) Stagnation - no or marginal improvement from one generation to the next: (J n+1 -J n )<X A particular point in the search space is encountered 44 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics GAs versus traditional methods Differ from traditional search/optimization methods: GAs search a population of points in parallel, not only a single point Gas use probabilistic transition rules, not deterministic ones Gas work on an encoding of the parameter set rather than the parameter set itself Gas do not require derivative information or other auxiliary knowledge - only the objective function and corresponding fitness levels influence search 45 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics Next Lecture(s) Mutation Operator, Schema Theorem Speciality GA’s Particle Swarm Optimization (PSO) Tabu Search Selection of Optimization Algorithms Example Design Applications Heuristic Methods 2 nd part 46 ? Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Engineering Systems Division and Dept. of Aeronautics and Astronautics References Holland J., “Adaptation in natural and artificial systems”, University of Michigan Press, 1975 Goldberg, D.E.,” Genetic Algorithms in Search, Optimization and Machine Learning”, Addison Wesley, 1989 A.S. Schulz, 15.057 Systems Optimization, Course Notes. Note a number of charts in this lecture are derived from these class notes. T. Baeck, Oxford, N.Y. (1996) Evolutionary Algorithms in Theory and Practice A. Zalzala, P.J. Fleming, “Genetic Algorithms in Engineering Systems” Control Engineering Series 55, The Institution of Electrical Engineers (IEE), 1997