Prev: W9, Next: W11 , Practice Questions: M25 M26 , Links: Canvas, Piazza, Zoom, TopHat (744662)
Tools
📗 Calculator:
📗 Canvas:


📗 You can expand all TopHat Quizzes and Discussions: , and print the notes: , or download all text areas as text file: .

Slide:  



# Local Search Algorithms

📗 For some problems, every state is a solution, only some states are better than other states specified by a cost function (sometimes score or reward): Wikipedia.
📗 The search strategy will go from state to state, but the path between states is not important.
📗 Local search assumes similar (nearby) states have similar costs, and search through the state space by iteratively improving the costs to find an optimal state.
📗 The successor states are called neighbors (or move set).



# Hill Climbing or Valley Finding

📗 Hill climbing is the discrete version of gradient descent: Wikipedia.
➩ It starts at a random state.
➩ Move to the best neighbor (successor) state.
➩ Stop when all neighbors are worse than the current state (local minimum).
📗 Random restarts can be used to pick multiple random initial states and find the best local minimum (similar to neural network training).
📗 If there are too many neighbors, first choice hill climbing randomly generates neighbors until a better neighbor is found.



# Simulated Annealing

📗 Simulated annealing uses a process similar to heating solids (heating and slow cooling to toughen and reduce brittleness): Wikipedia.
➩ Each time, a random neighbor is generated.
➩ If the neighbor has a lower cost, move to the neighbor.
➩ If the neighbor has a higher cost, move to the neighbor with a small probability: \(p = e^{- \dfrac{\left| f\left(s'\right) - f\left(s\right) \right|}{T\left(t\right)}}\), where \(f\) is the cost and \(T\left(t\right)\) is the temperature and decreasing in \(t\).
➩ Stop until bored.
📗 Simulated annealing is a version of Metropolis-Hastings algorithm: Wikipedia.
Example
📗 The traveling salesman problem is often solved by simulated annealing: Link.



# Temperature

📗 The temperature function should be decreasing over time. They can change arithmetically or geometrically.
➩ Arithmetic sequence: for example, \(T\left(t + 1\right) = \displaystyle\max\left\{T\left(t\right) - 1, 1\right\}\).
➩ Geometric sequence: for example, \(T\left(t + 1\right) = 0.9 T\left(t\right)\).
📗 When the temperature is high: almost always accept any state.
📗 When the temperature is low: first choice hill climbing.



# Genetic Algorithm

📗 Genetic algorithm starts with a fixed population of initial states, and the successors are found through cross-over and mutation: Wikipedia.
📗 Each state in the population with \(N\) states has probability of reproduction proportional to the fitness (or negatively proportional to the costs): \(p_{i} = \dfrac{F\left(s_{i}\right)}{F\left(s_{1}\right) + F\left(s_{2}\right) + ... + F\left(s_{N}\right)}\).
📗 If the states are encoded by strings, cross-over means swapping substrings at a fixed point: for example, abcde and ABCDE cross-over at position 2 results in abCDE and ABcde.
📗 If the states are encoded by strings, mutation means randomly updating substrings with a small probability called the mutation rate: for example, abcde can be updated to abCde or aBcDe or ... with small probabilities.
TopHat Quiz (Past Exam Question) ID:
📗 [4 points] When using the Genetic Algorithm, suppose the states are \(\begin{bmatrix} x_{1} & x_{2} & ... & x_{T} \end{bmatrix}\) = , , , . Let \(T\) = , the fitness function (not the cost) is \(\mathop{\mathrm{argmax}}_{t \in \left\{0, ..., T\right\}} x_{t} = 1\) with \(x_{0} = 1\) (i.e. the index of the last feature that is 1). What is the reproduction probability of the first state: ?
📗 Answer: .




# Variants of Genetic Algorithm

📗 The parents do not survive in the standard genetic algorithm, but if  reproduction between two copies of the same states is allowed, the parents can survive.
📗 The fitness or cost functions can be replaced by the ranking.
📗 In theory, cross-over is much more efficient than mutation.
Example
📗 Many problems can be solved by genetic algorithm (but in practice, reinforcement learning techniques are more efficient and produce better policies).
➩ Walkers: Link.
➩ Cars: Link.
➩ Eaters: Link.
➩ Image: Link.



# State Representation of Neural Networks

📗 A neural network can be represented by a sequence of weights (a single state).
📗 Two neural networks can swap a subset of weights (cross-over).
📗 One neural networks can randomly update a subset of weights with small probability (mutation).
📗 Genetic algorithm can be used to train neural networks to perform reinforcement learning tasks.



📗 Notes and code adapted from the course taught by Professors Jerry Zhu, Yingyu Liang, and Charles Dyer.
📗 Please use Ctrl+F5 or Shift+F5 or Shift+Command+R or Incognito mode or Private Browsing to refresh the cached JavaScript.
📗 If you missed the TopHat quiz questions, please submit the form: Form.
📗 Anonymous feedback can be submitted to: Form.

Prev: W9, Next: W11





Last Updated: November 30, 2024 at 4:35 AM