Prev: A6 Next: A8
Back to lecture 13 page: Link

# Warning: this is a draft, please do not start until the homework is announced on Canvas


# A7 Assignment Instruction

📗 Enter your ID (the wisc email ID without @wisc.edu) here: and click (or hit the "Enter" key)
📗 You can also load from your saved file
and click .
📗 If the questions are not generated correctly, try refresh the page using the button at the top left corner.
📗 The official deadline is August 5, late submissions within one week will be accepted without penalty.
📗 The same ID should generate the same set of questions. Your answers are not saved when you close the browser. You could either copy and paste or load your program outputs into the text boxes for individual questions or print all your outputs to a single text file and load it using the button at the bottom of the page.
📗 Please do not refresh the page: your answers will not be saved.
📗 You should implement the algorithms using the mathematical formulas from the slides. You can use packages and libraries to preprocess and read the data and format the outputs. It is not recommended that you use machine learning packages or libraries, but you will not lose points for doing so.
📗 Please report any bugs on Piazza.

# Warning: please enter your ID before you start!


📗 (Introduction) In this programming homework, you will use genetic algorithm to train a neural network to control a simplified version of Flappy Bird (Wikipedia, similar to this project: Link. Your neural network will have two inputs (horizontal and vertical distances to the center of the next obstacle or pipe), and one output (whether to flap). You should use a single hidden layer and you can decide the number of hidden units.

📗 (Part 1) Make sure you can simulate the environment correctly. The obstacles (pipes in the original game) are \(h\) = ? units apart, with a gap of \(g\) = ? units for the bird to fly through at a random position between \(0\) and \(100\). For simplicity, assume the horizontal thickness of the obstacle is \(0\) (this makes the game and the geometry significantly easier). The birds move down (due to gravity) by \(d\) = ? units, moves up (when the action flap is used) by \(u\) = ? units, and forward \(1\) unit every frame. For simplicity, you can allow the birds to fly above the top or below the bottom of the screen.

📗 (Part 1) Manually create a policy function to generate a training set to train a few neural networks using gradient descent. You can create multiple different training sets or randomly sample subsets from a large training set to get different networks.

📗 (Part 2) Start with the random networks from Part 1 and compute the total distance traveled minus the distance to the center of the obstacle. Use this value as the fitness measure for genetic algorithm.

📗 (Part 2) Cross-over the networks by randomly swapping weights and biases of the networks. Choose the cross-over probabilities based on the fitness measures. You can allow cross-over between two copies of the same network (which means the same network will be in the next generation).

📗 (Part 2) Randomly mutate the networks with small probabilities by multiplying a dividing the weights and biases by a random number between 0 and 0.5.

📗 (Part 2) Repeat the process many times until the best network can pass through all obstacles.

You can play a simulation of the game environment here (or use it to generate sample data):


Click to restart the game (and clear data):
Distance to next obstacle: horizontal: , vertical:
Score: current distance: , fitness (after game ends):
Obstacle centers:
Features: horizontal: , vertical:
Actions:
Combined data (row 1 is feature 1, row 2 is feature 2, row 3 is action):

📗 Note: if you are interested in reinforcement learning, you can also train the neural network using policy gradient methods similar to Link

# Question 1 (Part 1)

📗 [5 points] Given the following centers for the obstacles and the sequence of actions (0 means no flap, 1 means flap), compute the input features (horizontal and vertical distances to the center of the next obstacle) for every frame: \(t\) lines, \(2\) numbers each line, rounded to 2 decimal places.
➩ Centers:
➩ Actions:
Hint




# Question 2 (Part 1)

📗 [2 points] Train a neural network (either gradient descent or some machine learning package) to fit the action sequence from Question 1. Enter the first layer weights here: \(3\) lines, \(n\) numbers each line, rounded to 4 decimal places, first line for feature 1 weights (horizontal distance), second line for feature 2 weights (vertical distance), and the last line contains the bias terms.
Hint




# Question 3 (Part 1)

📗 [2 points] (Continue from Question 2) Enter the second layer weights here: \(n + 1\) numbers in one line, rounded to 4 decimal places, the last number is the bias for the output unit.
Hint




# Question 4 (Part 1)

📗 [10 points] Evaluate your network from Question 2 and Question 3 based on the obstacle centers from Question 1. If you trained your network correctly, your answer to this question should be the same as the action sequence in Question 1 (minor differences is okay). \(t\) integers (0 or 1) in one line. 
Hint




# Question 5 (Part 1)

📗 [5 points] Compute the fitness value of the above action sequence. Enter a single integer.
Hint
Answer:

# Question 6 (Part 2)

📗 [2 points] Use genetic algorithm to train a network and find the best network in the last iteration. Enter the first layer weights here: \(3\) lines, \(n\) numbers each line, rounded to 4 decimal places, first line for feature 1 weights (horizontal distance), second line for feature 2 weights (vertical distance), and the last line contains the bias terms.
Hint




# Question 7 (Part 2)

📗 [2 points] (Continue from Question 2) Enter the second layer weights here: \(n + 1\) numbers in one line, rounded to 4 decimal places, the last number is the bias for the output unit.
Hint




You can simulate the game using your network here:



# Question 8 (Part 2)

📗 [10 points] Evaluate your network from Question 6 and Question 7 based on the obstacle centers from Question 1. \(t\) integers (0 or 1) in one line.
Hint




# Question 9 (Part 2)

📗 [40 points] Compute the fitness value of the above action sequence. Enter a single integer. This question is worth 40 because it is graded based (1) consistency with the previous 3 questions, (2) performance of your network, the higher the fitness value, the higher your grade.
Hint



# Question 10

📗 [1 points] Please confirm that you are going to submit the code on Canvas under Assignment A7, and make sure you give attribution for all blocks of code you did not write yourself (see bottom of the page for details and examples).
I will submit the code on Canvas.

# Question 11

📗 [1 points] Please enter any comments and suggestions including possible mistakes and bugs with the questions and the auto-grading, and materials relevant to solving the question that you think are not covered well during the lectures. If you have no comments, please enter "None": do not leave it blank.
📗 Answer: .

# Grade


 * * * *

 * * * * *

# Submission


📗 Please do not modify the content in the above text field: use the "Grade" button to update.
📗 Warning: grading may take around 10 to 20 seconds. Please be patient and do not click "Grade" multiple times.


📗 You could submit multiple times (but please do not submit too often): only the latest submission will be counted. 
📗 Please also save the text in the above text box to a file using the button or copy and paste it into a file yourself . You can also include the resulting file with your code on Canvas Assignment A7.
📗 You could load your answers from the text (or txt file) in the text box below using the button . The first two lines should be "##a: 7" and "##id: your id", and the format of the remaining lines should be "##1: your answer to question 1" newline "##2: your answer to question 2", etc. Please make sure that your answers are loaded correctly before submitting them.



📗 Saving and loading may take around 10 to 20 seconds. Please be patient and do not click "Load" multiple times.

# Solutions

📗 The sample solution in Java and Python will be posted on Piazza around the deadline. You are allowed to copy and use parts of the solution with attribution. You are allowed to use code from other people (with their permission) and from the Internet, but you must and give attribution at the beginning of the your code. You are allowed to use large language models such as GPT4 to write parts of the code for you, but you have to include the prompts you used in the code submission. For example, you can put the following comments at the beginning of your code:
% Code attribution: (TA's name)'s A7 example solution.
% Code attribution: (student name)'s A7 solution.
% Code attribution: (student name)'s answer on Piazza: (link to Piazza post).
% Code attribution: (person or account name)'s answer on Stack Overflow: (link to page).
% Code attribution: (large language model name e.g. GPT4): (include the prompts you used).
📗 You can get help on understanding the algorithm from any of the office hours; to get help with debugging, please go to the TA's office hours. For times and locations see the Home page. You are encouraged to work with other students, but if you use their code, you must give attribution at the beginning of your code.





Last Updated: July 03, 2024 at 12:23 PM