Prev: HW8, Next: HW10 

# Warning: auto-grading is not working at the moment, please wait for an announcement on Canvas.


# A9 Project Submission Checklist

📗 Regular component (out of 5) should be submitted using the "Grade" and "Submit" buttons at the bottom of the page.
➩ Submission of the text file generated by the auto-grader to Canvas Assignment A9 is optional.
➩ Due date: August 9, no submission after that will be accepted.
📗 Competition component (out of 5) text file generated using Question 9 "Generate" button should be submitted to the Canvas Assignment A9C: Link
➩ Submission of an incorrectly formatted text file and any additional files to A9C will result in a competition score of \(-\infty\).
➩ Due date: August 4, no submission after that will be accepted under any circumstances.
📗 Note: Canvas A9 and A9C due date is the recommended due date, early submissions of competitions before the recommended due date will participate in trial competitions with the option to keep the score (not ranking).
📗 Hint: example submissions, discussion session schedules, and group recommendations (very different for different assignments) can be found on Piazza: Link.

# A9 Project Instruction

📗 Enter your ID (the wisc email ID without @wisc.edu) here: and click (or hit the "Enter" key)
📗 You can also load from your saved file
and click .
📗 If the questions are not generated correctly, try refresh the page using the button at the top left corner.
📗 The same ID should generate the same set of questions. Your answers are not saved when you close the browser. You could either copy and paste or load your program outputs into the text boxes for individual questions or print all your outputs to a single text file and load it using the button at the bottom of the page.
📗 Please do not refresh the page: your answers will not be saved.
📗 You can write the code in any programming language and using any large language models. You do not have to submit your code.
📗 Please report any bugs on Piazza.

# Warning: please enter your ID before you start!


CP


📗 (Introduction) In the project, you will train a neural network to control a simplified version of Flappy Bird Wikipedia. Your neural network will have three inputs (horizontal and vertical distances to the top and bottom of the next obstacle or pipe, and one output (whether to flap). You can use one or two hidden layers with a maximum of 100 units in each layer.

📗 (Part 1) Make sure you can simulate the environment correctly. Use Q value iteration to solve for one possible optimal solution for a simplified environment.

📗 (Part 2) Apply Q value iteration to the full environment and generate a training data set for imitation learning. Train a neural network to replicate the optimal behavior.

📗 (Competition) Submit your network to control the bird in a competition game. You can choose one of three groups to compete in, and within the group, you can choose the type of bird you want to use. During the competition, the same type of bird will not clash with each other, but different types of birds can push each other.

Your four digit ID will be used to determine your group, and your emoji will be used to determine your type.

Type 1 (animal):
Type 2 (food):
Type 3 (other):

Type 1 will be able to push type 2, type 2 can push type 3, and type 3 can push type 3.

You will fly through 30 pipes, with decreasing gap size (the last few with size 1), suppose your bird flies through \(d\) of the pipes and hit the pipe at \(z\) units away from the center of the gap, then your score is,
➩ \(20 1_{\left\{d = 30\right\}} + 10 d - z\)
that is,
➩ Passing through all pipes gives a bonus of \(20\).
➩ Passing through previous pipes gives \(10\) points each.
➩ Hitting the pipes closer to the gap will lead to relatively higher scores.

Your project grade is based on your submission to this assignment (out of 5) plus your ranking in the class (out of 5):
Top 20% gets 5/5.
Next 20% gets 4/5.
Next 20% gets 3/5.
Next 20% gets 2/5.
Next 20% gets 1/5.
(The students who do not participate in the competition will be given scores of negative infinities when computing the rankings).

# Competition Simulator [DO NOT USE FOR TESTING]


You can play a simulation of the game environment here (or use it to generate sample data):


Click to restart the game (and clear data):
Distance to next obstacle: horizontal: , vertical:
Score: current distance: , fitness (after game ends):
Obstacle centers:
Features: horizontal: , vertical:
Actions:
Combined data (row 1 is feature 1, row 2 is feature 2, row 3 is action):

📗 Note: if you are interested in reinforcement learning, you can also train the neural network using policy gradient methods similar to Link

# Question 1 (Part 1)

📗 [2 points] Train a neural network (either gradient descent or some machine learning package) to fit the action sequence from Question 1. Enter the first layer weights here: \(3\) lines, \(n\) numbers each line, rounded to 4 decimal places, first line for feature 1 weights (horizontal distance), second line for feature 2 weights (vertical distance), and the last line contains the bias terms.
Hint
📗 See the MNIST assignment.
📗 You can decide the number of hidden units \(n\), but it should be at least \(4\).




# Question 2 (Part 1)

📗 [2 points] (Continue from Question 2) Enter the second layer weights here: \(n + 1\) numbers in one line, rounded to 4 decimal places, the last number is the bias for the output unit.
Hint
📗 See the MNIST assignment.




# Question 3 (Part 1)

📗 [10 points] Evaluate your network from Question 2 and Question 3 based on the obstacle centers from Question 1. If you trained your network correctly, your answer to this question should be the same as the action sequence in Question 1 (minor differences is okay). \(t\) integers (0 or 1) in one line, \(t\) is the length of the actions vector in Question 1 (i.e. compute the actions even after the bird hit an obstacle).
Hint
📗 See the MNIST assignment.




# Question 4 (Part 1)

📗 [5 points] Compute the fitness value of the above action sequence. Enter a single integer.
Hint
📗 The fitness is the x-distance traveled before hitting a pipe minus the absolute y-distance to the center of the pipe. If the current position of the bird is \(\left(x, y\right)\) and the center of the pipe is \(\left(c_{x}, c_{y}\right)\) where \(c_{x} = x\), then the fitness is \(x - \left| y - y_{c} \right|\).
Answer:

You can plot the path of your action sequence using .


# Question 5 (Part 2)

📗 [2 points] Use genetic algorithm to train a network and find the best network in the last iteration. Enter the first layer weights here: \(3\) lines, \(n\) numbers each line, rounded to 4 decimal places, first line for feature 1 weights (horizontal distance), second line for feature 2 weights (vertical distance), and the last line contains the bias terms.
Hint
📗 Start with \(N\) neural networks with random weights, or the random perturbation of the neural networks from Questions 2 and 3.
📗 Compute the fitness of the neural networks \(f_{i}\), and the reproduction probability: \(\dfrac{f_{i}}{f_{1} + f_{2} + ... + f_{N}}\).
📗 Randomly select two networks based on the reproduction probabilities, and swap the weights and biases of the networks (there are many ways to cross-over, one example is to flatten all weights and biases to a long vector, choose a random position, and swap the weights and biases of the two networks after that position).
📗 Randomly mutate each of the resulting networks (mutation probabilities should be small, and there are many ways to mutate, multiplying or dividing by a random number between 0 and 0.5 is one example, but you can also try adding or subtracting a random number).




# Question 6 (Part 2)

📗 [2 points] Enter the second layer weights here: \(n + 1\) numbers in one line, rounded to 4 decimal places, the last number is the bias for the output unit.
Hint
📗 See Question 5.




You can simulate the game using your network here:


Click to restart the game (and clear data):
Distance to next obstacle: horizontal: , vertical:
Score: current distance: , fitness (after game ends):
Obstacle centers:
Features: horizontal: , vertical:
Actions:
Combined data (row 1 is feature 1, row 2 is feature 2, row 3 is action):

# Question 7 (Part 2)

📗 [10 points] Evaluate your network from Question 6 and Question 7 based on the obstacle centers from Question 1. \(t\) integers (0 or 1) in one line, \(t\) is the length of the actions vector in Question 1 (i.e. compute the actions even after the bird hit an obstacle).
Hint
📗 Same as Question 4.




# Question 8 (Part 2)

📗 [20 points] Compute the fitness value of the above action sequence. Enter a single integer. This question is worth 20 because it is graded based (1) consistency with the previous 3 questions, (2) performance of your network, the higher the fitness value, the higher your grade.
Hint
📗 Same as Question 5.
Answer:

You can plot the path of your action sequence using .


# Question 9 (Competition)

📗 [1 points] Please use the following form to generate a text file:
➩ Wisc Net ID (the ??? in ???@wisc.edu):
➩ Group:
➩ Player Icon (text from this icon):
➩ Player ID (a number between 0 and 9999):
➩ Network First (see net):

➩ Network Second:


➩ Output file:

📗 Every student must perform training independently and submit different trained networks.
📗 Submit this file on Canvas to Assignment A?C.
📗 To get the point to this question, please check this box if you submitted the file on Canvas or decided not to participate in the competition:

# Question 10

📗 [1 points] Please list the AI tools and references you used and the names of other students and course staff you discussed the assignment or competition with. Please also enter any comments and suggestions including possible mistakes and bugs with the questions and the auto-grading. If you completed the assignment without any help (not recommended), please enter "None" and do not leave this question blank.
📗 Answer: .

# Grade


 * * * *

 * * * * *
📗 Grading may take around 5 to 10 seconds. Please be patient and do not click "Grade" multiple times.

# Submission

 
📗 Please do not modify the content in the above text field: use the "Grade" button to update.


📗 You could submit multiple times (but please do not submit too often): only the latest submission will be counted. 
📗 Please also save the text in the above text box to a file using the button or copy and paste it into a file yourself .
📗 You could load your answers from the text (or txt file) in the text box below using the button . The first two lines should be "##a: 9" and "##id: your id", and the format of the remaining lines should be "##1: your answer to question 1" newline "##2: your answer to question 2", etc. Please make sure that your answers are loaded correctly before submitting them.



📗 Saving and loading may take around 5 to 10 seconds. Please be patient and do not click "Load" multiple times.

# Presentations and Interviews

📗 Presentations and interviews are optional for the competitions.
📗 If your competition grade is 2, 3, or 4, you can book an interview with the TA for 15 to 30 minutes.
📗 Interviews can only be booked during discussion sessions on Zoom (either during the current discussion session or for a future date and time): Link. Please do not email/spam the TA.
📗 A maximum of 3 interviews can be booked per person, and in the case you need 1 point for the next letter grade, we will allow a 4th one after the final exam.
📗 During the interviews, you will give a 5 to 10 minutes presentation to explain anything you did on the project that is creative or technically challenging. Then you will answer three technical questions about your presentation or any materials related to the assignment.
➩ If you answer any one of the three questions incorrectly, you will get \(-1\).
➩ If you answer all questions correctly, and if your presentation ideas are correct, interesting, consistent with your submissions, and not done by many other students (we will make the decision after all interviews are done), you will get \(+1\). 
➩ Otherwise, your grade will not change.





Last Updated: June 26, 2026 at 3:06 AM