Prev: L19, Next: L21

# Lecture

📗 The lecture is in person, but you can join Zoom: 8:50-9:40 or 11:00-11:50. Zoom recordings can be viewed on Canvas -> Zoom -> Cloud Recordings. They will be moved to Kaltura over the weekends.
📗 The in-class (participation) quizzes should be submitted on TopHat (Code:741565), but you can submit your answers through Form at the end of the lectures too.
📗 The Python notebooks used during the lectures can also be found on: GitHub. They will be updated weekly.


# Lecture Notes

📗 Access Links
WebDriver.find_elements("tag_name", "a") finds all the hyperlinks on the page.
➩ Use url = WebElement.get_attribute("href") to get the URL of the hyperlink, then use WebDriver.get(url) to load that page.

TopHat Discussion Follow this link: Link find the page with an image (two since there is also a UW Madison logo). What search strategy will you use?

Treasure Hunt Example
➩ Follow this link: Link find the page with an image (two since there is also a UW Madison logo). What is the image?
➩ Code to find the page with image: Notebook.
➩ Code to find the "network" of pages: Notebook.

 Infinite Graph Search
➩ If the pages are nodes, and links on one page to another page are edges, the digraph formed by pages will possibly have infinite depth and may contain cycles.
➩ To find a specific (goal) page, or to discover reachable pages from the initial page, breadth first search should be used (since depth first search may not terminate on trees with infinite depth).
➩ Since there are cycles, a list of visited pages should be kept.

TopHat Discussion
➩ Start from the "Data science" Wikipedia page: Link, following the links on the page and try to get to the "Cat" Wikipedia page: Link.
➩ The daily Wikipedia game: Link, and solution? Link.

📗 Search Heuristics
➩ A search heuristic is an estimate of how close the current node is to the goal node in the search tree.
➩ Before the start of a search, the heuristic functions may not be accurate estimates of the distances from the current node to the goal node.
➩ A heuristic that always underestimates the true distance is called an admissible heuristic.

📗 Informed Search
➩ Searching the nodes in the order according to the heuristic is called best first greedy search (GBS, since BFGS is reserved for the non-linear optimization method, Link).
➩ Since the heuristic could be incorrect, it might not find the shortest path to the goal node.
➩ Searching the nodes in the order according to the current distance from the initial node plus the heuristic is called A search (the name of the search algorithm is "A")
➩ If the heuristic is admissible, A search is called A* search (A-star search).

📗 Priority Queue
➩ For GBS search, use a Priority Queue with the priority based on the heuristic: Doc.
➩ For A search, use a Priority Queue with the priority based on current distance plus the heuristic.

Treasure Hunt with Hints Example
➩ Follow this link: Link find the page with an image (two since there is also a UW Madison logo). What is the image?
➩ Code to find the page using GBS: Notebook.
➩ Code to find the page using A: Notebook.

Additional Examples
➩ Give an example of a goal node that is reachable (finite edges away) from the initial node, but the following method cannot find the goal node. Assume the nodes are \(\left\{G, 0, 1, 2, ...\right\}\) where \(0\) is the initial node and \(G\) is the goal node, and each node has finite number of children.
(1) BFS: impossible.
(2) DFS: edges between \(\left(0, 2\right)\), \(\left(2, G\right)\), \(\left(0, 1\right)\), \(\left(1, 3\right)\), \(\left(3, 4\right)\), \(\left(4, 5\right)\), ...
(3) GBS: edges between \(\left(0, 2\right)\), \(\left(2, G\right)\), \(\left(0, 1\right)\), \(\left(1, 3\right)\), \(\left(3, 4\right)\), \(\left(4, 5\right)\), ... with \(h\left(2\right) = 1\) and .
(4) A: impossible.

➩ Suppose the following nodes are in the priority queue, {node: "A", g: 0, h: 4}, {node: "B", g: 2, h: 3}, {node: "C", g: 4, h: 2}, {node: "D", g: 6, h: 1}, where "g" represents the distance from the initial node and "h" represents the heuristic (estimated distance to the goal node). Which node will be checked next if the following informed search algorithms are used?
(1) GBS: it selects the node with the smallest h, which is "D".
(2) A: it selects the node with the smallest h + g, which is "A" with h + g = 4.


 Notes and code adapted from the course taught by Yiyin Shen Link and Tyler Caraza-Harter Link






Last Updated: January 20, 2025 at 3:11 AM