In the undirected graph, there is an edge between node 1 and node 3; therefore:
Now consider the following (directed) graph:
Note that the layout of the graph is arbitrary -- the important thing is which nodes are connected to which other nodes. So, for example, the following graph is the same as the one given above, it's just been drawn differently:
Also note that an edge can connect a node to itself; for example:
For each of the following graphs, say whether it is:
In general, the nodes of a graph represent objects and the edges represent
relationships.
Here are some examples:
In a tree, all nodes can be reached from the root node, so a tree
can be represented using two classes:
a Treenode class (used to represent each individual node),
and a Tree class that contains a pointer to the root node.
Some graphs have a similar property; i.e., there is a special "root" node
from which all other nodes are reachable (control-flow graphs often have
this property).
In that case, a graph can also be represented using a Graphnode
class for the individual nodes, and a Graph class that
contains a pointer to the root node.
However, if there is no root node, then the Graph class
needs to use some other data structure to keep track of the nodes
in the graph.
There are many possibilities: an array, a List, or a Set
of Graphnodes could be used.
The Graphnodes will contain whatever data is stored in a node (e.g.,
the name of a city, the name of a CS class, the statement represented
by a control-flow graph node).
The nodes will also contain pointers to their successors (stored
e.g., in an array, a List, or a Set).
Here's one reasonable pair of (incomplete) class definitions for
directed graphs, using ArrayLists to store the nodes in the graph
and the successors of each node:
Suppose we have a weighted graph (one in which each
edge has an associated value).
How could the class definitions given above be extended
to store the edge weights?
As discussed above,
graphs are often a good representation for problems involving objects
and their relationships because there are standard graph operations that
can be used to answer useful questions about those relationships.
Here we discuss two such operations: depth-first search and
breadth-first search, and some of their applications.
Both depth-first and breadth-first search are "orderly" ways to traverse
the nodes and edges of a graph that are reachable from some starting node.
The main difference between depth-first and breadth-first search is the
order in which nodes are visited.
Of course, since in general not all nodes are reachable from all other nodes,
the choice of the starting node determines which nodes and edges will be
traversed (either by depth-first or breadth-first search).
The basic idea of a depth-first search is to start at some node n, and then
to follow an edge out of n, then another edge out, etc,
getting as far away from n as possible before visiting any more of
n's successors.
To prevent infinite loops in graphs with cycles, we must keep track
of which nodes have been visited.
Here is the basic algorithm for a depth-first search from node n,
starting with all nodes marked "unvisited":
Note that in the example illustrated above, the order in which the nodes
are visited is: 0, 2, 3, 1, 4.
Another possible order (if node 4 were the first successor of node 0) is:
0, 4, 2, 3, 1.
To analyze the time required for depth-first search, note that one call
is made to dfs for each node that is reachable from the start node.
Each call looks at all successors of the current node, so
the time is O(# reachable nodes + total # of outgoing edges from those nodes).
In the worst case, this is all nodes and all edges, so the
worst-case time is O(N + E), where N is the number of nodes in the graph,
and E is the number of edges in the graph.
Assume that you start with all nodes "unvisited", and you do a depth-first
search.
Write a (Graph) method that sets all nodes back to "unvisited".
Recall that at the beginning of this section we said that depth-first
search can be used to answers questions about a graph such as:
The first question we will consider is: is there a path from node j to node k?
This question might be useful, for example:
Consider the example given above to illustrate depth-first search.
There is a cycle in that graph starting from
node 0.
Is there something that happens during the depth-first search that
indicates the presence of that cycle??
Note that during dfs(1), 0 is a successor of 1, but is already visited.
But that isn't quite enough to say that there's a cycle, because during
dfs(3), node 4 is a successor of 3 that has already been visited, but there
is no cycle starting from node 4.
What's the difference?
The answer is that when node 0 is considered as a successor of node 1,
the call dfs(0) is still "active" (i.e., its activation record is still
on the stack); however, when node 4 is considered
as a successor of node 3, the call dfs(4) has already finished.
How can we tell the difference??
The answer is to keep track of when a node is "inProgress" (as well as
whether it has been visited or not).
We can do this by using a "mark" field with three possible values:
Here's the code for cycle detection:
Think again about the graph that represents course prerequisites.
As long as there are no cycles in the graph
there is at least one order in which to take courses, such that
all prereqs are satisfied; i.e., so that for every course,
all prerequisites are taken before the course itself is taken.
(Note that is is reasonable to assume that there are no cycles
in a graph that represents course prerequisites,
because a cycle would mean that a course was a prerequisite for itself!)
Topological numbering can be used to find the order in which
to take the classes (so that all prereqs are satisfied first).
The goal is to assign numbers to nodes so that for every edge
j → k, the number assigned to j is less than the number assigned to k.
A topological numbering of the prerequisites graph would tell you
one legal order in which to take the CS courses.
For example:
To find a topological numbering, we use a variation of depth-first search.
The intuition is as follows: As long as there are no cycles in the graph,
there must be at least one node with no outgoing edges:
Question 1:
Give two different topological numberings for the following graph.
Question 2:
The topNum method given above only assigns numbers to the nodes reachable
from node n.
Write pseudo code for method numberGraph, similar to the code given for
method graphHasCycle above, that assigns topological
numbers to all nodes in a graph.
Question 3:
Write a Graph method isConnected, that returns true iff the graph
is connected.
Assume that every node has a list of its predecessors as well as a list
of its successors.
Some special kinds of graphs
A directed graph can also be a complete graph; in that case, there must
be an edge from every node to every other node.
Here's an example of a weighted, directed graph:
If the graph is a directed graph, also say whether it is cyclic or acyclic.
Uses for Graphs
The reason graphs are good representations in cases like those described
above is that there are many standard graph algorithms (operations on
graphs) that can be used to answer useful questions like:
Representing Graphs
class Graphnode {
// *** fields ***
private Object data;
private ArrayList successors;
// *** methods ***
...
}
class Graph {
// *** fields ***
private ArrayList nodes; // each item in the list will be a Graphnode
// *** methods ***
...
}
Graph Operations
Depth-first Search
Depth-first search can be used to answer many questions about a graph:
Information about which nodes have been visited can be kept in
the nodes themselves (e.g., using a boolean field) or, if the nodes
are numbered from 1 to N, the "visited" information can be
stored in an auxiliary array of booleans of size N.
Below is code for depth-first search, assuming that visited information
is in a node field named "visited", and that each node's successors
are in a List named "successors", and that the Graphnode class
provides the usual get/set methods to access its fields.
Note that this basic depth-first search doesn't actually do anything
except mark nodes as having been visited.
We'll see in the next section how to use variations on this code to
do useful things.
static void dfs (Graphnode n) {
n.setVisited( true );
Iterator it = n.getSuccessors().iterator();
while (it.hasNext()) {
Graphnode m = (Graphnode)it.next();
if (! m.getVisited()) dfs(m);
}
}
Here's a picture that illustrates the dfs method.
In this example, node numbers are used to denote the nodes themselves
(i.e., the call dfs(0) really means that the dfs method is called with
a pointer to the node labeled 0).
Two different colors are used to indicate the node currently being visited
and the previously visited node.
Uses for Depth-First Search
Questions 2, 3 and 5 are discussed; the others are left as exercises.
Path Detection
To answer the question, do the following:
Cycle Detection
There are two variations that might be interesting:
instead of the boolean "visited" field we've been using.
Initially, all nodes are marked "unvisited".
When the dfs method is first called for node n, it is marked "inProgress".
Once all of its successors have been processed, it is marked "done".
There is a cyclic path reachable from node n iff some node's successor
is found to be marked "inProgress" during dfs(n).
static boolean hasCycle(Graphnode n) {
n.setMark( inProgress );
Iterator it = n.getSuccessors().iterator();
while(it.hasNext()) {
Graphnode m = it.next();
if (m.getMark() == inProgress) return true;
if (m.getMark() != done) {
if (hasCycle(m)) return true;
}
}
n.setMark( done );
return false;
}
Note that if we want to know whether a graph contains a cycle anywhere (not
just one that is reachable from node n) we might have to call
hasCycle at the "top-level" more than once.
Here's a pseudo-code version
of a method of the Graph class that returns true iff there is a
cycle somewhere in the graph:
public boolean graphHasCycle() {
mark all nodes unvisited;
for each node k in the graph {
if (node k is marked unvisited) {
if (hasCycle(k)) return true;
}
}
return false;
}
Topological Numbering
These 2 situations correspond to the point in
method hasCycle where node n is marked "done" (when it has no
more unvisited successors).
We just need to keep track of the current number.
Below is a method that, given a node n and a number num,
assigns topological numbers to all unvisited nodes reachable from n, starting
with num and working down.
Note that before calling this method for the first time, all nodes should
be marked "unvisited", and that the initial call should pass N (the number of
nodes in the graph) as the 2nd parameter.
static int topNum (Graphnode n, int num) throws CycleException {
n.setMark( inProgress );
Iterator it = n.getSuccessors().iterator();
while (it.hasNext()) {
Graphnode m = it.next();
if (m.getMark() == inProgress) {
// no topological ordering for a cyclic graph!
throw new CycleException();
}
if (m.getMark() != done) num = topNum(k, num);
}
// here when n has no more successors
n.setMark( done );
n.setNumber( num );
return num-1;
}
As was the case for cycle detection, we might need several "top-level"
calls to number all nodes in a graph.