Finding Eulerian tours
Last updated
Last updated
Let us focus for now on solving the Eulerian tour problem, solution which can then be extended to address the route inspection problem, thus solving the assembly problem for de Bruijn graphs. First a definition:
Eulerian tour problem. Given a graph G(V, E), find a tour through the graph (traversal that starts and ends at the same node) that visits every edge of the graph exactly once.
This problem was initially analyzed by the mathematician (hence the name) who lived at the time in a city named Königsberg. This city had seven bridges across a river that broke up the city into four different neighborhoods. Euler wondered if he could start a walking tour from his house and return home after traversing all the bridges but without crossing any of them twice. The corresponding graph is shown below.
Euler observed that a necessary condition for solving the problem was for each node to have an even degree (each node has an even number of neighbors). The reasoning is that as you "walk" along the graph, once you enter a node you must have a way to exit the node. Since you can't use an edge twice, you must choose an edge you haven't seen before, thus each edge through which you entered a node must have a corresponding edge through which you can leave, leading to the even degree. If a node has an odd degree, you will either get stuck because you enter the node using the last available edge, thus cannot leave it, or you cannot return back to it because you will no longer have an available edge. As you can tell from the figure above, the graph corresponding to the city of Königsberg has only nodes with odd degrees, thus Euler was not able to enjoy a perfect walk.
The description so far has focused on undirected graphs (where the edges can be traversed in either direction), however the Eulerian tour problem can also be addressed on directed graphs such as de Bruijn graphs. Imagine, for example, that the streets in Königsburg are all one way. Instead of simply requiring that each node have even degree, we must add the additional condition that the number of in-edges is equal to the number of out-edges at each node. Or, in the street metaphor, for every one way street entering a neighborhood there should be a one way street exiting the neighborhood.
Is the condition that all nodes have odd degrees sufficient for finding an Eulerian tour? In the following section we'll describe an algorithm that demonstrates that it is.
The greedy solution to the Eulerian tour problem follows the same strategy you might also employ if walking around Königsberg.
Essentially, this algorithm picks a node that has unvisited edges adjacent to it, then traverses the graph along unvisited edges until no such edges exist. The following panels demonstrate its execution in a simple graph.
Starting with node A in the top left panel in the figure below, we proceed to node C (thickened edge), then E, then F, then return to A. Since A still has unvisited edges, we choose one of them and move to node E, which still has an available edge to D, then we proceed to C, and then B, and then return to A (bottom right panel). At this point A does not have any unvisited edges adjacent to it and we can terminate the algorithm, reporting the full tour: A, C, E, F, E, D, C, B, A.
Note, however, that we got quite lucky. An alternative traversal is shown below. Assume here that we started at node F, followed to A, then E, then returned to F (the first row of images in the figure). At this point, the algorithm is stuck as node F has no more adjacent edges that have not been visited. As described above, the algorithm now looks for a new node that has unvisited edges, for example C, and continues the traversal as highlighted in red in the figure. Now we can proceed from C to A, to B, back to C, then D, E, and back to C again, completing another cycle. We have managed to visit every edge in the graph, exactly once, however we have not been able to do so with one cycle.
It turns out that there's a simple solution to this problem. We can select one of the nodes where the two cycles meet, then "stitch" the two cycles together by swapping the order in which we visit the edges. Take node A for example. In the original tour, we visited this node coming from F, then going to E, along the black tour, then coming from C and proceeding to B in the red tour. We can swap the traversal, by going to B after coming from F, thus breaking the black cycle and proceeding along the red one. Eventually the red cycle will each A again (arriving from C) and now we can return to the black cycle continuing to E, as shown by the thin arrows in the figure below. We have, thus, effectively combined the black and red cycles into a single cycle, thus solving the Eulerian tour problem.
It should be fairly easy to see that the algorithm runs in linear time in terms of the number of edges and nodes in the graph as we only visit each edge and node a constant number of times.
Note that this algorithm works even if the graph is directed, we simply have to take the orientation of each edge in consideration.
The greedy solution proposed above is somewhat dissatisfying as we need to return and fix problems (join cycles) after the initial greedy stage. An elegant solution that avoids this problem uses a spanning tree of the graph to guide the traversal. As a reminder, a spanning tree of a graph is a tree that includes every node in the graph. Computing a spanning tree is computationally easy: we keep adding edges to the tree as long as they don't form cycles, i.e., an edge is added to the tree if at least one of its ends is not already in the tree. This more elegant algorithm proceeds as follows:
Effectively, we use the spanning tree as an escape strategy. We only include spanning tree edges into the tour if no other edges are available, thus ensuring the greedy traversal cannot get stuck at any node until the tour is complete.
An example is shown below. The top left panel shows the spanning tree in thick edges, then the progressive construction of the tour is shown in red. Starting at node F, we select the edge to E as it is the only unvisited edge available that is not part of the spanning tree. At node E our only choice are spanning tree edges, and we select an arbitrary one, proceeding to node C. There we proceed along one of the non-spanning tree edges to D where again our only choice is a spanning tree edge to E. There, again, we have to follow the only unvisited edge remaining to A. At a, we select the unvisited edge that is not part of the spanning tree, returning to C, where we only have one choice remaining, moving to B, then along the only available edge to A and then, finally, returning to F along the last unvisited edge in the graph. In the end, the cycle we have found is: F, E, C, D, E, A, C, B, A, F.
The reason why this algorithm works is that there are no cycles in the graph that are completely built from tree edges or from non-tree edges, thus the greedy algorithm cannot get stuck.
In a directed graph, we will want to create a spanning tree that has all edges pointing towards its root, and start the guided traversal from the root of the tree, ensuring that there will always be an escape path leading back to the starting point.
Is the following graph Eulerian?