Some special strings

When playing with strings, either for fun, or to test and execute algorithms, it is always useful to have some way of creating artificial strings with specific properties. Here are two special strings that have specific properties that make them useful when reasoning about strings (e.g., to find worst case scenarios) or when trying to trip up the code you wrote.

Fibonacci strings

As you can imagine, they are strings inspired by the recursively defined Fibonacci series. Specifically, let us denote by FnF_n the Fibonacci number of order nn. By definition Fn=Fn1+Fn2F_n = F_{n-1} + F_{n-2} for all n>=3n >= 3. F1=1,F2=1F_1 = 1, F_2 = 1. Similarly, we can define FSn=FSn1FSn2FS_n = FS_{n-1} FS_{n-2} to be the Fibonacci string of order nn, where the addition operation is replaced by the string concatenation operation. Also, FS1=b,FS2=aFS_1 = b, FS_2 = a. It should be easy to see that the Fibonacci strings have lengths equal to the Fibonacci numbers of the same order. Here are some examples:

FS1=bFS_1 = b FS2=aFS_2 = a FS3=abFS_3 = ab FS4=abaFS_4 = aba FS5=abaabFS_5 = abaab

\cdots

De Bruijn strings

De Bruijn strings are a type of 'comprehensive' strings. Specifically, for a given alphabet Σ\Sigma and a length kk, a de Bruijn string of order kk contains as substrings all strings of length kk over Σ\Sigma. To see that such strings exist, are compact, and can be constructed we can build a simple graph (called a de Bruijn graph – more on that later). This graph contains as nodes all strings over Σ\Sigma of length k1k-1, and its edges connect nodes that overlap in a suffix-prefix fashion by exactly k2k-2 characters. Specifically, an edge exists for every pair of nodes A,BA, B, where the corresponding strings have the form xαx\alpha, αy\alpha y where α\alpha is a string of length k2k-2 and xx and yy are characters in Σ\Sigma.

It is easy to see that such a graph is Eulerian (each node has equal in-degree and out-degree) and thus we can traverse it in such a way that every edge is seen exactly once. This traversal (which is not unique – rather one can have an exponential number of such traversals) spells out the de Bruijn string of order kk.

De Bruijn graph of order 3 over the alphabet {a, b}. The nodes of the graph are all 2-letter strings: aa, ab, ba, bb. The edges reflect the one-letter overlaps between nodes. Node aa is connected to itself with an edge labeled a, and to node ab with an edge labeled b.Node ab is connected to node bb with an edge labled b and to the node ba with an edge labeled a. Node bb is connected to itself with an edge labeled b and to node ba with an edge labeled a. Node ba is connected to node aa with an edge labeled a and to node ab with an edge labeled b.
De Bruijn graph of order 3 over the alphabet {a, b}. The nodes are all 2-letter strings and the edges represent all possible 1-letter overlaps between the nodes. An Eulerian tour of this graph generates a string that contains all 3-letter strings, i.e., the de Bruijn string of order 3.

Last updated