Recursive Sequences as Linear Transformations

Fibonacci Spirals the Drain

February 02, 2024 · John Peach

This problem is from Otto K. Bretscher posted in the Classical Mathematics Facebook group.

bretscher-sequence
Figure 1.

Recursive problem.

We’re working on a new platform for scientific research called eurAIka that combines The Whiteboard for writing notes, equations, and pasting in graphics, The Librarian for AI-assisted literature search, and The Coder where you can write programs. This will be our first attempt at using eurAIka for a Wild Peaches article.

The Fibonacci Sequence

Let’s start with an easier problem, the Fibonacci sequence:

fibonacci-sequence
Figure 2.

The Fibonacci sequence.

where the recursive relationship is xn+2=xn+1+xnx_{n+2} = x_{n+1} + x_n and x0=x1=1x_0 = x_1 = 1​. We could generalize this to include any sequence composed of multiples of the previous two terms,

xn+2=αxn+1+βxn.x_{n+2} = \alpha x_{n+1} + \beta x_{n}.

The coefficients α=β=1\alpha = \beta = 1 for the Fibonacci sequence, and for the problem we’re trying to solve α=4\alpha = 4 and β=7\beta = -7.

fibonacci-spiral
Figure 3.

The Fibonacci Spiral.

It wouldn’t be difficult to write a function in a computer language that calculates all of the elements of the sequence up to the one we’re interested in. To see how the value for x1=bx_1 = b changes the outcome, we’d have to run the code over many test cases to see what happened.

Another way to get to the value of x2024x_{2024} can be done in one step using linear algebra. Let’s call v0\overrightarrow{v_{0}} the vector composed of the first two entries of the sequence, so

v0=[x1x0].\overrightarrow{v_{0}} = \begin{bmatrix} x_{1} \\ x_{0} \end{bmatrix}.

The next step in the sequence puts x2x_2 in the first position and moves x1x_1 down giving

v1=[x2x1]=[αx1+βx0x1].\overrightarrow{v_{1}} = \begin{bmatrix} x_{2} \\ x_{1} \end{bmatrix} = \begin{bmatrix} \alpha x_{1} + \beta{x_{0}} \\ x_{1} \end{bmatrix}.

The entries of this new vector are linear combinations of the previous two entries of the sequence, so we could define a matrix AA that multiplies v0\overrightarrow{v_{0}} to get

v1=Av0=[αβ10][x1x0].\overrightarrow{v_{1}} = A \overrightarrow{v_{0}} = \begin{bmatrix} \alpha & \beta \\ 1 & 0 \end{bmatrix} \begin{bmatrix} x_{1} \\ x_{0} \end{bmatrix}.

The recursion step says that v2=Av1\overrightarrow{v_2} = A \overrightarrow{v_1} and so on, or in terms of the original vector, v2=A(Av0)=A2v0.\overrightarrow{v_2} = A(A \overrightarrow{v_0}) = A^2 \overrightarrow{v_0}. For the Fibonacci sequence, A=[1110]A = \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix}. For example, we can calculate the 16th entry in the sequence by calculating A14=[610377377233]A^{14} = \begin{bmatrix} 610 & 377 \\ 377 & 233 \end{bmatrix}, which gives F15F_{15} in the [1,1][1,1] location. A convenient trick to extract just the upper left entry using Julia is to write this as (A^14)[1,1].

If Ax=λxAx = \lambda x then xx is called an eigenvector and the constant λ\lambda is an eigenvalue. For the Fibonacci matrix AA there are two eigenvalues, λ=1±52\lambda = \frac{1 \pm \sqrt{5}}{2} (the positive one is the golden ratio, ϕ\phi) and if you take the ratio of Fn+1F_{n+1} to FnF_n you’ll see that it converges to λ\lambda. The other eigenvalue is the negative inverse of the first one, so λ2=Fn/Fn+1\lambda_2 = -F_n / F_{n+1}.

The matrix AA can be decomposed into A=PDPTA = P D P^T where PP is the 2×22 \times 2 array of eigenvectors and DD is a 2×22 \times 2 diagonal array with λ1\lambda_1 and λ2\lambda_2 on the diagonal. The eigenvectors are unit vectors, so PP is orthogonal meaning that PPT=IP P^T = I, the identity matrix. With this decomposition, powers of AA can be written as

An=PDnPT.A^n = P D^n P^T.

Since DD is diagonal, then DnD^n is also diagonal with λ1n\lambda_1^n and λ2n\lambda_2^n on the diagonal, making matrix powers easy to calculate. Now the 15th iterate of AA can be found from

A14=PD14PT=[610.0377.0377.0233.0].A^{14} = P D^{14} P^T = \begin{bmatrix} 610.0 & 377.0 \\ 377.0 & 233.0 \end{bmatrix}.

This is numerically unstable because DD will quickly overflow or underflow as powers of λ1\lambda_1 and λ2\lambda_2 approach either zero or ±,\pm \infty, but this idea will become theoretically useful for our problem.

As we saw above, the ratios of successive values of the Fibonacci sequence approach the golden ratio, ϕ\phi. If you plotted points (Fn,Fn+1)(F_n, F_{n+1}) you would see that they all lie nearly on the line y=ϕxy = \phi x, and the fit gets better the farther out you go. The points (Fn,Fn+1)(F_n, F_{n+1}) are the vectors which when multiplied by AA give the next point or vector in the sequence.

Geometrically, matrix multiplication transforms vectors through rotation and scaling. What we’re seeing here is that in the decomposition PDnPTvnP D^n P^T v_{n}, the first multiplication PTvnP^T v_n only performs a rotation since PP is unitary. The scaling occurs with Dn(PTvn)D^n (P^T v_n), and multiplication by PP restores the direction.

As an example, let v5=[5,3]Tv_5 = [5, 3]^T so

PTv5=[0.07675.830]DPTv5=[0.04749.434]PDPTv5=[85]\begin{aligned} P^T v_{5} &= \begin{bmatrix} 0.0767 \\ -5.830 \end{bmatrix} \\ DP^T v_{5} &= \begin{bmatrix} -0.0474 \\ -9.434 \end{bmatrix} \\ PDP^T v_{5} &= \begin{bmatrix} 8 \\ 5 \end{bmatrix} \\ \end{aligned}
Fibonacci
Figure 4.

Matrix multiplicaion rotation and scaling effects.

Notice that v5v_5 and PTv5P^T v_5 lie on the same circle as do the points DPTv5DP^Tv_5 and PDPTv5PDP^Tv_5 showing that PP is a rotation, while the scaling between F5F_5 and F6F_6 is multiplication by λ1\lambda_1​.

The Bretscher Sequence

Now let’s apply this same technique to the Bretscher sequence xn+2=4xn+17xnx_{n+2} = 4 x_{n+1} - 7 x_n. Define the transition matrix,

B=[4710]B = \begin{bmatrix} 4 & -7 \\ 1 & 0 \end{bmatrix}

which has the characteristic equation derived from the determinant of BλI|B - \lambda I|

0=(4λ)(λ)+7=λ24λ+7λ=2±3i.\begin{aligned} 0 &= (4 - \lambda)(-\lambda) + 7 \\ &= \lambda^2 - 4 \lambda + 7 \\ \Rightarrow \lambda &= 2 \pm \sqrt{ 3 } i. \end{aligned}

The eigenvectors are found by solving for (BλI)u=0(B - \lambda I) u = 0 for some uu. This gives

[4(2+3i)71(2+3i)][u1u2]=0.\begin{bmatrix} 4 - (2 + \sqrt{3}i) & -7 \\ 1 & -(2 + \sqrt{3}i) \end{bmatrix} \begin{bmatrix} u_1 \\ u_2 \end{bmatrix} = 0.

Expanding this gives the system of equations,

(23i)u17u2=0u1(2+3i)u2=0\begin{aligned} (2 - \sqrt{3}i)u_1 - 7 u_2 &= 0 \\ u_1 - (2 + \sqrt{3}i) u_2 &= 0 \end{aligned}

giving eigenvectors v=[2±3i1]v = \begin{bmatrix} 2 \pm \sqrt{3}i & 1 \end{bmatrix}. The inverse of

P=[2+3i23i11]P = \begin{bmatrix} 2+\sqrt{3}i & 2-\sqrt{3}i \\ 1 & 1 \end{bmatrix}

is

P1=[i2312+i3i2312i3].P^{-1} = \begin{bmatrix} -\frac{i}{2 \sqrt{3}} & \frac{1}{2} + \frac{i}{\sqrt{3}} \\ \frac{i}{2 \sqrt{3}} & \frac{1}{2} - \frac{i}{\sqrt{3}} \end{bmatrix}.

Let DD be the diagonal matrix formed from the eigenvalues,

D=[2+3i0023i]D = \begin{bmatrix} 2+\sqrt{3}i & 0 \\ 0 & 2-\sqrt{3}i \end{bmatrix}

and then verify that B=PDP1B = PDP^{-1} and PP1=IP P^{-1} = I, the identity matrix.

For the Fibonacci sequence, the eigenvectors formed an orthonormal basis, so we could quickly calculate powers of the matrix AA because

An=PDP1PDP1PDP1=PDIIDP1=PDnP1.A^n = PDP^{-1} PDP^{-1} \cdots PDP^{-1} = PDI \cdots I D P^{-1} = P D^nP^{-1}.

The Bretscher matrix also has this property, except that the eigenvectors aren’t orthogonal. Still, the eigenvectors found above form a basis of the space spanned by BB, so we can still calculate Bn=PDnP1B^n = PD^nP^{-1} for any power nn. This wouldn’t be so bad for some small values of nn, but we’ll need another method to answer the question of how to maximize x2024x_{2024}.

Otto Bretscher says in response to his original question,

The roots of the characteristic polynomial are 2 ± i√3 = √7exp[±i*arctan(√3/2)]. We can write x(n) as a linear combination of (2 + i√3)^n and (2 - i√3)^n and bring this expression into real form. While these computations are straightforward, they are somewhat tedious, perhaps best left to a computing device; the answer provided by WolframAlpha is attached. Since the coefficient of b turns out to be negative for n = 2024, we maximize x(2024) by letting b = 0.

wolfram-alpha-solution
Figure 5.

The characterisitic polynomial in Wolfram Alpha.

Pari/GP can calculate numbers with very high precision, so let’s give it a try. The sequence begins x0=1,x1=bx_0 = 1, x_1 = b, so the next element is found from

B[x1x0]=[4710][b1]=[4b7b].B \begin{bmatrix} x_{1} \\ x_{0} \end{bmatrix} = \begin{bmatrix} 4 & -7 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} b \\ 1 \end{bmatrix} = \begin{bmatrix} 4b - 7 \\ b \end{bmatrix}.

The fourth element is B2[b1]B^2 \begin{bmatrix} b \\ 1 \end{bmatrix} so the 2024th2024^{th} will be multiplied by B2022B^{2022}. Enter the matrix BB in PARI/GP as

B = [4, -7; 1, 0]
B2022
Figure 6.

The PARI/GP solution.

So, how do you choose b[0,2π]b \in [0,2 \pi] to maximize x2024x_{2024}? Well you don’t need to know the value of the number that Pari just computed, all you need is the sign. Since the value of the entry in the [1,1][1,1] position of this little 2×22 \times 2 matrix is negative, the bb that maximizes x2024x_{2024}​ is zero.

Spiraling to the Solution: Where Mathematics Meets Beauty

This exploration of the Bretscher sequence demonstrates how linear algebra can be useful in solving seemingly complex recursive problems. We found that

If you’re interested in exploring further, here are some suggested investigations:

The methods demonstrated here extend beyond just mathematical curiosity - they have applications in fields ranging from financial modeling to quantum mechanics. Understanding how to analyze recursive sequences through linear transformations provides a powerful tool for tackling complex problems in science and engineering.

recursive-image-of-p2024
Figure 7.

Recursive image of this problem in eurAIka.

Definitions

Linear combination: If vector v3=αv1+βv2v_3 = \alpha v_1 + \beta v_2 for some constants α\alpha and β\beta, then v3v_3 is a linear combination of the two vectors v1v_1 and v2v_2.

Eigenvector, eigenvalue: If the vector vv can be written as Av=λvAv = \lambda v for some matrix AA and constant λ\lambda, then vv is an eigenvector of AA, and λ\lambda is the associated eigenvalue.

Orthogonal vectors: Vectors are orthogonal if the angle between them is 90°90 \degree in two or three dimensions. The better definition of orthogonality is that the dot product between the two vectors is zero, and this holds for all dimensions. To calculate the dot product, multiply the elements in one vector by the corresponding element in the second vector and add up the result. If it sums to zero, the vectors are orthogonal.

Orthogonal matrix: A square matrix with real valued elements and with the property that its inverse is equal to its transpose is orthogonal.

Diagonal matrix: All elements not on the main diagonal are zero.

Transition matrix: A matrix that transforms vectors from one basis to another. Also called a change of coordinates matrix.

Characteristic polynomial: The polynomial arising from det(AλI)=0\det (A - \lambda I) = 0 where det\det is the determinant and II is the identity matrix.


Code for this article

Problem2024.ipynb - JupyterLab notebook in Julia

Bretscher_sequence.ipynb - JupyterLab notebook using Wolfram Language 13.2

Fibonacci.html - Geogebra worksheet to demonstrate rotation and scaling effects of linear transformations (open in br)

Software

References and further reading

Image credits