Understanding the Levinson-Durbin Algorithm

Author

Nick Appleton

Published

July 30, 2023

Introduction

I’ve recently been playing with this algorithm and wanted to do a write-up of my understanding on how it works. Better references certainly exist, but I wanted to write down how my brain thinks about the algorithm if for no other reason than for my own future reference.

I’m going to focus on the algorithm and explain how to build it - it won’t be the most efficient version, but will hopefully demonstrate the concepts.

Toeplitz systems

The Levinson-Durbin algorithm finds the solution to a Toeplitz system which has a non-zero main diagonal:

Tx=y(t0t1t2tn1t1t0t1t2t1t0t2t1t(n1)t2t1t0)x=y

Toeplitz systems arise frequently in signal processing problems and, whenever I see them, I think “convolution”. For example, given a discrete convolution:

y[n]=ka[k]b[nk]

Where a[k] is zero outside of the range 0kNa and b[k] is zero outside of the range 0kNb, we can write the equation as a Toeplitz matrix vector product:

(y0y1y2y3)=(a0000a1a000a2a1a00a3a2a1a0)(b0b1b2b3)

Toeplitz matrix multiplications can be implemented using convolutions!

The algorithm

Start by solving a 2x2 Toeplitz system using Cramer’s rule:

(t0t1t1t0)(x0x1)=(y0y1)

(x0x1)=1t02t1t1(t0t1t1t0)(y0y1)

To move to higher order systems, let’s define some new quantities:

  • Tn is the n-by-n Topelitz matrix that is taken as the first n rows and columns of a larger Toeplitz matrix. In the above 2x2 example:

T2=(t0t1t1t0)T1=t0

  • un is the n-row column vector that when multiplied by Tn produces e1 (i.e. the column vector which has a 1 in the top row and 0 in all other rows). See u, think upper - it is orthogonal to all but the top row of Tn.

u2=1t02t1t1(t0t1)

  • ln is the n-row column vector that when multiplied by Tn produces en (i.e. the column vector with a 1 in the bottom row and 0 in all other rows). See l, think lower - it is orthogonal to all but the bottom row of Tn.

l2=1t02t1t1(t1t0)

  • yn is the top n rows of the column vector y being solved.
  • xn is the top n rows of the column vector that satisfies Tnxn = yn.

Let’s go ahead and define a T3 matrix now:

T3=(t0t1t2t1t0t1t2t1t0)=(T2t2t1t2t1t0)=(t0t1t2t1t2T2)

Notice that Tn can be expressed in two different by equivalent ways in terms of Tn1 with two new variables tn1 and t(n1) inserted into the matrix (all the other new elements are already present in Tn1). All Tn matricies contain t0 on the diagonal.

From here we derive three new quantities y^3, u^3 and l^3, and which are based on x2, u2 and l2 that we defined earlier.

y^3=T3(x20)=(T2t2t1t2t1t0)(x20)=(y2r3)u^3=T3(u20)=(T2t2t1t2t1t0)(u20)=(e1α3)l^3=T3(0l2)=(t0t1t2t1t2T2)(0l2)=(β3e2)

u^3 is almost e1 except for the annoying probably-non-zero entry α3 in the bottom row and l^3 is almost e3 except for a probably-non-zero entry β3 at the top row. These scalars have values of:

α3=(t2t1)u2β3=(t1t2)l2

It is a straight-forward to show that for n, u^n and l^n have values of:

u^n=Tn(un10)=(e1αn)l^n=Tn(0ln1)=(βnen1)

These values are useless independently, but because of linearity in matrix multiplication, we can use u^n and l^n to obtain un and ln:

Tn(γ(un10)+δ(0ln1))=γ(e1αn)+δ(βnen1)

To make the RHS e1, we must solve γ+δβn=1 and γαn+δ=0:

γ=11βnαnδ=αn1βnαnun=γ(un10)+δ(0ln1)=11βnαn((un10)αn(0ln1))

To make the RHS en, we must solve γ+δβn=0 and γαn+δ=1:

δ=11βnαnγ=βn1βnαnln=δ(0ln1)+γ(un10)=11βnαn((0ln1)βn(un10))

Because Tnln=en, we know that it is an orthogonal component i.e. any multiple of it may be added to xn and it will only influence the value of yn1 i.e. the scalar final row of Tnxn.

y^3=T3(x20)=(y2r3)y3=T3((x20)+(y2r3)l3)=T3x3

That is, use l3 to remove the erroneous quantity r3 from y^3 and set it to the desired value i.e. the last element of y3 which is y2. x3 is now:

x3=(x20)+(y2r3)l3

The algorithm continues to replace the l, u and x vectors until the complete solution exists in x.

C Implementation

The following implementation follows the naming of quantities in the above working. l, u and x are first initialised to the quantities l2, u2 and x2. The loop then updates and extends these arrays until the solution is found. u and x get a new value appended to the end for each iteration. l gets a new element pre-pended per iteration which removes the need for a reverse-access read.

/* Solves Tx = y for x where x and y are vectors and T is Toeplitz.
 *
 * `ord` is the number of rows and columns in the matrix.
 * `t` contains the toeplitz diagonal coefficients from the bottom left to the
 *   top right (it must contain 2*ord-1 elements). t[ord-1] must be non-zero.
 * `y` contains the desired output (it must contain ord elements).
 * `x` will contain the computed solution. */
static void levindurb(double *x, const double *t, const double *y, unsigned ord) {
    const int diag_offset = ord-1;
    double    u[16];
    double    l[16];
    double    tmp;
    int       i, j;

    assert(ord < 16);

    tmp      = 1.0 / (t[diag_offset] * t[diag_offset] - t[diag_offset+1] * t[diag_offset-1]);
    u[0]     = tmp * t[diag_offset];
    u[1]     = tmp * -t[diag_offset-1];
    l[ord-2] = tmp * -t[diag_offset+1];
    l[ord-1] = u[0];
    x[0]     = y[0] * u[0] + y[1] * l[ord-2];
    x[1]     = y[0] * u[1] + y[1] * l[ord-1];
    for (i = 2; i < ord; i++) {
        double alpha, beta, r, scale;
        double *p_l = &(l[ord-i]);
        for (j = 0, alpha = 0.0, beta = 0.0, r = 0.0; j < i; j++) {
            r     += x[j]   * t[diag_offset-i+j];
            alpha += u[j]   * t[diag_offset-i+j];
            beta  += p_l[j] * t[diag_offset+1+j];
        }
        scale = 1.0 / (1.0 - beta * alpha);
        for (j = 0; j <= i; j++) {
            double ut = (j < i) ? u[j]     : 0.0;
            double lt = (j > 0) ? p_l[j-1] : 0.0;
            double xt = (j < i) ? x[j]     : 0.0;
            u[j]      = scale * (ut - alpha * lt);
            p_l[j-1]  = scale * (lt - beta * ut);
            x[j]      = xt + (y[i] - r) * p_l[j-1];
        }
    }
}