Everyday DSP for Programmers: The Discrete Fourier Transform

Last week we covered how to measure the frequency of a periodic signal with a single dominant frequency. This week we'll cover a way to measure the full frequency spectrum of a signal that can have any number of frequencies in it, up to the Nyquist frequency, of course. This tool is called the Discrete Fourier Transform (DFT), and we can derive it from the basic concepts of sine waves, signal transforms, and averaging.

The DFT may seem like a complicated, confusing thing at first, but fundamentally, it's actually fairly straightforward, if a bit tedious to calculate by hand. We'll explore the DFT by building it from the ground up using a signal with some known frequencies in it as a starting point. That way, we'll have a way to check that the results we're getting are reasonable. As for the tedious parts, we can write code to take care of that.

The Signal

We'll use a signal that's interesting enough to keep us going and easily decomposed into sine waves so that we know what frequencies we should get, but not so simple that it's a trivial conversion. The signal is the Fourier Series for a square wave that was covered in the post on transforms, and repeated here:

f(t) = 4 ∑(n=1,3,5,…) sin(nπt)/n

We can also calculate the sampled points of this signal with a JavaScript function like this:

function calcSquareWave(amplitude, n_max, t) {
  var y = 0;
  for (var n = 1; n <= n_max; n += 2) {
    y += amplitude*Math.sin(Math.PI*n*t)/n;
  }
  return y;
}

We'll limit the series to n=9, so if we use this function to draw a square wave, the resulting signal looks like this (click to play and pause these graphs):

Of course, we already know the frequencies and amplitudes of the sine waves that make up this signal, but we want to figure out a way to calculate those values from the sample points in the signal itself instead of using our prior knowledge as a crutch. We won't have that luxury for an arbitrary input signal.
We also can't use the same technique that we used to measure frequency before, as we'll only get the value of one of the frequencies in the signal, and we won't get the amplitude from that method. We need to try something different.

A First Step: Multiplying by a Sine Wave

It turns out that if we multiply the signal by a sine wave with an amplitude of one, we can actually pull out the part of the signal that looks like that sine wave. Then we can take the average of all of those multiplied samples to get the amplitude of the part of the signal that looks like the sine wave. This process is called correlating the signal with a sine wave. Another way to think about it is that you are taking the dot product of the array of signal values and an array of sine values, then dividing by the number of values. Here's what it looks like in code:

function CorrelateSine(signal, f) {
  var scaled_sig = signal.map(function(x, t) { return x*Math.sin(2*Math.PI*f*t) })
  return 2*scaled_sig.reduce(function(corr, x) { return corr + x }) / signal.length
}

It's just a map-reduce function, and notice that I slipped in a correction factor of 2 in the return value. This correction factor is needed because the correlation is actually split between this frequency and a higher frequency, but we'll get to that later. Visually, the correlation looks like this for the sine wave with a frequency corresponding to n=1 of our input signal:

The scale of this graph is slightly different than the last one, so that I could squeeze two plots into it. Notice how the scaled signal has the same wiggles in it as the original signal, but it looks quite a bit like the absolute value of a sine wave. The average is fixed at a value of 1, even when the graph is scrolling, because the sine wave is locked in phase with the signal. In a real application, this setup isn't likely to be possible because it can be quite difficult to determine what the phase of each sine wave is that makes up the input signal before doing the correlation. If we look at all of the possible combinations of this sine wave with the input signal by scrolling the input signal past the sine wave, the average varies between 1 and -1:

Depending on where the signal is relative to the sine wave, we're going to get different values for the correlation. That's not going to work, but no worries, we can fix it.

Use Your Imagination

What we need to do is correlate the signal with another sine wave so that we combine the two correlations in a way that always results in the correct amplitude. If you recall your basic trigonometric identities, the pertinent one for this problem is

sin²(t) + cos²(t) = 1

Since we can combine the result of the sine with the result of a cosine using the above equation and always get 1, cosine is our sine wave of choice. The next graph shows what we get when we find the correlation of the signal with both a sine and a cosine wave at the same frequency as before:

Notice that in addition to averages for both the sine and cosine, there is a calculated magnitude of those two averages that is fixed at a value of 1. This magnitude calculation can be thought of as the identity above, but it can also be thought of as the magnitude of a value on the real-imaginary plane. The cosine value is on the real axis, and the sine value is on the imaginary axis. The magnitude of this imaginary number would be calculated as

|cos(t) - i sin(t)| = √(cos²(t) + sin²(t)) = 1

This is why the DFT has real and imaginary parts, because the cosine correlation is represented as the real part and the sine correlation is represented as the imaginary part.

We can also verify that a frequency that should not be present in the signal will result in magnitude of zero:

As expected, the magnitude is fixed at zero for a frequency that is half of the lowest frequency present in the signal. Now that we've tried this correlation operation for a couple of frequencies, what about the rest of them?

Now for the Rest of the Frequencies

It turns out that the frequency shown in the last graph is the lowest frequency, other than zero, that we can measure in the DFT for this signal because one full period of the sine and cosine waves will fit within the number of samples that we're using to do the DFT. If we denote N as the number of samples in the DFT, and f_s as the sample rate, then the highest frequency we can measure is

f_MAX = f_s(N-1)/N

If we take one more step up in frequency from here, we reach f_s, and every sample of the sine and cosine waves will be the same value, making them equivalent to DC signals. As a result, the range of frequencies we can measure with an N-point DFT is

f_k = f_s·k/N for k = [0..N-1]

If we calculate all of the frequencies for our example signal, it looks like this:

The real component of the frequencies is in green and the imaginary component is in red. The frequencies that are present in the signal have non-zero real and imaginary components, and they correspond nicely to the frequencies of the sine waves that we added together to make the input signal. But notice that there is some symmetry in the frequencies. The real parts are mirrored across the center frequency, and the imaginary parts are mirrored and inverted across the center frequency. This is a property of DFTs because the center frequency is also the Nyquist frequency for the sample rate of the input signal.

Frequencies for the sine and cosine correlations that are above the Nyquist frequency will result in the same values as the equivalent frequency below the Nyquist frequency (with the sine values being inverted). You can think of the signal's energy being split between pairs of frequencies, and this is why the magnitude of any given frequency in the DFT is half of what it should be from the original signal. However, there is only one DC signal, so it's magnitude is the full value of the DC magnitude in the signal.

Knowing this property, we can correct for it. We can also stop after calculating frequencies up to the Nyquist frequency since anything higher than that doesn't provide any additional information. Here's what the magnitudes look like for the frequencies less than the Nyquist frequency:

And there we have it. The frequencies of 1, 3, 5, 7, and 9 from the original signal are all present and at the correct magnitudes, and the magnitudes are independent of the phase of the input signal. The final DFT calculation that we came up with can be represented mathematically as

f_k = ∑[n=0..N-1] s[n](cos(2πn·k/N) - i sin(2πn·k/N))/N
for k = [0..N/2-1]

Where s[n] is the nth sample of the input signal, and all other variables are as previously defined. The corresponding JavaScript function for finding the frequency magnitudes would be

function DftMagnitudes(signal) {
  var N = signal.length
  var mags = new Array(N)
  for (var k = 0; k < N/2; n++ ) {
    var scaled_sin = signal.map(function(x, n) { return x*Math.sin(2*Math.PI*n*k/N) })
    var imag = scaled_sin.reduce(function(corr, x) { return corr + x })
    var scaled_cos = signal.map(function(x, n) { return x*Math.cos(2*Math.PI*n*k/N) })
    var real = scaled_cos.reduce(function(corr, x) { return corr + x })
    mags[k] = 2*Math.sqrt(real*real + imag*imag)/N
  }
  return mags
}

The separate real and imaginary parts of the DFT could also be returned, but the magnitudes of the frequencies are normally the most useful. Sometimes the phase of each frequency is also of interest, and this can be computed from the real and imaginary parts by calculating arctan(imag/real).

So the DFT, a seemingly complicated operation of transforming a time domain signal into a frequency domain signal, actually consists of no more than multiplying the signal by sine waves at particular frequencies, averaging the results, and optionally calculating some magnitudes. This barely scratches the surface of the world of DFTs, so next week we'll look at the faster but restricted version of the DFT, the FFT, and how some common signals behave with the DFT.

Other DSP posts:
Basic Signals
Transforms
Sampling Theory
Averaging
Step Response of Averaging
Signal Variance
Edge Detection
Signal Envelopes
Frequency Measurement
The Discrete Fourier Transform
The DFT in Use
Spectral Peak Detection
FIR Filters
Frequency Detection
DC and Impulsive Noise Removal