Everyday DSP for Programmers: Averaging

After finishing up the introductory concepts of DSP last week, we've now moved firmly into the digital, sampled domain, and we're going to start taking a look at the programming algorithms that are commonly used in DSP. Fundamentally, digital signal processing has a lot in common with statistics. When we start analyzing signals, some of the first tools we look at using are based in statistical analysis: maximum and minimum values, averages, and variances.

Maximums and minimums mostly come into play when bounding and normalizing signals. If you want to find the amplitude of a signal, then you'll scan through the samples to find the extreme values. It's fairly straightforward, and there's no need to delve into them until they're needed.

Averages are another story, though. Not all averages are created equal, and different types of averages are more useful in different situations. We'll take a look at five types of averages in this post, including the full, block, moving, exponential, and filter averages. To ground the discussion a bit, let's apply these different averages to a real-world dataset. Gas prices are something that most people watch pretty closely, and it happens to be a series with some interesting characteristics, so let's go with that. I'll use the weekly U.S. regular reformulated retail gasoline prices from the last twenty years, made available online through the EIA.gov website.


Full Average


The most basic of the averages is the one everyone learns in the first week of any introductory statistics course. For lack of a better name, I'll call it the full average because you use the full signal to compute it. To do so, you add up the values of all of the samples and divide by the number of samples, like so:

mean = ∑[i=1..n] s[i]/n

where s[i] is the ith sample, and n is the number of samples. We can easily define a function to compute the full average. Here's one example in JavaScript:
function Mean(samples) {
  return samples.reduce(function(a, b) { return a + b; }) / samples.length
}
Note that to do this calculation, we need to have all of the samples available at the time the function is called. This is not always the case, especially if we're dealing with a real-time system, but when we do have all samples of a signal available, it will return the average value of the signal. Let's see what we get when we apply this function to the historical gas prices:


Click on the graph to see the average gas price over the last twenty years. It turns out to be about $2.30, but the graph tells us a couple more things. During the first decade, from 1996 to 2006, the gas price was nearly always below the 20-year average, and during the second decade, it was nearly always above the 20-year average. Also, the gas price dipped below the average during the Great Recession and again briefly early this year.

In addition to knowing the average value of a signal, the full average can be used to get a sense of how a signal behaves over time. Some signals will oscillate around their average, some signals will pass through their average only once, and still other signals will cross their average near the beginning and end of their domains. Finally, notice that the full average removes all frequency content and leaves the DC value of the signal. That means the full average is one way (the most direct way) to find the DC part of the signal if you need to use it or remove it.

Block Average

If you have an application where you need an average, but you don't have access to all of the samples at once, or you don't want to average over the entire signal, you can do the averaging in chunks of samples instead. Mathematically, this operation looks nearly the same, except you end up creating an array of means where each mean is the average of a continuous block of samples, and each block is of equal size.

mean[j] = ∑[i=0..k] s[j-i]/k

Where j is the jth block average and k is the number of samples per block. Here's the equivalent code:
function BlockAverage(samples, blockSize) {
  var means = []
  for (var i = 0; i < samples.length; i += blockSize) {
    means.push(Mean(samples.slice(i, i + blockSize)))
  }
  return means
}
This code is making a couple of assumptions. First, it assumes that all of the samples are already available. If samples were being acquired in real-time, they would likely be added to a buffer and the Mean() function would be called each time the buffer was full. Second, the length of samples is assumed to be a multiple of the block size. In production code, there would have to be a guarantee that this requirement was met, or there would have to be a check on the inputs so that error conditions could be handled appropriately.

Coming back to gas prices, suppose we want to see what the average gas price was for each year of the data we have. To find out, we could do a block average for each 52-week range of data points. This doesn't work out exactly to one mean per year, but it's close enough. Here's what it looks like (click to run the block averaging):


The yearly averages show that gas prices rose fairly steadily from 1998 to 2007, jumped in 2008, dove in 2009, recovered, flattened out, and are now falling again. Block averaging has removed some of the weekly and monthly noise and exposed the underlying long-term trends in the data. It has also decimated the data, and by that I don't mean that it destroyed the data (although it did eliminate some of it). In DSP, decimate means to reduce the sample rate of the data, and here the sample rate was reduced from weekly to yearly. As a result, the higher frequency movement in the data is removed, and the lower frequency movement remains.

Moving Average

What if we don't want to decimate the data, but we still want to remove some of the high frequency content of the signal? We can use a similar operation where we're still averaging a block of samples, but we move the block one sample at a time, taking an average each time. This is also referred to as a sliding-window average because it looks like a window is sliding over the data while averaging it. Mathematically, the moving average can be expressed as

mean[j] = ∑[i=j..(j+k)] s[i]/k

Where j is the jth sample and k is the block size. Notice that with the moving average, we will run out of samples near the end of the data, so we have to do something different either at the beginning or the end to fit in the averaging without a full block of samples. If the application is acquiring data in real-time, then we need to decide how to handle the startup behavior—either by holding off the averaging until the block of samples is full or computing partial averages only with the available samples. Here is what the code would look like for the latter case:
function MovingAverage(samples, blockSize) {
  var means = []
  for (var i = 1; i <= samples.length; i++) {
    var start = (i < blockSize) ? 0 : i-blockSize
    means.push(Mean(samples.slice(start, i)))
  }
  return means
}
This code makes the same assumptions about available samples and block size as the block average code did, and notice how we need to be careful about the start and end arguments for slice(). The sliced samples does not include the end value, so we need to pass it a value one more than what we want, and the easiest way to do that is to increment the start and end of the for loop by 1.

A real-time application would also commonly handle incoming samples by putting them in a circular buffer, which is a regular buffer that keeps track of the insertion point for new data. Each new data sample overwrites the oldest sample, and when the insertion point reaches the end of the buffer, it wraps around to the beginning again. Let's see how this moving average looks with the gas price data:


The averaged curve has similar characteristics to the block averaged version, but it's not decimated. It's a smoothed curve of the gas price data with much of the weekly volatility removed so that it's easier to see the underlying trends in the data. This smoothing function can be controlled by the block size. A shorter block will retain more of the volatility while a longer block will have a stronger smoothing effect.

This block size was kept the same as the block average at 52 weeks. Looking at the smoothed curve, it's clear that the 2008-2010 peak and trough were an overshoot and undershoot due to a shock to the system, and it will be interesting to see what happens to gas prices in the near future because of the sharp drop this year.

Another characteristic of the moving average curve is that the movements are delayed from the original data. That delay happens because older data is being combined with newer data, so recent changes in the data don't show up strongly in the means right away. It's an intrinsic behavior of averaging, and the delay is half the length of the block of samples that are being averaged.

Exponential Average

In some applications, especially embedded applications, either memory or processing time (or both) are limited, so there aren't enough resources to do a moving average with a large block size. In these cases it's still possible to retain most of the advantages of the moving average, but do the calculation in much less space and time, with an exponential average. This type of average only needs the current sample and the previous average to compute the new average:

mean[i] = w·s[i] + (1-w)·mean[i-1]

Where w is the weighting of the current sample and is a value between 0 and 1. The larger the weighting, the stronger the impact of the current sample on the mean. The code to compute the exponential average is pretty simple:
function ExpAvg(samples, w) {
  var means = [samples[0]]
  for (var i = 1; i < samples.length; i++) {
    means.push(w*samples[i] + (1-w)*means[i-1])
  }
  return means
}
This code assumes that the weighting is between 0 and 1. If it's not, the results will get pretty weird. Normally, you wouldn't see an exponential average laid out this way because samples would be processed as they were coming in. All you would have is an updating average with the calculation that's inside the push() function in the code above.
avg = w*newSample + (1-w)*avg
An exponential average with a weighting of 1/26 will have a similar smoothing function to a moving average with a block size of 52, so here is what that looks like with the gas price data:


The box that scans across the signal has shrunk to a line because the exponential average only needs the current sample and previous average, while the moving average needed an entire block of previous samples. The end result is not exactly the same as the moving average. Notice that the undershoot is not as pronounced in 2009, and a bit more of the volatility is present in other areas of the curve. The exponential average also has less of a delay associated with it because it's giving more weight to the most recent sample, and the relative weight of older samples falls off exponentially, hence the name. The behavior of this average is not as nice as a moving average, but when you need fast calculations that don't require large memory buffers, the exponential average is an excellent tool.

Sinc Filter

We saw with the moving average and the exponential average that the output signal was smoothed in comparison to the input signal by removing some of the high frequency content of the signal. We can refine this behavior by making a modification to the moving average function.

Consider that for each calculated mean, the moving average is actually multiplying each sample in the block by 1/k, where k is the block size, before adding the samples together. We can change the behavior of the average function by multiplying each sample by a different value, depending on its position in the block. The operation of sweeping this block of multipliers across a signal, summing the multiplied values at each step, is called a convolution. The block of multipliers is called a Finite Impulse Response (FIR) Filter, and the multipliers themselves are called filter taps.

We've stumbled upon FIR filters without even realizing it. Now, what should we replace the taps of the moving average with? Let's try the sinc function.

h[i] = sin(2πi/k) / (2πi/k)

Where h[i] is the ith tap and k is the length of the filter. We want to be careful how we calculate the sinc function for the taps, because if we simply use integer values for i/k, then the sinc function will reduce to the impulse function, and we won't get any filtering out of it. It will just return the signal values back with a delay added. Here's a function that generates filter taps of length k using a sinc function:
function GenFilter(k) {
  var taps = []
  for (var i = 0.5-k/2.0; i < k/2.0; i+=1.0) {
    taps.push(Math.sin(2*Math.PI*i/k) / (2*Math.PI*i/k))
  }
  var gain = taps.reduce(function(a, b) { return a + b; })
  taps = taps.map(function(a) { return a / gain })
  return taps
}
This function assumes an even number of taps and centers the taps around zero. If you wanted to use a filter with an odd number of taps, you would need to adjust the centering and force a tap value of 1 when i=0. Also notice that the gain of the filter is calculated by summing all of the taps, and then each tap is divided by this gain. That makes sure that the output of the filter will not amplify the signal. If you wanted to amplify the signal, you could add any gain you wanted by scaling the taps.

Applying the filter to the signal requires only a slight modification to the moving average calculation. Mathematically,

y[j] = ∑[i=0..(k-1)] s[j-i]·h[i]

And programmatically,
function Filter(ary, taps) {
  if (ary.length < taps.length) {
    var gain = taps.slice(0, ary.length).reduce(function(a, b) { return a + b; })
    taps = taps.map(function(a) { return a / gain })
  }
  return ary.reduce(function(a, b, i) { return a + b*taps[i] }, 0)
}
What this function is doing in the if statement is managing the startup behavior of the filter by taking a slice of the taps and scaling them up so that the smaller slice of taps has unity gain. The main calculation is all done in the reduce() function at the end. This Filter() function is a modification of the Mean() function in the first section of this article, so it is generating one sample every time it is called. To sweep over an entire signal, you would call it inside a function like the MovingAverage() function. Otherwise, if samples were coming in real-time, the sample would be added to a buffer, and the Filter() function would be called with that buffer.

Let's see what this sinc filter looks like when used on the gas price data:


As you can see, the sinc filter does a different kind of smoothing than the moving average, leaving the filtered data with a much more sinusoidal characteristic than before. Such filters have a very clean smoothing effect on the data that's especially desirable in situations where you want a good boundary between frequencies that are left in the signal and those frequencies that are removed. Since filters are a more general form of a moving average, they add the same delay to the filtered output, so that should be taken into account.

The sinc function is not the only function that can be used for the filter taps in place of the constant value for the moving average. There are many other functions that can be used for filter taps, and they all have their own frequency response characteristics. We'll look at some of them later on, after we cover the DFT. Next week, we'll expound on averaging a bit more and also take a look at signal variance.


Other DSP posts:
Basic Signals
Transforms
Sampling Theory
Averaging
Step Response of Averaging
Signal Variance
Edge Detection
Signal Envelopes
Frequency Measurement
The Discrete Fourier Transform
The DFT in Use
Spectral Peak Detection
FIR Filters
Frequency Detection
DC and Impulsive Noise Removal

No comments:

Post a Comment