What factors influence the quality of a spectral correlation function estimate?
The two non-parametric spectral-correlation estimators we’ve looked at so far–the frequency-smoothing and time-smoothing methods–require the choice of key estimator parameters. These are the total duration of the processed data block, , and the spectral resolution.
For the frequency-smoothing method (FSM), an FFT with length equal to the data-block length is required, and the spectral resolution is equal to the width of the smoothing function . For the time-smoothing method (TSM), multiple FFTs with lengths are required, and the frequency resolution is (in normalized frequency units).
The choice for the block length is partially guided by practical concerns, such as computational cost and whether the signal is persistent or transient in nature, and partially by the desire to obtain a reliable (low-variance) spectral correlation estimate. The choice for the frequency (spectral) resolution is typically guided by the desire for a reliable estimate.
Cross correlation functions can be normalized to create correlation coefficients. The spectral correlation function is a cross correlation and its correlation coefficient is called the coherence.
In this post I introduce the spectral coherence function, or just coherence. It deserves its own post because the coherence is a useful detection statistic for blindly determining significant cycle frequencies of arbitrary data records. See the posts on the strip spectral correlation analyzer and the FFT accumulation method for examples.
Let’s start with reviewing the standard correlation coefficient defined for two random variables and as
where and are the mean values of and , and and are the standard deviations of and . That is,
So the correlation coefficient is the covariance between and divided by the geometric mean of the variances of and .
Why do we need or care about higher-order cyclostationarity? Because second-order cyclostationarity is insufficient for our signal-processing needs in some important cases.
To contrast with HOCS, we’ll refer to second-order parameters such as the cyclic autocorrelation and the spectral correlation function as parameters of second-order cyclostationarity (SOCS).
The first question we might ask is Why do we care about HOCS? And one answer is that SOCS does not provide all the statistical information about a signal that we might need to perform some signal-processing task. There are two main limitations of SOCS that drive us to HOCS.
Spectral correlation in CSP means that distinct narrowband spectral components of a signal are correlated-they contain either identical information or some degree of redundant information.
Spectral correlation is perhaps the most widely used characterization of the cyclostationarity property. The main reason is that the computational efficiency of the FFT can be harnessed to characterize the cyclostationarity of a given signal or data set in an efficient manner. And not just efficient, but with a reasonable total computational cost, so that one doesn’t have to wait too long for the result.
Just as the normal power spectrum is actually the power spectral density, or more accurately, the spectral density of time-averaged power (or simply the variance when the mean is zero), the spectral correlation function is the spectral density of time-averaged correlation (covariance). What does this mean? Consider the following schematic showing two narrowband spectral components of an arbitrary signal:
Figure 1. Illustration of the concept of spectral correlation. The time series represented by the narrowband spectral components centered at and are downconverted to zero frequency and their correlation is measured. When , the result is the power spectral density function, otherwise it is referred to as the spectral correlation function. It is non-zero only for a countable set of numbers , which are equal to the frequencies of sine waves that can be generated by quadratically transforming the data.
Let’s define narrowband spectral component to mean the output of a bandpass filter applied to a signal, where the bandwidth of the filter is much smaller than the bandwidth of the signal.
The sequence of shaded rectangles on the left are meant to imply a time series corresponding to the output of a bandpass filter centered at with bandwidth Similarly, the sequence of shaded rectangles on the right imply a time series corresponding to the output of a bandpass filter centered at with bandwidth
We’ll use this simple textbook signal throughout the CSP Blog to illustrate and tie together all the different aspects of CSP.
To test the correctness of various CSP estimators, we need a sampled signal with known cyclostationary parameters. Additionally, the signal should be easy to create and understand. A good candidate for this kind of signal is the binary phase-shift keyed (BPSK) signal with rectangular pulse function.
PSK signals with rectangular pulse functions have infinite bandwidth because the signal bandwidth is determined by the Fourier transform of the pulse, which is a sinc() function for the rectangular pulse. So the rectangular pulse is not terribly practical–infinite bandwidth is bad for other users of the spectrum. However, it is easy to generate, and its statistical properties are known.
So let’s jump in. The baseband BPSK signal is simply a sequence of binary ( 1) symbols convolved with the rectangular pulse. The MATLAB script make_rect_bpsk.m does this and produces the following plot:
Figure 1. Time-domain plot of a baseband (not yet modulated by a carrier) rectangular-pulse BPSK signal with bit rate 1/10.
The signal alternates between amplitudes of +1 and -1 randomly. After frequency shifting and adding white Gaussian noise, we obtain the power spectrum estimate:
Figure 2. Power spectrum estimate for a simulated rectangular-pulse BPSK signal in noise. The signal power is unity, or 0 dB, and the noise power is 1/10, or -10 dB. The bit rate is 1/10 and the carrier offset frequency is 0.05. Note that the nulls (minima) of the signal spectrum are at , or harmonics of the bit rate offset by the carrier.
The power spectrum plot shows why the rectangular-pulse BPSK signal is not popular in practice. The range of frequencies for which the signal possesses non-zero average power is infinite, so it will interfere with signals “nearby” in frequency. However, it is a good signal for us to use as a test input in all of our CSP algorithms and estimators.
The MATLAB script that creates the BPSK signal and the plots above is here. It is an m-file but I’ve stored it in a .doc file due to WordPress limitations I can’t yet get around.