# The Spectral Coherence Function

Cross correlation functions can be normalized to create correlation coefficients. The spectral correlation function is a cross correlation and its correlation coefficient is called the coherence.

In this post I introduce the spectral coherence function, or just coherence. It deserves its own post because the coherence is a useful detection statistic for blindly determining significant cycle frequencies of arbitrary data records. See the posts on the strip spectral correlation analyzer and the FFT accumulation method for examples.

Let’s start with reviewing the standard correlation coefficient $\rho$ defined for two random variables $X$ and $Y$ as $\rho = \displaystyle \frac{E[(X - m_X)(Y - m_Y)]}{\sigma_X \sigma_Y}, \hfill (1)$

where $m_X$ and $m_Y$ are the mean values of $X$ and $Y$, and $\sigma_X$ and $\sigma_Y$ are the standard deviations of $X$ and $Y$. That is, $m_X = E[X] \hfill (2)$ $m_Y = E[Y] \hfill (3)$ $\sigma_X^2 = E[(X-m_X)^2] \hfill (4)$ $\sigma_Y^2 = E[(Y-m_Y)^2] \hfill (5)$

So the correlation coefficient is the covariance between $X$ and $Y$ divided by the geometric mean of the variances of $X$ and $Y$.

Now consider the spectral correlation function, $S_x^\alpha(f) = \displaystyle \lim_{T\rightarrow\infty} \frac{1}{T} E[ X_T(t, f+\alpha/2) X_T^* (t, f-\alpha/2)], \hfill (6)$

where the expectation operator $E[\cdot]$ can be either a time average or the ensemble average used in a stochastic-process formulation. For each finite $T$, we can consider the quantity $\displaystyle \frac{1}{T} E[X_T(t, f+\alpha/2) X_T^* (t, f-\alpha/2)], \hfill (7)$

which is the correlation between the two random variables $\displaystyle \frac{1}{T^{1/2}} X_T(t, f+\alpha/2)$ and $\displaystyle \frac{1}{T^{1/2}} X_T(t, f-\alpha/2)$. To form the correlation coefficient, we need the variances for these two random variables, $\sigma_1^2(T) = E[\displaystyle\frac{1}{T} \left| X_T(t, f+\alpha/2) \right|^2] \hfill (8)$

and $\sigma_2^2(T) = E[\displaystyle\frac{1}{T} \left| X_T(t, f-\alpha/2) \right|^2]. \hfill (9)$

Here we are assuming that the two variables have zero means, which is always true when there is no additive sine-wave component in the data with frequency $f\pm \alpha/2$.

For finite $T$, then, we can form the correlation coefficient $\rho(T) = \displaystyle \frac{\frac{1}{T} E[X_T(t, f+\alpha/2) X_T^* (t, f-\alpha/2)]}{[\sigma_1^2(T) \sigma_2^2(T)]^{1/2}}. \hfill (10)$

Now, as $T\rightarrow \infty$, the numerator of $\rho(T)$ converges to the spectral correlation function $S_x^\alpha(f)$ and $\displaystyle \lim_{T\rightarrow\infty} \sigma_1^2(T) = S_x^0(f+\alpha/2) \hfill (11)$

and $\displaystyle \lim_{T\rightarrow\infty} \sigma_2^2(T) = S_x^0(f-\alpha/2). \hfill (12)$

If the limit of the quotient exists, then the correlation coefficient is given by $\rho = C_x^\alpha(f) = \displaystyle\frac{S_x^\alpha(f)}{[S_x^0(f+\alpha/2)S_x^0(f-\alpha/2)]^{1/2}}. \hfill (13)$

We call this function the spectral coherence. A similar argument holds for the conjugate spectral coherence, $C_{x^*}^\alpha(f) = \displaystyle\frac{S_{x^*}^\alpha(f)}{[S_x^0(f+\alpha/2)S_x^0(\alpha/2-f)]^{1/2}}, \hfill (14)$

which is a normalization of the conjugate spectral correlation function. Since the coherence is a valid correlation coefficient, its magnitude will be less than or equal to one, and since the involved random variables are complex-valued, in general, so is the spectral correlation function and the coherence. Therefore, the coherence lies in the closed unit disk in the complex plane.

Here is an FSM-based estimate of the coherence for our rectangular-pulse BPSK signal: Figure 1. Estimated spectral coherence magnitudes for a rectangular-pulse BPSK signal with a bit rate of 100 kHz and a carrier frequency offset of 50 kHz.

The data-block length is $65536$ samples and the frequency resolution (width of $g(f)$ in the FSM) is set to $0.01$ (one percent of the sampling rate, which here is $1.0$) for both the numerator spectral correlation function and the denominator PSDs.

In practice, of course, the numerator and denominator of the coherence are estimates corresponding to finite-duration data blocks. The tricky part of estimating the coherence involves good selection of the estimator for the denominator PSDs $S_x^0(f\pm\alpha/2)$. If the PSDs are not well resolved, or contain zeros, the coherence quotient can be numerically unstable or erroneous. I'm a signal processing researcher specializing in cyclostationary signal processing (CSP) for communication signals. I hope to use this blog to help others with their cyclo-projects and to learn more about how CSP is being used and extended worldwide.

Posted on

## 17 thoughts on “The Spectral Coherence Function”

1. Chen says:

I cannot help asking you how to estiamte the denominator PSDs? I supposed that the FAM or SSCA based SCF estimates are obtained. Then I think FFT algorithm is a must to avoid computational complexity when estimating PSDs. I have tried a lot but I cannot present a detail solution to estimate PSDs furthermore.

1. Chad Spooner says:

Did you look at the discussion of coherence and PSD estimates in the FAM post? Also be sure to look through the comments of that post and the SSCA post; there is relevant material there too.

2. Clint says:

If the PSDs are not well resolved, or contain zeros, the coherence quotient can be numerically unstable or erroneous.

I’m just wondering what your thoughts are on how to automatically decide whether a PSD is “well resolved” or not. I’m guessing that it depends on the application requirements, but do you have any rules of thumb?

If I’m analyzing recorded signals of various lengths, how do I decide if they’re too short or otherwise might lead to low-confidence analysis results? On the flip side, if I’m recording a signal to be analyzed, how do I decide between a long recording that uses all my storage space (just to be safe) or a shorter recording that would be sufficient?

Thanks for any thoughts!

Clint

1. Chad Spooner says:

I’m just wondering what your thoughts are on how to automatically decide whether a PSD is “well resolved” or not. I’m guessing that it depends on the application requirements, but do you have any rules of thumb?

I don’t think it will be possible to automatically decide whether a PSD estimate is well-resolved. You’d have to know the true PSD of the signals in the processed data in advance. But there are spectrum features that are inherently unresolvable, such as unit-step functions and additive sine-wave components. They possess features that have spectral widths of zero, and so any spectrum estimate that possesses a finite effective spectral resolution cannot resolve them. For a step-function-like spectrum (such as a square-root raised-cosine QAM signal with zero roll-off), the edges of the spectrum cannot be perfectly estimated. For a sine-wave component, the spectrum for the frequencies near the sine-wave frequency will inevitably reflect the tone’s energy, not just the noise or other continuous-spectrum signal components that lie there.

You might consider building into a power spectrum estimator the ability to detect spectral edges or corners or the like, perhaps using a derivative or similar. Derivatives are tricky when dealing with noisy or random signals though. If you could detect such occurrences in your spectrum, you could then discount high coherence estimates that involve the spectral components near the detected edges/corners.

how do I decide if they’re too short or otherwise might lead to low-confidence analysis results?

For a noiseless random signal with cycle frequency $\alpha$, you’ll want to process a data block with length that is many multiples of $1/\alpha$ to ensure you can see that periodicity. For a noisy signal, though, you also need to process enough data to overcome the effects of noise on the spectral correlation estimate. So there is no easy answer. I capture many wideband signals and RF scenes, and am supplied with such by others as well, and I have plenty of storage, since it is cheap and readily available. So I generally try to store long scenes, where the length in seconds greatly exceeds the reciprocal of the smallest non-zero cycle frequency of interest. That turns out to be a small number of seconds, typically.

3. philbar6 says:

I’m trying to implement the spectral coherence for the task that estimate the symbol rate for low roll-off factor signal.
If the pulse shape is bandlimit(e.g. SRRC with roll-off=0 and bandwith B). There is 0/0 problem when computing the coherence function. Especially for $f_0 >B-\frac{1}{2alpha}$ in $\alpha S^\alpha(f_0)$
What do you do to avoid 0/0? For inband singularity point, I just filter out this NaN points to 0. For outband zeros,I padded with small value(1e-10). But this will amplify candidate cycle frequency $\hat{\alpha} > B$ ,since the $S^0(f) = 0$ when $f>B$

1. philbar6 says:

It’s the first time I use latex comment. The first equation should be “Especially for $|f_0| >B-\frac{\alpha}{2}$ in $S^\alpha(f_0)$

1. Chad Spooner says:

The coherence is undefined whenever the denominator is zero, which can happen only when you process a noiseless signal. Are you sure you want to process noiseless signals?

4. Clint says:

/quote Here we are assuming that the two variables have zero means, which is true when there is no additive sine-wave component in the data with frequency $f\pm \alpha/2$

What are your thoughts for when you do have additive sine wave components (e.g. pilot tones)? Of course they can make it easier to detect the signal, but it seems that they would confound the detection of cycle frequencies.

I seem to remember reading somewhere that it may be desirable to remove the additive sinusoid components from the PSD (by filtering the computed PSD?), but I’m not seeing it now.

Thanks!

1. Chad Spooner says:

I don’t think it is a good idea to process data containing finite-strength additive sine-wave components with a CSP estimator such as the SSCA, FAM, FSM, etc. Detecting them and removing them will help avoid false-alarms due to the unresolvability of the spectrum when the signal contains sine-wave components. There is no advantage to such CSP processing anyway–if you can detect the sine-wave, you can estimate its parameters, and nothing that the SCF or cyclic cumulants will tell you about the sine-wave is more information than that. I probably said as much, as you suggest, in other comments or posts, but I can’t find one at the moment.

You’ll need to do more than process the PSD. You should try to detect the sine-wave components and remove them from the data. There are many ways to detect sine waves, but a good non-parametric way is to just use the Fourier transform and some kind of thresholding. But that’s not really CSP, so I don’t talk about it much on the Blog. It would be a reasonable topic for a Signal Processing ToolKit post, though.

5. Mike says:

Thanks for sharing your CSP knowledge, Chad. It seems I’m overlooking an important detail or two in this topic. First, must expected values be taken at any point?

Second, in eqn 13 & 14 denominators, shifting by $\alpha/2$ and zero filling guarantees div by 0. Currently, I do not take expected values and simply divide the bpsk signal’s SCF by shifted PSD’s and replace div by 0 in result with 0.

My non-conj spectral coherance graph looks like mostly like yours but dosn’t roll off nearly as much as alpha approaches 1. At $\alpha=1$, your non-conj spectral coherance value seems about 0.4. Mine is about 0.8. (I don’t see a way to attach pictures or would do that.) Thanks for any pointers to my misunderstandings! 1. Chad Spooner says:

Thanks for checking out the CSP Blog, Mike, and the comments!

First, must expected values be taken at any point?

No. Expectations can only be taken when there is an ensemble. This is almost always a fictitious mathematical ensemble. In rare cases one might have an actual ensemble at hand. The CSP Blog is about signal processing, and so any averaging we do in the context of estimation is time (e.g., TSM) or frequency (e.g., FSM) averaging. Here we are concerned with estimating the spectral correlation function, coherence function, cyclic autocorrelation function, cyclic cumulants, etc., using only a single sampled-data signal (“time-series”).

Second, in eqn 13 & 14 denominators, shifting by $\alpha/2$ and zero filling guarantees div by 0.

Yes, indeed. In DSP, we only have the Fourier transform over the normalized frequency interval $[-0.5, 0.5)$, and so we just have to live with the fact that $X(f+\alpha/2)X^(f-\alpha/2)$ has increasingly small overlap as $\alpha$ increases, and similarly for a PSD estimate. And the dividing-by-zero problem is even worse than that, for it is perfectly possible for a PSD to have zeros. So you must avoid outputting a non-zero coherence estimate for any $(f, \alpha)$ pair for which the coherence denominator is zero.

That explains the appearance of my coherence plot in the post. As the cycle frequency $\alpha$ increases in magnitude, the overlap between the two shifted transforms decreases. In continuous-time mathematical analysis, this doesn’t happen. The coherence is perfectly well defined for all $f$ and $\alpha$, except for the case where the denominator PSDs have a zero.

As for why your coherence is not decaying as rapidly as mine, what are the signal and noise power levels?

1. Mike says:

Thanks, Chad, for your helpful reply, and seems I’m in better shape than I thought! I converted your BPSK sample code to python as a starting point. Barring bugs (!) I create the BPSK signal of random 1/-1 (duplicated to samples/sym), mix to carrier, then add noise to achieve 15 dB s/n.

1. Mike says:

And I now see if I drop to 10 dB s/n it falls much more inline with yours! I’ll get a grasp of the basics yet. 🙂

6. Mike says:

As a short follow up to my questions, for my results described in the previous note I used the SCF FSM subroutine that I used to generate correct looking results for previous blog lessons.

7. Rob says:

I am trying to use the SCohF to blindly detect the symbol rate of a set of signals.

I am using here the code from J. Antoni — I do have my code for the symbol rate estimation and I have read your critics on the algorithm, but I wanted to be sure that what I observe is not due to a bug in my code (yeah, pretty low self confidence ;)).
The window size for the STFT is 16 -> frequency resolution = 18.75 kHz in the example below.

For clean signals it seems to work pretty well, take for instance the SCohF for a 4-FSK signal (20ksym/s, sampling rate = 200 ksps, generated with a signal generator, SNR = 60 dB):
https://ibb.co/1R11CXr
In the picture I show, for each alpha value on the X-axis, the maximum of the mean of the absolute values returned by Antoni’s code (from the ‘S’ matrix in his code).
The peak is crystal clear, and gives the correct result.

Given the properties of CSP techniques, I would expect the SCohF to be pretty insensitive to AWGN, therefore I’ve taken the signal and lowered the SNR to 10 dB (using Matlab’s awgn() function, so there should be no issues there). In some of your posts you’ve worked with far, far worse SNRs, e.g., I recall a ROC curve where you do signal detection at -11 dB.
However, running again the code with the same parameters on the corrupted signal, this is what I obtain:
https://ibb.co/xLD9n8w
All the peaks are gone… 😦
Am I missing anything?

Best,
Rob

1. Chad Spooner says:

take for instance the SCohF for a 4-FSK signal (20ksym/s, sampling rate = 200 ksps, generated with a signal generator, SNR = 60 dB):

The dominant peak is at 20 kHz, which is good, but the maximum coherence is tiny. This implies that either (1) the SNR is not 60 dB, (2) something is going wrong in your estimator, or (3) the particular kind of FSK signal you are using has very weak cyclostationarity. There are many varieties of FSK, and their statistical properties depend on how the symbol-clock phase and carrier phase are related, and vary, over time. I suggest you use a simpler, more well-understood signal, such as my usual rectangular-pulse BPSK signal (and set its power to unity [0 dB]). That way you have lots of solid reference points to compare to what comes out of your code.

I’ve taken the signal and lowered the SNR to 10 dB (using Matlab’s awgn() function, so there should be no issues there).

Since the peaks have completely disappeared, I would check this step. What was the exact command you used? Did you look at a PSD?

1. Rob says:

(1) I haven’t recorded the data myself, I have taken this value from the metadata that came with the signal dataset.
(2) the estimator is the one from Antoni’s paper. I simply take his code, load my signal, and run it as:
[S, alpha, ~, ~] = Fast_SC(x, Nw, alpha_max, Fs, opt);
where:
– x = my signal
– Nw = 16
– alpha_max = 100e6
– Fs = 200e6
– opt.coh = 1 (to compute the SCohF)
Then I plot mean(abs(S(:, 2 : end)))
My code does the same, but in a non-optimized way and on selected alpha values (to reduce the computational load), and the result is virtually the same (in the AWGN case, I get garbage out).
(3) this might be the answer. In my dataset I do have multiple xFSK signals, and everything’s nice for all of them except this one… (I do few other mistakes, but these are due to the fact that I pick the wrong peak — either an harmonic or a sub-harmonic –, thus it is a mistake in my detection algorithm and not an issue with the CSP output)
(4) the signal actually disappeared from the PSD…
https://ibb.co/bbXjB5B
which is weird. The SNR reduction is simply made by invoking the Matlab function:
x = awgn(x, 10, ‘measured’);

Thanks again for your help and have a nice day!
Rob