The Spectral Coherence Function

Cross correlation functions can be normalized to create correlation coefficients. The spectral correlation function is a cross correlation and its correlation coefficient is called the coherence.

In this post I introduce the spectral coherence function, or just coherence. It deserves its own post because the coherence is a useful detection statistic for blindly determining significant cycle frequencies of arbitrary data records. See the posts on the strip spectral correlation analyzer and the FFT accumulation method for examples.

Let’s start with reviewing the standard correlation coefficient \rho defined for two random variables X and Y as

\rho = \displaystyle \frac{E[(X - m_X)(Y - m_Y)]}{\sigma_X \sigma_Y}, \hfill (1)

where m_X and m_Y are the mean values of X and Y, and \sigma_X and \sigma_Y are the standard deviations of X and Y. That is,

m_X = E[X] \hfill (2)

m_Y = E[Y] \hfill (3)

\sigma_X^2 = E[(X-m_X)^2] \hfill (4)

\sigma_Y^2 = E[(Y-m_Y)^2] \hfill (5)

So the correlation coefficient is the covariance between X and Y divided by the geometric mean of the variances of X and Y.

Now consider the spectral correlation function,

S_x^\alpha(f) = \displaystyle \lim_{T\rightarrow\infty} \frac{1}{T} E[ X_T(t, f+\alpha/2) X_T^* (t, f-\alpha/2)], \hfill (6)

where the expectation operator E[\cdot] can be either a time average or the ensemble average used in a stochastic-process formulation. For each finite T, we can consider the quantity

\displaystyle \frac{1}{T} E[X_T(t, f+\alpha/2) X_T^* (t, f-\alpha/2)], \hfill (7)

which is the correlation between the two random variables \displaystyle \frac{1}{T^{1/2}} X_T(t, f+\alpha/2) and \displaystyle \frac{1}{T^{1/2}} X_T(t, f-\alpha/2). To form the correlation coefficient, we need the variances for these two random variables,

\sigma_1^2(T) = E[\displaystyle\frac{1}{T} \left| X_T(t, f+\alpha/2) \right|^2] \hfill (8)


\sigma_2^2(T) = E[\displaystyle\frac{1}{T} \left| X_T(t, f-\alpha/2) \right|^2]. \hfill (9)

Here we are assuming that the two variables have zero means, which is always true when there is no additive sine-wave component in the data with frequency f\pm \alpha/2.

For finite T, then, we can form the correlation coefficient

\rho(T) = \displaystyle \frac{\frac{1}{T} E[X_T(t, f+\alpha/2) X_T^* (t, f-\alpha/2)]}{[\sigma_1^2(T) \sigma_2^2(T)]^{1/2}}. \hfill (10)

Now, as T\rightarrow \infty, the numerator of \rho(T) converges to the spectral correlation function S_x^\alpha(f) and

\displaystyle \lim_{T\rightarrow\infty} \sigma_1^2(T) = S_x^0(f+\alpha/2) \hfill (11)


\displaystyle \lim_{T\rightarrow\infty} \sigma_2^2(T) = S_x^0(f-\alpha/2). \hfill (12)

If the limit of the quotient exists, then the correlation coefficient is given by

\rho = C_x^\alpha(f) = \displaystyle\frac{S_x^\alpha(f)}{[S_x^0(f+\alpha/2)S_x^0(f-\alpha/2)]^{1/2}}. \hfill (13)

We call this function the spectral coherence. A similar argument holds for the conjugate spectral coherence,

C_{x^*}^\alpha(f) = \displaystyle\frac{S_{x^*}^\alpha(f)}{[S_x^0(f+\alpha/2)S_x^0(\alpha/2-f)]^{1/2}}, \hfill (14)

which is a normalization of the conjugate spectral correlation function. Since the coherence is a valid correlation coefficient, its magnitude will be less than or equal to one, and since the involved random variables are complex-valued, in general, so is the spectral correlation function and the coherence. Therefore, the coherence lies in the closed unit disk in the complex plane.

Here is an FSM-based estimate of the coherence for our rectangular-pulse BPSK signal:

Figure 1. Estimated spectral coherence magnitudes for a rectangular-pulse BPSK signal with a bit rate of 100 kHz and a carrier frequency offset of 50 kHz.

The data-block length is 65536 samples and the frequency resolution (width of g(f) in the FSM) is set to 0.01 (one percent of the sampling rate, which here is 1.0) for both the numerator spectral correlation function and the denominator PSDs.

In practice, of course, the numerator and denominator of the coherence are estimates corresponding to finite-duration data blocks. The tricky part of estimating the coherence involves good selection of the estimator for the denominator PSDs S_x^0(f\pm\alpha/2). If the PSDs are not well resolved, or contain zeros, the coherence quotient can be numerically unstable or erroneous.

Author: Chad Spooner

I'm a signal processing researcher specializing in cyclostationary signal processing (CSP) for communication signals. I hope to use this blog to help others with their cyclo-projects and to learn more about how CSP is being used and extended worldwide.

22 thoughts on “The Spectral Coherence Function”

  1. I cannot help asking you how to estiamte the denominator PSDs? I supposed that the FAM or SSCA based SCF estimates are obtained. Then I think FFT algorithm is a must to avoid computational complexity when estimating PSDs. I have tried a lot but I cannot present a detail solution to estimate PSDs furthermore.
    So could you please give some advice?

  2. Hi again Chad,

    If the PSDs are not well resolved, or contain zeros, the coherence quotient can be numerically unstable or erroneous.

    I’m just wondering what your thoughts are on how to automatically decide whether a PSD is “well resolved” or not. I’m guessing that it depends on the application requirements, but do you have any rules of thumb?

    If I’m analyzing recorded signals of various lengths, how do I decide if they’re too short or otherwise might lead to low-confidence analysis results? On the flip side, if I’m recording a signal to be analyzed, how do I decide between a long recording that uses all my storage space (just to be safe) or a shorter recording that would be sufficient?

    Thanks for any thoughts!


    1. I’m just wondering what your thoughts are on how to automatically decide whether a PSD is “well resolved” or not. I’m guessing that it depends on the application requirements, but do you have any rules of thumb?

      I don’t think it will be possible to automatically decide whether a PSD estimate is well-resolved. You’d have to know the true PSD of the signals in the processed data in advance. But there are spectrum features that are inherently unresolvable, such as unit-step functions and additive sine-wave components. They possess features that have spectral widths of zero, and so any spectrum estimate that possesses a finite effective spectral resolution cannot resolve them. For a step-function-like spectrum (such as a square-root raised-cosine QAM signal with zero roll-off), the edges of the spectrum cannot be perfectly estimated. For a sine-wave component, the spectrum for the frequencies near the sine-wave frequency will inevitably reflect the tone’s energy, not just the noise or other continuous-spectrum signal components that lie there.

      You might consider building into a power spectrum estimator the ability to detect spectral edges or corners or the like, perhaps using a derivative or similar. Derivatives are tricky when dealing with noisy or random signals though. If you could detect such occurrences in your spectrum, you could then discount high coherence estimates that involve the spectral components near the detected edges/corners.

      how do I decide if they’re too short or otherwise might lead to low-confidence analysis results?

      For a noiseless random signal with cycle frequency \alpha, you’ll want to process a data block with length that is many multiples of 1/\alpha to ensure you can see that periodicity. For a noisy signal, though, you also need to process enough data to overcome the effects of noise on the spectral correlation estimate. So there is no easy answer. I capture many wideband signals and RF scenes, and am supplied with such by others as well, and I have plenty of storage, since it is cheap and readily available. So I generally try to store long scenes, where the length in seconds greatly exceeds the reciprocal of the smallest non-zero cycle frequency of interest. That turns out to be a small number of seconds, typically.

  3. Hi Chad!
    I’m trying to implement the spectral coherence for the task that estimate the symbol rate for low roll-off factor signal.
    If the pulse shape is bandlimit(e.g. SRRC with roll-off=0 and bandwith B). There is 0/0 problem when computing the coherence function. Especially for f_0 >B-\frac{1}{2alpha} in \alpha S^\alpha(f_0)
    What do you do to avoid 0/0? For inband singularity point, I just filter out this NaN points to 0. For outband zeros,I padded with small value(1e-10). But this will amplify candidate cycle frequency \hat{\alpha} > B ,since the S^0(f) = 0 when f>B

      1. The coherence is undefined whenever the denominator is zero, which can happen only when you process a noiseless signal. Are you sure you want to process noiseless signals?

  4. /quote Here we are assuming that the two variables have zero means, which is true when there is no additive sine-wave component in the data with frequency f\pm \alpha/2

    What are your thoughts for when you do have additive sine wave components (e.g. pilot tones)? Of course they can make it easier to detect the signal, but it seems that they would confound the detection of cycle frequencies.

    I seem to remember reading somewhere that it may be desirable to remove the additive sinusoid components from the PSD (by filtering the computed PSD?), but I’m not seeing it now.


    1. I don’t think it is a good idea to process data containing finite-strength additive sine-wave components with a CSP estimator such as the SSCA, FAM, FSM, etc. Detecting them and removing them will help avoid false-alarms due to the unresolvability of the spectrum when the signal contains sine-wave components. There is no advantage to such CSP processing anyway–if you can detect the sine-wave, you can estimate its parameters, and nothing that the SCF or cyclic cumulants will tell you about the sine-wave is more information than that. I probably said as much, as you suggest, in other comments or posts, but I can’t find one at the moment.

      You’ll need to do more than process the PSD. You should try to detect the sine-wave components and remove them from the data. There are many ways to detect sine waves, but a good non-parametric way is to just use the Fourier transform and some kind of thresholding. But that’s not really CSP, so I don’t talk about it much on the Blog. It would be a reasonable topic for a Signal Processing ToolKit post, though.

  5. Thanks for sharing your CSP knowledge, Chad. It seems I’m overlooking an important detail or two in this topic. First, must expected values be taken at any point?

    Second, in eqn 13 & 14 denominators, shifting by $\alpha/2$ and zero filling guarantees div by 0. Currently, I do not take expected values and simply divide the bpsk signal’s SCF by shifted PSD’s and replace div by 0 in result with 0.

    My non-conj spectral coherance graph looks like mostly like yours but dosn’t roll off nearly as much as alpha approaches 1. At $\alpha=1$, your non-conj spectral coherance value seems about 0.4. Mine is about 0.8. (I don’t see a way to attach pictures or would do that.) Thanks for any pointers to my misunderstandings!

    Mike's Plot
    [Added by Chad]

    1. Thanks for checking out the CSP Blog, Mike, and the comments!

      First, must expected values be taken at any point?

      No. Expectations can only be taken when there is an ensemble. This is almost always a fictitious mathematical ensemble. In rare cases one might have an actual ensemble at hand. The CSP Blog is about signal processing, and so any averaging we do in the context of estimation is time (e.g., TSM) or frequency (e.g., FSM) averaging. Here we are concerned with estimating the spectral correlation function, coherence function, cyclic autocorrelation function, cyclic cumulants, etc., using only a single sampled-data signal (“time-series”).

      Second, in eqn 13 & 14 denominators, shifting by $\alpha/2$ and zero filling guarantees div by 0.

      Yes, indeed. In DSP, we only have the Fourier transform over the normalized frequency interval [-0.5, 0.5), and so we just have to live with the fact that X(f+\alpha/2)X^(f-\alpha/2) has increasingly small overlap as \alpha increases, and similarly for a PSD estimate. And the dividing-by-zero problem is even worse than that, for it is perfectly possible for a PSD to have zeros. So you must avoid outputting a non-zero coherence estimate for any (f, \alpha) pair for which the coherence denominator is zero.

      That explains the appearance of my coherence plot in the post. As the cycle frequency \alpha increases in magnitude, the overlap between the two shifted transforms decreases. In continuous-time mathematical analysis, this doesn’t happen. The coherence is perfectly well defined for all f and \alpha, except for the case where the denominator PSDs have a zero.

      As for why your coherence is not decaying as rapidly as mine, what are the signal and noise power levels?

      1. Thanks, Chad, for your helpful reply, and seems I’m in better shape than I thought! I converted your BPSK sample code to python as a starting point. Barring bugs (!) I create the BPSK signal of random 1/-1 (duplicated to samples/sym), mix to carrier, then add noise to achieve 15 dB s/n.

        1. And I now see if I drop to 10 dB s/n it falls much more inline with yours! I’ll get a grasp of the basics yet. 🙂

  6. As a short follow up to my questions, for my results described in the previous note I used the SCF FSM subroutine that I used to generate correct looking results for previous blog lessons.

  7. Dear Chad,
    I am trying to use the SCohF to blindly detect the symbol rate of a set of signals.

    I am using here the code from J. Antoni — I do have my code for the symbol rate estimation and I have read your critics on the algorithm, but I wanted to be sure that what I observe is not due to a bug in my code (yeah, pretty low self confidence ;)).
    The window size for the STFT is 16 -> frequency resolution = 18.75 kHz in the example below.

    For clean signals it seems to work pretty well, take for instance the SCohF for a 4-FSK signal (20ksym/s, sampling rate = 200 ksps, generated with a signal generator, SNR = 60 dB):
    In the picture I show, for each alpha value on the X-axis, the maximum of the mean of the absolute values returned by Antoni’s code (from the ‘S’ matrix in his code).
    The peak is crystal clear, and gives the correct result.

    Given the properties of CSP techniques, I would expect the SCohF to be pretty insensitive to AWGN, therefore I’ve taken the signal and lowered the SNR to 10 dB (using Matlab’s awgn() function, so there should be no issues there). In some of your posts you’ve worked with far, far worse SNRs, e.g., I recall a ROC curve where you do signal detection at -11 dB.
    However, running again the code with the same parameters on the corrupted signal, this is what I obtain:
    All the peaks are gone… 😦
    Am I missing anything?

    Thanks in advance for your help!

    1. take for instance the SCohF for a 4-FSK signal (20ksym/s, sampling rate = 200 ksps, generated with a signal generator, SNR = 60 dB):

      The dominant peak is at 20 kHz, which is good, but the maximum coherence is tiny. This implies that either (1) the SNR is not 60 dB, (2) something is going wrong in your estimator, or (3) the particular kind of FSK signal you are using has very weak cyclostationarity. There are many varieties of FSK, and their statistical properties depend on how the symbol-clock phase and carrier phase are related, and vary, over time. I suggest you use a simpler, more well-understood signal, such as my usual rectangular-pulse BPSK signal (and set its power to unity [0 dB]). That way you have lots of solid reference points to compare to what comes out of your code.

      I’ve taken the signal and lowered the SNR to 10 dB (using Matlab’s awgn() function, so there should be no issues there).

      Since the peaks have completely disappeared, I would check this step. What was the exact command you used? Did you look at a PSD?

      1. Dear Chad,

        Thanks for the quick reply!
        (1) I haven’t recorded the data myself, I have taken this value from the metadata that came with the signal dataset.
        (2) the estimator is the one from Antoni’s paper. I simply take his code, load my signal, and run it as:
        [S, alpha, ~, ~] = Fast_SC(x, Nw, alpha_max, Fs, opt);
        – x = my signal
        – Nw = 16
        – alpha_max = 100e6
        – Fs = 200e6
        – opt.coh = 1 (to compute the SCohF)
        Then I plot mean(abs(S(:, 2 : end)))
        My code does the same, but in a non-optimized way and on selected alpha values (to reduce the computational load), and the result is virtually the same (in the AWGN case, I get garbage out).
        (3) this might be the answer. In my dataset I do have multiple xFSK signals, and everything’s nice for all of them except this one… (I do few other mistakes, but these are due to the fact that I pick the wrong peak — either an harmonic or a sub-harmonic –, thus it is a mistake in my detection algorithm and not an issue with the CSP output)
        (4) the signal actually disappeared from the PSD…
        which is weird. The SNR reduction is simply made by invoking the Matlab function:
        x = awgn(x, 10, ‘measured’);

        Thanks again for your help and have a nice day!

  8. Good Morning Chad,

    I have a question I may not be able to act on as I move to a different job after next week. However, maybe it will interest others (including my current coworkers). This article seems like a reasonable place for the question.

    I appreciate the approach (e.g. Eq. 1, above) of respectively connecting the spectral correlation and coherence functions to the more familiar (to me at least) concepts of off-diagonal elements in a covariance matrix and their underlying correlation coefficients. As we’ve been implementing the SSCA, I’ve wondered whether anyone has taken that connection farther by treating SSCA point estimates as elements of a matrix. It’s only a vague idea in the back of my CSP-novice mind, but linear algebra and probability theory have a great relationship (e.g. Kalman filtering, or factor graphs). Maybe the machinery of matrix manipulation would lead to some efficient and/or insightful approaches to interpreting SSCA outputs.

    Are there any research threads along these lines?

    1. I’m not familiar with any published research along those lines. But I have been working, slowly, with others on the idea of treating the output of the SSCA as an image, and then trying to apply proven machine-learning image-recognition tools (CNNs) to do multiple-signal modulation recognition.

      The SSCA output is pretty huge if you don’t do (1) some kind of thresholding and (2) cycle-frequency binning or equivalent cluster analysis.
      So when you process a short record of, say, 16384 samples with an SSCA having, say, 32 strips, you get a matrix of point estimates that has 16384*32 = 524288 elements (two of these, one for the non-conjugate and one for the conjugate). There are only 32 frequencies but many more (16384) cycle frequencies, so we end up with a skinny sparse matrix I guess. I suppose some kind of sophisticated sparse-matrix tool might replace a straightforward coherence-based thresholding operation. But in any case, I strongly believe that the handful of significant cycle frequencies (said handful arrived at through thresholding and binning, for example) are the gold nuggets in the full-point-estimate pan. I would think a CNN operating on the full SSCA output would somehow focus in on those nuggets as well, but I have no good results yet.

      1. I like the idea of treating the SSCA’s output as an image. It seems well motivated based on our limited efforts thus far to process collected data through the SSCA. A heat map, of sorts, looking down on the full region of support has an image-like feel to it.

        Without question, the SSCA output is gargantuan. However, the sparsity of a matrix representation gives me hope that it may still produce useful results in reasonable time. I find there are only M unique cycle frequency values among all the strips, where M=2N-\frac{N}{N^\prime} and \frac{N}{N^\prime} is an integer. I get the same number of unique spectral frequencies. As the total number of points grows, M quickly becomes a pittance of the total (only about 6.2% using your above values of N and N^\prime). A square, M \times M matrix would be truly massive, but it just ‘feels’ like the sparsity and structured nature of the relationship between the frequency values indexing the strips could lead to something useful.

        I’m too far removed from linear algebra proficiency and far too new to cyclostationarity to go beyond this sparse description of my intuition. There is probably a host of very good reasons nobody conducts research along these lines. Thanks for humoring me.

        1. Thank you for a correct characterization of the numbers of unique cycle frequencies and spectral frequencies that are output by the SSCA! My comment above is incorrect–I must have been mixing up TSM ideas with SSCA ideas or just plain mixed up. I hope your new position doesn’t prevent further commenting and, crucially, error-detection and correction.

          I’ll return to this comment thread if I come across any serious attempts at treating the SSCA output (or FAM or whatever) as a matrix that can be successfully analyzed by linear algebraic means to some good signal-processing end.

          1. I appreciate the sentiments; the blog has taught me a lot. While I’m not aware of a CSP connection to my new job, ya never know. Even history tends to be cyclic.

            As a maybe-useful footnote to this exchange, I reworked the indexing and notation for coordinates in the bi-frequency plane to be easier on my brain. Assuming my WordPress LaTeX doesn’t fail me, I prefer f_{(j,i)}=\frac{\Delta\alpha}{2}(-q_{i}+k_{j}\Delta N) and \alpha_{(j,i)}=\Delta\alpha(-N+q_{i}+k_{j}\Delta N), where \Delta\alpha=\frac{f_s}{N}, \Delta N=\frac{N}{N^\prime}, q_{i}\in[0,N), and k_{j}\in[0,N^\prime). For starters, zero-based indices make me happy. It’s easier for me to see and say that the 0^{th} sample of the 0^{th} strip has a cycle frequency of -f_s. I also think of \Delta\alpha as a unit conversion factor; I look at each equation’s units as Hz = (Hz/Sample)*Sample. Packing all the sample variables together relates to this vague sense that linear algebra might be lurking in the background, waiting to provide some insight.

            Hoping for more CSP fun in the future,

Leave a Comment, Ask a Question, or Point out an Error