I’ve seen several published and pre-published (arXiv.org) technical papers over the past couple of years on the topic of cyclic correntropy (The Literature [R123-R127]). I first criticized such a paper ([R123]) here, but the substance of that review was about my problems with the presented mathematics, not impulsive noise and its effects on CSP. Since the papers keep coming, apparently, I’m going to put down some thoughts on impulsive noise and some evidence regarding simple means of mitigation in the context of CSP. Preview: I don’t think we need to go to the trouble of investigating cyclic correntropy as a means of salvaging CSP from the clutches of impulsive noise.
The various papers simply state that impulsive noise is a problem for cyclic methods of signal detection and modulation classification, but I don’t see any attempt to understand the root cause of the problem. And if you don’t know the root cause, you can’t very well rank the desirability of various solutions to the problem; you’re just treating your view of the symptoms, not the disease. In such cases, different people will resort to reaching for their favorite and trusted tools, not necessarily the best or simplest tool. Nails, hammers.
What’s the Problem?: Impulsive Noise and its Effect on Blind Cycle-Frequency Estimation
First, what is impulsive noise? Generically, it is a signal with a probability distribution that is in some sense a heavy-tailed distribution (relative to, say, the Gaussian distribution). So most of the time it looks like typical Gaussian noise, but every once in a while there is a noise spike that is much larger than typical–it looks like an impulse.
The standard model for impulsive noise is alpha-stable noise, which is a random process that is governed by four parameters, and (sorry for the dual use of , this is not the same that we usually use at the CSP blog for cycle frequency). For some values of these parameters, the noise is actually Gaussian noise, but for others it is an impulsive noise whose second-order moment does not exist. [I’ve always been puzzled by using a mathematical model with infinite power to model observed noise in some system, which can’t have infinite power (can it?). Why not just restrict our attention to models that have heavy probability density function tails, but also finite but huge variances?]
MATLAB provides an alpha-stable noise generator as part of its random.m function. It takes the four basic distribution parameters as well as the size of the data structure that will contain draws from the distribution. A typical alpha-stable distribution corresponds to , and :
[rv_vec_out] = random(‘stable’, alpha, beta, gamma, delta, [num 1]);
What happens is that the discrete-time alpha-stable impulsive-noise signal has a Gaussian-like component, but also possesses relatively rare high-amplitude samples–spikes or impulses. Here is a comparison between white Gaussian noise and alpha-stable impulsive noise using the typical parameters listed above:
Here the variance of each signal is equal to one. The rare large noise spikes are evident in the time-domain plot, and it is these spikes that cause trouble for blind cyclostationary signal processing algorithms. Why?
The main reason has to do with the form of the spectral coherence function. Recall that blind cycle-frequency estimation is efficiently and effectively done using a normalized version of the spectral correlation function called the coherence:
In either continuous time or discrete time, the Fourier transform of an impulse is a constant across frequency. Over time, if we correlate the various Fourier bins, we will see that they contain redundant information. That is, a single impulse has a form of spectral correlation! When we then form the coherence, we observe a very large number of high-coherence cycle frequencies–pretty much every possible cycle frequency has a high coherence. Here is a plot of the blindly detected cycle frequencies for the noisy impulse shown above:
The alpha-stable impulsive-noise signal is low-level more-or-less-normal noise plus randomly placed and scaled impulses, so it leads to a similar set of blindly detected cycle frequencies:
The problem with impulsive noise with respect to blind CSP is that a great many false cycle frequencies are detected. If a modulated signal is also present, its cycle frequencies can be detected too, but the large number of other detected cycle frequencies makes recognizing the true (useful!) cycle frequencies difficult; they become lost in the crowd.
Here is an example using our old friend the textbook rectangular-pulse BPSK signal with symbol rate kHz and carrier offset frequency of kHz in additive white Gaussian noise (using the SSCA):
The plot shows that all of the cycle frequencies that fall within the diamond-shaped principal domain of the discrete-time/discrete-frequency spectral correlation function are detected, and no others. (See the posts on rectangular-pulse BPSK here and here.)
Here are the blindly detected cycle frequencies for that same signal in alpha-stable impulsive noise (again using the SSCA):
The BPSK cycle-frequency pattern is clear to our eyes, but an algorithm will be confronted with a much more difficult task of cycle-frequency identification/grouping than in the case of white Gaussian noise. And think about what happens when the SNR decreases–even our eyes won’t find the pattern.
Impulsive Noise and its Effect on Non-Blind SCF Estimation
In the non-blind case, we know the cycle frequencies of interest in advance of processing, and we apply some variant of the TSM or FSM to estimate the spectral correlation function, or perhaps the cyclic autocorrelation or spectral coherence functions. The effect of the alpha-stable impulsive noise on the resulting estimates is relatively minor. What follows is a sequence of spectral correlation and coherence plots for the BPSK signal used above, considering only the true cycle frequencies exhibited by the signal:
I suppose one mitigation approach is to develop cyclic correntropy, which is a nonlinear function of the data that essentially mixes together all possible cyclic moments of the input data. The kernel parameter controls the weighting that is placed on higher-order-moment components of the correntropy, so that one can obtain a result that is, for example, primarily a mixture of second- and fourth-order cyclic moments by selecting the appropriate kernel value.
One might apply some kind of signal-amplitude limiter so that large spikes are automatically reduced.
My Preferred Approach
But why not attack the problem directly? Can we simply estimate the locations of the offending spikes and remove them? I think so.
First, to detect the locations of the larger of the spikes, we can apply a sorting routine to the incoming data-block samples’ magnitudes and declare the largest elements in the list as the spikes to be eliminated. More clearly, we can form a histogram of the magnitudes, determine the median of that histogram, and then declare all samples with magnitudes greater than times the median as spikes to be dealt with. So then we know the locations of the spikes. What next?
Two options come to mind. The first is to simply replace the spikes with zeros. The second is to replace the spike value by an interpolated value using the nearby non-spike values of the signal.
If the alpha-stable impulsive noise obeys the typical model, then the spikes are relatively rare (but still harmful), so replacing them with zero will not affect subsequent CSP estimators much. Somewhat better to fill the spikes in with interpolated values, and if only a few values are involved, the complexity is small.
So that’s what I do here: Find the median, fix , find the locations of the spikes, and replace them with a simple interpolated value using the points on the left and the right of the spike.
Processing Example: Blind Modulation Recognition and Parameter Estimation
Let’s see how this works on a blind modulation-recognition problem. I consider five input signal types: BPSK, QPSK, -DQPSK, 16QAM, and MSK. The signals have unit power ( dB), symbol rates of (normalized frequency), and carrier offsets that are chosen randomly from trial to trial on the interval . The PSK and QAM signals use a bandwidth-efficient square-root raised-cosine pulse function with roll-off parameter . The alpha-stable impulsive noise has a power equal to dB. The number of trials is and the length of each processed noisy signal is samples.
I apply these signals to a blind modulation-recognition algorithm (see the ML Challenge post for more information) that makes extensive use of CSP, including the spectral correlation and coherence functions as well as high-order cyclic cumulants.
When I do no impulsive-noise mitigation, I obtain the following confusion matrix:
The decision called CF-CHAIN is a generic decision that happens when the system blindly detects the presence of a large number of harmonically related cycle frequencies, but cannot reconcile them with signals that produce such sets, such as direct-sequence spread spectrum signals. We can see that the presence of the strong impulsive noise ruins the ability of this algorithm to blindly find the signals’ cycle frequencies, and so there is no hope of performing the subsequent modulation recognition that is based on those cycle frequencies. (But an improved algorithm still might be able to do it.)
When I apply the median-based mitigation method as a preprocessor to the modulation-recognition algorithm, I get the following confusion matrix:
I’m sure there are other ways of detecting and replacing the large noise spikes that are produced by alpha-stable impulsive-noise processes. The median-magnitude threshold-based method I outlined here applies to both blind and non-blind CSP, and greatly mitigates the deleterious effects of the impulses contained in impulsive noise. It is an effective pre-processor for CSP algorithms that are designed using benign noise models.
I’m wondering why the various papers that expressly state that CSP is highly vulnerable to impulsive noise don’t at least compare the correntropy machinery to relatively simple, straightforward impulsive-noise mitigation strategies such as that outlined here. Maybe correntropy is better! But there is a cost in using correntropy: The statistics of all orders are mixed together in the correntropy function. One of the key virtues of using cyclic cumulants is that the contributions to a detection/classification feature are separated in terms of cumulant order but also in terms of contributing signal. It is very easy to control exactly which orders of moments or cumulants are included in the processing. We’d like to retain that advantage while also mitigating the bad effects of the impulses.
Please consider leaving a comment, criticism, or correction in the Comments Section below.