# Wow, Elsevier, Just … Wow. Comments On “Cyclic Correntropy: Properties and the Application in Symbol Rate Estimation Under Alpha-Stable Distributed Noise,” by S. Luan et al.

Can we fix peer review in engineering by some form of payment to reviewers?

Let’s talk about another paper about cyclostationarity and correntropy. I’ve critically reviewed two previously, which you can find here and here. When you look at the correntropy as applied to a cyclostationary signal, you get something called cyclic correntropy, which is not particularly useful except if you don’t understand regular cyclostationarity and some aspects of garden-variety signal processing. Then it looks great.

But this isn’t a post that primarily takes the authors of a paper to task, although it does do that. I want to tell the tale to get us thinking about what ‘peer’ could mean, these days, in ‘peer-reviewed paper.’ How do we get the best peers to review our papers?

Let’s take a look at The Literature [R173].

The paper in question is published in Elsevier’s open-access peer-reviewed journal called Digital Signal Processing. Before connecting the contents of the paper, and the fact that it was actually peer-reviewed, to criticisms about peer-review, let’s just establish these aspects of the DSP journal.

Here is what google produces when asked the question “is elsevier ‘digital signal processing’ a peer-reviewed journal?”

So, yeah, peer-reviewed for sure. And some puffery too: “ensuring top-quality research and information …” Who’s authoring and reviewing these papers? We have top men working on it … Top … Men. I’m so reassured!

Then when you pluck up the courage to go look at the Elsevier DSP site itself, more reassurance!

### The Paper

I’ll take you down the rabbit hole, starting from the beginning, but I have to get something out of my system first. In Appendix C we see the opening text as in Figure 3.

The signal in equation (C.1) is rectangular-pulse BPSK with independent and identically distributed bits and zero carrier-frequency offset. Also known as the CSP Blog mascot signal. This process is cyclostationary with cyclostationary sample paths (spectral correlation and cyclic autocorrelation), is not ergodic, and is cycloergodic. Try to ignore or give a pass to the rapid switch in notation between bits $c(nT_B)$ and bits $c_n$. (What would $c(nT_B)$ even look like?)

Also, no matter what pulse-shaping function $q(t)$ we use in (C.1), the product of $T_B$ and $R_B$ is always equal to one because the symbol rate $R_B$ is defined to be the reciprocal of the symbol interval $T_B$!

So, yeah, they are saying that the cyclostationary signal that forms the basis of the theoretical development and processing examples over the entire Cyclostationary Signal Processing Blog is stationary. Oh no!

We also know that the real-valued signal in (C.2) is a cyclostationary signal with cycle frequencies relating to both $1/T_B$ and $f_c$, and could be an almost cyclostationary signal if $f_cT_B$ is irrational.

Is it reasonable that reviewers (peers) did not demand that any of this be corrected? No. It. Is. Not.

In the Introduction we see an attempt to paint correlation as a tool that explicitly requires Gaussian variables or signals as shown in Figure 4. But correlation and covariance are just similarity-measuring tools for arbitrary random variables.

Similarly, the basic probabilistic functions of cyclostationary signal processing, such as the spectral correlation function (which is just a correlation after all), were apparently ‘designed’ for Gaussian noise (see Figure 5). Well, it turns out that spectral correlation, cyclic correlation, and higher-order cyclic moments and cumulants are not ‘designed’ at all, any more than a probability density function is ‘designed.’ They are probabilistic aspects of a random process and/or its infinite-length sample paths. As aspects of the signal that exits a transmission antenna, they depend not at all on whether the signal is eventually corrupted by Gaussian noise, exponential noise, shot noise, impulsive noise, cochannel interference, or propagation effects. The probabilistic aspects of the received signal do, of course, depend on those things.

Also, higher-order cyclic moments and cumulants from [20,21] are the same thing as $k$th-order cyclic statistics in [22]. And neither is ‘cyclostationary-like.’ They are simply fundamental aspects of cyclostationary random processes and/or their infinite-length sample paths. Harrumph.

Moving on, the authors define several correntropy-related functions in Section 2.2. These are the core of the paper, and are shown here in Figure 6.

As in the other two cyclic correntropy posts I’ve written (here in 2016 and here in 2019), I understand the correntropy function to essentially form a weighted sum of all possible temporal moment functions which, in the case of a cyclostationary input signal, are periodic or polyperiodic functions of time. That is, the correntropy is purposely used to do the exact opposite of what I advocate with cyclic cumulants, which is to quantify that part of a temporal moment function that is unique to the specified order $n$, and is not simply a combination of lower-order effects. But OK.

So the functions in Figure 6 are probabilistic parameters, not estimates. They all depend on the expectation in the authors’ equation (2). As such, one needs to be careful to not violate cycloergodicity when going back and forth between estimates using a single sample path of the process and parameters that involve averaging across the infinite ensemble. Turns out here, in this paper, this fundamental problem is elided because no estimators are ever presented or spoken of at all. As we’ll see.

After the probabilistic parameters are presented, some results are presented in terms of theorems. None of the presented theorems deserves the appellation of ‘theorem,’ but, hey, that’s just an opinion. The theorems have to do only with probabilistic parameters and stochastic process models for BPSK and AM DSB.

This bit of math is then followed immediately by the presentation of a “Symbol rate estimation algorithm.” I reproduce the algorithm in Figures 7 and 8.

Keep in mind, for my processing results supplied near the end of this post, the assertion that the method is expected to be robust to cochannel interference. But focus for now on Step 1.A: “Calculate the correntropy $V_x(t, \tau)$ of the modulated signal.” Well, we probably want to estimate the correntropy using received or simulated data. But the function $V_x(t,\tau)$ is defined only through the authors’ equation (2), and that is an expected value over a fictitious ensemble. This is a basic error for anyone trying to publish a paper dealing with stochastic processes and estimators, and no competent reviewer should have let this go.

By simply never mentioning any transition from stochastic processes (and their average-over-the-ensemble probabilistic parameters) to estimators (operating on finite-length data blocks), the authors never have to specify things like the block length, the signal parameters, or the signal-to-noise ratio. It’s all ideal correntropy all the time in the paper.

But … they must apply some kind of (mysterious to the reader) estimator, judging by the cyclic correntropy spectrum profiles shown in Figure 1 because you can see that the function is erratic, or random, over the cycle-frequency parameter.

The authors’ Figure 1 is reproduced here as Figure 9 for your convenience, dear reader. There is a bit of trickiness to this result. Notice that the carrier is set to exactly twice the BPSK bit rate. Since the cycle frequencies for BPSK are always of the form $\alpha = (n-2m)f_c + kR_B$ (known to the authors; see Appendix C), this means that all the cycle frequencies lie on the grid formed by harmonics of the bit rate $R_B$! So the picture is a bit misleading in its cycle-frequency regularity.

Since the symbol-rate estimation algorithm explicitly states that one must form a complex-valued downconverted signal and then use the correntropy, the authors next show what the correntropy cyclic-domain profiles look like for downconverted BPSK. But this is mathematically slippery and leads to some misleading conclusions. Remember that the probabilistic parameter of the correntropy was defined for a ‘real random process $X(t)$‘ back in the authors’ equations (2) and (3) (my Figure 6). In equation (3) the Gaussian kernel is specified with an absolute value or modulus operation thusly

$\displaystyle \kappa_\sigma (x) = \frac{1}{\sqrt{2\pi}\sigma} e^{-|x|^2/2\sigma^2} \hfill (1)$

But it could just have well been defined with $x^2$ instead of $|x|^2$ since $x$ is supposed to be real. And in fact, in the analysis in the Appendices, the authors do switch to $x^2$.

But for complex-valued $x$ it absolutely does matter whether you use $|x|^2$ or $x^2$. Once you decide, it locks you into correntropy that is reflective of non-conjugate cyclostationarity ($|x|^2$, which delivers only cycle frequencies associated with, in our language, $n=2m$, or for BPSK, $kR_B$) or conjugate cyclostationarity ($x^2$, which delivers only cycle frequencies associated with $m= 0$, or $Kf_c + kR_B$).

Not understanding this, the authors then show some cyclic correntropy profiles for the downconverted BPSK signal in their Figure 2, which I reproduce here in Figure 10.

These profiles will look the same no matter what the value of $f_c$ is (as I demonstrate below) if you are using $|x|^2$ in the Gaussian kernel. The authors are impressed with the result, saying “Meanwhile, the peaks related to the carrier frequency and its harmonic components are nowhere to be found.” Also note, though, that the ‘magic’ parameter of $f_c = 2R_B$ is used here, which is a choice that obscures what is going on rather than clarifies it.

The authors then move into a simulation study. The only parameters for the study are shown in their Table 1, which I reproduce here in Figure 11.

Then they define figures of merit. One is ‘estimation accuracy’ and the other is root mean-squared error (RMSE). The former is described in the text reproduced here in Figure 12. Just not very well. To paraphrase: ‘the estimation accuracy determines the accuracy, and the probability of accuracy is the number of accurate estimates divided by the total number of estimates.’ Got it.

I’m not going to reproduce all the performance curves. They are pretty much meaningless without knowing the processing block length. Maybe I’ll take it as one sample. Look how great the correntropy is even when using only a single sample! Also, look how great it is using eleventy jillion samples! The same!

Finally, let’s turn to what the authors say is their main achievement in the paper, the thorough analysis of the existence question regarding the correntropy. We begin with Appendix A: Existence condition of cyclic correntropy.

#### Flaws in Appendix A: Existence Condition of Cyclic Correntropy

We see immediately that the $|x|^2$ in (3) is replaced by $x^2$ in (A.2),

This is followed by a straightforward application of the series expansion for $e^x$, leading to a condition for existence which is that a particular weighted sum of an infinite number of even-order moments is finite.

But then the authors try to extend this idea to a signal that is wide-sense cyclostationary, or ‘second-order cyclostationary,’ which they define as meaning that we know only that the first- and second-order moments are periodic. So we see the statement in Figure 14.

Here the second- and higher-order parts are defined as in Figure 15.

But then the authors apply the fraction-of-time sine-wave-extraction operation, which returns the sum of all finite-strength additive sine-wave components that exist in its input. Typically denoted by $E^{\{\alpha\}}[\cdot]$, they use the notation $[[\cdot ]]$, as in Figure 16, where they make the major mistake of assuming that $r_X(t,\tau)$ does not contain a periodic component,

The mistake here is using “the periodicity in other statistics of higher order is not guaranteed” to conclude “the periodicity in higher-order statistics are zero”! We know full well that the signal of particular interest to these authors, rectangular-pulse BPSK (see Appendix C), has non-zero periodic higher-order time-varying moments (and cumulants!) for all even orders. So, no, $[[r_X(t,\tau)]]$ is not equal to zero. This is basic. So the main mathematical achievement touted by the authors, that of thoroughly determining the correntropy existence conditions, is fatally flawed.

#### Flaws in Appendix B: Relation between Spectrum Peaks and Carrier Frequency for DSB Signals

It gets weirder, and more damaging to students and new researchers, in Appendix B, where the focus is on the cyclic correntropy profile elements for an input of analog-message amplitude modulation (AM DSB).

The AM signal is of the form $x(t) = a(t) \cos(2\pi f_c t)$, where $a(t)$ is the message waveform. The message waveform is here modeled as a stochastic process, and several expected values relating to $a(t)$ are presented in (B.1), which is reproduced in Figure 17.

The first says that the message has zero average value. The second says that the autocorrelation is not zero. Typically one would want to express the fact that this autocorrelation function is not zero as a function of $\tau$, which means it is ‘not identically zero.’ When we say $x(t) = 0$, the proper interpretation is that for the domain value $t$, the function is zero. It is free to be non-zero at other domain values. When the function is zero for all domain values, we say $x(t) \equiv 0$, and when it is not identically equal to zero, we can say $x(t) \not\equiv 0$, meaning that there is at least one domain value $t$ for which $x(t) \neq 0$.

Turning to the third equation in (B.1), $E[a(t)e^{-i2\pi\xi t}] = 0\ \ \ \forall \xi \neq 0$, there is no indication that the frequency $\xi$ is random, so this expectation is $e^{-i2\pi \xi t} E[a(t)]$. Since the authors already assert that $a(t)$ has zero mean, this expectation is also zero for all $\xi$, including $\xi = 0$.

Finally, the fourth expectation is equal to $e^{-i 2 \pi \xi t} E[a(t+\tau/2)a(t-\tau/2)]$, which cannot be zero for any $\tau$ for which the first expectation (the autocorrelation) in the list is not zero, because the exponential is never equal to zero. So the second and fourth expectations are contradictory. This is basic.

The stationary message signal $a(t)$ is multiplied by a real-valued sine wave to obtain an AM DSB signal in (B.2). The authors also note that time averages will equal expected values for a stationary ergodic signal. But then they apply that idea to the AM DSB signal, which is neither stationary nor ergodic! (But since there is no added carrier-phase random variable, it is cycloergodic.) See Figure 18.

What justifies (B.3) is the fact that the AM DSB signal is cyclostationary and cycloergodic.

#### Flaws in Appendix C: Relation between Spectrum Peaks and Signal Parameters for BPSK Signals

Finally, the authors put together Appendix B in an attempt to examine the cyclic correntropy profile for rectangular-pulse BPSK. Have a look at Figure 19.

This looks to be a sort of cut-and-paste kind of situation. They are trying to do the same kind of analysis here as they performed for AM DSB in Appendix B. But the signal in (C.1) is not stationary and is not ergodic. The modulated signal in (C.2) is also not stationary and not ergodic. This is basic, and will be quite confusing to all readers.

### Some Relevant Processing Results from the CSP Blog

First let’s just look at the spectral correlation function for rectangular-pulse BPSK in additive white Gaussian noise and in alpha-stable impulsive noise. We’ll use the known cycle frequencies for the signal, which are harmonics of the bit rate $k/10$ for the non-conjugate spectral correlation function and the doubled carrier plus harmonics of the symbol rate $2(0.05)+ k/10$ for the conjugate spectral correlation function, and a block length of 65536 samples. The resulting non-conjugate and conjugate spectral correlation surfaces are shown in Figure 20. All is normal.

Next we process this same BPSK signal but switch the AWGN for impulsive noise with $\alpha = 1.5$, $\beta = 0$, $\gamma = 1$, and $\delta = 0$. The resulting surfaces are shown in Figure 21, where, again, the true cycle frequencies are used. Clearly the spectral correlation is highly distorted by the presence of the impulse-like spikes in the alpha-stable noise.

But there is no need to create a theory around correntropy, wantonly mixing together a bunch of higher-order cyclic moments, to accommodate impulsive noise. As I’ve described before, you can simply detect and remove the worst of the impulses, and proceed with whatever CSP algorithms you wish after that. This isn’t difficult signal processing. The simple impulse-detection-and-removal algorithm removes a couple hundred impulses (and removes zero impulses when applied to the BPSK-in-AWGN signal above), leading to the surfaces shown in Figure 22.

As a final indication that simple signal processing can effectively handle impulsive noise, the mitigated BPSK signal is subjected to blind CSP using the SSCA. This results in a set of estimated cycle frequencies. When using those, we obtain the surface shown in Figure 23, which is nearly identical to that obtained in Figure 22 where the true exact cycle frequencies are used.

Next, we show some cyclic correntropy profile results from our own processing. I’ll display the cyclic correntropy spectral density (CCSD) function from the original cyclic correntropy post. It is similar to the cyclic correntropy profile used by the authors of The Literature [R173].

First we consider a real-valued BPSK signal with bit rate $1/128$ and carrier frequency of $1/128$. The cyclic correntropy profile is shown in Figure 24 for a block length of $16384$ samples. Note that since the doubled carrier frequency is equal to the second harmonic of the bit rate, all the cycle frequencies in the profile lie on the grid $k/128$. This is an unusual (perhaps pathological) case–I typically refer to these kinds of parameter choices as ‘magic numbers.’ Another example is when all the parameters of a signal are equal to multiples of reciprocals of dyadic numbers, such as $k/2^M$, and signal processing algorithms are applied that exclusively use FFTs of lengths $2^N$. So all cycle frequencies are ‘on bin center.’

When we remove the magic, as in Figure 25, we see the full cyclic correntropy pattern.

Moving to complex-valued signals, I want to demonstrate the differences between using $x^2$ and $|x|^2$ in the correntropy kernel. We’ll consider a BPSK signal in AWGN. It has a bit rate of $f_{bit} = 1/32$ and carrier offset of $0.01$. We employ the correntropy kernel with $\sigma = 1.0$ and a estimation block with length 16384 (as before). When we use $|x|^2$ in the kernel, we obtain the cyclic correntropy profile shown in Figure 26. The symbol-rate peak is prominent–this is what the authors are going after, and that’s a good thing.

When we use $x^2$ in the kernel, we obtain the results shown in Figure 27. We can call the cyclic correntropy profile with $|x|^2$ the non-conjugate profile and with $x^2$ the conjugate profile. Note that the conjugate profile in Figure 27 has many many prominent (and correct) cycle frequencies of the form $2Kf_c$ and $2Kf_c \pm kf_{bit}$.

The authors mention correntropy’s utility to, or tolerance of, cochannel interference in several places. So as our final example of how useful correntropy is, let’s apply it to two cochannel BPSK signals. The two bit rates are $1/32$ and $1/15$, and the corresponding carrier frequency offsets are $0.01$ and $0.011$. The kernel uses $\sigma = 1$ for the case of a non-conjugate kernel ($|x|^2$), and the result is shown in Figure 28. Notice the two prominent peaks at the two bit rates, but the even more prominent peak near zero.

Turning to the conjugate correntropy (kernel uses $x^2$), we notice that the outcome is quite sensitive to the particular value of the kernel scaling factor $\sigma$. When $\sigma = 1$ (a value we have used in many of the results shown in this post so far), we obtain the result in Figure 29. Here the maximum value on the y-axis (obscured by the title-sorry) is $3\times 10^{31}$, so some kind of overflow is occurring.

Turning down the gain, or turning up the scaling factor, to $\sigma = 2$ results in the profile shown in Figure 30. Which is more believable, but still weird. The profiles for $\sigma = 3$ and $5$ are shown in Figures 31 and 32, the latter of which is understandable–we can see $2(0.01)$ and $2(0.011)$. But the profile varies wildly over a small range of $\sigma$, which I view as a serious drawback to the method, and it will also depend on the power levels of the involved signals, indicating that maybe this method isn’t so useful for cochannel problems after all.

CSP is touted (by me!) as shining in situations involving interference. This one is no different. By applying the simple impulse-detection and impulse-excision algorithm of the 2019 correntropy post, we can see the before and after PSDs for the scenario in Figure 33.

When we apply the strip spectral correlation analyzer (SSCA) method of blind cycle-frequency estimation to the cochannel signals with the unmitigated impulsive noise, we get the result shown in Figure 34. Clearly the impulsive noise causes many false alarms (for reasons I’ve already mathematically documented), but also we see the true cycle frequencies in the list as well.

The SSCA is also applied to the mitigated scene with the good results in Figure 34. This mitigation is so good that the downstream multiple-signal modulation recognition algorithm (My Papers [26]) has no trouble producing the decision of two BPSK signals along with the correct parameters for each.

### Who are the Peers? Who Watches the Watchers?

I went through this peer-reviewed paper in painstaking detail to prove that it contains so many fundamental flaws that it should not have been published. Graduate students or professional engineers that are trying to learn about the field (CSP or even correntropy) will either waste a lot of time or actually learn the wrong things, causing them to spend time at some point in the future unlearning them, if they are lucky. And going through a paper in the manner that I did has its own value for the small set of readers and researchers out there that really care about this topic.

But the wider problem is the point of the post: The review process is broken. We know this paper was reviewed and we know it was published. Therefore, the reviews were sufficiently positive for the editor to recommend, and achieve, publication. But the reviewers, the editor, and the authors, are wrong about the technical content of the paper. Wrong or ignorant, I suppose.

We can contemplate several reasons for the peers’ failures. The first is that they willfully ignored the errors that they saw–perhaps they are invested in seeing correntropy gain wide acceptance, or perhaps the authors are colleagues or friends. However, I prefer to adopt Hanlon’s Razor here, and I like Napoleon Bonaparte’s version:

Never ascribe to malice that which is adequately explained by incompetence.

Napoleon Bonaparte

From that point of view, we just have incompetent reviewers. Judging from the number of times I’ve done this kind of review of published papers (leave aside arxiv.org papers if they didn’t make it to publication), combined with the fact that I see only a tiny fraction of the technical literature, many reviewers are incompetent.

What to do?

### You Get What you Pay For?

So, enough analysis and criticism. How about a solution?

We don’t pay reviewers. However, when we want something done right, we do pay for it. So we have to find a way to pay reviewers so that the best, most competent, reviewers will spend some of their valuable time reviewing technical papers.

But it may be a trap to try to create a corps of paid full-time reviewers because their expertise will grow stale. What we need is to entice current experts in their fields to do the reviews alongside their normal work, which helps maintain and strengthen their expertise over time.

I’d like to see a system where the IEEE, and Elsevier and others, chooses a subset of reviewers in some quasi-random way and then pays them. Not a completely uniform distribution over the reviewers though–the chances of getting a reviewer’s payday can be increased by good reviewer behavior. And just what is that?

• No trivial reviews. Ever seen a review of a 10-page technical manuscript that is a couple sentences long? Not useful. Odds of a payday go down with trivial reviews.
• Reviewers in the running for payment should do a minimum number of reviews each year. Odds of a payday go up with review frequency.
• Reviewers should be required to read and rate the other reviews for a manuscript they have reviewed. Payday odds go down when other reviewers consistently rate a reviewer as poor.

We could then create a formula based on these metrics that adjusts the chance of getting paid to do reviews. The IEEE could award a substantial payment of, say, \$1000 each month in each of its main technical areas. The payee would be chosen by a random number generator using the probabilities as adjusted by the reviewer’s behaviors.

So you could still do lots of great reviews, and yet never get paid. To deal with that, we could layer on a system where each year editors pick reviewers to be paid from a list of all those that have not yet received any payment for their work.

A researcher could still deliver poor reviews, but the other reviewers and the editor would downgrade that reviewer, and his/her chances of getting paid would go down. So the incentives to do better-rated reviews for other reviewers would work in favor of producing better and better reviews for authors and editors.